Johannes Koch wrote:
That's because the W3C markup validator does not really know about XHTML
and XML in general.
Nor should it (in general).
The web is not (yet) an XML medium. If you want to serve XHTML (don't
let me stop you) then you're best doing it as text/html under Appendix
C. This is perfectly reasonable to do and works fine.
However a validator encountering such a page has to treat it as an SGML
document (or an least non-XML HTML, where this varies from rigid SGML)
conforming to some mutant SGML DTD derived from the XHTML Schema. It
would be an error to treat it as an XML document and try to validate it
against XML rules, because that's not what you're serving it as.
Yes, the W3C validator doesn't do XML. If you serve XHTML _as_ XML
(with application/xhtml+xml) then you're going to see odd behaviours -
much as you may if you serve it like that to the web at large. In
particular, don't expect the W3C validator to validate it _as_ an XML
document - features like namespaces (for instance) just don't work.
This is not entirely unreasonable though. The W3C validator never
claims to be an XML validator, nor is (and this surprises many people)
an XHTML document an XML document, _if_it's_served_as_HTML_. Your
embedded RDF metadata with namespaces (and I'm the biggest culprit
around) is fine and dandy so long as your XHTML document still _is_
XML, but once you throw it out onto the web wearing the trousers of
HTML, then that namespacing is no longer valid. We know the old HTML
rules about ignoring unknown elements and just rendering their content
- these are enough to keep the embedded metadata "safe" for use
(there's an old last-century "RDF in HTML" note of Dan Brickley's on
just this topic) but while this "safe" "extended" HTML is perfectly
usable, it is no longer valid in the strict sense.