473,385 Members | 1,752 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

??? Crazy XHTML Strict Validation Problem ???

Hello all,

I have a very strange situation -- I have a page that validates (using
http://validator.w3.org/) as "XHTML 1.0 Strict" just fine. This page
uses this DOCTYPE:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

When I change the DOCTYPE to (what should be the equivalent):
<!DOCTYPE html SYSTEM
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

I get several validation errors.

The page I'm referring to is at:
http://www.absolutejava.com/testing.html

I know you are wondering *WHY* I would make this change. I'll get to
that in a moment. The important point is that, according to my
understanding, both DOCTYPEs use the *same* DTD and so the document
should validate (or not validate) consistently, right?

I also tried copying the DTD from www.w3.org to my server and then
modifying the DOCTYPE accordingly, but I still got the same validation
errors.

The reason I'm doing this is that I want to use "XHTML Strict" *except*
for one small tweak I need to make to the DTD. But, before I can make
the tweak I need the document to validate against a local copy of the
DTD.

Can anyone explain why the different DOCTYPEs produce different
validation results, even though they use the same DTD?

Thanks....

Oct 5 '05 #1
9 2322
rbronson1976 wrote:
Hello all,

I have a very strange situation -- I have a page that validates (using
http://validator.w3.org/) as "XHTML 1.0 Strict" just fine. This page
uses this DOCTYPE:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

When I change the DOCTYPE to (what should be the equivalent):
<!DOCTYPE html SYSTEM
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

I get several validation errors.

The page I'm referring to is at:
http://www.absolutejava.com/testing.html
The problem is that without the public identifier, the validator does
not know that the system identifier is referencing an XML DTD, rather
than an SGML DTD, and because the document is being served with the
wrong MIME type (text/html instead of application/xhtml+xml) falls back
to using SGML based validation.

If you change the MIME type sent by the server in the HTTP Content-Type
header, to an XML MIME type, then the validator should behave as expected.
The reason I'm doing this is that I want to use "XHTML Strict" *except*
for one small tweak I need to make to the DTD. But, before I can make
the tweak I need the document to validate against a local copy of the
DTD.


Ignoring the question of why you want to modify the DTD, you should
consider using HTML and modifying the HTML 4.01 Strict DTD, rather than
trying to use XHTML incorrectly.

http://www.cs.tut.fi/~jkorpela/html/own-dtd.html

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Oct 5 '05 #2
Lachlan,

The public identifier, to my knowledge, is optional. The SYSTEM
identifier is what provides the DTD.

I've also tried other (custom) PUBLIC Identifiers, to no avail. For
example, I tried this DOCTYPE, but I got the same validation failures.

<!DOCTYPE html PUBLIC "-//ABS//DTD XHTML 1.0 Strict Special Tweak//EN"
"http://www.absolutejava.com/DTD/xhtml1-strict.dtd">

You also said, "...the validator does not know that the system
identifier is referencing an XML DTD, rather than an SGML DTD, and
because the document is being served with the wrong MIME type..."

That's an odd thing to say considering the document *DOES* validate
correctly if I use the first DOCTYPE, even though, according to you, I
am using the wrong MIME type (text/html). So, if the MIME type is
causing the problem, why doesn't it cause a problem with the first
DOCTYPE?

I do not believe the "text/html" MIME type is wrong or causing the
problem, although it is not preferred for XHTML. According to
http://www.w3.org/TR/xhtml-media-types, "...the use of 'text/html'
SHOULD be limited to HTML-compatible XHTML 1.0 documents."

In addition, "... XHTML Documents which follow the guidelines set forth
in Appendix C, 'HTML Compatibility Guidelines' may be labeled with the
Internet Media Type "text/html", as they are compatible with most HTML
browsers."

So, I don't think you can say I'm using the wrong MIME type.

Finally, I did try changing the MIME type in the <meta> element to
"application/xhtml+xml" but the same problem occurred.

If anyone *really* knows why I am getting these validation errors,
please respond...but no more half-baked, useless guesses, please.

Oct 5 '05 #3
On 5 Oct 2005, rbronson1976 wrote:
Finally, I did try changing the MIME type in the <meta> element to
"application/xhtml+xml" but the same problem occurred.


The MIME type is *not* set in the META thingy; it is set in the HTTP
header!

Oct 5 '05 #4
Really? That's very interesting. I know there is an HTTP header that
specifies content-type, BUT, there is also a <meta> tag (a.k.a.,
"thingy") that supplements the HTTP headers.

So, if the HTTP header, proper, indicates content-type of 'text/html'
and the <meta> tag indicates something else, which one should a user
agent accept? And, which W3 spec indicates this?

Oct 5 '05 #5
rbronson1976 wrote:
The public identifier, to my knowledge, is optional.
Yes, technically, it is according to the XML rec.
The SYSTEM identifier is what provides the DTD.
Yes, it references the external DTD.
I've also tried other (custom) PUBLIC Identifiers, to no avail.
You misunderstood what I meant. The validator switches to XML mode for
known PUBLIC identifiers for XML documents, such as XHTML, regardless of
the MIME type. Since the validator, obviously, does not know about your
custom PUBLIC identifier, it does not know that it should continue in
XML mode and, because it was served as text/html, defaults to SGML mode.
Using an XML MIME type, it uses XML mode.
You also said, "...the validator does not know that the system
identifier is referencing an XML DTD, rather than an SGML DTD, and
because the document is being served with the wrong MIME type..."

That's an odd thing to say considering the document *DOES* validate
correctly if I use the first DOCTYPE, even though, according to you, I
am using the wrong MIME type (text/html).
That's because the validator knows the XHTML DOCTYPEs
So, if the MIME type is causing the problem, why doesn't it cause a
problem with the first DOCTYPE?
Because, upon encountering a document with a known XML DOCTYPE, the
validator knows that it should continue in XML mode.
I do not believe the "text/html" MIME type is wrong or causing the
problem...
In addition, "... XHTML Documents which follow the guidelines set forth
in Appendix C, 'HTML Compatibility Guidelines' may be labeled with the
Internet Media Type "text/html", as they are compatible with most HTML
browsers."
Although it is allowed by the recommendation, you should be aware that
doing so is considered harmful.
So, I don't think you can say I'm using the wrong MIME type.
No, it is the *wrong* MIME type, even if it is technically allowed under
certain conditions.
Finally, I did try changing the MIME type in the <meta> element to
"application/xhtml+xml" but the same problem occurred.
Change the MIME type in the HTTP headers, the meta element is only
useful for setting the charset in text/html documents, when the charset
parameter has been omitted from the HTTP Content-Type header, or when
the file is not being served over HTTP, or other protocol with such
information available.

In the HTTP headers, for HTML, use:
Content-Type: text/html; charset=XXX
(where XXX is whatever encoding you have used)

For XHTML, use:
Content-Type: application/xhtml+xml

(XML documents are self describing and don't need charset information in
the HTTP headers)
If anyone *really* knows why I am getting these validation errors,
please respond...but no more half-baked, useless guesses, please.


I do *really* know why you are getting these validation errors, it was
not a "half-baked, useless guess". If you can't remain civil in the
future and not insult those that choose to take the time to assist you,
simply because you failed to understand the advice given, then don't
expect too much from anyone else in the future.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Oct 5 '05 #6
Lachlan,

I'm sorry for being so snotty in my previous post -- this is all just
very frustrating and in the past I've found it's not uncommon for
people who seem to know nothing about a topic to post useless replies.
From your reply I see that you do seem to know what you're talking

about....sorry again.

Anyway, for anyone that may be reading, I took Lachlan's advice and I'd
like to describe what I found -- I also have one final question.

Lachlan seems to be correct regarding the use of a "known public
identifier" (e.g., "-//W3C//DTD XHTML 1.0 Strict//EN") -- When a "known
public identifier" is used in the DOCTYPE it causes the validator to go
into "XML mode", even if the content type of the document is non-XML
(e.g., "text/html"). In fact, using a "known public identifier" seems
to cause the validator to ignore the system identifier completely! For
example, using the following DOCTYPE, my document validated just fine:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "Queen
Victoria">

It seems that once the validator recognized the public identifier (
"-//W3C//DTD XHTML 1.0 Strict//EN", in this case) it used some existing
copy of the "xhtml1-strict.dtd" DTD to validate the document. It did
*not* consult the DTD in the system identifier ("Queen Victoria", in
this case), but instead, completely ignores it.

This leads to my last question: Is it possible to use a "known public
identifier" *AND* still tell the validator to use a custom DTD? In
other words, I want to use a "known public identifier" so that the
validator goes into "XML mode" BUT I want it to use my version of the
DTD, not the one it has cached somewhere. Or, equivalently, can I use a
custom, unknown public identifier yet still somehow force the validator
into "XML mode"?

As a final note, I find I am able to get the validator to use my custom
DTD but to do so I have to specify a "custom" public identifier *AND* I
have to serve the document as an XML document (e.g.,
"application/xhtml+xml") so that the validator stays in XML mode, just
as Lachlan indicated. The only reason I prefer to serve as "text/html"
is that IE 6, as you probably know, does not understand
"application/xhtml+xml".

Okay, thanks for the replies. At least I can get my documents to
validate using a custom DTD, which is much farther than I was 24 hours
ago.

Oct 6 '05 #7
rbronson1976 wrote:
Lachlan seems to be correct regarding the use of a "known public
identifier" (e.g., "-//W3C//DTD XHTML 1.0 Strict//EN") -- When a "known
public identifier" is used in the DOCTYPE it causes the validator to go
into "XML mode", even if the content type of the document is non-XML
(e.g., "text/html"). In fact, using a "known public identifier" seems
to cause the validator to ignore the system identifier completely! For
example, using the following DOCTYPE, my document validated just fine:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "Queen
Victoria">
The system identifier only needs to be dereferenced by a validating user
agent when it does not contain the public identifier within its catalogue.
It seems that once the validator recognized the public identifier (
"-//W3C//DTD XHTML 1.0 Strict//EN", in this case) it used some existing
copy of the "xhtml1-strict.dtd" DTD to validate the document.
That's correct.
It did *not* consult the DTD in the system identifier ("Queen Victoria", in
this case), but instead, completely ignores it.
Ignoring the fact that the SI needs to be a URI, that is essentially
correct.
This leads to my last question: Is it possible to use a "known public
identifier" *AND* still tell the validator to use a custom DTD?


No. When you use a public identifier, it is expected that the DTD
referenced by the SI matches that identified by the public identifier.
You need to use <!DOCTYPE SYSTEM "http://...">, but, for the purpose of
validation, you also need to use an XML validator, not an SGML
validator. In the case of the W3 validator, XML mode is triggered by an
XML MIME type, which is the correct way to do what you want.

However, you can make use of another validator, like Page Valet [1],
that allows to to manually select XML validation, if you choose to
ignore the fact that by serving as text/html, your document will not be
treated as XML by any other UA.

Not only will Page Valet allow you do force XML mode, but it's also a
much better XML validator than the W3 validator, which is just an SGML
validator with a few patches to make it act like an XML validator with
"some limitations".

I recommend that, unless you have a really compelling reason to continue
using XHTML on the client side, that you deliver HTML 4.01 with a custom
DTD instead. If your authoring tool/process benefits from using XHTML,
that's fine, you can continue to use XHTML on the back end, but you
should consider transforming it to HTML for the client.

[1] http://valet.webthing.com/page/

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Oct 6 '05 #8
On 5 Oct 2005, rbronson1976 wrote:
Organization: http://groups.google.com
User-Agent: G2/0.2
The innocents abroad.
Really? That's very interesting.


What? What is interesting?
Please quote the statement you refer to!

http://www.xs4all.nl/~wijnands/nnq/nquote.html
http://www.netmeister.org/news/learn2quote.html

Oct 6 '05 #9
Lachlan Hunt <sp***********@gmail.com> wrote:
In the case of the W3
validator, XML mode is triggered by an XML MIME type, which is the
correct way to do what you want.
I wouldn't know of any 'XML mode', this is basically not a question of
XML versus SGML but a question of locating an SGML declaration.

The SGML declaration for XML

<http://validator.w3.org/sgml-lib/xml.dcl>

is -- dramatically -- different from the one for HTML 4

<http://validator.w3.org/sgml-lib/REC-html401-19991224/HTML4.decl>

which in turn is slightly different from e.g. the one for HTML 3

<http://validator.w3.org/sgml-lib/REC-html32-19970114/HTML32.dcl>

And so on.

Using a custom DTD for validation on a remote system is likely to get
you in trouble sooner or later if you don't know the default
declaration which will be choosen in advance (pick a card, and jolly
good luck).
On a side note, it's no good to draw conclusions from 'how stuff works'
by observing some particular behaviour in the wild; whether or not the
FPI OVERRIDEs the sytem identifier is just something else to be
configured in the catalog, see e.g.

<http://validator.w3.org/sgml-lib/REC-html401-19991224/HTML4.cat>

Id est, on a different validating system, the OP and Queen Victoria
might encounter quite different behaviour.

(On yet another side note, if you, for example, always use the html
4.01 strict dtd, you'd add something sensible like

doctype html strict.dtd

to your catalog and could free all your documents from the obsolete
cruft and just use <!doctype html system> for validation purposes.
It's utterly silly to want remote validation after publication;
validation, if at all, is useful in the production process, starting
with a local validating system and an editor that can read the catalog
and the dtd as well.)
Not only will Page Valet allow you do force XML mode, but it's
also a much better XML validator


It isn't a question of 'better' but rather yes or no. Page valet lets
you choose an XML parser, the w3c validator doesn't. Thus the latter
isn't an 'XML validator' (validating XML processor) at all.
--
Goodbye and keep cold
Oct 7 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Jonny | last post by:
How can you validate Javascript generated HTML for XHTML 1.0 strict compliance? To avoid the "<" and "&" problem, all inline scripts MUST be enclosed with either: <!-- script --> Looked down...
59
by: Philipp Lenssen | last post by:
I've tested some of the new Nokia 6600 functionality. It ships with WAP2 and XHTML Support (it says). What it does is check the Doctype -- if it's not the XHTML Mobile Profile Doctype, but a...
41
by: CMAR | last post by:
What are the pluses and minuses of constructing and validating between XHTML Transitional vs. HTLM 4.01 Strict Thanks, CMA
3
by: Robert Smith | last post by:
I have a very basic form validation script, which wont work due to XHTML Strict not allowing me to use the name attribute on a form. Here is part of my code: if...
22
by: Gianni Rondinini | last post by:
hi all. please excuse the misusage of some tech terms, but writing in english is not as easy as in italian :) i'm designing our new website and, since i want to do something that will last as...
9
by: wardy1975 | last post by:
Hi All, Looking for a little expert advice on a few web standards issues. I am currently trying to understand the impact of web standards for a web application I work with. I have been doing a...
11
by: Michael Powe | last post by:
How can I make an XHTML-compliant form of an expression in this format: document.write("<scr"+"ipt type='text/javascript' src='path/to/file.js'>"+"</scr"+"ipt>"); this turns out to be a...
2
by: Radu | last post by:
Hi. I have been working at home on a web project (VSNET 2005 SP1). Now I have brought the project at work, and I suddenly have plenty of warnings like: Validation (XHTML 1.0 Transitional) -...
10
by: Robert Huff | last post by:
Can someone offer suggestions why, on the same server (Apache 2.2.8), this works <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html lang="en-US"> <head> <link rel=stylesheet...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.