DTD in browsers

Randy Webb wrote:

VK said the following on 5/2/2006 9:48 AM:
If you mean "trying to render it" then FF behavior is the same as for
all other UA's willing to be in use (and not W3C demos). If document is
served as text/html, FF will render it somehow anyhow.
So you are saying it totally disregards the DTD and any hints from the
server how to handle the document?

Except server reported Content-Type (text/plain, text/html, text/xml,
application/xhtml+xml etc.)
DTD string itself is irrelevant (and this string by itself is not a
"hint from the server" but a "hint from the document").

Obviously it doesn't connect every time to w3.org to get a DTD, it uses
a build one.

So you are saying, again, that DTD's are irrelevant?

From the document parsing point of view: yes, absolutely irrelevant. They have some theoretical importance for documents' indexing and
searching. Most importantly DTD allows - so far - to switch IE into W3C
box model (unless short HTML Transitional). Without the latter their
usage would be limited by ciwas and ciwah exclusively.

That would be another aspect of your question: what DTD/
tag database is build in into FF? Only one so far: XHTML 1.0 The only
namespace for HTML Firefox knows about is
xmlns:html="http://www.w3.org/1999/xhtml"

If that is true, then Firefox is not even close to Standards Compliant.

It is true, but Firefox *is* Standards Compliant - as much as it's
humanly possible without rendering a UA useless and by keeping it
attractive for potential users.

But what decision will it make based on this table - it depends
completely on the Content-Type. Say absolutely the same content with
Content-Type text/html will go through or get adjusted, but with
application/xhtml+xml will lead to a parsing error.

Odd behavior if you tell it text/html with a 4.01 DTD

WWW doesn't go by extensions or formal document signs, never did and
never will. The only important part is Content-Type. It defines
everything.

And if anyone curious: the build in DTD of IE6 is
<http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd> This is the
only one it's aware of and the only one it uses. Respectively the only
type of documents existing in IE is <!DOCTYPE HTML PUBLIC "-//W3C//DTD
HTML 4.01 Transitional//EN">

Now that I don't believe.

As you wish. But you believe or disbelieve doesn't change anything in
this matter. The only "major" change expecting in IE7 will be <abbr>
element added as separate entity (now it goes as synonim or <acronym>).
Of course IE knows a bounch of other proprietary tags. It has tables
for behaviors (<public>, <component>, <attach> etc.), tables for VML
(<v:group>, <v:line>, <v:oval> etc.) and so on. But talking about
*those* DTD - from W3C - the above mentioned DTD is the only one.

By providing other DTD's one can switch IE in "CSS1Compat" mode, but
it's just a formal reaction on "Unknown DTD" programmed into the
browser, DTD itself never changes.

Can you prove that?

Oh com'on! Again: "prove me that the sky is blue" ? :-)

<!DOCTYPE FOOBAR "Micro$oft must die!">
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1">
</head>
<body onload="alert(document.compatMode)">
</body>
</html>

Say you can put IE into CSS1Compat mode by placing instead:
<!DOCTYPE FOOBAR "Micro$oft must die!">

Does the fun never end?

See above
document.doctype gives some neat info in Firefox though.

document.doctype is just a convenience access to the provided DTD
string wich is hardly accessible otherwise (because it's formally
located outside of any document blocks, even outside of
documentElement). In IE document.doctype==null for all HTML documents -
to not make DTD users too much upset I guess.

May 3 '06

Subscribe Reply

4441

Eric B. Bednarz wrote:

But then I don't have any idea what kind of point you are aiming at; you
were babbling about IE 'using' a particular default DTD in a text/html
context.
I was not bubling as you say: I was explaining you the basic things you
should know
It was cool that you cared to explain all this amazing rocket science
stuff to me, thank you so much.
You are welcome.
As to "woodoo" (John Woodoo, I
presume), document type declarations work for me in their original sense
every other day when writing HTML.

(woodoo = voodoo)

"in their original sense" I presume then as opaque strings identifying
this or that (X)HTML environment? Sorry for you if it /is/ the original
sense of DOCTYPE and linked DTD's.

standalone="yes" (default value)

Once you located the spec to back up your first statement above, please
make a note about the section that defines this default value as well.

You have to learn XML (real XML, not pseudo-XML XHTML crap) and XSLT
(the latter is not obligatory but would be very nice). I linked in my
previous post a sample and the Mozilla bug at bugzilla.mozilla.org
which contains a lot of useful links and references. You may start with
the latter for the basics of the prolog syntax.

Overall I suggest to read the relevant manuals and make a couple of
your own simple pages - a lot of things are getting much clearer on
practice.

May 17 '06 #51

Michael Winter

On 17/05/2006 13:17, VK wrote:

[snip]

I had/have/will have nasty argues with Thomas, but his original
statement that "DTD for XML are always fetched" is totally correct.
Thomas did not write that, and a good thing, too: as a blank statement,
it is totally false.

[snip]
1) standalone="no" flag in prolog instruct the parser that before
validating it has to retrieve additional definitions from external DTD
The standalone document declaration doesn't instruct a validating
processor to 'do' anything. It is a requirement of validating processors
themselves to process the DTD and any referenced external entities.

The standalone document declaration does have an impact on
well-formedness (see Entity Declared in section 4.1 [p.33]), and on
non-validating processor when reading parameter entities.

[snip]
[...] standalone="yes" (default value) [...]
How on Earth you managed to get that idea is beyond me.

[snip]
And the last but not least:
Currently Firefox cannot load external DTD's at all.
It can. It just chooses not to for the most part. Opera doesn't process
external entities, either.
This is a nasty bug,
It's not a bug at all; on the Web, neither Firefox nor Opera implement
validating XML processors.
but to fix it properly they have to solve somehow the problem with
the bogus DTD from W3C.

Hopefully you've resolved that misconception.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.

May 17 '06 #52

Michael Winter wrote:

1) standalone="no" flag in prolog instruct the parser that before
validating it has to retrieve additional definitions from external DTD

The standalone document declaration doesn't instruct a validating
processor to 'do' anything. It is a requirement of validating processors
themselves to process the DTD and any referenced external entities.

The standalone document declaration does have an impact on
well-formedness (see Entity Declared in section 4.1 [p.33]), and on
non-validating processor when reading parameter entities.

[snip]
[...] standalone="yes" (default value) [...]

How on Earth you managed to get that idea is beyond me.

[snip]

With all my deep respect I only can repeat the advise given to the
previous opponent. Besides a very informative discussion around the
mentioned bug at bugzilla, you also may read
<http://www.w3.org/TR/REC-xml/#sec-rmd> Actually this and additional
sections are mentioned in the bug thread, but you may want to start
right wrom W3C.

May 17 '06 #53

Michael Winter

On 17/05/2006 19:24, VK wrote:

Michael Winter wrote:

[VK:]

[...] standalone="yes" (default value) [...]

How on Earth you managed to get that idea is beyond me.

With all my deep respect I only can repeat the advise given to the
previous opponent.

If there are no external markup declarations, the standalone
document declaration has no meaning. If there are external
markup declarations but there is no standalone document
declaration, the value "no" is assumed.
-- 2.9 Standalone Document Declaration, XML 1.0 (2nd Ed.)

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.

May 17 '06 #54

Eric B. Bednarz

"VK" <sc**********@yahoo.com> writes:

Eric B. Bednarz wrote:
[...] document type declarations work for me in their original sense
every other day when writing HTML.

"in their original sense" I presume then as opaque strings identifying
this or that (X)HTML environment?
No, I mean that Emacs knows where my catalog and nsgmls are, and that my
catalog knows where the DTDs are.
Sorry for you if it /is/ the original
sense of DOCTYPE and linked DTD's.
Oh. Now that's what I call a tough break.
You have to learn XML

Due to this encouragement I'll try some day.

(Emacs also knows where my RELAX NG schemas are; but if I have any
questions about DTDs, you'll be the first one I ask for advice. :)
--
||| hexadecimal EBB
o-o decimal 3771
--oOo--( )--oOo-- octal 7273
205 goodbye binary 111010111011

May 17 '06 #55

Michael Winter wrote:

If there are no external markup declarations, the standalone
document declaration has no meaning. If there are external
markup declarations but there is no standalone document
declaration, the value "no" is assumed.
-- 2.9 Standalone Document Declaration, XML 1.0 (2nd Ed.)

Bingo! ;-)

Applying the quoted rule to say:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

May 18 '06 #56

Andy Dingley

VK wrote:

I had/have/will have nasty argues with Thomas, but his original
statement that "DTD for XML are always fetched" is totally correct.

No, that statement is entirely incorrect (and Thomas is too smart to
have said that anyway).

XML apps _may_ of course fetch DTDs, but in the absence of any real
statistical evidence I'd guess that their actual practice is that far
fewer XML apps fetch DTDs than similar SGML apps do. The reason is
simple - XML never needs a DTD to parse the document into the DOM. This
is the absolutely fundamental design difference between XML and SGML.

As an emergent result, it's possible to do useful work with XML without
ever even producing a DTD or Schema, and this is generally the way that
commercial XML work is done. If you ever do fetch the DTD the most
common result is to discover that it's actually some years out of date
and is no longer valid against the structure of the live documents.

May 18 '06 #57

VK wrote:

I had/have/will have nasty argues with Thomas, but his original
statement that "DTD for XML are always fetched" is totally correct.

Andy Dingley wrote: No, that statement is entirely incorrect (and Thomas is too smart to
have said that anyway).
<http://groups.google.com/group/comp.infosystems.www.authoring.html/tree/browse_frm/thread/4ac44109aac7fa53/2d44e7f84a7a0065?rnum=21&hl=en&_done=%2Fgroup%2Fco mp.infosystems.www.authoring.html%2Fbrowse_frm%2Ft hread%2F4ac44109aac7fa53%2F0a4906e0027d1058%3Fhl%3 Den%26#doc_6f68f10a6c9518a9>
Thomas 'PointedEars' Lahn wrote: It is not fetched by tagsoup parsers in many known Web browsers because it
is built-in there. It is definitely fetched by XML parsers in known Web
browsers.
Which is (once again) totally correct in application to /DTD/ in /XML/
(not pseudo-DTD in pseudo-XML aka XHTML)
I just presume that Thomas was not aware of the current bug in Gecko
preventing it to fetch external DTD. But this bug (though should be
mentioned) doesn't render the statement as it is wrong.
XML apps _may_ of course fetch DTDs, but in the absence of any real
statistical evidence I'd guess that their actual practice is that far
fewer XML apps fetch DTDs than similar SGML apps do. The reason is
simple - XML never needs a DTD to parse the document into the DOM. This
is the absolutely fundamental design difference between XML and SGML.
Yet again the lack of prectical experience is demonstrated in this
statement. Without a DTD with at least the most necessary extra
entities you simply not capable to parse HTML template. The only
entities any XML parser is aware of are amp quot apos lt gt. Anything
atop has to be added to internal or external DTD declaration to let the
document to validate right. And - because of the mentioned Mozilla bug
- so far the only option is to use internal DTD. Take any real life XML
(or XML feed) to find out that DOCTYPE section can take many lines
right after prolog.
As an emergent result, it's possible to do useful work with XML without
ever even producing a DTD or Schema
Only in an narrow set of situations not conncted with the Web, see
above.
and this is generally the way that
commercial XML work is done.
Technically impossible and factually wrong. Please follow my advise and
study some real life commercial XML/XSL solution or a XML news feed.
If you ever do fetch the DTD the most
common result is to discover that it's actually some years out of date
and is no longer valid against the structure of the live documents.

Here you're comind back to the the W3C's bogus DTD used as opaque
strings. This is the hard choice to make to you (and to W3C). Either we
agree that there is only one DOCTYPE and only one DTD mechanics equal
to any document where used; or we agree that there is that DOCTYPE
(HTML/XHTML) and this DOCTYPE (XML) with completely different rules and
functionality.

May 18 '06 #58

Andy Dingley

> VK wrote:

I had/have/will have nasty argues with Thomas, but his original
statement that "DTD for XML are always fetched" is totally correct.
Andy Dingley wrote:
No, that statement is entirely incorrect (and Thomas is too smart to
have said that anyway). <http://groups.google.com/group/comp.infosystems.www.authoring.html/tree/browse_frm/thread/4ac44109aac7fa53/2d44e7f84a7a0065?rnum=21&hl=en&_done=%2Fgroup%2Fco mp.infosystems.www.authoring.html%2Fbrowse_frm%2Ft hread%2F4ac44109aac7fa53%2F0a4906e0027d1058%3Fhl%3 Den%26#doc_6f68f10a6c9518a9>

And Thomas' exact quote is "It is definitely fetched by XML parsers in
known _WEB_ browsers."
(my emphasis)

Now I have no idea if this is accurate - I haven't tested XML web
browsers.

However it is _not_ the same as saying that the fetch happens for "all
XML parsers". Now from my own direct knowledge I know that much
certainly isn't true. For one thing it can't be true because many of
the world's live XML apps don't even _have_ a DTD (or an accurate DTD)
to fetch.
Which is (once again) totally correct in application to /DTD/ in /XML/
(not pseudo-DTD in pseudo-XML aka XHTML)
XHTML is not pseudo-XML. XHTML under Appendix C is pseudo-SGML, and
not XML at all. XHTML as XML) is (or should be) perfectly compliant
XML.

Of course a "workable" browser needs to make best sense of any rubbish
it's given, but that's a separate problem (and they're hardly likely to
resolve it by fetching DTDs)

XML apps _may_ of course fetch DTDs, but in the absence of any real
statistical evidence I'd guess that their actual practice is that far
fewer XML apps fetch DTDs than similar SGML apps do. The reason is
simple - XML never needs a DTD to parse the document into the DOM. This
is the absolutely fundamental design difference between XML and SGML.

Yet again the lack of prectical experience is demonstrated in this
statement.

I've been delivering commercial XML apps since back in the last
century. I have far more XML experience than you, and I'm no doubt
twice your age and have twice the experience in software engineering
too. I'll listen to "lack of experience" claims from Jukka, Alan or
Nick, but not many others in this ng.
Without a DTD with at least the most necessary extra
entities you simply not capable to parse HTML template.
HTML ? or XHTML ? And what's a "template" ? If you're going to nit
pick, then you need to be precise.
The only entities any XML parser is aware of are amp quot apos lt gt.
Agreed.

However why is this a problem "for parsing into the XML DOM" ? If an
XML parser meets an unrecognised entity, then it's entitled to choke on
it. For that reason entity references (other than those in the TR) are
not commonly used in XML apps, as they are in widespread use in the
SGML world.

As a specific instance, look at the number of RSS feeds around with
HTML (non-XML) entity references occuring in them, and the errors that
causes. If you want to build a "workable" RSS feed parser then you need
to cope with this, because feeds just aren't reliably XML valid if they

encounter an é

As an emergent result, it's possible to do useful work with XML without
ever even producing a DTD or Schema

Only in an narrow set of situations not conncted with the Web, see
above.

Hardly "narrow". In fact it's pretty much all the XML in the world
(the web not yet being a widespread XML medium)

and this is generally the way that
commercial XML work is done.

Technically impossible and factually wrong.

Why? I build XML apps all day - I _very_ rarely see a DTD. If I bother
building something at all, it's far more likely to be XML Schema
anyway. Admittedly I don't use entities.
Please follow my advise and
study some real life commercial XML/XSL solution or a XML news feed.
So where is the DTD for an RSS 2.0 news feed ?! (or most other RSS
versions)

If you ever do fetch the DTD the most
common result is to discover that it's actually some years out of date
and is no longer valid against the structure of the live documents.

Here you're comind back to the the W3C's bogus DTD used as opaque
strings.

No, I'm talking about widespread non-web XML practice. DTD's just don't
get written. Almost no commercial XML developers even understand their
syntax!
This is the hard choice to make to you (and to W3C). Either we
agree that there is only one DOCTYPE and only one DTD mechanics equal
to any document where used; or we agree that there is that DOCTYPE
(HTML/XHTML) and this DOCTYPE (XML) with completely different rules and
functionality.

I'd agree with this statement in the context of web browsing. DTDs are
a design and documentation mechanism, no more (in practice, for the
web). HTML's parsing and usage depends on some internal structure
representation within the browser (I can't say any more detail than
this) and there's no reason why that needs to be a DTD, rather than
explicit code. Doctype identifiers on "the web" are thus merely
treated as opaque strings, not URLs to a DTD that needs to be retrieved
(of course any browser may choose to, but most are unlikely to).

May 18 '06 #59

Michael Winter

On 18/05/2006 09:10, VK wrote:

Michael Winter wrote:
If there are no external markup declarations, the standalone
document declaration has no meaning. If there are external
markup declarations but there is no standalone document
declaration, the value "no" is assumed.
-- 2.9 Standalone Document Declaration, XML 1.0 (2nd Ed.)

Bingo! ;-)

Please don't tell me that you still think you were right. One doesn't
need to be an XML expert to realise that the phrase 'the value "no" is
assumed' means quite the opposite of what you wrote.

If that is, somehow, an admission that you were wrong, you might want to
make it a bit more explicit in future.

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.

May 18 '06 #60

Andy Dingley <di*****@codesmiths.com> wrote:

And Thomas' exact quote is "It is definitely fetched by XML parsers in
known _WEB_ browsers."
(my emphasis)
Thomas reads way too many W3C specs and he got infected by W3C's style
where a crystal clear looking affirmative statement contains one or two
words allowing to interprete it in N different ways :-)

I guess it will be useful for him to see the actual negative effect of
such writing. Let's imagine for a second that the Thomas' post is one
of W3C's paragraphs and we have to retrieve the "original intended
meaning" out of it.

First of all let's bring the quote to the source state:
<q>It is definitely fetched by XML parsers in known Web browsers.</q>
As you see there is no emphasis of any kind on the word "Web". By
reading the sentence with normal intonation we see nothing but regular
term "Web browser" written by rules of English grammar thus with the
word "Web" capitalized. Yet this sentence leaves a hole - a very narrow
one though - you just managed to squeeze in. By adding emphasis: "in
known W`eb browsers" one cand pretend that there are some Web browsers
and non-Web browsers. That's a nice try (reinforced by immediately
invented "XML web browsers") but unfortunately having no sense in
English. There are only browsers / Web browsers and nothing more. Yet
different Web browsers are capable to render different sets of
electonic documents. How Thomas could avoid this branch of discussion
(and me from typing all this words)? By simply saying it in the
technically proper way: "It is definitely fetched by known Web browsers
if served as XML document".
(I wish some of W3C writers would found this thread :-)
Now I have no idea if this is accurate - I haven't tested XML web
browsers.
Naturally you didn't as there are not such. There are Web browsers with
XML parsers; see the rest a bit above.
However it is _not_ the same as saying that the fetch happens for "all
XML parsers". Now from my own direct knowledge I know that much
certainly isn't true. For one thing it can't be true because many of
the world's live XML apps don't even _have_ a DTD (or an accurate DTD)
to fetch.
1) If a XML document served as XML document and 2) it contains external
DTD declaration and 3) prolog has flag standalone="no" then any
standard-compliant XML parser is /obligated/ to retrieve all entities
from the linked DTD before starting the validation. All
standard-compliant browsers indeed do this except Firefox due to the
mentioned bug to be fixed. The relevant part of XML specs was also
linked and quoted several times here, so if it contadicts your
expectations, you may argue with W3C, not with me.

Which is (once again) totally correct in application to /DTD/ in /XML/
(not pseudo-DTD in pseudo-XML aka XHTML)

XHTML is not pseudo-XML. XHTML under Appendix C is pseudo-SGML, and
not XML at all. XHTML as XML) is (or should be) perfectly compliant
XML.

This is why I guess they start with `XML' prolog? Not, of course not,
it's just added for fun so XML parsers would have some job to do.

XML apps _may_ of course fetch DTDs, but in the absence of any real
statistical evidence I'd guess that their actual practice is that far
fewer XML apps fetch DTDs than similar SGML apps do. The reason is
simple - XML never needs a DTD to parse the document into the DOM. This
is the absolutely fundamental design difference between XML and SGML.

Yet again the lack of prectical experience is demonstrated in this
statement.

I've been delivering commercial XML apps since back in the last
century. I have far more XML experience than you, and I'm no doubt
twice your age and have twice the experience in software engineering
too. I'll listen to "lack of experience" claims from Jukka, Alan or
Nick, but not many others in this ng.

OK - taking "experience" back, rephrase it to "XHTML-free thinking
experience". While dealing with all these XML-looking imitations it is
easy to forget that there is somewhere real XML with its own real
rules. You may notice that I got enough either of "no experience /
foggy mumbling" comments. People A, B and C get you to the point, and D
gets all your frustration. Happens in the life, happens in the Usenet
:-)

Doctype identifiers on "the web" are thus merely
treated as opaque strings, not URLs to a DTD that needs to be retrieved

And: XHTML documents are not XML document and do not follow
XML-conforming DOCTYPE/DTD rules.

This statement can be used as the conclusive for the discussion unless
someone has more comments.

May 18 '06 #61

Michael Winter wrote:

If there are no external markup declarations, the standalone
document declaration has no meaning. If there are external
markup declarations but there is no standalone document
declaration, the value "no" is assumed.
-- 2.9 Standalone Document Declaration, XML 1.0 (2nd Ed.)

Bingo! ;-)

Please don't tell me that you still think you were right.

Yes I do - and I'm actually surprised that you don't. I read this quote
in your previous post as an admission of your mistake. If we still
tango, then read the relevant discussion Masayasu Ishikawa - Heikki
Toivonen at <https://bugzilla.mozilla.org/show_bug.cgi?id=69799>

May 18 '06 #62

Andy Dingley

VK wrote:

I guess it will be useful for him to see the actual negative effect of
such writing. Let's imagine for a second that the Thomas' post is one
of W3C's paragraphs and we have to retrieve the "original intended
meaning" out of it.0
What deconstructivist twaddle is this?

You're in a hole, stop digging.

It might be true that some XML gadgets somewhere sometimes retrieve a
DTD, but it is not a requirement on XML parsing in total. Now stop
saying it is, stop saying you never said that it is, and stop saying
that other people agreed with you.

...or else get yourself a black turtleneck, shave your head and start
posting about transformative hermeneutics instead.

1) If a XML document served as XML document and 2) it contains external
DTD declaration and 3) prolog has flag standalone="no" then any
standard-compliant XML parser is /obligated/ to retrieve all entities
from the linked DTD before starting the validation.
This is reasonably correct, however condition 2) is not usually
encountered in XML documents, especially the more trivial ones, and
_unlike_ SGML, XML still has parsing behaviour permitted (nay,
encouraged) in this case.

XML is parseable without a DTD, unless one is required.
SGML requires a DTD and isn't parseable without one.

> > Which is (once again) totally correct in application to /DTD/ in /XML/ (not pseudo-DTD in pseudo-XML aka XHTML)

What is "pseudo-XML" about XHTML ?

Bizarre behaviour about the processing of XHTML in some web browsers is
not a fault of XHTML, it's a characteristic of the browsers. (nor is
Appendix C pseudo-XML, because it's clearly presenting its XML-like
self as pseudo-SGML instead)

Incidentally, your logical inferencing is wrong too. If we take real
pseudo XML like M$oft's CDF or ASX files, then these are psudo-XML, but
that's certainly not to say that all pseudo-XML is implied to be one of
these particular formats. You appear to be arguing over points of logic
that your mind, or at least your prose, just isn't adequate for.
XHTML is not pseudo-XML. XHTML under Appendix C is pseudo-SGML, and
not XML at all. XHTML as XML) is (or should be) perfectly compliant
XML.

This is why I guess they start with `XML' prolog?

XHTML starts with an XML prolog (if it does), because it's claiming to
be XML, and well-formed valid XML at that. Appendix C XHTML doesn't
have an XML prolog.

OK - taking "experience" back, rephrase it to "XHTML-free thinking
experience".
OK then, I've been arguing free-thinking XHTML in this very newsgroup
since some time in the last century. Not always correctly or wisely, I
grant you, but it was early days and I was young and foolish.

While dealing with all these XML-looking imitations it is
easy to forget that there is somewhere real XML with its own real
rules.
All XML has the same rules - that's the point (and its single huge
benefit over SGML). There aren't exceptions for either the web, or for
your crazy imaginings. This stuff may not always be simple, but it is
(for once) written down and fairly clearly readable.

Doctype identifiers on "the web" are thus merely
treated as opaque strings, not URLs to a DTD that needs to be retrieved

And: XHTML documents are not XML document and do not follow
XML-conforming DOCTYPE/DTD rules.

In what way do XHTML (not Appendix C) documents on the web _not_
conform to the rules?

NB - Documents - not _processing_. Processing is an artefact of the
processors and I'm sure some of them have very weird and unconformant
behaviours -- I know, I've written RSS parsers that have enormous
inferences built-in to try to work around badly formed XML.

This statement can be used as the conclusive for the discussion unless
someone has more comments.

If you want to have the last word, try not to leave it as "Wibble".

May 18 '06 #63

Michael Winter

On 18/05/2006 14:52, VK wrote:

Michael Winter wrote:
[snip]

Please don't tell me that you still think you were right.

Yes I do [...]

Why does that not surprise me. *sigh*
I read this quote in your previous post as an admission of your
mistake.
I quoted from the specification because either you haven't read it, or
you don't understand it.

You stated in <11*********************@g10g2000cwb.googlegroups. com>
that, in the context of XHTML, the standalone document declaration has a
default value of "yes". However, section 2.9 specifies that if there are
external markup declarations (for example, an external subset), but the
standalone declaration is absent, then the default value is "no".
If we still tango, then read the relevant discussion Masayasu
Ishikawa - Heikki Toivonen at
<https://bugzilla.mozilla.org/show_bug.cgi?id=69799>

Their discussion is irrelevant to what you wrote.

Masayasu Ishikawa points out that the non-validating XML processor in
Mozilla reports undefined entities to be well-formedness errors in
non-standalone documents. However, this behaviour is incorrect:
non-validating XML processors should only treat undefined entities as
well-formedness errors in standalone documents.

When he writes, 'even if external DTD subset is present'[1], he is
referring to the fact that in such circumstances, the default for a
standalone document declaration is "no". A fact that you seem to be
unable to comprehend.

Mike
[1] Masayasu Ishikawa, comment #9
<https://bugzilla.mozilla.org/show_bug.cgi?id=69799#c9>

--
Michael Winter
Prefix subject with [News] before replying by e-mail.

May 18 '06 #64

Michael Winter wrote:

On 18/05/2006 14:52, VK wrote:
Michael Winter wrote:

[snip]
Please don't tell me that you still think you were right.

Yes I do [...]

Why does that not surprise me. *sigh*
I read this quote in your previous post as an admission of your
mistake.

I quoted from the specification because either you haven't read it, or
you don't understand it.

I considered this quote as the statement "I read the relevant part in
full, here is the proof" :-)

Why did you stop reading in the middle then?

<http://www.w3.org/TR/REC-xml/#sec-rmd>
.... three lines below of what you already read:
Validity constraint: Standalone Document Declaration

The standalone document declaration MUST have the value "no" if any
external markup declarations contain declarations of:

* attributes with default values, if elements to which these
attributes apply appear in the document without specifications of
values for these attributes, or
* entities (other than amp, lt, gt, apos, quot), if references to
those entities appear in the document, or
* attributes with tokenized types, where the attribute appears in
the document with a value such that normalization will produce a
different value from that which would be produced in the absence of the
declaration, or
* element types with element content, if white space occurs
directly within any instance of those types.

May 18 '06 #65

VK wrote:

Why did you stop reading in the middle then?

<http://www.w3.org/TR/REC-xml/#sec-rmd>
... three lines below of what you already read:
Validity constraint: Standalone Document Declaration

The standalone document declaration MUST have the value "no" if any
external markup declarations contain declarations of:

* attributes with default values, if elements to which these
attributes apply appear in the document without specifications of
values for these attributes, or
* entities (other than amp, lt, gt, apos, quot), if references to
those entities appear in the document, or
* attributes with tokenized types, where the attribute appears in
the document with a value such that normalization will produce a
different value from that which would be produced in the absence of the
declaration, or
* element types with element content, if white space occurs
directly within any instance of those types.

Yet after deep thinking I admit that it seems not humanly possible to
get a definitive result out of:

"If there are external markup declarations but there is no standalone
document declaration, the value "no" is assumed."
and
"The standalone document declaration MUST have the value "no" if any
external markup declarations contain declarations of: <snip>".

These sentences are collocated in two consecutive paragraphs, but only
a real W3C language specialist can tell what do they mean. Does it mean
that for the situations spelled in the MUST section one has to
explicetly set standalone="no" ? Or does it mean that for the
situations spelled in the MUST section the value "no" must be assumed
and in other situations it /may/ be assumed? God damn, I've never seen
so low clarity in so clear looking text. I guess it is needed to ask at
<comp.text.xml>, maybe they already managed to decrypt this fragment.
Practically for me it was always obvious to set standalone="no" if DTD
is used, without hope on some default values - but of course it doesn't
prove anything.

May 18 '06 #66

Andy Dingley wrote:

You're in a hole, stop digging.

and at the beginning he wrote:
<q>XML apps _may_ of course fetch DTDs, but in the absence of any real
statistical evidence I'd guess that their actual practice is that far
fewer XML apps fetch DTDs than similar SGML apps do. The reason is
simple - XML never needs a DTD to parse the document into the DOM. This
is the absolutely fundamental design difference between XML and SGML.
</q>

and VK continuosly stated that:

1) If a DTD is provided for XML document, it is /obligated/ to fetch it
before proceed with validation. If for some reason parser doesn't want
to retrieve the DTD, it is not allowed to validate/unvalidate the
document (thus raise parsing errors).

2) DTD are videly used in XML/XSL templates to add extra entities which
otherwise would raise parsing errors (this is not the only use of DTD
but the most common one).

3) A document not following the rule 1) is not XML-conformant.

You're in a hole, stop digging - especially after you've made such
great step forward by admitting that DTD in HTML/XHTML are formal
opaque strings and their usage there is not /totally/ the same as
DOCTYPE/DTD specs say.

Com'on! Come out! I gocha! :-)

May 18 '06 #67

Jack

VK wrote:

1) If a DTD is provided for XML document, it is /obligated/ to fetch
it before proceed with validation. If for some reason parser doesn't
want to retrieve the DTD, it is not allowed to validate/unvalidate
the document (thus raise parsing errors).
VK also wrote:
3) A document not following the rule 1) is not XML-conformant.

Your rule 1) doesn't refer to a document; the first "it" in the first
sentence implicitly refers to a parser. That is, your rule is saying
that a parser must fetch the DTD for a document if it wishes to validate
that document. You can't use rule 1) to conclude that any given document
is or isn't XML-conformant.

To put it more simply: you're talking rubbish again. And your prose is
significantly less clear than certain W3C recommendations.

--
Jack.

May 18 '06 #68

Jack wrote:

VK wrote:

1) If a DTD is provided for XML document, it is /obligated/ to fetch
it before proceed with validation. If for some reason parser doesn't
want to retrieve the DTD, it is not allowed to validate/unvalidate
the document (thus raise parsing errors).
VK also wrote:

3) A document not following the rule 1) is not XML-conformant.

Your rule 1) doesn't refer to a document; the first "it" in the first
sentence implicitly refers to a parser.

"it" meaning is pretty clear from the context (also how in the world a
/document/ could fetch a DTD - even if I wrote explicetly like that,
that would be an obvious typo - yet I did not).
That is, your rule is saying
that a parser must fetch the DTD for a document if it wishes to validate
that document. You can't use rule 1) to conclude that any given document
is or isn't XML-conformant.
Here indeed a bit of W3C style - it must be contageous :-)

"A document usind DOCTYPE and DTD but not expecting from them to be
treated by the rule 1) is not XML-conformant"

To put it more simply: you're talking rubbish again.And your prose is
significantly less clear than certain W3C recommendations.

no direct comments - see above.

May 18 '06 #69

VK wrote:

Yet after deep thinking I admit that it seems not humanly possible to
get a definitive result out of:

"If there are external markup declarations but there is no standalone
document declaration, the value "no" is assumed."
and
"The standalone document declaration MUST have the value "no" if any
external markup declarations contain declarations of: <snip>".

These sentences are collocated in two consecutive paragraphs, but only
a real W3C language specialist can tell what do they mean. Does it mean
that for the situations spelled in the MUST section one has to
explicetly set standalone="no" ? Or does it mean that for the
situations spelled in the MUST section the value "no" must be assumed
and in other situations it /may/ be assumed? God damn, I've never seen
so low clarity in so clear looking text. I guess it is needed to ask at
<comp.text.xml>, maybe they already managed to decrypt this fragment.
Practically for me it was always obvious to set standalone="no" if DTD
is used, without hope on some default values - but of course it doesn't
prove anything.

<comp.text.xml> comment:
<q>Since it's a constraint attached to the production for standalone
declarations, I think you should take it as "if there is a standalone
declaration, it must have the value "no" if ...". You're certainly not
required to have one.</q>

That's a creative reading of the damaged text fragment (never crossed
my mind) and seems the only one having sense.

This way my original statement that it is not a valid XML syntax per se
(without explicit standalone="no"):

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

- this statement is not correct. I was wrong and you were right.

May 19 '06 #70

Toby Inkster

VK wrote:

1) If a DTD is provided for XML document, it is /obligated/ to fetch it
before proceed with validation.

Correct, but misleadingly phrased.

If a DTD is provided for an XML document, the user agent is *not* obliged
to fetch it *unless* it wants to validate the document. (And many user
agents have no interest in validation -- only well-formedness.)

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

May 19 '06 #71

Andy Dingley

VK wrote:

and VK continuosly stated that:

You're talking about yourself in the third person?

We have clear k00k-sign !
(*plonk* - you're just not worth it)

May 19 '06 #72

Andy Dingley <di*****@codesmiths.com> wrote:

VK wrote:
and VK continuosly stated that:

You're talking about yourself in the third person?

We have clear k00k-sign !

(*plonk* - you're just not worth it)

Andy Dingley wrote a while ago:
<q>XML apps _may_ of course fetch DTDs, but in the absence of any real
statistical evidence I'd guess that their actual practice is that far
fewer XML apps fetch DTDs than similar SGML apps do. The reason is
simple - XML never needs a DTD to parse the document into the DOM. This
is the absolutely fundamental design difference between XML and
SGML.</q>

No one statement in this quote appeared to be correct. The only correct
statement you forcely did later was that DTD in HTML/XHTML are opaque
strings and they have different functionality than described in XML
specs. As you started to switch onto side topics about my style and
semantics in my posts, I presume that was the maximum compromise you
are ready to go for. I don't dare to force you any further.

May 19 '06 #73

Michael Winter

On 18/05/2006 17:26, VK wrote:

Michael Winter wrote:
[snip]

I quoted from the specification because either you haven't read it,
or you don't understand it.

I considered this quote as the statement "I read the relevant part in
full, here is the proof" :-)

Why did you stop reading in the middle then?

I didn't, but I quoted only what was relevant.

[snip]
Validity constraint: Standalone Document Declaration

The standalone document declaration MUST have the value "no" if any
external markup declarations contain declarations of:

[snip]

And what bearing do you think that has on your assertion that the
default value is "yes"?

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.

May 19 '06 #74

Michael Winter wrote:

And what bearing do you think that has on your assertion that the
default value is "yes"?

No, it was my mistake I admitted in the previous post. Should I do it
again? /I was wrong/

In the case like:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

in a XML-conformant document standalone value in prolog is assumed "no"
and /cannot/ be set to "yes" (by "Validity constraint" section at
<http://www.w3.org/TR/REC-xml/#sec-rmd>)

May 19 '06 #75

Michael Winter

On 19/05/2006 07:42, VK wrote:

[Regarding validity constraint, section 2.9, XML 1.0]

<comp.text.xml> comment:
<q>Since it's a constraint attached to the production for standalone
declarations, I think you should take it as "if there is a standalone
declaration, it must have the value "no" if ...". You're certainly not
required to have one.</q>

That's a creative reading of the damaged text fragment [...]
That would be a reasonable reading of English.

First, section 2.9 starts by defining what constitutes an 'external
markup declaration'. It then follows with a description of what a
standalone document declaration represents, its relationship with the
previous definition, and the default value of the declaration. Finally,
it sets out when the value must be 'no' for a document to be valid.

[snip]
I was wrong and you were right.

Thank you for acknowledging that. However, I think I'm past the point of
caring (actually, I think I reached that point a while ago).

I'm fed up of banging my head against the wall, trying to make you see
sense. Don't be surprised if I ignore any discussions you intend to
start with me, even if I reply with corrections to your posts. They will
be for the benefit (or protection) of other readers, not you. You just
aren't worth the effort or the frustration any more.

A cop-out? Perhaps, but it's better than the alternative.

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.

May 19 '06 #76

Michael Winter wrote:

I'm fed up of banging my head against the wall, trying to make you see
sense. Don't be surprised if I ignore any discussions you intend to
start with me, even if I reply with corrections to your posts. They will
be for the benefit (or protection) of other readers, not you. You just
aren't worth the effort or the frustration any more.

A cop-out? Perhaps, but it's better than the alternative.

I'm sorry if I caused such frustration - that was not neither initial
nor secondary purpose to "show up". I just wanted to show that
DOCTYPE/DTD in HTML/XHTML documents are formal opaque strings and they
have very far relation to DOCTYPE/DTD used in XML-conformant documents
and described in W3C specs. As simple as that - yet seemed to be held
as a non-disclosure taboo. I really wondering that the response be to
the recent
<http://groups.google.com/group/comp.infosystems.www.authoring.html/browse_frm/thread/06e7a7fe8050da5b/a62f3c8bd54ede47?hl=en#a62f3c8bd54ede47>
without this thread some below.

At the same time I miss a lot of professional knowledge in XML and
XSLT, definitely weak in reading W3C docs and my English may fell down
- especially after midnight.

cop-out

May 19 '06 #77

Henri Sivonen

In article <11*********************@g10g2000cwb.googlegroups. com>,
"VK" <sc**********@yahoo.com> wrote:

Currently Firefox cannot load external DTD's at all.
That's not true.

Firefox can load external DTDs if the system ID has the chrome URI
scheme. Loading external DTDs has, by design, been prevented for other
URI schemes.
This is a nasty
bug, but to fix it properly they have to solve somehow the problem with
the bogus DTD from W3C.

It is a feature. The XML spec, by design, allows not processing the
external DTD subset. The rationale for the spec feature was browsers. It
would be simply stupid for browsers to suffer the troubles of external
entities when the spec gives a way out.

Mozilla does do a dirty trick in this area, which causes error messages
to cite the wrong reason in a specific case. However, when an error is
displayed, it is legitimate per spec (even if the reason stated is not
legitimate).

--
Henri Sivonen
hs******@iki.fi
http://hsivonen.iki.fi/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

May 20 '06 #78

Similar topics