By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,983 Members | 1,468 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,983 IT Pros & Developers. It's quick & easy.

DTDs, www & validation

P: n/a
I've been wandering around the results of numerous googles for several
hours without reaching a conclusive solution, so I'm dipping a tentative
toe back in ciwah...

I've been persuaded here in the past that serving xhtml is a bad thing (tm).

I want the extra constraints xhtml imposes.

It has been suggested before to create a DTD which requires these &
includes all the requirements of HTML 4.01 & validate against that.

I've downloaded such a document from
http://www.spartanicus.utvinternet.i...1-stricter.dtd which I'm
sure looks familiar to all the regulars here.

My questions are pretty simple to state:

How do I make use of this to achieve the above?

I know I can declare <!DOCTYPE HTML SYSTEM
"http://www.spartanicus.utvinternet.ie/html401-stricter.dtd">
(though I'd first either enquire as to whether I should use this url or
duplicate the file on my own server).

However I'm concerned as to whether some browsers will do something
undesirable in their interpretation of my pages when they see this,
rather than a standard public declaration.

Also, http://www.spartanicus.utvinternet.ie/no-xhtml.htm which links to
the above DTD, declares itself as HTML 4.01.

This leads me to wonder, am I supposed to use the !DOCTYPE HTML SYSTEM
form only for development validation & then replace this with a !DOCTYPE
HTML PUBLIC referring to HTML 4.01 prior to the site going live?

However, if that is what one should do, why not just declare an XHTML
1.0 doctype while developing & then replace this with the html 4.01 type
prior to going live to achieve an equivalent effect?

Is there an ideal solution?

Along the way, which resources do the readers here use for validation?
Software installed locally & if so, what, or the online services of w3c
or wdg?

Lastly, for a current project I'm forced to use DreamweaverMX2004; can
it be persuaded to validate against a custom dtd?

Perhaps there's a different & better solution to this whole problem.

--
Michael
m r o z a t u k g a t e w a y d o t n e t
Jul 23 '05 #1
Share this Question
Share on Google+
81 Replies


P: n/a
Michael Rozdoba <mr**@nowhere.invalid> wrote:
I've downloaded such a document from
http://www.spartanicus.utvinternet.i...1-stricter.dtd which I'm
sure looks familiar to all the regulars here.

My questions are pretty simple to state:

How do I make use of this to achieve the above?
I merely configured my local validator to use my custom DTD rather than the
W3C's DTD when it sees the W3C's standard DOCTYPE declaration. My custom
DTD is more restrictive than the W3C's DTD, so if a document validates
against my DTD, then it will validate against the W3C's DTD as advertised.

Technically, if your custom DTD allows markup that the W3C's DTD doesn't
allow (e.g., http://www.cs.tut.fi/~jkorpela/html/strict.dtd), then you're
guilty of false advertizing if you use one of the W3C's standard DOCTYPE
declarations. But I think that's the lesser evil in these days of browsers
with DOCTYPE sniffing.

FWIW, I don't bother using the DTD to enforce XML-like requirements that
all attributes be quoted, that optional closing tags be used, etc. For
that, I use a tool that adds quotes, closing tags, etc.:
http://www.jclark.com/sp/spam.htm
Along the way, which resources do the readers here use for validation?


I've got a copy of nsgmls installed locally:
http://www.jclark.com/sp/nsgmls.htm

I use makefiles to run an HTML preprocessor and other programs to
automatically generate and validate HTML documents from preprocessor
source, data files, etc.
--
Darin McGrew, mc****@stanfordalumni.org, http://www.rahul.net/mcgrew/
Web Design Group, da***@htmlhelp.com, http://www.HTMLHelp.com/

"Cheaters never win; they just finish first." - Johhny Hart
Jul 23 '05 #2

P: n/a
Michael Rozdoba <mr**@nowhere.invalid> writes:
I want the extra constraints xhtml imposes.
Uh, all of them?

<http://groups-beta.google.com/group/alt.html/msg/bab970becd5a41d3>

might answer some of your questions, also some you forgot (groups-beta
is bloody annoying, BTW).
This leads me to wonder, am I supposed to use the !DOCTYPE HTML SYSTEM
form only for development validation & then replace this with a
!DOCTYPE HTML PUBLIC referring to HTML 4.01 prior to the site going
live?
No; you can just use the public identifier for HTML 4.01, and override
that locally. If you haven't, you should look into (X)Emacs which
allows for using your private DTD for the editing part, not just after
the fact control.
Lastly, for a current project I'm forced to use DreamweaverMX2004; can
it be persuaded to validate against a custom dtd?


Since when does Dreamweaver ship with a validating system?
--
| ) Più Cabernet,
-( meno Internet.
| ) http://bednarz.nl/
Jul 23 '05 #3

P: n/a
Darin McGrew wrote:
I merely configured my local validator to use my custom DTD rather
than the W3C's DTD when it sees the W3C's standard DOCTYPE
declaration.
That was why I asked a later question about where posters validate -
local looks like a better solution as one has more control. This seems
much more practical.
My custom DTD is more restrictive than the W3C's DTD, so if a
document validates against my DTD, then it will validate against the
W3C's DTD as advertised.
Indeed.
Technically, if your custom DTD allows markup that the W3C's DTD
doesn't allow (e.g., http://www.cs.tut.fi/~jkorpela/html/strict.dtd),
then you're guilty of false advertizing if you use one of the W3C's
standard DOCTYPE declarations.
Quite - not something I'd do intentionally; anyhow, I'm sure I'm
incompetent enough to make up for that with other errors.
But I think that's the lesser evil in these days of browsers with
DOCTYPE sniffing.

FWIW, I don't bother using the DTD to enforce XML-like requirements
that all attributes be quoted, that optional closing tags be used,
etc. For that, I use a tool that adds quotes, closing tags, etc.:
http://www.jclark.com/sp/spam.htm
Interesting. Thanks, I'll look into that, though I think I'd rather be
in the habit of writing them myself. As long as I am writing code by
hand I'd rather write it the way I mean it to end up, iyswim.
Along the way, which resources do the readers here use for
validation?

I've got a copy of nsgmls installed locally:
http://www.jclark.com/sp/nsgmls.htm


That I definitely will pay a visit to shortly.
I use makefiles to run an HTML preprocessor and other programs to
automatically generate and validate HTML documents from preprocessor
source, data files, etc.


Sounds like fun. I think I'll not get into that for the time being - got
enough on my plate atm. Cheers.

--
Michael
m r o z a t u k g a t e w a y d o t n e t
Jul 23 '05 #4

P: n/a
Eric B. Bednarz wrote:
Michael Rozdoba <mr**@nowhere.invalid> writes:

I want the extra constraints xhtml imposes.

Uh, all of them?


Erm, probably not - bad choice of wording. The only ones I had in mind
were closing of elements & quoting of attributes. Though I have no
immediate objections to any constraints which force greater separation
of content & presentation. Does xhtml impose additional constraints
outside of those?
<http://groups-beta.google.com/group/alt.html/msg/bab970becd5a41d3>

might answer some of your questions, also some you forgot
Thanks :)
(groups-beta
is bloody annoying, BTW).
I've not made my mind up - without getting too OT, what's annoying you
the most?
This leads me to wonder, am I supposed to use the !DOCTYPE HTML SYSTEM
form only for development validation & then replace this with a
!DOCTYPE HTML PUBLIC referring to HTML 4.01 prior to the site going
live?

No; you can just use the public identifier for HTML 4.01, and override
that locally.


Which again implies local validation. I'll try to sort that out now.
If you haven't, you should look into (X)Emacs which
allows for using your private DTD for the editing part, not just after
the fact control.


I wonder if I can do that with vim? I'm using vim not as I know it yet
but because I like it & am trying to force myself to learn to use it ;)
I needed a good editor which doesn't require a gui; forget why I plumped
for vi et al rather than emacs.

That all said, I'm required to use DW & also need to use it in order to
learn how to use it - as part of a course module - for my current
project, so the more closely I can integrate a validation solution with
DW the better.

I've already had to edit a couple of its config files manually to get it
to stop choosing transitional dtds which afaic is wasting my time - not
as friendly as I was told it is. That said, the code it produces doesn't
look too bad & it seems to let me stay in control most of the time.
Lastly, for a current project I'm forced to use DreamweaverMX2004; can
it be persuaded to validate against a custom dtd?

Since when does Dreamweaver ship with a validating system?


I can't claim for certain it does, however it does allow one to validate
documents one way or another. Also, I've read some third party
documentation which implies it has it's own validation code.

If I wasn't in the middle of several downloads I'd kill my net
connection & try validating a few files. Will do that later.

This is in respect of DW MX2004 v7.0 btw.

--
Michael
m r o z a t u k g a t e w a y d o t n e t
Jul 23 '05 #5

P: n/a
In article <41**********************@news.zen.co.uk>,
Michael Rozdoba <mr**@nowhere.invalid> writes:
I've been persuaded here in the past that serving xhtml is a bad thing (tm).
I'd be inclined to call it neutral in the real world. Neither good nor
bad (unless you get carried away).
I want the extra constraints xhtml imposes.
You can get that very easily.
See http://valet.webthing.com/page/parsemode.html
It has been suggested before to create a DTD which requires these &
includes all the requirements of HTML 4.01 & validate against that.


That's the hard way, and leaves you at the mercy of browser quirks.

--
Nick Kew

Nick's manifesto: http://www.htmlhelp.com/~nick/
Jul 23 '05 #6

P: n/a
Michael Rozdoba <mr**@nowhere.invalid> wrote:
I know I can declare <!DOCTYPE HTML SYSTEM
"http://www.spartanicus.utvinternet.ie/html401-stricter.dtd">
(though I'd first either enquire as to whether I should use this url or
duplicate the file on my own server).

However I'm concerned as to whether some browsers will do something
undesirable in their interpretation of my pages when they see this,
rather than a standard public declaration.
Unlikely (I'm not aware of any that do), but it's possible.
Also, http://www.spartanicus.utvinternet.ie/no-xhtml.htm which links to
the above DTD, declares itself as HTML 4.01.

This leads me to wonder, am I supposed to use the !DOCTYPE HTML SYSTEM
form only for development validation & then replace this with a !DOCTYPE
HTML PUBLIC referring to HTML 4.01 prior to the site going live?


As you've realised it revolves around how you validate, if you want to
use one of the online validators then you'd have to use a custom system
doctype and upload the DTD to your server. (linking to a DTD on someone
else's server is not nice and it makes you dependant on that server)

If you use a local validator then for DTDs that use a subset of the
public DTD, (no new elements), you can use the public doctype and
override the location of the DTD locally to a local DTD. (I've elected
to omit the uri from the doctype declaration but this isn't the best
way).

Personally I use ARV as a local Windows validator (
http://www.arealvalidator.com ) Note that it has a limitation: it can
only validate files on the local file system, not on a server.

From the link Eric provided you will have understood that certain
differences between XHTML and HTML like mandatory attribute quoting and
element case sensitivity are not governed by the DTD but by the SGML
stuff that goes with it.

I'll expand the bit on validating against a custom DTD on the no-xhtml
document later.

--
Spartanicus
Jul 23 '05 #7

P: n/a
Nick Kew wrote:
In article <41**********************@news.zen.co.uk>,
Michael Rozdoba <mr**@nowhere.invalid> writes:

I've been persuaded here in the past that serving xhtml is a bad thing (tm).

I'd be inclined to call it neutral in the real world. Neither good nor
bad (unless you get carried away).


Last time I innocently tried that around here I got an intellectual
kneecapping :) - you wouldn't be suggesting this isn't the real world
would you? ;)
I want the extra constraints xhtml imposes.


You can get that very easily.
See http://valet.webthing.com/page/parsemode.html


Yes, that popped up recently. Another handy check.
It has been suggested before to create a DTD which requires these &
includes all the requirements of HTML 4.01 & validate against that.


That's the hard way, and leaves you at the mercy of browser quirks.


You mean if a document is made public with such a DTD still in place?
That was what I feared :/

Ah well, no harm. I've gone down the suggested route of installing local
validation via "A Real Validator 1.11" & used it to override the HTML
4.01 DTD with http://www.spartanicus.utvinternet.i...1-stricter.dtd

I think this gives me most of what I want. Though atm I see it won't let
me close empty tags. Is that a bad thing to want to do with something
claiming to be HTML 4.01? I'd rather have my docs in a format as close
to XML as possible, whilst still being valid HTML 4.01.

If it can be done, any pointers to altering the DTD appropriately? If
it's a case of RTFM, an ptrs to that? Cheers.

--
Michael
m r o z a t u k g a t e w a y d o t n e t
Jul 23 '05 #8

P: n/a
Michael Rozdoba <mr**@nowhere.invalid> wrote:
I think this gives me most of what I want. Though atm I see it won't let
me close empty tags. Is that a bad thing to want to do with something
claiming to be HTML 4.01? I'd rather have my docs in a format as close
to XML as possible, whilst still being valid HTML 4.01.


Closing empty tags makes it invalid HTML. Well, sorta, depending on the
empty tag. But if it *is* valid HTML, then it isn't what you wanted. For
example, in HTML,

<img src=... alt=... />

is technically equivalent to

<img src=... alt=...>>

or

<img src=... alt=...>&gt;

It doesn't matter that much for tags like <img> or <br> that appear in the
BODY: You just get a few '>' characters scattered here and there. But for
tags like <link> that appear in the HEAD, it causes problems, because the
text content (the extra '>' characters) closes the HEAD.

Appendix C compatability relies on browsers ignoring this bit of SGML-based
trivia. Most do, but not all.
--
Darin McGrew, mc****@stanfordalumni.org, http://www.rahul.net/mcgrew/
Web Design Group, da***@htmlhelp.com, http://www.HTMLHelp.com/

"You can't strengthen the weak by weakening the strong."
Jul 23 '05 #9

P: n/a
Michael Rozdoba <mr**@nowhere.invalid> wrote:
Though atm I see it won't let
me close empty tags. Is that a bad thing to want to do with something
claiming to be HTML 4.01?
Of course it is, it's not allowed under HTML.
I'd rather have my docs in a format as close
to XML as possible, whilst still being valid HTML 4.01.


Feel free to author in X(HT)ML if it has a real benefit to you (the
ability to use XML tools on the data for example), you should then use
those XML tools to generate HTML from the X(HT)ML and serve that to
clients.

--
Spartanicus
Jul 23 '05 #10

P: n/a
On Fri, 10 Dec 2004 21:27:33 +0000, Michael Rozdoba
<mr**@nowhere.invalid> wrote:
I want the extra constraints xhtml imposes.


It doesn't. XHTML is (by design) a literal transcoding of HTML 4.01
(both Strict and Transitional) from HTML's almost-SGML into XML.
There are _no_ extra constraints imposed by XHTML.

Now if you want to do this (there are negligible good reasons to do
so), then just use XHTML. Or you could invent your own XML-flavour of
HTML 4.01 and use that with its own DTD, but this would have to be
equivalent to XHTML anyway.

XHTML causes problems for serving to browsers (allegedly) because of
its XML nature, not because of its DTD. Any "constraint" you might
wish to gain can only be this same added XML nature, and so you might
as well switch wholesale to XHTML.

XHTML does not modify the HTML 4.01 DTD. If you wanted to do this
(maybe add the <blink> element), then you could certainly do so, but
you'd do that by a custom DTD and could stay within the bounds of
SGML-style HTML

--
Smert' spamionam
Jul 23 '05 #11

P: n/a
Michael Rozdoba <mr**@nowhere.invalid> writes:
Eric B. Bednarz wrote:
(groups-beta is bloody annoying, BTW).


I've not made my mind up - without getting too OT, what's annoying you
the most?


Legacy URI references to articles are redirected to threads (WTF is
that? It even seems like more work).

To view a single message I have to choose options, and then individual
message (one action for the price of two, how cool); once I'm there, to
go to the thread I have to choose options and, erm, get a link to the
message that I'm currently viewing. If I didn't come from the thread,
dead end.

What's the point of killing the framed tree view when I get a pointless
sidebar instead of some screen estate?

How much webdesigners does it take to install a light bulb, or combine
impenetrable table-salad with all the disadvantages of CSS layout?

<http://sandbox.bednarz.nl/.swap/beta.png> (20.5KiB)
I wonder if I can do that with vim?
Not that I know of.
I'm using vim not as I know it yet
but because I like it
Oh, but Vim *is* a much better text-editor than Emacs. To write HTML
from scratch, I'd almost always use Vim; but to edit Document Types I'm
not familiar with I'd almost always use Emacs. The same goes for minor
changes, because I'm usually in it anyway because I use it as (local and
remote) file manager, email and news client and to have my cup of tea.
forget why I plumped for vi et al rather than emacs.


Maybe your hands aren't big enough to perform most of the required
keyboard longcuts. :)
--
| ) Più Cabernet,
-( meno Internet.
| ) http://bednarz.nl/
Jul 23 '05 #12

P: n/a
"Andy Dingley" <di*****@codesmiths.com> wrote in message
news:6f********************************@4ax.com...
On Fri, 10 Dec 2004 21:27:33 +0000, Michael Rozdoba
<mr**@nowhere.invalid> wrote:
I want the extra constraints xhtml imposes.

There are _no_ extra constraints imposed by XHTML.


Not correct. E.g., xHTML requires end tags where in HTML they are optional.

Jul 23 '05 #13

P: n/a
Eric B. Bednarz wrote:
Michael Rozdoba <mr**@nowhere.invalid> writes:
[annoying google groups beta]
Legacy URI references to articles are redirected to threads (WTF is
that? It even seems like more work).
Ah yes, I did notice that. Thankfully I usually want thread views, but
now I see what you mean.

[snip critique]

Um, it is beta ;)

I hope they're getting informative feedback.
How much webdesigners does it take to install a light bulb, or combine
impenetrable table-salad with all the disadvantages of CSS layout?

<http://sandbox.bednarz.nl/.swap/beta.png> (20.5KiB)
Ouch :/

[vim/emacs]
Oh, but Vim *is* a much better text-editor than Emacs. To write HTML
from scratch, I'd almost always use Vim; but to edit Document Types I'm
not familiar with I'd almost always use Emacs. The same goes for minor
changes, because I'm usually in it anyway because I use it as (local and
remote) file manager, email and news client and to have my cup of tea.


LOL. I guess if I'm doing this enough I'll eventually learn which tool
does the best job for each task.

--
Michael
m r o z a t u k g a t e w a y d o t n e t
Jul 23 '05 #14

P: n/a
Darin McGrew wrote:
Closing empty tags makes it invalid HTML. Well, sorta, depending on the
empty tag.
[snip]
Appendix C compatability relies on browsers ignoring this bit of SGML-based
trivia. Most do, but not all.


I was aware that it usually works but not that this is as a result of
what looks like a fluke. I don't mind bending rules to get meet my
desires but I'm not keen on breaking them without very good reason. Thanks.

--
Michael
m r o z a t u k g a t e w a y d o t n e t
Jul 23 '05 #15

P: n/a
Spartanicus wrote:
Michael Rozdoba <mr**@nowhere.invalid> wrote:
Though atm I see it won't let me close empty tags. Is that a bad
thing to want to do with something claiming to be HTML 4.01?


Of course it is, it's not allowed under HTML.


I do seem able to ask really stupid questions sometimes.
I'd rather have my docs in a format as close to XML as possible,
whilst still being valid HTML 4.01.


Feel free to author in X(HT)ML if it has a real benefit to you (the
ability to use XML tools on the data for example), you should then
use those XML tools to generate HTML from the X(HT)ML and serve that
to clients.


Yes, understood. It's clear I ought to be serving HTML 4.01 & this must
be valid. I like XHTML, but browsers can't generally handle it when
served correctly.

The above are the only reasonable courses of action. For now I'll stick
with HTML 4.01. When I need to automate handling of larger volumes of
data I'll use XHTML & generate HTML from it either prior to serving or
before uploading to the server.

--
Michael
m r o z a t u k g a t e w a y d o t n e t
Jul 23 '05 #16

P: n/a
C A Upsdell:
xHTML requires end tags where in HTML they are optional.


More precisely, XHTML requires all elements are closed. There's an
important difference.
Jul 23 '05 #17

P: n/a
Neal <ne*****@yahoo.com> writes:
C A Upsdell:
xHTML requires end tags where in HTML they are optional.


More precisely, XHTML requires all elements are closed. There's an
important difference.


Please don't hesitate to elaborate on that.
--
| ) Più Cabernet,
-( meno Internet.
| ) http://bednarz.nl/
Jul 23 '05 #18

P: n/a
Eric B. Bednarz:
Neal:
C A Upsdell:
xHTML requires end tags where in HTML they are optional.

More precisely, XHTML requires all elements are closed. There's an
important difference.

Please don't hesitate to elaborate on that.


Sure.

<p>text</p> is a care where the element is closed, and the required end
tag is included. As long as we're talking about non-empty elements, the
two descriptions are similar enough.

<img ... /> is a closed empty element. The term "end tag" (which clearly
implies there is a "start tag") implies that <img ...></img> would be
correct. While this can be valid XML, it is not appropriate for XHTML. The
img element does not take an end tag - the one tag used to mark that
element is inherently closed.

It's much more useful in this context to think of elements rather than
tags. All elements require closure, but not all tags must appear in
start/end pairs.
Jul 23 '05 #19

P: n/a
Eric B. Bednarz <be*****@fahr-zur-hoelle.org> wrote:
Neal <ne*****@yahoo.com> writes:
C A Upsdell:
xHTML requires end tags where in HTML they are optional.


More precisely, XHTML requires all elements are closed. There's an
important difference.


Please don't hesitate to elaborate on that.


An end tag is </foo>

A closed element (in XHTML) is either <foo>...</foo> _OR_ <foo/>.

Steve

Jul 23 '05 #20

P: n/a
Spartanicus wrote:
Michael Rozdoba <mr**@nowhere.invalid> wrote:

I know I can declare <!DOCTYPE HTML SYSTEM
"http://www.spartanicus.utvinternet.ie/html401-stricter.dtd">
(though I'd first either enquire as to whether I should use this url or
duplicate the file on my own server).

However I'm concerned as to whether some browsers will do something
undesirable in their interpretation of my pages when they see this,
rather than a standard public declaration.

Unlikely (I'm not aware of any that do), but it's possible.


My main concern was would IE end up in quirks mode.
Also, http://www.spartanicus.utvinternet.ie/no-xhtml.htm which links to
the above DTD, declares itself as HTML 4.01.

This leads me to wonder, am I supposed to use the !DOCTYPE HTML SYSTEM
form only for development validation & then replace this with a !DOCTYPE
HTML PUBLIC referring to HTML 4.01 prior to the site going live?

As you've realised it revolves around how you validate, if you want to
use one of the online validators then you'd have to use a custom system
doctype and upload the DTD to your server. (linking to a DTD on someone
else's server is not nice


I wouldn't do that unless given permission first.
and it makes you dependant on that server)
Only sensible if that server is much more reliable than yours... erm but
if down users'd not be able to fetch the html nevermind the dtd - oops,
forget I thought that.
If you use a local validator then for DTDs that use a subset of the
public DTD, (no new elements), you can use the public doctype and
override the location of the DTD locally to a local DTD. (I've elected
to omit the uri from the doctype declaration but this isn't the best
way).
I seem to recall if you give the full uri it triggers certain alignment
bugs in IE 6 (I came across this when looking at articles discussing
modifying Dreamweaver config files recently):

http://www.macromedia.com/devnet/mx/...standards.html
http://www.dwfaq.com/IE6/

Some of the other info is dodgy, so I don't know how reliable that it.

What are the problems of giving no uri?
Personally I use ARV as a local Windows validator (
http://www.arealvalidator.com ) Note that it has a limitation: it can
only validate files on the local file system, not on a server.
Indeed. It seemed to be the only option I could find (a few other
related possibilites did turn up)
From the link Eric provided you will have understood that certain
differences between XHTML and HTML like mandatory attribute quoting and
element case sensitivity are not governed by the DTD but by the SGML
stuff that goes with it.


I didn't take that all in last night... Ah right, okay. I've made the
corresponding changes to html.soc & HTML4.dcl now & attribute value
quoting is now required.

That just leaves element case sensitivity...

Hmm, reading, though not necessarily understanding,
http://home.chello.no/~mgrsby/sgmlintr/chapter3.htm

Leads me to the following guess -

If I want to require elements & attributes to be lowercase the following
is necessary & sufficient:

In the .dcl specify SYNTAX NAMING NAMECASE GENERAL as NO, and in the
dtd, specify all element & attribute names in lowercase.

Is that at all close to the mark?

Either way, editing the dtd in respect of the latter could easily result
in a few errors. Does anyone have a copy of the html 4.01 dtd modified
in this respect /&/ to enforce closing of elements?

--
Michael
m r o z a t u k g a t e w a y d o t n e t
Jul 23 '05 #21

P: n/a
Neal <ne*****@yahoo.com> wrote:
Eric B. Bednarz:
Neal:
C A Upsdell:
xHTML requires end tags where in HTML they are optional.
More precisely, XHTML requires all elements are closed. There's
an important difference. Please don't hesitate to elaborate on that.


Sure.

<p>text</p> is a ca[s]e where the element is closed,


In the case <p>text<p>foo</p>, the first element is closed as well.
and the required end tag is included.
You mean _XML_ and hence XHTML requires the end tag. But this is a
different issue. You should have written "XHTML requires that all
elements be explicitly closed with end tags". Which is, in fact, a bit
misleading too.
As long as we're talking about non-empty
elements, the two descriptions are similar enough.
What two descriptions? If you mean HTML 4.01 and XHTML, then there
_are_ differences for non-empty elements.
<img ... /> is a closed empty element.
In XHTML, that is. In HTML, by the specifications, it is a closed empty
element followed by the greater than character.
The term "end tag" (which
clearly implies there is a "start tag") implies that <img
...></img> would be correct.
Does it? Anyway, it _is_ correct (though not recommended) in XHTML (and
not in HTML 4.01).
The img element does not take an end tag
It does, in XHTML, though you can (and are recommended to) use the same
tag as both opening and closing.
It's much more useful in this context to think of elements rather
than tags. All elements require closure, but not all tags must
appear in start/end pairs.


Well, that's what I was about to say. But I think you first said
something different.

Remember that the "empty element" stuff is just an obscure oddity
plugged into a markup language. Don't try to find much logic in it.
The magic slash in <img .../> is there to make the world safe for
abstract parsing. Empty elements were introduced into HTML by mistake
(details on this at http://www.cs.tut.fi/~jkorpela/html/empty.html )
and the second mistake was to transmogrify HTML into something
nominally XML based without getting rid of the empty stuff.
We need to live with that, but we don't need to believe all the related
cargo cult histories.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 23 '05 #22

P: n/a
Neal <ne*****@yahoo.com> writes:
<img ... /> is a closed empty element.
That's a NET-enabled start-tag ('<img ... /') shut up by a null
end-tag ('>').
The term "end tag" (which
clearly implies there is a "start tag")
Oh, you already knew that.
implies that <img ...></img>
would be correct.
It *is* correct.
While this can be valid XML, it is not appropriate
for XHTML.
Valid, well-frowned, recommended, good practice, and now even
appropriate. My head is about to explode from all this stuff.
The img element does not take an end tag - the one tag used
to mark that element is inherently closed.
It is *not one tag* and there is no such thing like -- or as (you like
it) -- 'inherently closed tags'.
It's much more useful in this context to think of elements rather than
tags.


No kidding.

To play back your ball, it's much more useful to think of syntactical
components in terms of their contextual functionality first, rather than
jumping to conclusions from their mere appearance as characters just
because they seem to remotely resemble something slightly familiar 'and
never did that before' back then ("Uncle Harry is dead." "No really, you
don't say; he never did *that* before, I can tell you.").
--
| ) Più Cabernet,
-( meno Internet.
| ) http://bednarz.nl/
Jul 23 '05 #23

P: n/a
On Sat, 11 Dec 2004 22:22:03 +0000, Steve Pugh <st***@pugh.net> wrote:
An end tag is </foo>
A closed element (in XHTML) is either <foo>...</foo> _OR_ <foo/>.


Yea; sad isn't it ?

--
Rex
Jul 23 '05 #24

P: n/a
"Jukka K. Korpela" <jk******@cs.tut.fi> writes:

[I was taking for granted that we are talking (about the SGML
declaration of) XML here]
The magic slash in <img .../> is there to make the world safe for
abstract parsing.


s/abstract parsing/real-world tag-soup slurping/

The *magic token* here is '>'; <foo/> is nothing but the XMLised Annex-K
unbundled shorttag notation of what <foo// is for the reference concrete
syntax.

(I'm a bit surprised and you owe me a beer, which means nothing more but
that I owe you one less, you do the math yourself :)
--
| ) Più Cabernet,
-( meno Internet.
| ) http://bednarz.nl/
Jul 23 '05 #25

P: n/a
Michael Rozdoba <mr**@nowhere.invalid> writes:
My main concern was would IE end up in quirks mode.
I see that as a *requirement* for productive sites, not as a concern (so
does msdn, BTW, and they *are* the authoritative source for their
homegrown products).
What are the problems of giving no uri?
What are the problems of giving no document type declaration at all?
Anyway, the canonical form is
mdo("<!"),"DOCTYPE",+ps,document type name,*ps,mdc(">")
Everything else is voodoo^Woptional.
In practical setups the system identifier might be an issue if you also
(or better: only) use the transitional FPI.

Since even Mozilla seems to have recovered from the most clueless
sniffing heuristics by now, you predominantly would want want to avoid
ISO HTML (That's a *real standard* and as such still good for Quirks
mode in Opera, but after all that's just a commercial product and as
such inherently crap, so is the faulty documentation which you just
shouldn't trust).
http://www.arealvalidator.com

Indeed. It seemed to be the only option I could find


It also depends on your OS; anyway, there's a handy point-and-shoot
SP-plugin available even for NoteTab.
--
| ) Più Cabernet,
-( meno Internet.
| ) http://bednarz.nl/
Jul 23 '05 #26

P: n/a
In article <m3************@email.bednarz.nl>,
Eric B. Bednarz <be*****@fahr-zur-hoelle.org> writes:
sniffing heuristics by now, you predominantly would want want to avoid
ISO HTML (That's a *real standard* and as such still good for Quirks


Have you looked at ISO HTML? Devices like implying <div1> ... <div6>
to enforce a heirarchy seems to me an exercise in futility on the par
with XHTML1.1.
http://www.arealvalidator.com

Indeed. It seemed to be the only option I could find


It also depends on your OS;


Indeed. FYI there's http://www.webthing.com/software/validator-lite/
for platforms with GTK on - which means in principle more-or-less
everything. But it's very much a quick&dirty hack just to wrap
OpenSP in a GUI. I'd expect arealvalidator to offer more on Windows.

--
Nick Kew
Jul 23 '05 #27

P: n/a
In article <m3************@email.bednarz.nl>,
Eric B. Bednarz <be*****@fahr-zur-hoelle.org> wrote:
"Jukka K. Korpela" <jk******@cs.tut.fi> writes:

[I was taking for granted that we are talking (about the SGML
declaration of) XML here]
The magic slash in <img .../> is there to make the world safe for
abstract parsing.


s/abstract parsing/real-world tag-soup slurping/


Nope, the magic slash is indeed there to allow one-tag empty elements in
XML to be parsed unambiguously without a DTD. (XML does not make a
normative reference to SGML, so it does not matter how the tokenization
or parsing would work as per Annex K.)

The magic slash does not make the world safe for real-world tag slurping
in any way. Check out
http://iki.fi/hsivonen/test/bogo-empty-element.html using Mozilla's DOM
Inspector, for example. The magic slash in <span/> and <span /> is
ignored.

--
Henri Sivonen
hs******@iki.fi
http://iki.fi/hsivonen/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html
Jul 23 '05 #28

P: n/a
Michael Rozdoba <mr**@nowhere.invalid> wrote:
What are the problems of giving no uri?
None that I'm aware of for the HTML 4.01 *Strict* doctype, although it's
not allowed by the specs.
That just leaves element case sensitivity...


I haven't bothered enforcing that, I didn't see the point since UAs are
not going to care one iota.

Remember that to code using all the extra "strictness" of xhtml does not
need a custom DTD at all, it's purely down to the author. The only
benefit you get from incorporating it into a DTD is validator warnings
if you get it wrong. There may be some benefits to doing that for
mandatory closing of elements, but I can't see any point in requiring
case sensitivity.

--
Spartanicus
Jul 23 '05 #29

P: n/a
In article
<6i********************************@news.spartanic us.utvinternet.ie>,
Spartanicus <me@privacy.net> wrote:
Michael Rozdoba <mr**@nowhere.invalid> wrote:
What are the problems of giving no uri?
None that I'm aware of for the HTML 4.01 *Strict* doctype,


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> triggers the Quirks
mode in Mac IE 5. It does not have anything to do with validation,
though.
although it's not allowed by the specs.


Which specs?

--
Henri Sivonen
hs******@iki.fi
http://iki.fi/hsivonen/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html
Jul 23 '05 #30

P: n/a
On Sun, 12 Dec 2004 14:24:37 +0200, Henri Sivonen <hs******@iki.fi>
wrote:
In article
<6i********************************@news.spartani cus.utvinternet.ie>,
Spartanicus <me@privacy.net> wrote:
Michael Rozdoba <mr**@nowhere.invalid> wrote:
>What are the problems of giving no uri?


None that I'm aware of for the HTML 4.01 *Strict* doctype,
although it's not allowed by the specs.


Which specs?


The HTML 4.01 specs.
http://www.w3.org/TR/html401/struct/global.html#h-7.2

"TML 4.01 specifies three DTDs, so authors must include one of the
following document type declarations in their documents. The DTDs vary
in the elements they support.
....
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
....
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
....
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd">"

So authors MUST include one of the three listed doctypes. Hence the
versions without URIs aren't allowed.

Steve

Jul 23 '05 #31

P: n/a
Steve Pugh <st***@pugh.net> writes:
http://www.w3.org/TR/html401/struct/global.html#h-7.2
3PM and already time for the best medicine against a hangover.
"TML 4.01 specifies three DTDs, so authors must include one of the
following document type declarations in their documents. The DTDs vary
in the elements they support.
...
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
...
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
...
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd">"

So authors MUST include one of the three listed doctypes. Hence the
versions without URIs aren't allowed.


The next time you visit those specs, feel free to view source. :)

(it is rather obvious following this 'strict' interpretation that the
public identifier MUST be immediately followed by a line feed and that
the system identifier SHALL be preceded by eight space characters --
another interpretation is that there are many syntactically different
but factually identical ways to do the same thing, namely include a
bunch of markup declarations, so the meaning of 'one of the following'
is completely up to the reader)
--
| ) Più Cabernet,
-( meno Internet.
| ) http://bednarz.nl/
Jul 23 '05 #32

P: n/a
Henri Sivonen <hs******@iki.fi> writes:
Nope, the magic slash is indeed there to allow one-tag empty elements in
XML to be parsed unambiguously without a DTD. (XML does not make a
normative reference to SGML, so it does not matter how the tokenization
or parsing would work as per Annex K.)
You are right on that. The word-count for SGML in the XML spec's short
abstract is 3 though, in rather unambiguous contextual prose. But that
probably doesn't tell much. As I was there anyway I had myself
convinced that 'empty-element tags' are an XMLism, something I
apparently had forgotten (it doesn't make a different from my POV, but
your point is taken). I read the XML spec 4 or 5 times in the past, to
no avail, since it always left me exactly where I started ("what's it
good for anyway?").
The magic slash does not make the world safe for real-world tag slurping
in any way. Check out
http://iki.fi/hsivonen/test/bogo-empty-element.html using Mozilla's DOM
Inspector, for example. The magic slash in <span/> and <span /> is
ignored.


That *is* why it is supposed to be 'safe', or 'compatible'. I think we
have a misunderstanding there (which doesn't really matter as far as I'm
concerned).
--
| ) Più Cabernet,
-( meno Internet.
| ) http://bednarz.nl/
Jul 23 '05 #33

P: n/a
In article <m3************@email.bednarz.nl>,
Eric B. Bednarz <be*****@fahr-zur-hoelle.org> wrote:
I read the XML spec 4 or 5 times in the past, to
no avail, since it always left me exactly where I started ("what's it
good for anyway?").


It is good for storing or transmitting a byte representation of a tree
that consists of labeled internal/leaf "element" nodes and leaf text
nodes with Unicode text, where the element nodes can also have
"attribute" name-value pairs. This representation can be parsed using
widely-available off-the-shelf tools without the recipient needing to
know how to use a parser generator for generating a parser for a given
grammar.

Such a tree could be represented using SGML as well, but the syntax of
SGML is a serious overkill for the purpose.

--
Henri Sivonen
hs******@iki.fi
http://iki.fi/hsivonen/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html
Jul 23 '05 #34

P: n/a
On Sun, 12 Dec 2004 15:15:12 +0100, Eric B. Bednarz
<be*****@fahr-zur-hoelle.org> wrote:
Steve Pugh <st***@pugh.net> writes:
So authors MUST include one of the three listed doctypes. Hence the
versions without URIs aren't allowed.
The next time you visit those specs, feel free to view source. :)


Therefore, either this "must" in the specs is incorrect, or the validator
is incorreectly passing the shortened DTD as valid. Rather than continue
the cat-mouse, does anyone know which it is?

I'd guess that since it's been 5 years since this document has been up, if
it were an error in the language of the specs it would appear in the
errata. My buck forty-nine's on the validator passing a form of the DTD
not expressly permitted.
(it is rather obvious following this 'strict' interpretation that the
public identifier MUST be immediately followed by a line feed and that
the system identifier SHALL be preceded by eight space characters --
It's clear from other examples of this DTD on the same page that this is
merely a wrapped line with indent. Without explicit "musts" and "shalls"
in the text, this assertion (which I know you are making in jest anyhow)
doesn't hold water.
another interpretation is that there are many syntactically different
but factually identical ways to do the same thing, namely include a
bunch of markup declarations, so the meaning of 'one of the following'
is completely up to the reader)


Then that should also be specified. Long and short of it - something's
messy here in 7.2. We should not have to be divining the truth out of a
5-year-old document.
Jul 23 '05 #35

P: n/a
On Sun, 12 Dec 2004 12:50:24 -0500, Neal <ne*****@yahoo.com> wrote:
On Sun, 12 Dec 2004 15:15:12 +0100, Eric B. Bednarz
<be*****@fahr-zur-hoelle.org> wrote:
Steve Pugh <st***@pugh.net> writes:

So authors MUST include one of the three listed doctypes. Hence the
versions without URIs aren't allowed.
The next time you visit those specs, feel free to view source. :)


Shocking isn't it? Transitional.
Therefore, either this "must" in the specs is incorrect, or the validator
is incorreectly passing the shortened DTD as valid. Rather than continue
the cat-mouse, does anyone know which it is?
Or...
It's one of the requirements in the spec that can not be expressed in
the DTD (after all, how could it?).
I'd guess that since it's been 5 years since this document has been up, if
it were an error in the language of the specs it would appear in the
errata.


I wouldn't give them that much credit. I think it is a mistake in the
spec and that the authors never intended to limit the doctypes to
those exact variants.

Steve

Jul 23 '05 #36

P: n/a
In article <op**************@news.individual.net>,
Neal <ne*****@yahoo.com> wrote:
On Sun, 12 Dec 2004 15:15:12 +0100, Eric B. Bednarz
<be*****@fahr-zur-hoelle.org> wrote:
Steve Pugh <st***@pugh.net> writes:
So authors MUST include one of the three listed doctypes. Hence the
versions without URIs aren't allowed.
The next time you visit those specs, feel free to view source. :)


Therefore, either this "must" in the specs is incorrect, or the validator
is incorreectly passing the shortened DTD as valid. Rather than continue
the cat-mouse, does anyone know which it is?


What premises shall we accept true? Shall we pretend HTML is an
application of SGML and the SGML rules apply? (This is just a mind game.
Theory and practice are wildly different when it comes to the supposed
SGML nature of HTML.)

In SGML (but not in XML), it is permissible to refer to an external
entity using merely a public id without a system id. The validator uses
an SGML parser. Browsers don't. (And browsers don't resolve external
entities.)
I'd guess that since it's been 5 years since this document has been up, if
it were an error in the language of the specs it would appear in the
errata. My buck forty-nine's on the validator passing a form of the DTD
not expressly permitted.
Are you aware that the validator checks whether the document is
internally consistent (elements and attributes used as declared)
according to an SGML formalism and does not check whether the prose of
the HTML spec is obeyed?

The validator isn't what it is often hyped to be.
Then that should also be specified. Long and short of it - something's
messy here in 7.2. We should not have to be divining the truth out of a
5-year-old document.


Perhaps the reason is that the HTML spec makes a normative reference to
SGML but goes on to make statements that it wouldn't be allowed to make
if it was seriously a conforming application of SGML.

--
Henri Sivonen
hs******@iki.fi
http://iki.fi/hsivonen/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html
Jul 23 '05 #37

P: n/a
On Sun, 12 Dec 2004 12:50:24 -0500, Neal wrote:

[Re Section 7.2 about what doctype declarations "must" look like]
Therefore, either this "must" in the specs is incorrect, or the validator
is incorreectly passing the shortened DTD as valid. Rather than continue
the cat-mouse, does anyone know which it is?
The latter, because the guts of the validator is an SGML parser. The
provisions of Sec 7.2 make no sense in SGML, so naturally the parser can't
enforce them. The wrapper CGI script/program could, though.
Long and short of it - something's messy here in 7.2.


There's nothing not messy in 7.2. There is deep confusion of its own
making regarding the nature and purpose of doctype declarations; deep
confusion inherited from other quarters regarding URIs vis-a-vis PUBLIC
and SYSTEM identifiers; and deep confusion about being taken seriously.
Jul 23 '05 #38

P: n/a
On Sun, 12 Dec 2004 18:26:01 +0000, Steve Pugh wrote:
On Sun, 12 Dec 2004 12:50:24 -0500, Neal <ne*****@yahoo.com> wrote:

I'd guess that since it's been 5 years since this document has been up,
if it were an error in the language of the specs it would appear in the
errata.


I wouldn't give them that much credit. I think it is a mistake in the spec
and that the authors never intended to limit the doctypes to those exact
variants.


Actually, there is no mistake. The authors intended precisely that. The
why of it is an open secret.

Jul 23 '05 #39

P: n/a
On Sat, 11 Dec 2004 01:14:03 +0000, Spartanicus wrote:
If you use a local validator then for DTDs that use a subset of the public
DTD, (no new elements), you can use the public doctype and override the
location of the DTD locally to a local DTD. (I've elected to omit the uri
from the doctype declaration but this isn't the best way).


What is the best way?

I'm not sure what the OP is looking for here. A magical doctype decaration
that "works" both locally and globally? (As a matter of fact, there *is*
an answer for that, for a suitably restricted value of "globally", but it
may run afoul of latter-day confusions about PUBLIC and SYSTEM.)

Consider the benefits of

<!DOCTYPE html SYSTEM>

or even just

<!DOCTYPE html>

(Catalogs, anyone?)
Jul 23 '05 #40

P: n/a
"Arjun Ray" <ar**@nmds.com.invalid> wrote:
If you use a local validator then for DTDs that use a subset of the public
DTD, (no new elements), you can use the public doctype and override the
location of the DTD locally to a local DTD. (I've elected to omit the uri
from the doctype declaration but this isn't the best way).


What is the best way?


Include the uri.

--
Spartanicus
Jul 23 '05 #41

P: n/a
On Sun, 12 Dec 2004 23:04:02 +0000, Spartanicus wrote:
"Arjun Ray" <ar**@nmds.com.invalid> wrote:

What is the best way?


Include the uri.


Dreadful.

(Though, admittedly, the error of using SYSTEM identifiers for URIs is a
very deep-seated one, despite the official sanction.)

Jul 23 '05 #42

P: n/a
"Arjun Ray" <ar**@nmds.com.invalid> wrote:
What is the best way?


Include the uri.


Dreadful.


I'm not persuaded by your arguments.

--
Spartanicus
Jul 23 '05 #43

P: n/a
On Sun, 12 Dec 2004 18:13:10 +0200, Henri Sivonen wrote:
Such a tree could be represented using SGML as well, but the syntax of
SGML is a serious overkill for the purpose.


On the contrary, the syntax of XML is serious overkill for the purpose.

Jul 23 '05 #44

P: n/a
Arjun:
The why of it is an open secret.


I like secrets. Tell me...

Jul 23 '05 #45

P: n/a
On Sun, 12 Dec 2004 21:36:15 -0500, Neal wrote:
Arjun:

The why of it is an open secret.


I like secrets. Tell me...


(1) To support Doctype Sniffing ("version information")
(2) Keeping The Web Safe For Netploder


Jul 23 '05 #46

P: n/a
Arjun:
Neal:
Arjun:

The why of it is an open secret.


I like secrets. Tell me...


(1) To support Doctype Sniffing ("version information")
(2) Keeping The Web Safe For Netploder


So, using the pecified format of the doctype is ultimately a wise thing to
do.
Jul 23 '05 #47

P: n/a
On Sun, 12 Dec 2004 23:07:44 -0500, Neal wrote:
Arjun:

(1) To support Doctype Sniffing ("version information")
(2) Keeping The Web Safe For Netploder


So, using the pecified format of the doctype is ultimately a wise thing to
do.


Only if you think DS and KTWSFN are Good Things.

Jul 23 '05 #48

P: n/a
On Mon, 13 Dec 2004 04:18:36 GMT, Arjun Ray <ar**@nmds.com.invalid> wrote:
On Sun, 12 Dec 2004 23:07:44 -0500, Neal wrote:
Arjun:

(1) To support Doctype Sniffing ("version information")
(2) Keeping The Web Safe For Netploder


So, using the pecified format of the doctype is ultimately a wise thing
to
do.


Only if you think DS and KTWSFN are Good Things.


Which depend on whether you have rent money on hand, no?

Jul 23 '05 #49

P: n/a
On Sun, 12 Dec 2004 23:26:33 -0500, Neal wrote:
On Mon, 13 Dec 2004 04:18:36 GMT, Arjun Ray <ar**@nmds.com.invalid> wrote:

Only if you think DS and KTWSFN are Good Things.


Which depend on whether you have rent money on hand, no?


Not really. If the rent money is predicated on following orders, then the
effective wisdom (or lack of it) will have been exercised by someone else.
Jul 23 '05 #50

81 Replies

This discussion thread is closed

Replies have been disabled for this discussion.