By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,492 Members | 1,210 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,492 IT Pros & Developers. It's quick & easy.

Are XML-style "/>" tags valid in 4.01 Transitional? I get weird answers from validators.

P: n/a
Consider the following HTML.

----------

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<title>Untitled</title>
<link rel="Stylesheet" href="mystylesheet.css" />
</head><body>
<img src="myimage.gif" alt="my image" width="100" height="100" />
</body>
</html>

----------

There's a guy a work who insists on coding like this. Or rather,
*some* of his images have a trailing slash, not others, and his link
tags, as above, have them, but not his meta-tags.

I don't know where he picked up the habit, but he says it's valid 4.01
Transitional, and it's also "best practice".

If it is, why aren't all his single tags like this, but let's move on.

In an attempt to get an official answer on this, I validate the above
at the W3C.

I get this result:

----------

Line 7, column 6: end tag for element "HEAD" which is not open
(explain...).

Line 8, column 5: document type does not allow element "BODY" here
(explain...).

----------

If I validate it with BBEdit's built-in Check Syntax, it give me this:

----------

File "xmlstyle.html"; Line 7: Document type doesn't permit empty XML
element; "<link/>".
File "xmlstyle.html"; Line 10: Document type doesn't permit empty XML
element; "<img/>".

----------

When I show these results to the coder, he says the W3C is complaining
about "HEAD and BODY tags on the same line" which is laughable, but
he's got a point when he says "BBEdit says the tags are empty. They're
not empty."

I could go on trying other validators, but I'm not happy with the
results of these.

Validated as 4.01 Strict, by the way, these tags are definitely
errors. "Character Data Is Not Allowed Here" with an arrow pointing to
the end of the tag.

What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way. I
think I'm right but I can't cite anything.

To me coding this way is something like the HTML equivalent of wearing
a baseball cap backwards...
Jul 20 '05 #1
Share this Question
Share on Google+
8 Replies


P: n/a
rf

"Hostile17" <ho*******@bigfoot.com> wrote in message
news:c6*************************@posting.google.co m...
Consider the following HTML.

----------

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<title>Untitled</title>
<link rel="Stylesheet" href="mystylesheet.css" />
</head><body>
<img src="myimage.gif" alt="my image" width="100" height="100" />
</body>
</html>

----------

There's a guy a work who insists on coding like this. Or rather,
*some* of his images have a trailing slash, not others, and his link
tags, as above, have them, but not his meta-tags.
A / at the end of a tag is not valid with any 4.01 DTD. Your guy is wrong.

Go to the spec - http://www.w3.org/TR/html4/ - and have a look.

The fact that it appears to "work" is that the browsers error correction is
kicking in and throwing away the invalid '/' attribute.
I don't know where he picked up the habit, but he says it's valid 4.01
Transitional, and it's also "best practice".
No it is not. It is an XHTML rule that empty elements be closed, most easily
by including the closing, the /, in the opening tag. This is actually a
requirement of XML. It has nothing to do with HTML.
If it is, why aren't all his single tags like this, but let's move on.
Because he is only wrong some of the time :-)
In an attempt to get an official answer on this, I validate the above
at the W3C.

I get this result:

----------

Line 7, column 6: end tag for element "HEAD" which is not open
(explain...).

Line 8, column 5: document type does not allow element "BODY" here
(explain...).
The validator's error recovery is not quite as good as the browsers. It is
misinterpreting /> as something else and getting screwed up a bit further
down.

Probably it is interpreting something in the head as body text. This is
quite permissable, a UA should implicitly close the head element and open a
body element. You get exactly the same effect if you use something like
<p>text</p> inside your head element.

So, when the validator gets to the </head> tag it raises an error. The head
element has already been closed. When it gets to the <body> tag it raises an
error. The body element has already been opened. You can not have nested
body elements.
----------

If I validate it with BBEdit's built-in Check Syntax, it give me this:

----------

File "xmlstyle.html"; Line 7: Document type doesn't permit empty XML
element; "<link/>".
File "xmlstyle.html"; Line 10: Document type doesn't permit empty XML
element; "<img/>".

----------

When I show these results to the coder, he says the W3C is complaining
about "HEAD and BODY tags on the same line" which is laughable,
Very true. See above.
but
he's got a point when he says "BBEdit says the tags are empty. They're
not empty."
Look again. It does not say empty tag, it says empty element. The image and
link elements are indeed empty elements, they have no content like that
title element up there does. All the goodies with a link or image element
happen in the opening tag.

With HTML empty elements do not have a closing tag. So, if you attempt to
close the element in the opening tag BBEdit, correctly, gets upset. It would
also get upset if you said something like <img...>description of
image</img>. It's just not allowed.
I could go on trying other validators, but I'm not happy with the
results of these.
Don't bother. Use the specs.
Validated as 4.01 Strict, by the way, these tags are definitely
errors. "Character Data Is Not Allowed Here" with an arrow pointing to
the end of the tag.
Correct. Transitional allows invalid attributes (the /). Strict does not.
You want a new attribute? Add it to the DTD.
What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way. I
think I'm right but I can't cite anything.


As I said, go to the spece above. Also look up the XHTML spec. There is a
good description of what is new/different with XHTML and the above is
specifically mentioned.

Cheers
Richard.
Jul 20 '05 #2

P: n/a
On 11 Sep 2003 18:01:24 -0700, ho*******@bigfoot.com (Hostile17)
wrote:
Consider the following HTML.

----------

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<title>Untitled</title>
<link rel="Stylesheet" href="mystylesheet.css" />
</head><body>
<img src="myimage.gif" alt="my image" width="100" height="100" />
</body>
</html>

----------

There's a guy a work who insists on coding like this. [...]
What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way. I
think I'm right but I can't cite anything.


Jukka has a nice article that you maay be interested in:

<http://www.cs.tut.fi/~jkorpela/html/empty.html>

Nick

--
Nick Theodorakis
ni******************@urmc.rochester.edu
Jul 20 '05 #3

P: n/a
Hostile17 wrote:
I don't know where he picked up the habit, but he says it's valid 4.01
Transitional, and it's also "best practice".


Technically it is sometimes valid HTML 4.01 Transitional, although it
doesn't neccessarily mean what he thinks it means.

For example "<hr />" in HTML 4.01 Transitional *technically* means a
horizontal line followed by ">".

This is thus valid within the body of a document. However, within the head
it is not, as you would have a ">" in the head of the document, and you
aren't allowed to to start putting characters there -- only metas, links,
styles, scripts, titles, etc.

While technically valid, it is certainly not best practice in HTML.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me - http://www.goddamn.co.uk/tobyink/?id=132

Jul 20 '05 #4

P: n/a
In article <id*******************@news-server.bigpond.net.au>, one of infinite monkeys
at the keyboard of "rf" <ma**********@the.time> wrote:
(chop broadly accurate reply) A / at the end of a tag is not valid with any 4.01 DTD. Your guy is wrong.
Technically that's not entirely right, for reasons below. For practical
purposes you are right.
Line 7, column 6: end tag for element "HEAD" which is not open
(explain...).

Line 8, column 5: document type does not allow element "BODY" here
(explain...).


The validator's error recovery is not quite as good as the browsers. It is
misinterpreting /> as something else and getting screwed up a bit further
down.


Nope. The "/>" isn't an error under strict SGML rules, although the second
character (">") may be. If it happens in the HEAD then an artifact of
the HTML/Legacy DTD causes it to close the HEAD and open the BODY.
This kind of ambiguity is just one of many reasons to use HTML strict.
Probably it is interpreting something in the head as body text. This is
quite permissable, a UA should implicitly close the head element and open a
body element. You get exactly the same effect if you use something like
<p>text</p> inside your head element.
It's not just permissible; it's required (though under HTML strict this
is not the case - your <p> would indeed terminate <HEAD> but the "/>"
is just an error).

To see what's going on, use Page Valet and select "visual" mode:
it will display your HTML normalised.

To get more useful error messages, select a more helpful parse mode.
The default selections in either Page Valet or the WDG Validator
will do this.
I could go on trying other validators, but I'm not happy with the
results of these.


Don't bother. Use the specs.


Careful! That's more complex than you realise.
Validated as 4.01 Strict, by the way, these tags are definitely
errors. "Character Data Is Not Allowed Here" with an arrow pointing to
the end of the tag.


Correct.


Yes, but
Transitional allows invalid attributes (the /). Strict does not.
You want a new attribute? Add it to the DTD.


Totally wrong. Transitional doesn't allow invalid attributes, and /
is not an attribute - it's an abbreviated way to close the tag.

Some of us regard this as a bug in the HTML spec. But it's the
validator's job to implement the spec, warts and all. That's why
other validators (Page Valet and WDG Validator) offer users the
choice of parse modes (WDG calls it "warnings").
What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way. I
think I'm right but I can't cite anything.


http;//valet.webthing.com/page/parsemode.html
--
Nick Kew

In urgent need of paying work - see http://www.webthing.com/~nick/cv.html
Jul 20 '05 #5

P: n/a
Hostile17 wrote:
Consider the following HTML.

----------

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<title>Untitled</title>
<link rel="Stylesheet" href="mystylesheet.css" />
</head><body>
<img src="myimage.gif" alt="my image" width="100" height="100" />
</body>
</html>

----------

There's a guy a work who insists on coding like this. Or rather,
*some* of his images have a trailing slash, not others, and his link
tags, as above, have them, but not his meta-tags.
It sounds like cargo-cult behaviour. Can he explain why he does it?

I don't know where he picked up the habit, but he says it's valid 4.01
Transitional, and it's also "best practice".
The example you gave is invalid, but the practice of XML-style empty
elements _may_ result in a valid document, although only by coincidence,
and it will not mean what he thinks it means. If he's concerned with "best
practice", then he should be far more worried about using the Transitional
document type.

The real problem, though, is that he's shifted the burden of proof onto you,
when it's _him_ that needs to justify his position (as you can clearly
demonstrate error messages, even if you don't understand them).
[snip] In an attempt to get an official answer on this, I validate the above
at the W3C.

I get this result: [snip errors] If I validate it with BBEdit's built-in Check Syntax, it give me this: [snip] When I show these results to the coder, he says the W3C is complaining
about "HEAD and BODY tags on the same line" which is laughable,
He has completely misunderstood the error message. Apart from anything
else, it's trivial to prove him wrong by simply putting them on separate
lines.

but he's got a point when he says "BBEdit says the tags are empty. They're
not empty."
BBEdit says that it doesn't permit empty XML elements. You are mixing up
tags and elements. <img> elements, for example, are always empty, although
the tags are not. Empty elements have no content, although they usually
have attributes.

I could go on trying other validators, but I'm not happy with the
results of these.
They are working properly. Let's walk through the code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
This is an HTML 4.01 Transitional document.

<html>
Open the <html> element.

<head>
Open the <head> element, which is within the <html> element.

<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
Opens and closes a <meta> element, which is within the <head> element. This
is an empty element.

<title>Untitled</title>
Opens the <title> element, within the <head> element, which contains the
character data 'Untitled', and closes it.

<link rel="Stylesheet" href="mystylesheet.css" />
Opens a <link> element. This is where the problem lies. An obscure part of
HTML allows authors to use shorthand syntax to write elements. For
example, the following are equivalent:

<em>...</em>
<em/.../

No browser (that I know of) implements this part of HTML, only the
validator, which is why you haven't heard of it. So, going back to the
code:
<link rel="Stylesheet" href="mystylesheet.css" />
This is actually equivalent to:

<link rel="Stylesheet" href="mystylesheet.css" >/

Now, <link> elements are always empty. So, upon encountering the solidus
(slash), the parser jumps back out one layer, to the <head> element. The
<head> element cannot contain character data, so the parser jumps out
another level (the closing tag for the <head> element is optional), to the
<html> element. The <html> element cannot contain character data either,
_but_, another little-known corner of HTML states that the <body> element
can be implied; that is to say you don't need to explicitly start it with
an opening tag. So, going back to the code (again):
<link rel="Stylesheet" href="mystylesheet.css" />
This opens a <link> element, which is empty, closes the <head> element and
starts the <body> element, which contains a solidus character.

</head><body>
Obviously, at this point, there is no <head> element to close, and you can
only have one <body> element per document, which is already open at this
point. This is the error.
[snip] Validated as 4.01 Strict, by the way, these tags are definitely
errors. "Character Data Is Not Allowed Here" with an arrow pointing to
the end of the tag.
This is actually a different error. HTML 4.01 Strict documents do not allow
character data within <body> elements, they need to be contained within
another element, such as a <p> element. The following is not valid HTML
4.01 Strict:

....
<body>
Britney

The following is valid HTML 4.01 Strict:

....
<body>
<p>
Britney

What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way.

[snip]

The only definitive source is the specification, unfortunately it isn't
trivial to point to a single part to say "this is wrong".

<URL:http://www.w3.org/TR/html401/>

However, if you can make do with a less definitive, but pretty clear
statement from the validator guys, have a look at the validator FAQ:

<URL:http://validator.w3.org/docs/help.html#faq-linkandmeta>

--
Jim Dabell

Jul 20 '05 #6

P: n/a
Jim Dabell wrote:
<link rel="Stylesheet" href="mystylesheet.css" />


This opens a <link> element, which is empty, closes the <head> element and
starts the <body> element, which contains a solidus character.


I think it's a greater-than character, actually... the solidus closes
the link element, and it's the character after it that is considered to
be plain text. But this is such an obscure HTML "feature" that I get
confused about it too.

--
== Dan ==
Dan's Mail Format Site: http://mailformat.dan.info/
Dan's Web Tips: http://webtips.dan.info/
Dan's Domain Site: http://domains.dan.info/

Jul 20 '05 #7

P: n/a
Daniel R. Tobias wrote:
Jim Dabell wrote:
<link rel="Stylesheet" href="mystylesheet.css" />


This opens a <link> element, which is empty, closes the <head> element
and starts the <body> element, which contains a solidus character.


I think it's a greater-than character, actually... the solidus closes
the link element, and it's the character after it that is considered to
be plain text. But this is such an obscure HTML "feature" that I get
confused about it too.


Oops, yes, you are right. I knew the rule, I don't know how I ended up
switching the characters in the explanation. I hope I didn't confuse the
issue further - cheers for spotting it :)
--
Jim Dabell

Jul 20 '05 #8

P: n/a
Thank you all very much for your help.

I really appreciate it.
Jul 20 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.