473,326 Members | 2,114 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

Are XML-style "/>" tags valid in 4.01 Transitional? I get weird answers from validators.

Consider the following HTML.

----------

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<title>Untitled</title>
<link rel="Stylesheet" href="mystylesheet.css" />
</head><body>
<img src="myimage.gif" alt="my image" width="100" height="100" />
</body>
</html>

----------

There's a guy a work who insists on coding like this. Or rather,
*some* of his images have a trailing slash, not others, and his link
tags, as above, have them, but not his meta-tags.

I don't know where he picked up the habit, but he says it's valid 4.01
Transitional, and it's also "best practice".

If it is, why aren't all his single tags like this, but let's move on.

In an attempt to get an official answer on this, I validate the above
at the W3C.

I get this result:

----------

Line 7, column 6: end tag for element "HEAD" which is not open
(explain...).

Line 8, column 5: document type does not allow element "BODY" here
(explain...).

----------

If I validate it with BBEdit's built-in Check Syntax, it give me this:

----------

File "xmlstyle.html"; Line 7: Document type doesn't permit empty XML
element; "<link/>".
File "xmlstyle.html"; Line 10: Document type doesn't permit empty XML
element; "<img/>".

----------

When I show these results to the coder, he says the W3C is complaining
about "HEAD and BODY tags on the same line" which is laughable, but
he's got a point when he says "BBEdit says the tags are empty. They're
not empty."

I could go on trying other validators, but I'm not happy with the
results of these.

Validated as 4.01 Strict, by the way, these tags are definitely
errors. "Character Data Is Not Allowed Here" with an arrow pointing to
the end of the tag.

What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way. I
think I'm right but I can't cite anything.

To me coding this way is something like the HTML equivalent of wearing
a baseball cap backwards...
Jul 20 '05 #1
8 5709
rf

"Hostile17" <ho*******@bigfoot.com> wrote in message
news:c6*************************@posting.google.co m...
Consider the following HTML.

----------

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<title>Untitled</title>
<link rel="Stylesheet" href="mystylesheet.css" />
</head><body>
<img src="myimage.gif" alt="my image" width="100" height="100" />
</body>
</html>

----------

There's a guy a work who insists on coding like this. Or rather,
*some* of his images have a trailing slash, not others, and his link
tags, as above, have them, but not his meta-tags.
A / at the end of a tag is not valid with any 4.01 DTD. Your guy is wrong.

Go to the spec - http://www.w3.org/TR/html4/ - and have a look.

The fact that it appears to "work" is that the browsers error correction is
kicking in and throwing away the invalid '/' attribute.
I don't know where he picked up the habit, but he says it's valid 4.01
Transitional, and it's also "best practice".
No it is not. It is an XHTML rule that empty elements be closed, most easily
by including the closing, the /, in the opening tag. This is actually a
requirement of XML. It has nothing to do with HTML.
If it is, why aren't all his single tags like this, but let's move on.
Because he is only wrong some of the time :-)
In an attempt to get an official answer on this, I validate the above
at the W3C.

I get this result:

----------

Line 7, column 6: end tag for element "HEAD" which is not open
(explain...).

Line 8, column 5: document type does not allow element "BODY" here
(explain...).
The validator's error recovery is not quite as good as the browsers. It is
misinterpreting /> as something else and getting screwed up a bit further
down.

Probably it is interpreting something in the head as body text. This is
quite permissable, a UA should implicitly close the head element and open a
body element. You get exactly the same effect if you use something like
<p>text</p> inside your head element.

So, when the validator gets to the </head> tag it raises an error. The head
element has already been closed. When it gets to the <body> tag it raises an
error. The body element has already been opened. You can not have nested
body elements.
----------

If I validate it with BBEdit's built-in Check Syntax, it give me this:

----------

File "xmlstyle.html"; Line 7: Document type doesn't permit empty XML
element; "<link/>".
File "xmlstyle.html"; Line 10: Document type doesn't permit empty XML
element; "<img/>".

----------

When I show these results to the coder, he says the W3C is complaining
about "HEAD and BODY tags on the same line" which is laughable,
Very true. See above.
but
he's got a point when he says "BBEdit says the tags are empty. They're
not empty."
Look again. It does not say empty tag, it says empty element. The image and
link elements are indeed empty elements, they have no content like that
title element up there does. All the goodies with a link or image element
happen in the opening tag.

With HTML empty elements do not have a closing tag. So, if you attempt to
close the element in the opening tag BBEdit, correctly, gets upset. It would
also get upset if you said something like <img...>description of
image</img>. It's just not allowed.
I could go on trying other validators, but I'm not happy with the
results of these.
Don't bother. Use the specs.
Validated as 4.01 Strict, by the way, these tags are definitely
errors. "Character Data Is Not Allowed Here" with an arrow pointing to
the end of the tag.
Correct. Transitional allows invalid attributes (the /). Strict does not.
You want a new attribute? Add it to the DTD.
What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way. I
think I'm right but I can't cite anything.


As I said, go to the spece above. Also look up the XHTML spec. There is a
good description of what is new/different with XHTML and the above is
specifically mentioned.

Cheers
Richard.
Jul 20 '05 #2
On 11 Sep 2003 18:01:24 -0700, ho*******@bigfoot.com (Hostile17)
wrote:
Consider the following HTML.

----------

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<title>Untitled</title>
<link rel="Stylesheet" href="mystylesheet.css" />
</head><body>
<img src="myimage.gif" alt="my image" width="100" height="100" />
</body>
</html>

----------

There's a guy a work who insists on coding like this. [...]
What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way. I
think I'm right but I can't cite anything.


Jukka has a nice article that you maay be interested in:

<http://www.cs.tut.fi/~jkorpela/html/empty.html>

Nick

--
Nick Theodorakis
ni******************@urmc.rochester.edu
Jul 20 '05 #3
Hostile17 wrote:
I don't know where he picked up the habit, but he says it's valid 4.01
Transitional, and it's also "best practice".


Technically it is sometimes valid HTML 4.01 Transitional, although it
doesn't neccessarily mean what he thinks it means.

For example "<hr />" in HTML 4.01 Transitional *technically* means a
horizontal line followed by ">".

This is thus valid within the body of a document. However, within the head
it is not, as you would have a ">" in the head of the document, and you
aren't allowed to to start putting characters there -- only metas, links,
styles, scripts, titles, etc.

While technically valid, it is certainly not best practice in HTML.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me - http://www.goddamn.co.uk/tobyink/?id=132

Jul 20 '05 #4
In article <id*******************@news-server.bigpond.net.au>, one of infinite monkeys
at the keyboard of "rf" <ma**********@the.time> wrote:
(chop broadly accurate reply) A / at the end of a tag is not valid with any 4.01 DTD. Your guy is wrong.
Technically that's not entirely right, for reasons below. For practical
purposes you are right.
Line 7, column 6: end tag for element "HEAD" which is not open
(explain...).

Line 8, column 5: document type does not allow element "BODY" here
(explain...).


The validator's error recovery is not quite as good as the browsers. It is
misinterpreting /> as something else and getting screwed up a bit further
down.


Nope. The "/>" isn't an error under strict SGML rules, although the second
character (">") may be. If it happens in the HEAD then an artifact of
the HTML/Legacy DTD causes it to close the HEAD and open the BODY.
This kind of ambiguity is just one of many reasons to use HTML strict.
Probably it is interpreting something in the head as body text. This is
quite permissable, a UA should implicitly close the head element and open a
body element. You get exactly the same effect if you use something like
<p>text</p> inside your head element.
It's not just permissible; it's required (though under HTML strict this
is not the case - your <p> would indeed terminate <HEAD> but the "/>"
is just an error).

To see what's going on, use Page Valet and select "visual" mode:
it will display your HTML normalised.

To get more useful error messages, select a more helpful parse mode.
The default selections in either Page Valet or the WDG Validator
will do this.
I could go on trying other validators, but I'm not happy with the
results of these.


Don't bother. Use the specs.


Careful! That's more complex than you realise.
Validated as 4.01 Strict, by the way, these tags are definitely
errors. "Character Data Is Not Allowed Here" with an arrow pointing to
the end of the tag.


Correct.


Yes, but
Transitional allows invalid attributes (the /). Strict does not.
You want a new attribute? Add it to the DTD.


Totally wrong. Transitional doesn't allow invalid attributes, and /
is not an attribute - it's an abbreviated way to close the tag.

Some of us regard this as a bug in the HTML spec. But it's the
validator's job to implement the spec, warts and all. That's why
other validators (Page Valet and WDG Validator) offer users the
choice of parse modes (WDG calls it "warnings").
What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way. I
think I'm right but I can't cite anything.


http;//valet.webthing.com/page/parsemode.html
--
Nick Kew

In urgent need of paying work - see http://www.webthing.com/~nick/cv.html
Jul 20 '05 #5
Hostile17 wrote:
Consider the following HTML.

----------

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<title>Untitled</title>
<link rel="Stylesheet" href="mystylesheet.css" />
</head><body>
<img src="myimage.gif" alt="my image" width="100" height="100" />
</body>
</html>

----------

There's a guy a work who insists on coding like this. Or rather,
*some* of his images have a trailing slash, not others, and his link
tags, as above, have them, but not his meta-tags.
It sounds like cargo-cult behaviour. Can he explain why he does it?

I don't know where he picked up the habit, but he says it's valid 4.01
Transitional, and it's also "best practice".
The example you gave is invalid, but the practice of XML-style empty
elements _may_ result in a valid document, although only by coincidence,
and it will not mean what he thinks it means. If he's concerned with "best
practice", then he should be far more worried about using the Transitional
document type.

The real problem, though, is that he's shifted the burden of proof onto you,
when it's _him_ that needs to justify his position (as you can clearly
demonstrate error messages, even if you don't understand them).
[snip] In an attempt to get an official answer on this, I validate the above
at the W3C.

I get this result: [snip errors] If I validate it with BBEdit's built-in Check Syntax, it give me this: [snip] When I show these results to the coder, he says the W3C is complaining
about "HEAD and BODY tags on the same line" which is laughable,
He has completely misunderstood the error message. Apart from anything
else, it's trivial to prove him wrong by simply putting them on separate
lines.

but he's got a point when he says "BBEdit says the tags are empty. They're
not empty."
BBEdit says that it doesn't permit empty XML elements. You are mixing up
tags and elements. <img> elements, for example, are always empty, although
the tags are not. Empty elements have no content, although they usually
have attributes.

I could go on trying other validators, but I'm not happy with the
results of these.
They are working properly. Let's walk through the code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
This is an HTML 4.01 Transitional document.

<html>
Open the <html> element.

<head>
Open the <head> element, which is within the <html> element.

<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
Opens and closes a <meta> element, which is within the <head> element. This
is an empty element.

<title>Untitled</title>
Opens the <title> element, within the <head> element, which contains the
character data 'Untitled', and closes it.

<link rel="Stylesheet" href="mystylesheet.css" />
Opens a <link> element. This is where the problem lies. An obscure part of
HTML allows authors to use shorthand syntax to write elements. For
example, the following are equivalent:

<em>...</em>
<em/.../

No browser (that I know of) implements this part of HTML, only the
validator, which is why you haven't heard of it. So, going back to the
code:
<link rel="Stylesheet" href="mystylesheet.css" />
This is actually equivalent to:

<link rel="Stylesheet" href="mystylesheet.css" >/

Now, <link> elements are always empty. So, upon encountering the solidus
(slash), the parser jumps back out one layer, to the <head> element. The
<head> element cannot contain character data, so the parser jumps out
another level (the closing tag for the <head> element is optional), to the
<html> element. The <html> element cannot contain character data either,
_but_, another little-known corner of HTML states that the <body> element
can be implied; that is to say you don't need to explicitly start it with
an opening tag. So, going back to the code (again):
<link rel="Stylesheet" href="mystylesheet.css" />
This opens a <link> element, which is empty, closes the <head> element and
starts the <body> element, which contains a solidus character.

</head><body>
Obviously, at this point, there is no <head> element to close, and you can
only have one <body> element per document, which is already open at this
point. This is the error.
[snip] Validated as 4.01 Strict, by the way, these tags are definitely
errors. "Character Data Is Not Allowed Here" with an arrow pointing to
the end of the tag.
This is actually a different error. HTML 4.01 Strict documents do not allow
character data within <body> elements, they need to be contained within
another element, such as a <p> element. The following is not valid HTML
4.01 Strict:

....
<body>
Britney

The following is valid HTML 4.01 Strict:

....
<body>
<p>
Britney

What I want is someone to give me a definitive response, backed up by
a link to a reputable website, where it gives an answer either way.

[snip]

The only definitive source is the specification, unfortunately it isn't
trivial to point to a single part to say "this is wrong".

<URL:http://www.w3.org/TR/html401/>

However, if you can make do with a less definitive, but pretty clear
statement from the validator guys, have a look at the validator FAQ:

<URL:http://validator.w3.org/docs/help.html#faq-linkandmeta>

--
Jim Dabell

Jul 20 '05 #6
Jim Dabell wrote:
<link rel="Stylesheet" href="mystylesheet.css" />


This opens a <link> element, which is empty, closes the <head> element and
starts the <body> element, which contains a solidus character.


I think it's a greater-than character, actually... the solidus closes
the link element, and it's the character after it that is considered to
be plain text. But this is such an obscure HTML "feature" that I get
confused about it too.

--
== Dan ==
Dan's Mail Format Site: http://mailformat.dan.info/
Dan's Web Tips: http://webtips.dan.info/
Dan's Domain Site: http://domains.dan.info/

Jul 20 '05 #7
Daniel R. Tobias wrote:
Jim Dabell wrote:
<link rel="Stylesheet" href="mystylesheet.css" />


This opens a <link> element, which is empty, closes the <head> element
and starts the <body> element, which contains a solidus character.


I think it's a greater-than character, actually... the solidus closes
the link element, and it's the character after it that is considered to
be plain text. But this is such an obscure HTML "feature" that I get
confused about it too.


Oops, yes, you are right. I knew the rule, I don't know how I ended up
switching the characters in the explanation. I hope I didn't confuse the
issue further - cheers for spotting it :)
--
Jim Dabell

Jul 20 '05 #8
Thank you all very much for your help.

I really appreciate it.
Jul 20 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Anthony Liu | last post by:
I have a news corpus that looks like the following. I want to do a statistical survey of the words used in the news report per se. So, I must not consider those words in the XML tags. I know...
2
by: Yang | last post by:
Hi everyone: after I parse a .xml file, I still have the xml tags in my varibles. Please help. Why my $title varible is " <title>title is here</title>" instead of "title is here"??? thanks...
2
by: ZMan | last post by:
I'm trying to format some data out as an RSS feed. I already have a perfectly good ASPX page that does the same thing with an <asp:repeater> so I figured I'd take the page change the html tags to...
1
by: Scanner2001 | last post by:
I am looking for a way to return the output of a web service as a string without the xml tags, just the raw data. I am calling the web service from an html page using a form post method....
3
by: Andrew Jocelyn | last post by:
Hi Is there a way of outputting some xml tags during an XSLT transform? For example when I use the 'xsl:value-of select="xhtml"' statement I'd like to output the children of 'xhtml' after the...
3
by: serge calderara | last post by:
Dear all, Let say that I have developped a funcionnal library that I distribute for use. That library can be integrated in a project by referencing it and then use it by calling function and...
7
by: mn_ms_user | last post by:
Can someone point me in the right direction? I need to wrap some class properties into XML tags in C#. I just need the XML tags and not the XML document information. Thanks.
1
by: Shyran | last post by:
we have a java framework, where we feed a request xml. this request xml is forwarded through the framework, tomcat and axis, for the backend processing, and the processing results are again...
0
by: theintrepidfox | last post by:
Dear Group I wonder if I do something wrong and whether this is possible at all. Any hint is greatly appreciated. I have a XML file bound to a Treeview. The treenodes Text/Value properties...
18
by: malathib | last post by:
Hi, while trasforming xml to html using xslt, i dont want to lose xml tags.I want to preserve it. After transforming to html and displaying in browser and if you are right clicking and viewing...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.