XHTML user agent behavior regarding empty elements

Mikko Ohtamaa

From XML specification:

[Definition: An element with no content is said to be empty.] The
representation of an empty element is either a start-tag immediately
followed by an end-tag, or an empty-element tag.

(This means that <foo></foo> is equal to <foo/>)

From XHTML specification:

C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not EMPTY
(for example, an empty title or paragraph) do not use the minimized form
(e.g. use <p> </p> and not <p />).

From XML point of view <div/> and <div></div> are equal. However, XHTML,
which should be valid XML, recommends(?) to use <div></div> only. Should
XHTML browsers accept empty-element tags?

A little testing shows that this is not the case. Both IE 5.5 and Netscape
7.0 fail to render following XHTML code correctly. They consider
empty-element tag <div/> equal to <div>.

This is nuisance, since when you are producing XHTML from XML with XSLT
transform, XSLT transformers present empty elements using empty-element
tag notation. You must use external postprocessor to change <div/>
elements to <div></div> pairs.

<?xml version="1.0" encoding="utf-8" ?>
<html>
<body>

<div style="margin-left: 10%; background: blue">
A working sample.
<div style="margin-left: 10%; background: red">
Lalihoo!
<div id="blaah"></div>
Am I red?
</div>
Am I blue?
</div>

<br/>

<div style="margin-left: 10%; background: blue">
Hiihoo!

<div style="margin-left: 10%; background: red">
Lalihoo!
<div id="blaah"/>
Am I red?
</div>
Am I blue? No, I am red because I am confused.
</div>
</body>
</html>

Jul 20 '05 #1

Subscribe Reply

4041

Julian F. Reschke

"Mikko Ohtamaa" <mo*@sneakmail.zzn.com> schrieb im Newsbeitrag
news:26**************************@posting.google.c om...

From XML specification:

[Definition: An element with no content is said to be empty.] The
representation of an empty element is either a start-tag immediately
followed by an end-tag, or an empty-element tag.

(This means that <foo></foo> is equal to <foo/>)

From XHTML specification:

C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not EMPTY
(for example, an empty title or paragraph) do not use the minimized form
(e.g. use <p> </p> and not <p />).

From XML point of view <div/> and <div></div> are equal. However, XHTML,
which should be valid XML, recommends(?) to use <div></div> only. Should
XHTML browsers accept empty-element tags?
a) The quote in C.3 is from the (non-normative) chapter "HTML compatibility
guidelines".

b) They must.
A little testing shows that this is not the case. Both IE 5.5 and Netscape
7.0 fail to render following XHTML code correctly. They consider
empty-element tag <div/> equal to <div>.
IE is known not to support XHTML. For NS 7, this may be a bug that needs to
be fixed. Make sure that you are serving the XHTML in a way that the browser
is *aware* that this is not HTML, though.
...

Julian

Jul 20 '05 #2

Johannes Koch

Mikko Ohtamaa wrote:

From XML specification:

[Definition: An element with no content is said to be empty.] The
representation of an empty element is either a start-tag immediately
followed by an end-tag, or an empty-element tag.

(This means that <foo></foo> is equal to <foo/>)

From XHTML specification:

C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not EMPTY
(for example, an empty title or paragraph) do not use the minimized form
(e.g. use <p> </p> and not <p />).

From XML point of view <div/> and <div></div> are equal.
From XML 1.0 Second Edition:
Empty-element tags may be used for any element which has no content,
whether or not it is declared using the keyword EMPTY. For
interoperability, the empty-element tag should be used, and should only
be used, for elements which are declared EMPTY.
However, XHTML,
which should be valid XML, recommends(?) to use <div></div> only. Should
XHTML browsers accept empty-element tags?
Yes, they should.
A little testing shows that this is not the case. Both IE 5.5 and Netscape
7.0 fail to render following XHTML code correctly.
IE 5.5 is no XHTML browser, maybe it can be called an XML browser.
In various browsers XML rules are only applied when the content is known
to be XML (via an appropriate Content-Type HTTP header).
They consider
empty-element tag <div/> equal to <div>.

In tag soup mode.

No f'up2 set, because it may be interesting for both groups.
--
Johannes Koch
In te domine speravi; non confundar in aeternum.
(Te Deum, 4th cent.)

Jul 20 '05 #3

Jukka K. Korpela

mo*@sneakmail.zzn.com (Mikko Ohtamaa) wrote:

From XHTML specification:

C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not
EMPTY (for example, an empty title or paragraph) do not use the
minimized form (e.g. use <p> </p> and not <p />).
I think it needs to be mentioned that the HTML 4.01 specification
explicitly frowns upon empty paragraphs and says authors should not use
them and browsers shoulds ignore them. It's not clear whether <p> </p> is
empty or not; a space character as the content is not the same as lack of
content (and the common construct <p> </p> that various programs
spit out is a yet another thing).
A little testing shows that this is not the case. Both IE 5.5 and
Netscape 7.0 fail to render following XHTML code correctly. They
consider empty-element tag <div/> equal to <div>.
No wonder. And rumors say that there are even some small browsers that
process the construct <div/> _correctly_ by HTML rules as valid up to and
including HTML 4.01, namely as equivalent to <div>> (where the second
greater than sign is a data character).
This is nuisance, since when you are producing XHTML from XML with
XSLT transform, XSLT transformers present empty elements using
empty-element tag notation. You must use external postprocessor to
change <div/> elements to <div></div> pairs.

Why do you generate elements with empty content in the first place?
What is the meaning of a <div> element with no content, give that the
<div> element has no semantics except in the abstract sense that it
constitutes a block-element element?

Empty elements are extremely confusing, see
http://www.cs.tut.fi/~jkorpela/html/empty.html

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #4

Johannes Koch

Mikko Ohtamaa wrote:

I am using MSXML (Microsoft XML engine) to transform XML data to XHTML
reports.
Why do you want to create _X_HTML reports, when several browsers don't
know about _X_HTML. Produce HTML instead.
In XSLT it is too heavy to check if each element will be empty and
implement a wrapper for it.

<xsl:template match="foo">
<xsl:if test="normalize-space(.) != ''">
<div class="{local-name()}">
<xsl:value-of select="."/>
</div>
</xsl:if>
</xsl:template>

Is this really too heavy?

xpost and f'up2 ctx
--
Johannes Koch
In te domine speravi; non confundar in aeternum.
(Te Deum, 4th cent.)

Jul 20 '05 #5

David Madore

Mikko Ohtamaa in litteris
<26**************************@posting.google.com > scripsit:

From XML point of view <div/> and <div></div> are equal. However, XHTML,
which should be valid XML, recommends(?) to use <div></div> only. Should
XHTML browsers accept empty-element tags?
If the document is served with MIME content-type
"application/xhtml+xml", then <div /> _must_ be treated as equivalent
to <div></div>; on the other hand, if the document is served with MIME
content-type "text/html", then the browser is free to treat the
content as a soup of tag.

See <URL: http://www.w3.org/TR/xhtml-media-types/ > for more
information.
A little testing shows that this is not the case. Both IE 5.5 and Netscape
7.0 fail to render following XHTML code correctly. They consider
empty-element tag <div/> equal to <div>.
Mozilla (and Mozilla derivatives, such as Netscape7) treat <div/> as
equivalent to <div> when parsing the document as HTML, but as
<div></div> when parsing it as XHTML. The difference is determined by
the MIME content-type as explained above, or, in the absence of
higher-level protocol information, by the extension.

Note that Mozilla is about the only browser which supports the
application/xhtml+xml content-type anyway.
This is nuisance, since when you are producing XHTML from XML with XSLT
transform, XSLT transformers present empty elements using empty-element
tag notation. You must use external postprocessor to change <div/>
elements to <div></div> pairs.

Simply use <xsl:comment> to create a comment inside the <div> element
if it has any chance of being empty: this will prevent it from being
minimized. I use "" in this context.

--
David A. Madore
(da**********@ens.fr,
http://www.eleves.ens.fr:8080/home/madore/ )

Jul 20 '05 #6

Alan J. Flavell

On Mon, Sep 1, David Madore inscribed on the eternal scroll:

Simply use <xsl:comment> to create a comment inside the <div> element
if it has any chance of being empty: this will prevent it from being
minimized. I use "" in this context.

The div element is designed to contain, well, "content". If there
isn't any content, then it's semantically meaningless (syntax or no
syntax). Surely the logical move would be to take it out, rather than
looking for other kinds of content-free clutter to stick into it?

(I did once have a program that ran faster by inserting a NOP, but
that's a different story entirely.)

all the best

Jul 20 '05 #7

John Bokma

Alan J. Flavell wrote:

(I did once have a program that ran faster by inserting a NOP, but
that's a different story entirely.)

Quad word alignment pops up :-)

--
Kind regards, feel free to mail: mail(at)johnbokma.com (or reply)
virtual home: http://johnbokma.com/ ICQ: 218175426
John web site hints: http://johnbokma.com/websitedesign/

Jul 20 '05 #8

David Madore

"Alan J. Flavell" in litteris
<Pi*******************************@lxplus096.cern. ch> scripsit:

The div element is designed to contain, well, "content". If there
isn't any content, then it's semantically meaningless (syntax or no
syntax). Surely the logical move would be to take it out, rather than
looking for other kinds of content-free clutter to stick into it?

Generally speaking, I agree with you. There are rare cases, however,
where I find an empty <div> or <span> element to be useful and
appropriate. Here's one:

<div style="border: solid">
<img src="pornpicture.jpg" width="120" height="240"
alt="[Highly erotic image]" style="float: left" />
<p>To the left is a picture of me. Blah, blah, blah.</p>
<div style="clear: both"></div>
</div>

- in other words, the empty <div> is used to make sure that the border
of the outer <div> fully goes around the image even if the text is too
short for that.

Another case is when you want to style an element using the CSS
"content" property: sometimes there is nothing else to put in the
element. One intereting hack consists of using the CSS "content"
property on an empty <span> element as it seems to be the only way to
include foreign text in an HTML document without embedding it.
Similarly, using the Mozilla-invented XBL language it might turn out
to be useful to bind to empty <div> or <span> elements.

Another case is when the <div> or <span> element starts empty, but
receives dynamical content through the Document Object Model, e.g.,
via ECMAscript. Of course, the DOM might be used to create the <div>
or <span> element itself, but it might then be a major hassle to get
it in the right place, whereas an empty <div> or <span> element with a
correct id tag is so simple to locate in the DOM!

Speaking of which, of course, an empty <div> might be useful if you
want several anchors pointing to the same place in an HTML document.
It isn't very elegant, and I would advise against it in general, but
sometimes it seems to be the right thing to do.

But, again, in general, I agree with you: unless content generation
makes it very hard to tell in advance whether the <div> will be empty,
it is better to leave out empty <div>s.

Besides, I was using <div> just as an example: there are other
possibly empty tags to which the poster's question might validly
apply. <script> springs to my mind. (Unfortunately, as far as
<script> goes, there is the nasty problem of XML's PCDATA versus
SGML's CDATA content...)

--
David A. Madore
(da**********@ens.fr,
http://www.eleves.ens.fr:8080/home/madore/ )

Jul 20 '05 #9

Headless

David Madore wrote:

Point of order; don't cross post replies.

Note that Mozilla is about the only browser which supports the
application/xhtml+xml content-type anyway.

Ahem: Opera.

Mozilla doesn't support incremental rendering of XHTML, a nasty
drawback.
Headless

--
Email and usenet filter list: http://www.headless.dna.ie/usenet.htm

Jul 20 '05 #10

Jukka K. Korpela

da**********@ens.fr (David Madore) wrote:

There are rare cases, however,
where I find an empty <div> or <span> element to be useful and
appropriate.
Let's see you examples:
<div style="clear: both"></div>
You should assign clear: both to the next element. If there is no next
element in the document, no clearing is needed.
Another case is when you want to style an element using the CSS
"content" property:
The content property applies to :before and :after pseudo-elements only,
so you just need to select whether you wish to have the text inserted
before or after some text in the document.
One intereting hack consists of using the CSS "content"
property on an empty <span> element as it seems to be the only way to
include foreign text in an HTML document without embedding it.
Would that really fall within the principle of using CSS for optional
presentational suggestions? It's hardly a good argument in favor of
something that it would be needed for a hack that shouldn't be used. But
even for such a hack, you can simply assign the content property to a
suitable pseudo-element (as you need to do anyway, but the point is that
the pseudo-element can be derived from a real element, as opposite to an
artificial element with empty content).
Similarly, using the Mozilla-invented XBL language it might turn out
to be useful to bind to empty <div> or <span> elements.
A similar case indeed, except that you're referring to a browser-specific
invention, it seems.
Another case is when the <div> or <span> element starts empty, but
receives dynamical content through the Document Object Model, e.g.,
via ECMAscript.
This is the kind of emptyness that potentially makes sense in SGML-based
markup, but whether it makes sense in authoring for the WWW is less clear.
Of course, the DOM might be used to create the <div>
or <span> element itself,
I think you just objected your own example. If scripting is actually used
to change the document's structure by adding elements, why would you hide
this with making them technically static?
Speaking of which, of course, an empty <div> might be useful if you
want several anchors pointing to the same place in an HTML document.
It isn't very elegant, and I would advise against it in general, but
sometimes it seems to be the right thing to do.

The need still needs to be proven.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #11

David Madore

"Jukka K. Korpela" in litteris
<Xn*****************************@193.229.0.31> scripsit:

Let's see you examples:

da**********@ens.fr (David Madore) wrote:
<div style="clear: both"></div>
You should assign clear: both to the next element. If there is no next
element in the document, no clearing is needed.

Maybe you didn't read my example completely. I'm not using the
"clear" property to clear the next element, but to clear the border of
the surrounding <div>.

Here's an example (except that I didn't have a nice porn picture to
use, sorry): please compare

<URL: http://www.eleves.ens.fr:8080/home/m...st/float1.html >
and
<URL: http://www.eleves.ens.fr:8080/home/m...st/float2.html >

(the first uses an empty <div> as I suggest, and the second puts the
float property on the next element).

All browsers I have at hand display them differently, and that is also
what I understand from the CSS spec should be done. And evidently
there are cases when the first presentation is wanted, not the second:
in this case I think putting an empty <div> is a perfectly reasonable
solution, and I don't see in what way it would be harmful.

Another case is when you want to style an element using the CSS
"content" property:

The content property applies to :before and :after pseudo-elements only,
so you just need to select whether you wish to have the text inserted
before or after some text in the document.

Sometimes the content is generated and it is extremely difficult to
get at the previous or next generated element.

One intereting hack consists of using the CSS "content"
property on an empty <span> element as it seems to be the only way to
include foreign text in an HTML document without embedding it.

Would that really fall within the principle of using CSS for optional
presentational suggestions? It's hardly a good argument in favor of
something that it would be needed for a hack that shouldn't be used.

I would very much prefer if the fathers and normalizers of HTML had
foreseen the usefulness of a tag to include plain text (or
inline-level HTML) from a foreign source within HTML (without creating
a block-level element for embedding). But given that this tag doesn't
exist, what else can I do? I agree that it's a hack to use CSS for
that, and most often contrary to the goals and principles of CSS
(though not always: sometimes the inserted text *is* optional and of
presentational nature), but until someone suggests a better
solution...
But
even for such a hack, you can simply assign the content property to a
suitable pseudo-element (as you need to do anyway, but the point is that
the pseudo-element can be derived from a real element, as opposite to an
artificial element with empty content).

See above: if the content is generated, it is not always easy, or even
possible, to get at the previous or next element.

Or it may be simply a matter of elegance. For example, consider this:

<p>Stylesheet name (if applicable): [<span
id="insert-stylesheet-name-here"></span>]</p>

with a CSS rule like

#insert-stylesheet-name-here:before { content: "Foobar"; }

in the "Foobar" stylesheet, and similarly in the others. Now it is
true that I might also write this as

<p>Stylesheet name (if applicable): [<span
id="insert-stylesheet-name-here">]</span></p>

I just happen to think it is more elegant to use an empty <span> tag,
because it avoids misbalancing the braces.

(Of course, you might then point out that the <span> shouldn't be
empty, it should contain the word "none", and CSS should be used to
avoid displaying that word when a stylesheet is active. Right. We
could continue the byzantine discussion indefinitely in this line.)

Similarly, using the Mozilla-invented XBL language it might turn out
to be useful to bind to empty <div> or <span> elements.

A similar case indeed, except that you're referring to a browser-specific
invention, it seems.

Yes, and so? There's nothing wrong with browser-specific inventions
if they're useful and are employed in a way that gracefully degrades
on other browsers.

Another case is when the <div> or <span> element starts empty, but
receives dynamical content through the Document Object Model, e.g.,
via ECMAscript.

This is the kind of emptyness that potentially makes sense in SGML-based
markup, but whether it makes sense in authoring for the WWW is less clear.

I'm not sure I understand this comment.

Of course, the DOM might be used to create the <div>
or <span> element itself,

I think you just objected your own example. If scripting is actually used
to change the document's structure by adding elements, why would you hide
this with making them technically static?

It's not a matter of hiding the fact that dynamic content will be
inserted. It's just that if there is a small (and optional) amount of
it, it is much simpler to dump it in an already existent, but empty,
<span> or <div> tag, which is located using getElementById(), than to
create that tag in the first place.

Speaking of which, of course, an empty <div> might be useful if you
want several anchors pointing to the same place in an HTML document.
It isn't very elegant, and I would advise against it in general, but
sometimes it seems to be the right thing to do.

The need still needs to be proven.

Why is the burden of the proof on my shoulders? Suppose you proved
that the need cannot arise?

It seems that in every case I've given (except the first, where I
still see no workaround) you've told me "this isn't absolutely
necessary" and I've answered "yes, but it's convenient". I hope we
can agree on this: that empty <div> or <span> elements are not
necessary, but they are sometimes convenient. Now suppose you told me
what is *wrong* about them?

If there is some kind of dogmatic reason ("Natura abhorret vacuum"?)
for not ever using empty <div> or <span> tags, then I will refrain
from further discussion. My religion doesn't forbid empty <div> or
<span> tags: it just frowns upon their *gratuitous* use, but allows
them when they make things simpler, or more convenient, and when no
other inconvenience results (and I'd like to know what inconvenience
can be caused by an empty tag). In that case, let us just let our
religions be at peace and people can make their own mind as to what
gospel they will follow. I do not intend to flame or debate endlessly
about what is The Right Thing.

On the other hand, if you have an important practical reason for not
using empty <div> and <span> tags (such as "this-or-that browser will
break to pieces upon encountering them" or "they cause a serious
accessibility problem for people with this-or-that disability"), then
I would certainly like to hear it.

Cheers,

--
David A. Madore
(da**********@ens.fr,
http://www.eleves.ens.fr:8080/home/madore/ )

Jul 20 '05 #12

Alan J. Flavell

On Tue, Sep 2, David Madore inscribed on the eternal scroll:

Sometimes the content is generated and it is extremely difficult to
get at the previous or next generated element.

We are still free to discuss the quality of the end result, surely, no
matter what technique was used to generate it? If the tools then
prove inadequate to the task, we would have to decide which is more
important - to use the tools at hand, or to produce a quality product.

I have been known to pass the result through a post-filter where I
wasn't satisfied with the output of some tool that I needed to use for
other reasons; and no doubt I'll be doing the same again if/when a
similar situation arises.

Jul 20 '05 #13

David Madore

"Alan J. Flavell" in litteris
<Pi*******************************@lxplus078.cern. ch> scripsit:

We are still free to discuss the quality of the end result, surely, no
matter what technique was used to generate it? If the tools then
prove inadequate to the task, we would have to decide which is more
important - to use the tools at hand, or to produce a quality product.

Certainly. But I still fail to see why having empty <div> or <span>
elements degrades the "quality" of an (X)HTML document, apart from the
dogmatic "you're not supposed to" which in my opinion is certainly not
a sufficient argument to justify going through the pains of
post-processing the document in order to remove these empty tags (and
somehow relocate their style properties).

Cheers,

--
David A. Madore
(da**********@ens.fr,
http://www.eleves.ens.fr:8080/home/madore/ )

Jul 20 '05 #14

Jukka K. Korpela

da**********@ens.fr (David Madore) wrote:

Maybe you didn't read my example completely. I'm not using the
"clear" property to clear the next element, but to clear the border of
the surrounding <div>.
The meaning of the clear property is to stop floating, so I cannot see why
you could not use it the way I suggested. It seems to be that you are
imitating <br clear="..."> in CSS, rather than making full use of CSS
possibilities. I don't see how you would "clear the border"; a border
property affects the element that it is assigned to, and you can assign a
height property to the element if you wish to make it taller than its
content needs.
Sometimes the content is generated and it is extremely difficult to
get at the previous or next generated element.
You're referring to content generated by server- or client-side scripting
or preprocessing, right? The content generated by the CSS 'content'
property is something different. In any case, the tools you use for
generating content e.g. server-side should be selected to match the needs,
not vice versa.
I would very much prefer if the fathers and normalizers of HTML had
foreseen the usefulness of a tag to include plain text (or
inline-level HTML) from a foreign source within HTML (without creating
a block-level element for embedding).
Well, they did in a sense - but browsers have not implemented the SGML way
of using entities (except in the trivial sense of supporting a predefined
set of entity references that expand to character references).

I agree with the idea that a simple markup system like HTML should have
had a simple include feature. But CSS is _not_ the solution to that. There
are several better approaches, as describe in the c.i.w.a.h. FAQ.
(though not always: sometimes the inserted text *is* optional and of
presentational nature)
Then it should be something that accompanies the presentation of some
existing element. Besides, in WWW authoring the whole idea of CSS
generated content is mostly just theoretical, due to lack of support by
the current market leader among browsers.
<p>Stylesheet name (if applicable): [<span
id="insert-stylesheet-name-here"></span>]</p>
I fail to see what this relates to. Why would a document contain style
sheet names that way?

A similar case indeed, except that you're referring to a
browser-specific invention, it seems.

Yes, and so? There's nothing wrong with browser-specific inventions
if they're useful and are employed in a way that gracefully degrades
on other browsers.

The point is that you make arguments in favor of hacks, on the grounds
that some hacks need them.
It seems that in every case I've given (except the first, where I
still see no workaround) you've told me "this isn't absolutely
necessary" and I've answered "yes, but it's convenient".
I think for that for every case, including the first, I have shown that
there is no need for using a <div> or <span> with empty content.
On the other hand, if you have an important practical reason for not
using empty <div> and <span> tags

First, there is no practical need for <div> and <span> elements with empty
content (to use the proper terms).

Second, we have the precedent of <p></p>, which has caused much confusion
- it has been used for layout, and the HTML specification explicitly says
that it should not be used, and that browsers should ignore such elements.
And browsers do not generally do that, so we really have a confusion.

Third, to take a simple example, such elements mess up the document
appearance when a user style sheet is used in order to make all <div>
elements bordered, so that the structure can be seen.

Followups trimmed - I think we are now so far from general XML that this
belongs to the HTML group only.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #15

David Madore

"Jukka K. Korpela" in litteris
<Xn****************************@193.229.0.31> scripsit:

The meaning of the clear property is to stop floating, so I cannot see why
you could not use it the way I suggested. It seems to be that you are
imitating <br clear="..."> in CSS, rather than making full use of CSS
possibilities. I don't see how you would "clear the border"; a border
property affects the element that it is assigned to, and you can assign a
height property to the element if you wish to make it taller than its
content needs.
The clear attribute is deprecated in HTML4 or XHTML. Certainly <br
style="clear: both" /> does the trick, but I fail to see in what way
it is any better than <div style="clear: both"></div>. The HTML4 spec
says that "The BR element forcibly breaks (ends) the current line of
text", and that's not what it's being used for: I'd say that using
<br/> except within a <p> (or somesuch) with text immediately
preceding and following is far worse style than using an empty <div>.

Restate that with an example: I believe that
<URL: http://www.eleves.ens.fr:8080/home/m...st/float1.html >
is better HTML style than
<URL: http://www.eleves.ens.fr:8080/home/m...st/float3.html >
(and the two should be rendered more or less identically).

(Note: I use the style attribute in my examples merely so they can be
written more concisely, but it doesn't have to be so; the class
attribute and an appropriate stylesheet would to just as well, of
course.)
You're referring to content generated by server- or client-side scripting
or preprocessing, right?
Yes, sorry, my wording was confusing.
In any case, the tools you use for
generating content e.g. server-side should be selected to match the needs,
not vice versa.
That's very well in theory, but in practice we have to do with the
tools we have. Unless you were to give a *compelling* reason for not
using empty <div> and <span> elements, which you failed to do.
Altering an entire production chain merely to avoid empty <div> and
<span> elements is hardly a serious suggestion.
Well, they did in a sense - but browsers have not implemented the SGML way
of using entities (except in the trivial sense of supporting a predefined
set of entity references that expand to character references).
Precisely.
I agree with the idea that a simple markup system like HTML should have
had a simple include feature. But CSS is _not_ the solution to that. There
are several better approaches, as describe in the c.i.w.a.h. FAQ.
If you refer to question 5.1 in the FAQ (at <URL:
http://www.faqs.org/faqs/www/authoring-faq/ >), well, server-side
includes are not an option for me for reasons that I won't go into.
I'm willing to hear about any other proposal (or any other bit of the
FAQ that I might have missed).

<p>Stylesheet name (if applicable): [<span
id="insert-stylesheet-name-here"></span>]</p>

I fail to see what this relates to. Why would a document contain style
sheet names that way?

Because we can have alternate stylesheets, and a well-designed
browser, or a little bit of ECMAscript magic, lets the user choose
among them. It might be nice to let the stylesheet name appear
somewhere within the content.

Yes, and so? There's nothing wrong with browser-specific inventions
if they're useful and are employed in a way that gracefully degrades
on other browsers.

The point is that you make arguments in favor of hacks, on the grounds
that some hacks need them.

And so what? Hacks won't go away merely because we call them "hacks".
Sometimes, regretfully, they are needed, because existing tools don't
do the job.

It seems that in every case I've given (except the first, where I
still see no workaround) you've told me "this isn't absolutely
necessary" and I've answered "yes, but it's convenient".

I think for that for every case, including the first, I have shown that
there is no need for using a <div> or <span> with empty content.

You have shown that there is no logical necessity to use them, indeed,
but you have not shown that they are not useful or that they are
harmful.

On the other hand, if you have an important practical reason for not
using empty <div> and <span> tags

First, there is no practical need for <div> and <span> elements with empty
content (to use the proper terms).

There is no practical need, but I maintain that there is practical
usefulness.
Second, we have the precedent of <p></p>, which has caused much confusion
- it has been used for layout, and the HTML specification explicitly says
that it should not be used, and that browsers should ignore such elements.
And browsers do not generally do that, so we really have a confusion.
But an empty paragraph is certainly an absurdity: a paragraph, by
definition, cannot be empty. But what is a <div> or a <span>, anyway?
The HTML specification does not enlighten us.

And saying that "it will cause confusion" is handwaving. What
practical problems do you expect should turn up?
Third, to take a simple example, such elements mess up the document
appearance when a user style sheet is used in order to make all <div>
elements bordered, so that the structure can be seen.
That's hardly convincing. First, we'll either get empty borders, or
horizontal rules, or simply nothing at all, and in either case, it
accurately reflects the document's structure (there *is* a <div>
there, and it's empty - knowing whether it should be shown or not is
as pointless as knowing how many angels can stand on the head of a
pin).
Followups trimmed - I think we are now so far from general XML that this
belongs to the HTML group only.

Seems appropriate. I suggest that we drop this discussion, anyway,
since it is getting us nowhere and I think that each of us understands
the other's arguments: do we agree to disagree? Other people can make
their own mind as to whether empty <div> and <span> elements are, or
not, needed / useful / harmful / dangerous.

I would be interested, however, in any suggestions on how to include
foreign inline-level content in HTML without using the CSS content
property or the Mozilla-invented XBL language, given that server-side
includes are ruled out.

Cheers,

--
David A. Madore
(da**********@ens.fr,
http://www.eleves.ens.fr:8080/home/madore/ )

Jul 20 '05 #16

Jukka K. Korpela

da**********@ens.fr (David Madore) wrote:

The clear attribute is deprecated in HTML4 or XHTML.
Certainly. What I referred to was the fact that the use of the clear
property for empty elements means that you are simulating <br clear="...">
in CSS. And I noted that the natural approach is to use the clear property
for content elements, not for elements without content (like <br>, or
<div> with empty content).

In any case, the tools you use for
generating content e.g. server-side should be selected to match the
needs, not vice versa.

That's very well in theory, but in practice we have to do with the
tools we have. Unless you were to give a *compelling* reason for not
using empty <div> and <span> elements, which you failed to do.

You might just as well argue in favor of using <font> because some
generating software produces it, or generates something for which you need
<font>.
Altering an entire production chain merely to avoid empty <div> and
<span> elements is hardly a serious suggestion.
Surely many more production chains rely on <font>.
If you refer to question 5.1 in the FAQ (at <URL:
http://www.faqs.org/faqs/www/authoring-faq/ >), well, server-side
includes are not an option for me for reasons that I won't go into.
The FAQ describes other approaches too, and actually doesn't favor server-
side includes as much as many other suggestions do.

But I really cannot see why server-side processing is excluded when you
actually refer to content generated server-side, as it seems.

<p>Stylesheet name (if applicable): [<span
id="insert-stylesheet-name-here"></span>]</p>

I fail to see what this relates to. Why would a document contain
style sheet names that way?

Because we can have alternate stylesheets, and a well-designed
browser, or a little bit of ECMAscript magic, lets the user choose
among them.

A well-designed browser lets the user choose between style sheets, but
here you seem to be trying to do something that creates a page-specific
method for the same purpose. This might be useful, in the present
situation, but it's difficult to see how your code would relate to that.
Do you mean that people using CSS-disabled browsers should see
"Stylesheet name (if applicable): ", instead of not seeing such a thing?
And so what? Hacks won't go away merely because we call them "hacks".
If we agree on the observation that <div></div> is a hack, we have reached
an obvious conclusion after quite some discussion.
But an empty paragraph is certainly an absurdity: a paragraph, by
definition, cannot be empty. But what is a <div> or a <span>, anyway?
A <div> or <span> with empty content is the same as <p></p>, just without
the paragraph semantics...
The HTML specification does not enlighten us.
.... and no _explicit_ statement against them in the specs. But if <p></p>
is not recommended and should be ignored by user agents, doesn't the same
apply to <div></div> and <span></span> a fortiori?
I would be interested, however, in any suggestions on how to include
foreign inline-level content in HTML without using the CSS content
property or the Mozilla-invented XBL language, given that server-side
includes are ruled out.

Just write it. And here you have genuine use for <span> or <div>, since
the lang attribute needs some markup element to which it can be attached.
If your document should have some content and should not have it, I'm
afraid you need to explain a bit.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #17

David Madore

"Jukka K. Korpela" in litteris
<Xn*****************************@193.229.0.31> scripsit:

Certainly. What I referred to was the fact that the use of the clear
property for empty elements means that you are simulating <br clear="...">
in CSS. And I noted that the natural approach is to use the clear property
for content elements, not for elements without content (like <br>, or
<div> with empty content).
Could you stop beating about the bush and explicit the meaning of "use
the clear property for content elements, not for elements without
content"? Can you write down explicitly the markup that you would
use? I give my very simple example again: the basic page (written
with an empty <div> element) is <URL:
http://www.eleves.ens.fr:8080/home/m...st/float1.html >. At
first I understood that you meant me to write <URL:
http://www.eleves.ens.fr:8080/home/m...st/float2.html > instead,
which puts the clear property on the following <p>, and I underlined
that it does not render in the same way (so it is not acceptable).
Then I understood that you meant me to write as in <URL:
http://www.eleves.ens.fr:8080/home/m...st/float3.html >, which
uses <br/> intead. And now you say that I shouldn't be using clear
for content elements.

So, please put an end to the confusion, and rewrite <URL:
http://www.eleves.ens.fr:8080/home/m...st/float1.html > to
produce the same presentation effect, in the way you think it should
be done. This should take you five seconds and put an end to this
silly dialogue of the deaf.

If you refer to question 5.1 in the FAQ (at <URL:
http://www.faqs.org/faqs/www/authoring-faq/ >), well, server-side
includes are not an option for me for reasons that I won't go into.

The FAQ describes other approaches too, and actually doesn't favor server-
side includes as much as many other suggestions do.

Would you mind pointing to a specific place within the FAQ?
But I really cannot see why server-side processing is excluded when you
actually refer to content generated server-side, as it seems.
Yes, but the content that needs to be included is not available to the
server that does the processing, strange as it may seem. The main
HTML content is processed on one computer, is served from another
computer, and further inline content (which is dynamic, whereas the
rest is mostly static) should be inserted that is served from a third
computer. Details are unimportant, but the bottom line is that the
main content cannot use server-side includes.

And so what? Hacks won't go away merely because we call them "hacks".

If we agree on the observation that <div></div> is a hack, we have reached
an obvious conclusion after quite some discussion.

No, I do not agree with this, and I wish you weren't so condescending.
But I think that we've both given ample arguments by now and it is
useless to continue discussing along this line. I'd just like to know
how you propose to replace the clear property on the empty <div>
(example above), and if you have any totally novel suggestion for
including inline-level HTML content from an HTML document.
A <div> or <span> with empty content is the same as <p></p>, just without
the paragraph semantics...

Precisely, and it is the paragraph semantics which pose problem,
because a paragraph should not be empty, whereas I see no *a priori*
reason why an abstract container element should not be empty. But as
I just said, I drop this line of discussion.

--
David A. Madore
(da**********@ens.fr,
http://www.eleves.ens.fr:8080/home/madore/ )

Jul 20 '05 #18

Jukka K. Korpela

da**********@ens.fr (David Madore) wrote:

Could you stop beating about the bush and explicit the meaning of "use
the clear property for content elements, not for elements without
content"?
The formulation is very explicit. An element without content is an
element that has no content.
Can you write down explicitly the markup that you would use?
I would use no extra markup, except a class attribute when needed.
So, please put an end to the confusion, and rewrite <URL:
http://www.eleves.ens.fr:8080/home/m...st/float1.html > to
produce the same presentation effect
What "same presentation effect"? It renders essentially differently e.g.
on IE 6 and Mozilla 1.3.

If you wish to get consultation help from me with your specific problems,
you need to be prepared to discussions concerning what you really want,
disclosing the real life case, and negotiating on the fee beforehand.
Would you mind pointing to a specific place within the FAQ?

You had already found the right place. Now read it - it does _not_ present
SSI as the only answer to "How do I include a file".

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #19

Mikko Ohtamaa

Johannes Koch <ko**@w3development.de> wrote in message news:<bi************@ID-61067.news.uni-berlin.de>...

Mikko Ohtamaa wrote:
I am using MSXML (Microsoft XML engine) to transform XML data to XHTML
reports.

Why do you want to create _X_HTML reports, when several browsers don't
know about _X_HTML. Produce HTML instead.

Yes, we fell back to HTML. The orignal reason for using XHTML was
character encoding difficulties with MSXML and HTML, but we managed to
workaround this other way.

In XSLT it is too heavy to check if each element will be empty and
implement a wrapper for it.

<xsl:template match="foo">
<xsl:if test="normalize-space(.) != ''">
<div class="{local-name()}">
<xsl:value-of select="."/>
</div>
</xsl:if>
</xsl:template>

Is this really too heavy?

The empty <div/> problem was not in XSLT itself. When MSXML transforms
XML data to XHTML document, the target document resides in memory as
MSXML DOM tree. The DOM tree doesn't have information about start tags
and end tags. When DOM tree is spit out to XML all empty elements are
presented using self-closing tags. So there is no difference in output
wheter you use <div></div> or <div/> in XSLT stylesheets.

(Empty <div/> elements were produced with XSLT because there were
accepted missing fields in input XML)

Also, the result XHTML XML is stored to a file instead of direct
serving from a web server with MIME type support. Even if there are
<?xml...?> and <!doctype...> tags, browsers fail to identify file
contents as XHTML.

-Mikko

Jul 20 '05 #20

Ernest Cline

"Headless" <me@privacy.net> wrote:

David Madore wrote:

Point of order; don't cross post replies.
Note that Mozilla is about the only browser which supports the
application/xhtml+xml content-type anyway.

Ahem: Opera.

Opera is nice but it has what I consider to be a stupid design decision on
its part concerning application/xhtml+xml. It fails to handle entities such
as é correctly. It is true that the XML specs say that a user agent
can do that for XML in general and if the document were being served as
application/xml, I would agree with their decision, but since Opera
indicates that it supports xhtml+xml then in my opinion if it gets back a
document served as xhtml+xml it should parse the entities. Since they have
to support the entities for HTML anyway, I fail to see why they are being so
obstinate.

Jul 20 '05 #21

Henri Sivonen

In article <9L*****************@newsread2.news.atl.earthlink. net>,
"Ernest Cline" <er*********@mindspring.communism> wrote:

Opera is nice but it has what I consider to be a stupid design decision on
its part concerning application/xhtml+xml. It fails to handle entities such
as é correctly. It is true that the XML specs say that a user agent
can do that for XML in general and if the document were being served as
application/xml, I would agree with their decision, but since Opera
indicates that it supports xhtml+xml then in my opinion if it gets back a
document served as xhtml+xml it should parse the entities. Since they have
to support the entities for HTML anyway, I fail to see why they are being so
obstinate.

As I understand it, the main reason why the XML specification allows
non-validating XML processors not to process external entities is to
accommodate browsers. Therefore, it would be silly for browsers not to
use this opportunity for optimizing performance.

The XHTML DTDs are huge. It doesn't make sense to parse them (even from
a local catalog) in an interactive application only to get some
character entities.

By the way, Safari doesn't support the character entities, either.
Mozilla cheats and uses an abridged DTD. What Mozilla does is quite icky.

Entity support in HTML has nothing to do with this, since HTML gets a
tag soup treatment.

--
Henri Sivonen
hs******@iki.fi
http://www.iki.fi/hsivonen/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

Jul 20 '05 #22

Joel Shepherd

Henri Sivonen wrote:

In article <9L*****************@newsread2.news.atl.earthlink. net>,
"Ernest Cline" <er*********@mindspring.communism> wrote:
[Opera] fails to handle entities such as é correctly.
As I understand it, the main reason why the XML specification
allows non-validating XML processors not to process external
entities is to accommodate browsers. Therefore, it would be silly
for browsers not to use this opportunity for optimizing
performance.

Optimizing performance sounds like a very weak rationale for not
rendering characters correctly. I could make my own software much
faster as well, if I could chop out some of the basic functional
requirements. That wouldn't make it *better* though.
The XHTML DTDs are huge. It doesn't make sense to parse them (even
from a local catalog) in an interactive application only to get
some character entities.

How many XHTML DTDs would a browser need to know about? Why would it
not make sense to cache the *parsed* version of each, thereby
protecting performance and enabling the browser to render entities
correctly and efficiently?

--
Joel.

Jul 20 '05 #23

Henri Sivonen

In article <M7****************@newsread4.news.pas.earthlink.n et>,
Joel Shepherd <jo******@ix.netcom.com> wrote:

Henri Sivonen wrote:
In article <9L*****************@newsread2.news.atl.earthlink. net>,
"Ernest Cline" <er*********@mindspring.communism> wrote:
[Opera] fails to handle entities such as é correctly.
As I understand it, the main reason why the XML specification
allows non-validating XML processors not to process external
entities is to accommodate browsers. Therefore, it would be silly
for browsers not to use this opportunity for optimizing
performance.

Optimizing performance sounds like a very weak rationale for not
rendering characters correctly.

There are already two other ways of representing characters correctly.

Getting a *third* way for representing characters when the two other
ways can represent all of Unicode and this third way can represent only
a small subset sounds like a very weak rationale for requiring user
agents to parse additional fluff every time a document is parsed.
I could make my own software much
faster as well, if I could chop out some of the basic functional
requirements. That wouldn't make it *better* though.
The two other ways of representing all the characters that are allowed
in XML are
1) using an encoding that can represent all of Unicode (UTF-*)
and
2) using numeric character references (Ӓ).
I consider support for UTF-8 a basic funtional requirement for software
that is used for authoring XML documents. Do you?

The XHTML DTDs are huge. It doesn't make sense to parse them (even
from a local catalog) in an interactive application only to get
some character entities.

How many XHTML DTDs would a browser need to know about?

To meet what requirement? Modularization of XHTML makes it possible for
anyone to concoct a new language variant that is in the "XHTML family"
of languages. Also, anyone can take an existing W3C XHTML DTD and refer
to it using a local system id without the public id.
Why would it
not make sense to cache the *parsed* version of each, thereby
protecting performance and enabling the browser to render entities
correctly and efficiently?

Not really. Grammar caching is hard because declarations in the internal
DTD subset can substantially affect the external DTD subset. It would be
possible to optimize the common cases somewhat, though. The browser
could cache the data structures built when parsing a couple of W3C XHTML
DTDs and use the cached versions if the internal DTD subset is empty.
However, inflicting this kind of complexity on user agents just in order
to get a third way of representing some characters doesn't make sense.

--
Henri Sivonen
hs******@iki.fi
http://www.iki.fi/hsivonen/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

Jul 20 '05 #24

XHTML user agent behavior regarding empty elements

Similar topics