Patient Guy <Pa*********@nowhere.to.be.found.com> wrote:
In my reading of the Strict and Transitional DTD for HTML 4.0, the
table row (TR) elements are contained within table section elements:
THEAD, TFOOT, and TBODY.
That is correct. The same is true for HTML 4.01, which is better reading
these days than HTML 4.0 (though the differences are small and don't matter
here).
The table section elements are the defined contents of the TABLE
element. The TR element is not defined as an "immediate" or "direct"
contained element of TABLE.
Correct. Things may _look_ different when you look at the source code of a
valid HTML 4.01 document, but that's because you don't see the omissable
start and end tags if they have been omitted.
In XHTML things change, since it does not allow any start or end tag
omission. Authors of XHTML 1.0 did not want to disallow the old practice of
writing TR elements inside a TABLE element without intervening markup, so
they made up an ad hoc rule, allowing TR as direct descendant of TABLE
(well, tr as direct descendant of table, to speak XML):
<!ELEMENT table
(caption?, (col*|colgroup*), thead?, tfoot?, (tbody+|tr+))>
Given that
i. the use of the table section elements is optional,
Not really. The elements THEAD and TFOOT are optional, but a TBODY element
is a required part of a TABLE element's content. The _tags_ <tbody> and
</tbody> are optional, though.
but that
ii. TBODY is implicit when no table section elements are specified with
an HTML table,
Not really. The element's start and end tags are implicit.
what does an HTML parsing agent (browser) do when tree-building from a
TABLE node and it encounters a TR element without having encountered
any table section element?
Technically, an HTML parsing agent is not required to build any tree, by
HTML specifications. For practical reasons, some tree structure is needed.
The appropriate way to handle a <tr> tag encountered inside a TABLE element
but without having seen any <thead>, <tfoot>, or <tbody> tag is to imply a
<tbody> tag. Consequently, the tree should have a TBODY element as a
subelement of TABLE and with the TR element(s) as its subelements. Whether
browsers actually do this might not be directly visible.
Does it:
1. append the TR element node to the TABLE node?
That would be inappropriate for HTML 4.0 or HTML 4.01 (though correct for
XHTML 1.0).
OR
2. examine for the presence of a table section node, and failing to
find one instantiated, perform the following steps in this order:
a. instantiate a TBODY element node
b. append it to the TABLE node
c. create the TR element node
d. append the TR node to the TBODY node
e. continue reading the HTML and develop the table according the
specification?
Well, yes, that would be a (somewhat complicated) different variant of the
description I wrote, assuming that "appending" means creation of something
that corresponds to subelement relationship.
It's really a special case of handling omissible tags. Consider the
following fairly minimal HTML 4.01 document:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<title>demo</title>
<p>Hello world.</p>
Bye world.
An HTML parser needs to infer quite a many start and end tags or - to put
it in another way - to recognize elements even though some start and end
tags have been omitted. For example, upon seeing the tag <p>, the parser
needs to conclude that the HEAD element that had been started (though its
start tag had to be inferred) must be considered as closed (in a sense,
</head> is implied first) and that the BODY element has been started (in a
sense, <body> is implied next). The logical document tree does contains the
HTML element as the root element and the HEAD and BODY element as its
subelements (even though all of these three elements have their start and
end tags omitted), and the P element is a subelement of BODY.
This means, for example, that if there is a style sheet rule that applies
to the BODY element, it affects the text "Bye world.", which is directly
inside the BODY element, and it may affect indirectly, via inheritance, the
text inside the P element. Some browsers used to get this wrong - e.g.,
they did not apply such a rule when the start tag <body> was omitted - so
probably they were not that good at building document trees correctly.
***
You pointlessly crossposted to _three_ groups. The JavaScript group was
_certainly_ wrong for this question. I have restricted followups to the
group that is most specifically devoted to the topic that your question
belongs to.
--
Yucca,
http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring:
http://www.cs.tut.fi/~jkorpela/www.html