473,836 Members | 1,572 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Root element specified by DTD ?

What specifies the permitted root element(s) for a document ? HTML,
SGML, XHTML or XML ?
Valid HTML documents need to have a well-known DTD and a doctypedecl in
each document like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">

The document's root element is "HTML", and is specified by the
doctypedecl. For HTML and XHTML it's possible that the prose of their
recommendation restricts it too.
My question is, is there any way to author a non-HTML DTD (SGML or XML)
so as to restrict valid documents to only allow a certain subset of
their elements to be used as the root element? Can this restriction be
expressed _entirely_ within a DTD? Is this used within the HTML DTDs ?
(i.e. not just in the doctypedecl)

Is this fragment a valid HTML document ? If not, why isn't it? Just
which part of its definition is forbidding this fragmentary use?
<!DOCTYPE div PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<div>
<p>Foo</p>
</div>
Good tutorial refs on DTDs are also welcome. I don't know anything like
enough on DTD innards.

Thanks

Jun 2 '06
28 2710
Jukka K. Korpela wrote:
In future, please quote or paraphrase the message that you are
commenting on.
I usually do. Apologies.
It depends on. There's no law that requires additional rules
Granted. It's rare that there aren't any, in my experience, unless the
document type is pure structure.

Think of
these as "higher-level syntax checking"; the application is always
going to impose semantic constraints as well.


What's "higher-level" here?


Higher than the basic XML syntax.
Anyway, in the issue discussed in this
thread, it is the additional _syntactic_ constraints that imply that a
certain kind of document is not an HTML document.
That's what I was agreeing with, though apparently I may have phrased it
badly. The DTD is not always a completely constrained specification of
"a kind of document". That flexibility may in fact have been deliberate;
I strongly suspect the intent was that a single DTD could describe
several documents which share related structures.
Whether HTML specifications make such a
requirement is debatable; the prose in the specs is a mixture of
normative-looking prose, comments, hints, wishful thinking, etc.)


http://www.w3.org/TR/1999/REC-html40...bal.html#h-7.1

The complicating factor here is the use of the word "should". The HTML4
spec predates the W3C's adoption of the normative use of MAY, SHOULD,
and MUST to mean "optional", "don't violate this without extremely good
reason", and "required by the spec" respectively. So we need to
crosscheck that.

XHTML 1.0 does follow that convention, so we can backhandedly check the
intent by looking at that spec. There, a Strictly Conforming Document
must (!) have html as its root element, and this is *NOT* flagged as one
of the differences from HTML4 either in this spec or in the
compatability guidelines (http://www.w3.org/TR/xhtml1/#guidelines). This
strongly suggests that the W3C intended that HTML4 docs follow this rule.

I agree, that's a less than ideal way to answer this question, but I can
tell you that even folks working on the W3C's specs often have to resort
to that kind of pointer chasing to nail things down.

If you need a fully official answer... I haven't checked; are any of us
members of the (X)HTML Working Group? If not, I'd suggest dropping a
quick note to ww******@w3.org and suggesting that it might be good to
have an erratum which clarifies whether this "should" was intended to be
"must" or not. (I checked; there isn't one.)

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 3 '06 #21
VK <sc**********@y ahoo.com> scripsit:
Alan J. Flavell wrote:
I'm saying that - no matter which specific HTML DTD were to be called
out from the above DOCTYPE - the result could be an HTML fragment,
but it would be unreasonable to claim it as an "HTML document".
You have no choice but claim it as "HTML document".


Surely there's the option of being silent? And, in fact, saying that it is
not an HTML document.
It is served from
the served with "Content-Type: text/html",
So what? Serving it as image/gif would not make it a GIF image. The Internet
media type would be incorrectly declared. A Content-Type declaration does
not magically _make_ the data conform to the specification of a specific
media type.
for local files it is
served as the same type by association .html,.htm... --> text/html.
That's a rule that you just made up. Besides, nobody said the filename
suffix is .html or .htm. For all that you can know, it can be .gif or .foo.
So before any DTD you /have/ to explicetly declare what document you
are serving


Nope. Nobody forces you to serve a document on the Internet, or using HTTP
in particular.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

Jun 3 '06 #22
Joe Kesselman <ke************ @comcast.net> scripsit:
http://www.w3.org/TR/1999/REC-html40...bal.html#h-7.1

The complicating factor here is the use of the word "should".
I don't see any "should" in the statement "An HTML 4 document is composed of
three parts:...", which explicitly mentions the <head> element, which by the
DTD must be nonempty. (The <head> and </head> tags are omissible, but the
<title> element is not.)

Besides, reading a bit further, under 7.2 you find
"HTML 4.01 specifies three DTDs, so authors must include one of the
following document type declarations in their documents."

Regarding the more abstract and more vague question what is an "HTML
document" in general, surely any reasonable definition would require
syntactic conformance to _some_ published specification (though not
necessarily one that uses a DTD, for example). The issue was a document that
contains a DOCTYPE declaration referring to an HTML 4.01 DTD, so what HTML
specification could it possibly comply with?
If you need a fully official answer... I haven't checked; are any of
us members of the (X)HTML Working Group? If not, I'd suggest dropping
a quick note to ww******@w3.org and suggesting that it might be good
to have an erratum which clarifies whether this "should" was intended
to be "must" or not. (I checked; there isn't one.)


They are clearly not interested in doing such things. Look at the errata:
http://www.w3.org/MarkUp/html4-updates/errata
(The absence of any additions since May 2001 does not mean that no errors
have been reported.)
HTML 4.01 is closed for all practical purposes, with all the flaws,
ambiguities, and vagueness.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

Jun 3 '06 #23
Jukka K. Korpela wrote:
Peter Flynn <pe********@m.s ilmaril.ie> scripsit:
Is this fragment a valid HTML document ?
Yes, perfectly.


No, it is a valid SGML document, but it is not an HTML document, as
defined in HTML specifications.


Yes, if you need to reference the HTML Spec in addition to the DTD.
That would indicate the validity, but the HTML 4.01 specification
requires that one of three specific DOCTYPE declarations be used - not
just that one of three DTDs be used.
That's why it is unenforceable by a standard parser. Only browsers
implement this requirement, and they are not conforming SGML
applications.
And this isn't one of them.
Moreover, the specification explicitly says:
"After document type declaration, the remainder of an HTML document is
contained by the HTML element."
http://www.w3.org/TR/REC-html40/stru...bal.html#h-7.3


I'm not clear why you were asking this question if you already knew
the answer.

///Peter

Jun 3 '06 #24
Peter Flynn <pe********@m.s ilmaril.ie> scripsit:
Jukka K. Korpela wrote:
Peter Flynn <pe********@m.s ilmaril.ie> scripsit:
Is this fragment a valid HTML document ?

Yes, perfectly.
No, it is a valid SGML document, but it is not an HTML document, as
defined in HTML specifications.


Yes, if you need to reference the HTML Spec in addition to the DTD.


I'm not sure I see what you are saying "Yes" to and what the if statement
relates to. Surely what is or is not an HTML document is to be defined in
HTML specifications, not in a DTD.
That would indicate the validity, but the HTML 4.01 specification
requires that one of three specific DOCTYPE declarations be used -
not just that one of three DTDs be used.


That's why it is unenforceable by a standard parser.


Yes, but the question was not whether something can be enforced.
Only browsers
implement this requirement, and they are not conforming SGML
applications.
They surely aren't, but they don't implement the requirement. They simply
started using the presence and exact form of a DOCTYPE declaration to decide
on the "quirks" vs. "standard" mode. They don't reject a document on the
grounds that it lacks a correct DOCTYPE; they simply process it differently.
(OK, you might say that "quirks" mode intentionally deviates from the
standards, but this is really just a difference in degree - the "standards"
mode isn't standard-conforming either. Besides, "quirks" mode largely means
intentionally broken CSS implementation rather than intentionally broken
HTML implementation. )
I'm not clear why you were asking this question if you already knew
the answer.


I wasn't. It wasn't me who asked the original question. I'm just commenting
on the answers.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

Jun 3 '06 #25
Chris Morris wrote:
Lachlan Hunt <sp***********@ gmail.com> writes:
Andy Dingley <di*****@codesm iths.com> wrote:
Is this fragment a valid HTML document ?...
<!DOCTYPE div PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<div>
<p>Foo</p>
</div>

Yes, it's valid. The validator would have told you that.


It's valid, but is it a valid *HTML* document?


There is a difference between validity and conformance. It is a valid
document, though it is not a conforming HTML document.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Jun 4 '06 #26
Jukka K. Korpela wrote:
HTML 4.01 is closed for all practical purposes, with all the flaws,
ambiguities, and vagueness.


Granted; new effort is going into XHTML.. But in my experience, that
doesn't mean you can't get answers about HTML if you ask intelligent
questions.

I don't care enough to pursue it further. If you do, either try to get
an official ruling or live with ambiguity.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 4 '06 #27

Henri Sivonen wrote:
Since you haven't learning invested in DTDs, unless you have a
non-negotiable requirement to use them, I suggest learning RELAX NG
Compact Syntax instead:
http://relaxng.org/compact-tutorial-20030326.html


Thanks to everyone for their contributions to this useful thread.

As to Relax, then I've been using that for a couple of years now and
found it an excellent format for human-readable definitions. However
most of my actual work is with Schema, simply because it's the
data-typing layer I use with some OWL work (although Relax is making
inroads there).

This particular job needs to be built around DTDs though, something
which so far I've managed to avoid bothering with.

Jun 5 '06 #28
Joe Kesselman wrote:
I don't know _what_ the validator is telling me. As an example (from
>Tidy) it gives a warning
>"inserting missing 'title' element"


Tidy isn't a validatator. It's a tool for repairing broken documents.


Agreed. But it's already on my desktop and nsgmls isn't
(or at least is refusing to install and work right thus far)
Anyone care to comment on what Tidy thinks this document _is_ ?

Now I think we can agree that "<!doctype...>< div><p>Foo</p></div>" is
probably a valid HTML fragment, but that it's not correct to serve such
things over the web.

Now what's Tidy trying to interpet it as? As far as I can judge, Tidy
think this is _also_ a valid HTML document, albeit one that needs a lot
of implicit content adding to <head> beforehand. Is this at all
justifiable, or is Tidy completely out to lunch here?

Jun 5 '06 #29

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
3262
by: Sascha Kerschhofer | last post by:
If I define more than one element "globally" in an XML schema, is there any hint which one is the actual root element for any instance document? e.g. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="a"> <xs:complexType> <xs:sequence> <xs:element ref="b"/> <xs:element name="c"/> </xs:sequence>
2
2428
by: Stanimir Stamenkov | last post by:
I'm trying to clear some sizing issues relative to the initial containing block and the root document element. The sample document I'm trying with: http://stanio.info/viewport_fill.html Basically, for some tests I want to specify the height of an example DIV element inside the BODY using percentages of the viewport height. For this to work the BODY container should fill the viewport height where I'm using:
1
2519
by: Brian | last post by:
Every time add data and save an xml document using XmlDataDocument.Save I get another root node added to the xml file. Am I doing something wrong or is this supposed to happen? Sample Code: srdReader = New StreamReader(SCHEMA1.xsd) xmlFile.DataSet.ReadXmlSchema(srdReader) xmlFile.Load(XML1.xml)
28
2562
by: Andy Dingley | last post by:
What specifies the permitted root element(s) for a document ? HTML, SGML, XHTML or XML ? Valid HTML documents need to have a well-known DTD and a doctypedecl in each document like this: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> The document's root element is "HTML", and is specified by the
16
3534
by: TT (Tom Tempelaere) | last post by:
Hi all, I created an XSD to define the structure of an XML file for my project. I made an XML file linked to the XSD using XmlSpy. The problem is that if I read the file using .NET XmlDocument and then query for the root element, the result is always null (1). However if I strip the root element of all attributes generated by XmlSpy, then there is no problem to find the root element with .NET XML classes (2). (1) The XML for which...
8
1950
by: VK | last post by:
Can be multiple instances of element used as the root element? That's a curly way of asking, but I did not come up with a better sentence, sorry. What I mean is with a document like: <?xml version="1.0" encoding="UTF-8"?> <root> <element>Content</element> <root><element>Content</element></root> <element>Content</element>
0
1573
by: Dave Hill | last post by:
Forgive a newbie question. I'm learning the .NET XML environment. In the walkthrough on using XML designer to create an xsd, there is no discussion of the root element of the target xml document. I realize that the namespace specifying the xsd is an attribute of the root element of an xml document conforming to the xsd, so is logically outside the xsd. I built a simple schema, then added an XML document, added a root element to the...
9
6489
by: Mark Olbert | last post by:
I'm trying to serialize (using XmlSerializer.Serialize) a class that I generated from an XSD schema using XSD.EXE /c. The problem I'm running into is that the root element needs to be unqualified, and the default namespace needs to be included on it as an attribute. The schema I'm using is this: <xs:schema xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:html="http://www.w3.org/TR/REC-html40"...
0
2826
by: icesign | last post by:
I know that the selector of these elements has a scope relative to the element being declared, but maybe there is a way to get beyond bounds of this scope or maybe just a way to extend base element? Here’s a working example: <xs:schema id="schema" targetNamespace="http://tempuri.org/schema.xsd" elementFormDefault="qualified" xmlns="http://tempuri.org/schema.xsd" xmlns:mstns="http://tempuri.org/schema.xsd" ...
0
9816
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9668
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10840
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9371
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6978
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5647
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4448
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4013
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3112
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.