By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,286 Members | 2,422 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,286 IT Pros & Developers. It's quick & easy.

xhtml, custom dtds, and MIME types

P: n/a
Hi All,

Looking for a little expert advice on a few web standards issues. I am
currently trying to understand the impact of web standards for a web
application I work with. I have been doing a lot of research in the
areas of XHTML and WAI compliance, and am attempting to come up with a
recommendation for our product in terms of standards level compliance.
Ideally, I would like to be at XHTML 1.0 Strict.

However, in my reading I have come across some discrepancies in my
attempt to understand these standards. In a perfect world, I would
render my content as "application/xhtml+xml" as is the recommendation
from the W3C, but unfortunately this is the real world, and the
application needs to be backward compatible and work in IE. I plan to
use content negotiation if possible, but may end up *sigh* always
rendering as "text/html" MIME type.

The problem I have is that our application uses and requires
proprietary attributes for html attributes, such as "required", which
the validator throws as an error. In doing some research, it appears
that I could customize the DTD to include these attributes, and then
some validators (apparently not the W3C one, which I have heard doesn't
understand customized DTD's).

And now the questions.....
1. Is it allowable to customize a DTD so that custom attributes do not
throw validation errors?
2. Does rendering a page with MIME type "text/html" still recognize
custom DTD declarations?
3. Is it allowable to not use a custom DTD, and on the web site state
that we are XHTML 1.0 Strict compliant, use the little W3C compliant
logo, and elaborate that the only non-valid sections are proprietary
tags?
4. If we do use a custom DTD, which the W3C validator doesn't
understand, can we still use their logo for compliance if we validate
on another validator (such as the WDG's)?
5. Do the validators provided in Firefox and the IE Dev toolbar beta,
for example, understand custom DTD's?

Sorry for the long post and all the questions, but this is such a grey
area and quickly gets confusing (especially when talking about backward
compatibility and MIME types) and I want to be sure that my
recommendations are in fact possible to accomplish.

Thanks in advance

Jan 27 '06 #1
Share this Question
Share on Google+
9 Replies


P: n/a
wa*******@gmail.com wrote:
However, in my reading I have come across some discrepancies in my
attempt to understand these standards. In a perfect world, I would
render my content as "application/xhtml+xml" as is the recommendation
from the W3C, but unfortunately this is the real world, and the
application needs to be backward compatible and work in IE. I plan to
use content negotiation if possible, but may end up *sigh* always
rendering as "text/html" MIME type.
Don't bother with content negotiation, just stick with HTML 4.01.
Sending application/xhtml+xml to Mozilla/Firefox currently prevents
incremental rendering of the page, which would be a fairly major problem
for a relatively large page downloaded over a dialup connection and
still a minor annoyance for those with faster connections.
The problem I have is that our application uses and requires
proprietary attributes for html attributes, such as "required", which
the validator throws as an error.
Then, your application is broken. You could instead use
class="required" and then any client side scripts that currently make
use of the required attribute could just look for that instead.
In doing some research, it appears that I could customize the DTD to
include these attributes, and then some validators (apparently not the
W3C one, which I have heard doesn't understand customized DTD's).
No, it understands them just fine, except that when you validate with a
custom XML DTD, you need to use a MIME type that will trigger XML
parsing mode, instead of SGML mode. It defaults to SGML mode for
unknown DTDs. Since you'll be serving it as text/html, you may as well
use a customised HTML 4 DTD.
And now the questions.....
1. Is it allowable to customize a DTD so that custom attributes do not
throw validation errors?
Yes.
2. Does rendering a page with MIME type "text/html" still recognize
custom DTD declarations?
Yes, but the W3C validator will parse it with SGML mode instead of XML mode.
3. Is it allowable to not use a custom DTD, and on the web site state
that we are XHTML 1.0 Strict compliant, use the little W3C compliant
logo, and elaborate that the only non-valid sections are proprietary
tags?
When it comes to those little icons, there's basically an honesty policy
in effect. Nothing will happen to you for using them on an invalid
page, except that the lie might confuse people (assuming they aren't web
developers) that click on it, wondering what on earth the little icon
means; in which case they'll be presented with an equally confusing
error message that they won't have a clue how to fix.
4. If we do use a custom DTD, which the W3C validator doesn't
understand, can we still use their logo for compliance if we validate
on another validator (such as the WDG's)?
Again, there's nothing stopping you from lying.
5. Do the validators provided in Firefox and the IE Dev toolbar beta,
for example, understand custom DTD's?


What? I'm assuming you mean something like the web developer toolbars
for Firefox and IE, which simply make use of the W3C validator, or maybe
the HTML Tidy extension for firefox, which makes use of HTML Tidy. I'm
not sure whether HTML Tidy supports custom DTDs or not.

These artices should be useful for you.
http://www.cs.tut.fi/~jkorpela/html/own-dtd.html
http://www.cs.tut.fi/~jkorpela/html/validation.html

In particular, take note of the section about the validation icons in
the validation article.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Jan 28 '06 #2

P: n/a
Lachlan Hunt wrote:
No, it understands them just fine, except that when you validate with a
custom XML DTD, you need to use a MIME type that will trigger XML
parsing mode, instead of SGML mode. It defaults to SGML mode for
unknown DTDs. Since you'll be serving it as text/html, you may as well
use a customised HTML 4 DTD.
Indeed, sending custom-XHTML as text/html is just plain wrong.
2. Does rendering a page with MIME type "text/html" still recognize
custom DTD declarations?

Yes, but the W3C validator will parse it with SGML mode instead of XML
mode.


As will other validators, including those that grok XML better than
the W3C ones (at least those I can speak for without faffing about
feeding them testcases).
What? I'm assuming you mean something like the web developer toolbars
for Firefox and IE, which simply make use of the W3C validator, or maybe
the HTML Tidy extension for firefox, which makes use of HTML Tidy. I'm
not sure whether HTML Tidy supports custom DTDs or not.
Tidy doesn't support DTDs. It's never claimed to be a validator
(although it is used *alongside* a true validator in some tools).
In particular, take note of the section about the validation icons in
the validation article.


Maybe. But the views of its author are not universal amongst informed
opinion.

--
Nick Kew
Jan 28 '06 #3

P: n/a
Thanks for the replies Lachlan and Nick....some very useful information
in there.....Lachlan, are you recommending not to go with an XHTML
DOCTYPE declaration at all, or just saying to always render as
"text/html" and forego content negotiation but still use the XHTML
DOCTYPE declaration?

Unfortunately, in terms of the custom attributes, we have approximately
a dozen or so, so using the "class" attribute as the catch all I don't
think will work - I think instead we would almost have to strip out all
of the invalid attributes prior to rendering to the browser, though
this will mean all client side validation is lost unless we use a
custom DTD.

If I understand what you have said though Lachlan about custom DTD's
and validating, I should be using a customized HTML DTD as the only way
I can validate with a customized XHTML DTD is to use a MIME type that
triggers an XML parser, and the text/html MIME type doesn't do this -
it invokes the SGML parser so it won't recognize the custom XHTML DTD.
So, my question would be, should I even be trying to XHTML compliance
at this point since I have to render to "text/html" anyway? I guess
the problem that I am going to have with this setup is that I want to
be able to validate with XHTML 1.0 Strict, but also have custom
attributes (which requires a custom DTD) but based on browser
restrictions I have to render as MIME type of "text/html", which means
I am never going to be able to reach my goal....is that an accurate
assessment from your comments?

Thanks again

Jan 30 '06 #4

P: n/a
wa*******@gmail.com wrote:
Thanks for the replies Lachlan and Nick....some very useful information
in there.....Lachlan, are you recommending not to go with an XHTML
DOCTYPE declaration at all, or just saying to always render as
"text/html" and forego content negotiation but still use the XHTML
DOCTYPE declaration?
I thought that was quite obvious from my statements, but think about it
this way:

1. Sending application/xhtml+xml to Mozilla prevents incremental rendering.
2. IE doesn't support application/xhtml+xml and requires text/html.
3. Sending XHTML as text/html is wrong, and is effectively the same as
sending incorrectly labelled HTML 4.

Therefore, do not send as application/xhtml+xml and do not use XHTML.
Use HTML 4.01 instead.
Unfortunately, in terms of the custom attributes, we have approximately
a dozen or so, so using the "class" attribute as the catch all I don't
think will work
Firstly, be very careful using custom attributes. If you use, for
example, a foo attribute and in the not to distant future that foo
attribute is defined in a spec and implemented by browsers, your
interpretation of the attribute by your own script may be different from
that defined in a future spec, and thus may cause incompatibilities in
the future.

If you are absolutely positive that the class attribute isn't suitable
for your needs (remembering that it can contain multiple space-separated
values, it's not just limited to one) then use a vendor prefix, such as
the company's name or abbreviation, much like what is done with
proprietary/experimental CSS properties.

eg. instead of using
<p foo="bar">
use
<p x-foo="bar">

(where X is the name of the company)

If you were able to use real XHTML, you could instead define your own
namespace as in:

<html xmlns:x="http://www.example.org/x"
xmlns="http://www.w3.org/1999/xhtml">
....
<p x:foo="bar">

But, since you need to send it as HTML, that is not a real option.
Plus, DTDs don't really work well with namespaces anyway, you'd have to
go with RNG or Schema based validation.
- I think instead we would almost have to strip out all
of the invalid attributes prior to rendering to the browser, though
this will mean all client side validation is lost unless we use a
custom DTD.
Client side validation? Do you realise that browsers don't use
validating parsers and therefore they don't read the DTD? Or were you
talking about client side form validation, in which case, that has
nothing to do with a DTD.
If I understand what you have said though Lachlan about custom DTD's
and validating, I should be using a customized HTML DTD as the only way
I can validate with a customized XHTML DTD is to use a MIME type that
triggers an XML parser, and the text/html MIME type doesn't do this -
it invokes the SGML parser so it won't recognize the custom XHTML DTD.
Yes.
So, my question would be, should I even be trying to XHTML compliance
at this point since I have to render to "text/html" anyway?
No.
I guess the problem that I am going to have with this setup is that I want to
be able to validate with XHTML 1.0 Strict,


Why? What does valid XHTML 1.0 Strict do that valid HTML 4.01 Strict
doesn't?

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Jan 31 '06 #5

P: n/a
Hi Lachlan....thanks again.

When I was talking about client side validation, I was talking about
Javascript and I was talking in the context of it using our custom
attributes when validating the content of a page on the client side -
totally distinct from any DTD.

In terms of the XHTML vs. HTML Strict, I was under the impression that
the benefit of XHTML was that it forces well formed documents (all
elements closed, lowercasing, etc...), whereas the HTML standard did
not. Perhaps I am mistaken in that....

I had never heard of using a vendor prefix on attributes....definitely
sounds like a solution for us. Presumably the validators (W3C, etc...)
ignore attributes that have a vendor prefix?

Thanks

Jan 31 '06 #6

P: n/a
wa*******@gmail.com wrote:
In terms of the XHTML vs. HTML Strict, I was under the impression that
the benefit of XHTML was that it forces well formed documents (all
elements closed, lowercasing, etc...), whereas the HTML standard did
not. Perhaps I am mistaken in that....
Yes, XHTML includes some additional requirements such as explicit end
tags (elements are always closed in HTML as well) but that means
nothing unless the pages are being fed to an XML parser. In text/html
web pages,
browsers will treat them just the same.
I had never heard of using a vendor prefix on attributes....definitely
sounds like a solution for us. Presumably the validators (W3C, etc...)
ignore attributes that have a vendor prefix?


No, they quite correctly flag it up as an error unless you use a custom
DTD. Lachlan explained why you should use vendor prefixes - to avoid
clashes in browsers with future official attributes.

Steve

Jan 31 '06 #7

P: n/a
Ok....thanks Steve....a few last questions - I was having a look at
some other sites on the Web that post the XHTML 1.0 compliant image and
ran them through the validator and they passed. What I did notice was
that some of these sites did not have a content type (i.e. MIME type)
declaration....what happens when this is omitted? Do the browsers
revert to their default behaviours of rendering as text/html?

In terms of the benefit of XHTML compliance, I believe the other
advantage is that it is where the technology is going and allows for
write-once run on multiple devices/platforms, so even though there is
no direct benefit at this time due to rendering as text/html, it may be
of benefit going forward - just trying to be proactive for changes and
minimize potential rework.

Thanks

Feb 1 '06 #8

P: n/a
wardy wrote:
Ok....thanks Steve....a few last questions - I was having a look at
some other sites on the Web that post the XHTML 1.0 compliant image and
ran them through the validator and they passed. What I did notice was
that some of these sites did not have a content type (i.e. MIME type)
declaration....what happens when this is omitted? Do the browsers
revert to their default behaviours of rendering as text/html?
Where were you looking for the content-type header? In the actual HTTP
headers or somewhere else (we're NOT talking about meta tags here). Can
you post the URL of a site that you doesn't think has a content-type
header?
In terms of the benefit of XHTML compliance, I believe the other
advantage is that it is where the technology is going and allows for
The technology doesn't know where its going. XHTML 1.1 is an exercise
in futility. XHTML 2.0 is deliberately broken as far as backwards
compatabilityis concerned. The current buzz is around HTML 5 but it may
just be buzz. With IE7 offering only minimal improvments over IE6 it
looks like the current state of play is going to last for several years
without any major changes.
write-once run on multiple devices/platforms,
Oddly enough that was the original reason for creating HTML in the
first place.

Whilst there are a few mobile platforms that only parse XHTML, most
will parse HTML and XHTML equally well/badly. With 99.999% of the web
made of bad HTML/bad XHTML pretending to be HTML there's no other
sensible option for device developers.
so even though there is
no direct benefit at this time due to rendering as text/html, it may be
of benefit going forward - just trying to be proactive for changes and
minimize potential rework.


Transforming valid HTML 4.01 Strict to valid XHTML 1.0 Strict is
trivial.

Steve

Feb 1 '06 #9

P: n/a
Hi Steve, again, thanks for all the help...this really is a minefield
of information and possibilities, especially when you're used to
dealing in black and whites...I think my statement in the last post
about omitting the content type was incorrect....when I went back and
viewed one of the sites (www.onesuffolk.co.uk) and checked the headers,
the content type was there and it was "text/html", so, for instance,
their site is XHTML Transitional but renders as "text/html". Is this
ok (which I guess the answer is subjective), as their site does
validate to the W3C using that setup? I guess for us, it's going to
come down to the fact that we will likely require a custom DTD for our
proprietary attributes we use, and so the only way we can have that
functionality and still validate would be to use the HTML DOCTYPE and
render as "text/html", thereby invoking the SGML parser for validation
which understands a customized HTML DTD. As I understand it, if I
tried XHTML DOCTYPE with a custom XHTML DTD but rendered to text/html,
this would still invoke the SGML parser, which can't understand an
XHTML custom DTD, so it would revert to the traditional DTD and cough
on our proprietary attributes. Correct me if I am wrong here
please....:)

JohnW

Feb 2 '06 #10

This discussion thread is closed

Replies have been disabled for this discussion.