By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,228 Members | 1,374 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,228 IT Pros & Developers. It's quick & easy.

XHTML W3C validation, UTF-8 and meta tags

P: n/a


I can't understand the warning I'm getting from the W3C
validator. Here it is, along with the source code that it is not
fully satisfied with. What meta-tags should I be including?

Here is the warning I got from the W3C validator.

Note: The HTTP Content-Type header sent by your web browser
(unknown) did not contain a "charset" parameter, but the
Content-Type was one of the XML text/* sub-types (text/xml). The
relevant specification (RFC 3023) specifies a strong default of
"us-ascii" for such documents so we will use this value
regardless of any encoding you may have indicated elsewhere. If
you would like to use a different encoding, you should arrange
to have your browser send this new encoding information.

Here is my test file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xml:lang="en"
xmlns="http://www.w3.org/1999/xhtml">
<!-- Test XHTML -->
<head>
<meta http-equiv="content-type" content="text/html;
charset=UTF-8" />
<title>Sample XHTML</title>
</head>
<body>
<div>
<a href="hello.html">Hello</a> again
<img src="Hello.gif" alt="Hello" height="121"
width="400"/>
<form action="bye-bye/" method="get">
<div>
<input type="hidden" name="config"/>
</div>
</form>
<ol style="LIST-STYLE-TYPE: lower-alpha">
<li>One</li>
<li>Two</li>
</ol>
</div>
<p>Some extended characters: &mdash; &alpha;   &ge;
</p>
</body>
</html>

Jul 20 '05 #1
Share this Question
Share on Google+
21 Replies


P: n/a


Zenobia wrote:

I can't understand the warning I'm getting from the W3C
validator. Here it is, along with the source code that it is not
fully satisfied with. What meta-tags should I be including?

Here is the warning I got from the W3C validator.

Note: The HTTP Content-Type header sent by your web browser
(unknown) did not contain a "charset" parameter, but the
Content-Type was one of the XML text/* sub-types (text/xml). The
relevant specification (RFC 3023) specifies a strong default of
"us-ascii" for such documents so we will use this value
regardless of any encoding you may have indicated elsewhere. If
you would like to use a different encoding, you should arrange
to have your browser send this new encoding information.

Here is my test file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xml:lang="en"
xmlns="http://www.w3.org/1999/xhtml">
<!-- Test XHTML -->
<head>
<meta http-equiv="content-type" content="text/html;
charset=UTF-8" />


Your are using the file upload validation, right? I am afraid that if
you want to validate your XHTML then you are better off if you first
upload the file via FTP to a public HTTP server and then validate by
giving the URL to the W3C validator. Then the W3C validator reads the
XML declaration I think to find the encoding and you don't get that warning.
--

Martin Honnen
http://JavaScript.FAQTs.com/

Jul 20 '05 #2

P: n/a


Zenobia wrote:

I can't understand the warning I'm getting from the W3C
validator. Here it is, along with the source code that it is not
fully satisfied with. What meta-tags should I be including?

Here is the warning I got from the W3C validator.

Note: The HTTP Content-Type header sent by your web browser
(unknown) did not contain a "charset" parameter, but the
Content-Type was one of the XML text/* sub-types (text/xml). The
relevant specification (RFC 3023) specifies a strong default of
"us-ascii" for such documents so we will use this value
regardless of any encoding you may have indicated elsewhere. If
you would like to use a different encoding, you should arrange
to have your browser send this new encoding information.

Here is my test file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xml:lang="en"
xmlns="http://www.w3.org/1999/xhtml">
<!-- Test XHTML -->
<head>
<meta http-equiv="content-type" content="text/html;
charset=UTF-8" />


Your are using the file upload validation, right? I am afraid that if
you want to validate your XHTML then you are better off if you first
upload the file via FTP to a public HTTP server and then validate by
giving the URL to the W3C validator. Then the W3C validator reads the
XML declaration I think to find the encoding and you don't get that warning.
--

Martin Honnen
http://JavaScript.FAQTs.com/

Jul 20 '05 #3

P: n/a
On Sun, 11 Apr 2004, Zenobia wrote:
What meta-tags should I be including?
Probably the wrong question. HTTP protocol doesn't consist of "meta
tags".
Here is the warning I got from the W3C validator.

Note: The HTTP Content-Type header sent by your web browser ^^^^^^^^^^^^^^^^^^^^^^^^ (unknown) did not contain a "charset" parameter, but the
Content-Type was one of the XML text/* sub-types (text/xml).
I take it that you submitted your content for validation by uploading
a local file, rather than by giving the validator a URL which it could
access on your own system?

The validator is warning you that it received the submission without
the browser telling it what character encoding ("charset") the browser
thought it was using. This is actually quite common browser
behaviour, for legacy reasons, although by rights - as the validator
is explaining to you - the recipient is entitled to assume us-ascii in
this context unless the browser explicitly tells it otherwise.

In this particular instance, where your source code in fact contains
only us-ascii characters, the warning is harmless. Since us-ascii is
a subset of utf-8 encoding, you're doing no harm by letting XML
believe that it's utf-8.

You could fool the validator by renaming the local file to imply HTML
rather than XML. It's a bit of a juggling act, due to the messy
consequences of the W3C trying to achieve the impossible and make a
seamless transition to XHTML - and making matters a lot worse than if
they'd insisted on the big bang right from the outset. The notorious
"XHTML/1.0 Appendix C" is only part of this mess.

Or more usefully (and especially if you want others to help you
constructively), you could put the test material on a web-accessible
server (checking that the server sends an appropriate HTTP content
header when it serves the content out), and then submit it by URL
rather than by upload.
Here is my test file:


Yes, but that doesn't really help, since if I save that as foobar.html
and submit it, there's no complaints from the validator.

However, if I rename it to foobar.xml and submit -that-, then the
validator pops up with this warning.

Unfortunately, it seems that the validator folk have implemented this
intended-to-be-informative warning without actually getting around to
documenting it. At least, I couldn't find any further explanation in
their documentation (nor could Google, though it found a couple of
useful discussions on the mailing list archives)[1]

Other contributors may get different effects depending on which
browser they use, and how their browser/ filename-extension/
content-type mappings are set.

It really is much preferable (as repeatedly recommended on this group)
if you can put your specimen on a web server, and tell the hon
Usenauts its URL.

good luck

[1] As a general tip: don't be afraid to type error message fragments
verbatim into Google when you're in doubt as to what they mean. In
this case, it wasn't as helpful as it might be, but in general it
rates to give you a quick answer without waiting for a response from
usenet.
Jul 20 '05 #4

P: n/a
On Sun, 11 Apr 2004, Zenobia wrote:
What meta-tags should I be including?
Probably the wrong question. HTTP protocol doesn't consist of "meta
tags".
Here is the warning I got from the W3C validator.

Note: The HTTP Content-Type header sent by your web browser ^^^^^^^^^^^^^^^^^^^^^^^^ (unknown) did not contain a "charset" parameter, but the
Content-Type was one of the XML text/* sub-types (text/xml).
I take it that you submitted your content for validation by uploading
a local file, rather than by giving the validator a URL which it could
access on your own system?

The validator is warning you that it received the submission without
the browser telling it what character encoding ("charset") the browser
thought it was using. This is actually quite common browser
behaviour, for legacy reasons, although by rights - as the validator
is explaining to you - the recipient is entitled to assume us-ascii in
this context unless the browser explicitly tells it otherwise.

In this particular instance, where your source code in fact contains
only us-ascii characters, the warning is harmless. Since us-ascii is
a subset of utf-8 encoding, you're doing no harm by letting XML
believe that it's utf-8.

You could fool the validator by renaming the local file to imply HTML
rather than XML. It's a bit of a juggling act, due to the messy
consequences of the W3C trying to achieve the impossible and make a
seamless transition to XHTML - and making matters a lot worse than if
they'd insisted on the big bang right from the outset. The notorious
"XHTML/1.0 Appendix C" is only part of this mess.

Or more usefully (and especially if you want others to help you
constructively), you could put the test material on a web-accessible
server (checking that the server sends an appropriate HTTP content
header when it serves the content out), and then submit it by URL
rather than by upload.
Here is my test file:


Yes, but that doesn't really help, since if I save that as foobar.html
and submit it, there's no complaints from the validator.

However, if I rename it to foobar.xml and submit -that-, then the
validator pops up with this warning.

Unfortunately, it seems that the validator folk have implemented this
intended-to-be-informative warning without actually getting around to
documenting it. At least, I couldn't find any further explanation in
their documentation (nor could Google, though it found a couple of
useful discussions on the mailing list archives)[1]

Other contributors may get different effects depending on which
browser they use, and how their browser/ filename-extension/
content-type mappings are set.

It really is much preferable (as repeatedly recommended on this group)
if you can put your specimen on a web server, and tell the hon
Usenauts its URL.

good luck

[1] As a general tip: don't be afraid to type error message fragments
verbatim into Google when you're in doubt as to what they mean. In
this case, it wasn't as helpful as it might be, but in general it
rates to give you a quick answer without waiting for a response from
usenet.
Jul 20 '05 #5

P: n/a
"Zenobia" <5.**********@spamgourmet.com> wrote in message
news:b6********************************@4ax.com...
I can't understand the warning I'm getting from the W3C
validator. .... Note: The HTTP Content-Type header sent by your web browser
(unknown) did not contain a "charset" parameter, but the
Content-Type was one of the XML text/* sub-types (text/xml). .... Here is my test file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xml:lang="en"
xmlns="http://www.w3.org/1999/xhtml">
<!-- Test XHTML -->
<head>
<meta http-equiv="content-type" content="text/html;
charset=UTF-8" />


Try <meta http-equiv="content-type" content="text/html; charset=utf-8" />
In general XHTML doesn't like to get uppercase parameters.
Jul 20 '05 #6

P: n/a
"Zenobia" <5.**********@spamgourmet.com> wrote in message
news:b6********************************@4ax.com...
I can't understand the warning I'm getting from the W3C
validator. .... Note: The HTTP Content-Type header sent by your web browser
(unknown) did not contain a "charset" parameter, but the
Content-Type was one of the XML text/* sub-types (text/xml). .... Here is my test file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xml:lang="en"
xmlns="http://www.w3.org/1999/xhtml">
<!-- Test XHTML -->
<head>
<meta http-equiv="content-type" content="text/html;
charset=UTF-8" />


Try <meta http-equiv="content-type" content="text/html; charset=utf-8" />
In general XHTML doesn't like to get uppercase parameters.
Jul 20 '05 #7

P: n/a
In article <b6********************************@4ax.com>,
Zenobia <5.**********@spamgourmet.com> writes:
fully satisfied with. What meta-tags should I be including?


None. It's warning that HTTP rules take precedence over XML rules,
and your document will be treated as us-ascii regardless of what
you put in it, because that's what your browser said.

You might be able to fix it in your browser configuration, or
work around it by removing the xmldecl and/or giving your file a
..html "extension". If your browser uploads it as text/html, then
the validator will apply HTML rules so your charset declaration
will be used in the absence of one from the browser.

--
Nick Kew

Nick's manifesto: http://www.htmlhelp.com/~nick/
Jul 20 '05 #8

P: n/a
In article <b6********************************@4ax.com>,
Zenobia <5.**********@spamgourmet.com> writes:
fully satisfied with. What meta-tags should I be including?


None. It's warning that HTTP rules take precedence over XML rules,
and your document will be treated as us-ascii regardless of what
you put in it, because that's what your browser said.

You might be able to fix it in your browser configuration, or
work around it by removing the xmldecl and/or giving your file a
..html "extension". If your browser uploads it as text/html, then
the validator will apply HTML rules so your charset declaration
will be used in the absence of one from the browser.

--
Nick Kew

Nick's manifesto: http://www.htmlhelp.com/~nick/
Jul 20 '05 #9

P: n/a
JotM <ja*****@12move.netherlands> wrote:
Try <meta http-equiv="content-type" content="text/html;
charset=utf-8" /> In general XHTML doesn't like to get uppercase
parameters.


Please elaborate.

--
David Håsäther
Jul 20 '05 #10

P: n/a
JotM <ja*****@12move.netherlands> wrote:
Try <meta http-equiv="content-type" content="text/html;
charset=utf-8" /> In general XHTML doesn't like to get uppercase
parameters.


Please elaborate.

--
David Håsäther
Jul 20 '05 #11

P: n/a
"David Håsäther" <ha******@msn.com> wrote in message
news:Xn***************************@195.67.237.51.. .
JotM <ja*****@12move.netherlands> wrote:
Try <meta http-equiv="content-type" content="text/html;
charset=utf-8" /> In general XHTML doesn't like to get uppercase
parameters.


Please elaborate.


The OP's file states
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />

Now XHTML requires parameters to be inbetween quotes and in lowercase, not
UPPERCASE.
An XML validator therefor should ignore the UTF part in the stated charset
declaration. (especially when one chooses to use the strict DTD)

That works, as my testpage is neatly validated by the W3C validator. (
http://validator.w3.org/check?uri=ht...nl%2Findex.htm )

So, use:
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
not UTF-8 in the last bit.

Cheers

Jul 20 '05 #12

P: n/a
"David Håsäther" <ha******@msn.com> wrote in message
news:Xn***************************@195.67.237.51.. .
JotM <ja*****@12move.netherlands> wrote:
Try <meta http-equiv="content-type" content="text/html;
charset=utf-8" /> In general XHTML doesn't like to get uppercase
parameters.


Please elaborate.


The OP's file states
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />

Now XHTML requires parameters to be inbetween quotes and in lowercase, not
UPPERCASE.
An XML validator therefor should ignore the UTF part in the stated charset
declaration. (especially when one chooses to use the strict DTD)

That works, as my testpage is neatly validated by the W3C validator. (
http://validator.w3.org/check?uri=ht...nl%2Findex.htm )

So, use:
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
not UTF-8 in the last bit.

Cheers

Jul 20 '05 #13

P: n/a
"Nick Kew" <ni**@hugin.webthing.com> wrote in message
news:v7************@webthing.com...
In article <b6********************************@4ax.com>,
Zenobia <5.**********@spamgourmet.com> writes:
fully satisfied with. What meta-tags should I be including?


None. It's warning that HTTP rules take precedence over XML rules,

.... [snap]

Yet I found that when I did (only declaring the encoding in the XML prolog
of a XHTML 1.1 document - of the "application/xml+xhtml" type), only
Netscape and Amaya rendered the page correctly. IE6 and Opera needed the
charset decleration to be in a meta to interpret UTF-8 right.
Jul 20 '05 #14

P: n/a
"Nick Kew" <ni**@hugin.webthing.com> wrote in message
news:v7************@webthing.com...
In article <b6********************************@4ax.com>,
Zenobia <5.**********@spamgourmet.com> writes:
fully satisfied with. What meta-tags should I be including?


None. It's warning that HTTP rules take precedence over XML rules,

.... [snap]

Yet I found that when I did (only declaring the encoding in the XML prolog
of a XHTML 1.1 document - of the "application/xml+xhtml" type), only
Netscape and Amaya rendered the page correctly. IE6 and Opera needed the
charset decleration to be in a meta to interpret UTF-8 right.
Jul 20 '05 #15

P: n/a
In article <c5**********@reader13.wxs.nl>,
"JotM" <ja*****@12move.netherlands> writes:
(chop)


Since thre people posted different formulations of the correct answer,
I thought this one could be safely ignored. But evidently not.

The charset parameter is case-insensitive.

--
Nick Kew

Nick's manifesto: http://www.htmlhelp.com/~nick/
Jul 20 '05 #16

P: n/a
In article <c5**********@reader13.wxs.nl>,
"JotM" <ja*****@12move.netherlands> writes:
(chop)


Since thre people posted different formulations of the correct answer,
I thought this one could be safely ignored. But evidently not.

The charset parameter is case-insensitive.

--
Nick Kew

Nick's manifesto: http://www.htmlhelp.com/~nick/
Jul 20 '05 #17

P: n/a
Nick Kew wrote:
In article <c5**********@reader13.wxs.nl>,
"JotM" <ja*****@12move.netherlands> writes:

(chop)

Since thre people posted different formulations of the correct answer,
I thought this one could be safely ignored. But evidently not.

The charset parameter is case-insensitive.

Mmm.
My appologies. Read the rest of the thread and did some googling.
Learned a lot today. Thanks.

Jul 20 '05 #18

P: n/a
Nick Kew wrote:
In article <c5**********@reader13.wxs.nl>,
"JotM" <ja*****@12move.netherlands> writes:

(chop)

Since thre people posted different formulations of the correct answer,
I thought this one could be safely ignored. But evidently not.

The charset parameter is case-insensitive.

Mmm.
My appologies. Read the rest of the thread and did some googling.
Learned a lot today. Thanks.

Jul 20 '05 #19

P: n/a
JotM scribbled something along the lines of:
Nick Kew wrote:
In article <c5**********@reader13.wxs.nl>,
"JotM" <ja*****@12move.netherlands> writes:

(chop)


Since thre people posted different formulations of the correct answer,
I thought this one could be safely ignored. But evidently not.

The charset parameter is case-insensitive.

Mmm.
My appologies. Read the rest of the thread and did some googling.
Learned a lot today. Thanks.


The thing is that XHTML attributes are case sensitive. That means values
like type="" on <input/> and other values that must match a predefined
set as well as the attribute names must be lowercase because the DTD
(and Schema, if there was one) defines them in lowercase.

--
Alan Plum, WAD/WD, Mushroom Cloud Productions
http://www.mushroom-cloud.com/
Jul 20 '05 #20

P: n/a
On Sat, 17 Apr 2004, Ashmodai wrote:
The thing is that XHTML attributes are case sensitive.
As far as XHTML is concerned, indeed they are. But where the values
are used in some other context, the precise rules can be rather
confusing.
That means values like type="" on <input/> and other values that
must match a predefined set as well as the attribute names must be ^^^^^^^^^^^^^^^^^^^^^^ lowercase because the DTD (and Schema, if there was one) defines
them in lowercase.


That's an important point. The charset attribute's values, which
started this subthread, are not matched against a predefined list in
XHTML.

And it means, paradoxically, that the form submission methods MUST be
specified as "get" and "post" in XHTML[1], in spite of the fact that
the corresponding HTTP methods MUST be specified as "GET" and "POST" -
since in HTTP too [2] the method is case-sensitive - but
HTTP rules require the method to be upper-case.

[1]
<!ATTLIST form
%attrs;
action %URI; #REQUIRED
method (get|post) "get"
[...]

[2] RFC2616 section 5.1.1
Jul 20 '05 #21

P: n/a
In article <Pi*******************************@ppepc56.ph.gla. ac.uk>,
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> writes:
The thing is that XHTML attributes are case sensitive.


As far as XHTML is concerned, indeed they are. But where the values
are used in some other context, the precise rules can be rather
confusing.


If I might blow my own trumpet here ...

To see how this works, go to Page Valet[1], and validate any XHTML page.
Under the options, select:
Parser Xerces
Report Format Visual Validator
(other options are irrelevant here)

It will show your markup normalised, including highlighting the various
different classes of attribute values.

[1] http://valet.webthing.com/page/

--
Nick Kew

Nick's manifesto: http://www.htmlhelp.com/~nick/
Jul 20 '05 #22

This discussion thread is closed

Replies have been disabled for this discussion.