W3C HTML Validator Error - Invalid content-type

Trevor Orton

Hello, I'm having a slight problem using the W3C html validator and I've
reviewed the FAQ's with no luck so hopefully someone here would be kind
enough to point me in the right direction.

I recently validated my site's css and the bulk of the html pages via the
W3C validator but was forced to use the file upload option because the
'Validate by URL' option returns the following error:

Sorry, I am unable to validate this document because its content type is

http://www.ortonage.com, which is not currently supported by this service.
The Content-Type field is sent by your web server (or web browser if you use
the file upload interface) and depends on its configuration. Commonly, web
servers will have a mapping of filename extensions (such as ".html") to MIME
Content-Type values (such as text/html).

I'd now like to determine why I'm receiving the error. The web site is
http://www.ortonage.com. So far, I've verified that the server mime types
are configured correctly on the web server (MS IIS). I know I'm probably
going to get beat up because I'm using MS IIS but it has met my needs so
far. :) I've also verified the doctype is correct <!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd"> and that the meta tag <META
HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> is
correct. I'm now at a loss on where to look next.

The site uses a URL rewriter so even though the URL appears as static to the
client it gets modified by the server to a dynamic URL before processing and
responding to the client. I'm wondering if the server is not including the
correct content-type in the header information because the ".htm" is
stripped from the URL during processing and before the response is actually
returned to the client side. If that is the case, any ideas on how to
rectify the problem? I also have the meta tag equivalent in each web page
so I thought that it might guarantee the content-type is set correctly.
Works fine for Netscape and IE browsers. I've only seen this problem on W3C
so far.

Any suggestions would be very much appreciated.

Thanks.

Trevor Orton (www.ortonage.com)

Jul 23 '05 #1

Subscribe Post Reply

5659

Steve Pugh

On Sat, 13 Nov 2004 00:48:06 -0500, "Trevor Orton"
<or******@yahoo.com> wrote:

I recently validated my site's css and the bulk of the html pages via the
W3C validator but was forced to use the file upload option because the
'Validate by URL' option returns the following error:
Sorry, I am unable to validate this document because its content type is

http://www.ortonage.com, which is not currently supported by this service.
The Content-Type field is sent by your web server (or web browser if you use
the file upload interface) and depends on its configuration. Commonly, web
servers will have a mapping of filename extensions (such as ".html") to MIME
Content-Type values (such as text/html).

I'd now like to determine why I'm receiving the error. The web site is
http://www.ortonage.com. So far, I've verified that the server mime types
are configured correctly on the web server (MS IIS).

No they're not.

HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Sat, 13 Nov 2004 06:32:48 GMT
Set-Cookie: u=1704538305; expires=Tuesday, 11-Nov-14 06-32-48 GMT;
path=/;
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Sat, 13 Nov 2004 06:32:48 GMT

See? No Content-Type header at all.

Steve

Jul 23 '05 #2

Leonard Blaisdell

In article <z4********************@rogers.com>, "Trevor Orton"
<or******@yahoo.com> wrote:

Any suggestions would be very much appreciated.

<http://www.htmlhelp.com/tools/validator/> says everything is OK. The W3C
validator can't parse it. I'll leave it to others to tell you why.
You asked only for a solution to one problem. I have to tell you that from
a design point of view, you have two.

leo

--
<http://web0.greatbasin.net/~leo/>

Jul 23 '05 #3

David Ross

First of all, the validator is saying there is something wrong with
your server, not with your page. It never even tried to validate
your page.

You need to have your Web host correct the server's .htaccess file
to define a MIME type for HTML files. It's not unusual for an
individual Web site to also have its own .htaccess file for
esoteric MIME types; I have one. However, your Web server should
already have HTML defined in its master .htaccess.

However, I can tell you that you definitely have a design problem.
A well-designed Web page should not require right-left scrolling,
especially for text.

--

David E. Ross
<http://www.rossde.com/>

I use Mozilla as my Web browser because I want a browser that
complies with Web standards. See <http://www.mozilla.org/>.

Jul 23 '05 #4

Trevor Orton

> >http://www.ortonage.com. So far, I've verified that the server mime
types

are configured correctly on the web server (MS IIS).

No they're not.

Ok, I'll have to dig into why IIS is not transmitting the content-type
header even though it is configured to do so. I'm guessing that IIS is
confused because I modify the URL before processing. The fix may be to
force the content-type header through the dynamic page generation sequence
instead of relying on the server.

I assume then that the meta tag for content-type is of little value to the
W3C validator and I assume then that the browsers must either use the meta
tag or assume text/html in the absence of the content-type header.

Thanks for the response Steve.

Jul 23 '05 #5

Trevor Orton

> <http://www.htmlhelp.com/tools/validator/> says everything is OK. The W3C

validator can't parse it. I'll leave it to others to tell you why.
You asked only for a solution to one problem. I have to tell you that from
a design point of view, you have two.

leo

Hi Leo. Yes, I had already ran the same validator but it did not complain,
nor did W3C when I uploaded the page source individually.

Please comment on what you feel are design issues (even if you don't want to
get into why or how to fix them). The site was my first attempt with css and
compliant html so I'm not surprised you feel there are other problems. It
seems to be rendering correctly under recent versions of IE and NS and I
plan to do some testing with additional browsers in the future when time
permits.

Thanks for your time.

Trevor

Jul 23 '05 #6

Trevor Orton

"David Ross" <no****@nowhere.not> wrote in message
news:41***************@nowhere.not...

First of all, the validator is saying there is something wrong with
your server, not with your page. It never even tried to validate
your page.
Makes sense.
You need to have your Web host correct the server's .htaccess file
to define a MIME type for HTML files. It's not unusual for an
individual Web site to also have its own .htaccess file for
esoteric MIME types; I have one. However, your Web server should
already have HTML defined in its master .htaccess.
I didn't know MS IIS had a .htaccess file. I thought .htaccess was used
with linux based web servers only. My IIS GUI shows that text/html is
configured correctly so I'll have to investigate this one further.
However, I can tell you that you definitely have a design problem.
A well-designed Web page should not require right-left scrolling,
especially for text.

Yes, I used absolute positioning in my css and may have used too wide of a
page design. Also, I have only tested it on recent versions of IE and NS so
far so it could be brutal on other browsers at the moment and I wouldn't
even know. I've got some work to do on that end.

Thanks for your feedback.

Trevor

Jul 23 '05 #7

Harrie

Trevor Orton said the following on 11/13/04 20:55:

"David Ross" <no****@nowhere.not> wrote in message
news:41***************@nowhere.not...
You need to have your Web host correct the server's .htaccess file
to define a MIME type for HTML files. It's not unusual for an
individual Web site to also have its own .htaccess file for
esoteric MIME types; I have one. However, your Web server should
already have HTML defined in its master .htaccess.

I didn't know MS IIS had a .htaccess file. I thought .htaccess was used
with linux based web servers only.

AFAIK .htaccess is specific to Apache. But Apache is not Linux based
only, it's a cross platform web server that also runs on Windows. You
might want to give it a try.

--
Regards
Harrie

Jul 23 '05 #8

Harrie

Trevor Orton said the following on 11/13/04 06:48:

The site uses a URL rewriter so even though the URL appears as static to the
client it gets modified by the server to a dynamic URL before processing and
responding to the client. I'm wondering if the server is not including the
correct content-type in the header information because the ".htm" is
stripped from the URL during processing and before the response is actually
returned to the client side. [..]
Why (stripping the ".htm" part)?
[..] If that is the case, any ideas on how to
rectify the problem?

Maybe by not stripping the stripping the ".htm" part?

--
Regards
Harrie

Jul 23 '05 #9

Brian

Trevor Orton wrote:

I didn't know MS IIS had a .htaccess file.
AFAIK, it doesn't.
I thought .htaccess was used with linux based web servers only.

No. It is used with *Apache* only, regardless of operating system. I use
it on Apache/Linux and Apache/Windows.

--
Brian (remove "invalid" to email me)

Jul 23 '05 #10

Brian

Trevor Orton wrote:

I assume then that the meta tag for content-type is of little value
to the W3C validator
Or indeed to conforming web user-agents. The meta tag hack is only
useful for those who want to save the page and open the local copy
later, perhaps when they are offline.
and I assume then that the browsers must either use the meta tag or
assume text/html in the absence of the content-type header.

No, but they are allowed to guess if there's no content-type. Obviously,
there's no sense in making the ua guess if you're the author.

--
Brian (remove "invalid" to email me)

Jul 23 '05 #11

Trevor Orton

Why (stripping the ".htm" part)?

The server extension (dll) does not require it.

I had a brief look at the server side code (dll) and I think that it may be
deleting the content-type header (and possibly others) during the processing
sequence. The server itself is probably working correctly. Funny how it
has been like this for months and the W3C validator was the only thing that
complained. I'll have to debug it on my test server tomorrow.

Thanks everyone for your help.

Trevor

Jul 23 '05 #12

Alan J. Flavell

On Sun, 14 Nov 2004, Brian wrote:

I thought .htaccess was used with linux based web servers only.

No. It is used with *Apache* only,

The idea derives from the NCSA HTTPD, and has been adopted by more
than just one HTTPD that was based on that idea. The Apache server is
by far the most common server which fits that description, but
..htaccess isn't really exclusive to Apache.

Jul 23 '05 #13

Harlan Messinger

"Trevor Orton" <or******@yahoo.com> wrote:

"David Ross" <no****@nowhere.not> wrote in message
news:41***************@nowhere.not...
First of all, the validator is saying there is something wrong with
your server, not with your page. It never even tried to validate
your page.

Makes sense.
You need to have your Web host correct the server's .htaccess file
to define a MIME type for HTML files. It's not unusual for an
individual Web site to also have its own .htaccess file for
esoteric MIME types; I have one. However, your Web server should
already have HTML defined in its master .htaccess.

I didn't know MS IIS had a .htaccess file.

It doesn't. Either David overlooked that you were talking about IIS,
or he didn't realize that .htaccess isn't common to all web servers.

Especially with the part about having a URL-rewriting server
extension, it sounds like you'll really get a better response in an
IIS-related newsgroup. Your problem isn't related to HTML at all.

--
Harlan Messinger
Remove the first dot from my e-mail address.
Veuillez ôter le premier point de mon adresse de courriel.

Jul 23 '05 #14

David Ross

Trevor Orton wrote [in part]:

I previously wrote [also in part]:
However, I can tell you that you definitely have a design problem.
A well-designed Web page should not require right-left scrolling,
especially for text.

Yes, I used absolute positioning in my css and may have used too wide of a
page design. Also, I have only tested it on recent versions of IE and NS so
far so it could be brutal on other browsers at the moment and I wouldn't
even know. I've got some work to do on that end.

Don't use absolute positioning or sizing. Alternative Web browsers
(e.g., Web-capable cell phones) can't handle it. It also creates a
problem for users who don't maximize their browser windows.
Instead, use relative positioning and sizing.

--

David E. Ross
<http://www.rossde.com/>

I use Mozilla as my Web browser because I want a browser that
complies with Web standards. See <http://www.mozilla.org/>.

Jul 23 '05 #15

Leonard Blaisdell

In article <Ad********************@rogers.com>, "Trevor Orton"
<or******@yahoo.com> wrote:

Please comment on what you feel are design issues (even if you don't want to
get into why or how to fix them). The site was my first attempt with css and
compliant html so I'm not surprised you feel there are other problems.
First, two newsgroups that handle site design specifically:
alt.html.critique
comp.infosystems.www.authoring.site-design
I find the first reference to be more active.
Second, I use the ancient resolution of 640x480. Your design causes an
enormous sideways scroll. You have large areas of white space toward the
bottom. There are other design issues.
Try the following link to think about creating sites for "all" resolutions.
<http://www.allmyfaqs.com/faq.pl?AnySizeDesign>
And try the alt.html.critique newsgroup for candid appraisal of your design.
Thanks for your time.

I'm throwing you to the sharks ;-)

leo

--
<http://web0.greatbasin.net/~leo/>

Jul 23 '05 #16

Harrie

Trevor Orton said the following on 11/14/04 05:40:

Why (stripping the ".htm" part)?

The server extension (dll) does not require it.

That's not an answer why you strip it. My browser doesn't require it
also, but it doesn't strip.

If a file is called index.htm I see no reason for whatever extension to
strip the .htm part.

--
Regards
Harrie

Jul 23 '05 #17

Mark Tranchant

Harrie wrote:

Trevor Orton said the following on 11/14/04 05:40:
Why (stripping the ".htm" part)?

The server extension (dll) does not require it.

That's not an answer why you strip it. My browser doesn't require it
also, but it doesn't strip.

If a file is called index.htm I see no reason for whatever extension to
strip the .htm part.

It's useful to specify URLs without the extension. For example:

http://tranchant.plus.com/ie

is actually ie.phpc, but if pigs flew and hell froze over, I might want
to change to IIS/ASP, and rename the file to ie.asp. Sure, I could
rewrite the URL at the server, but I prefer to just drop the extension.

Why should I expose the underlying technology any more than required?

See http://www.w3.org/Provider/Style/URI, just over half way down, under
"What to leave out".

--
Mark.
http://tranchant.plus.com/

Jul 23 '05 #18

Harrie

Mark Tranchant said the following on 11/17/04 10:38:

Harrie wrote:
Trevor Orton said the following on 11/14/04 05:40:
Why (stripping the ".htm" part)?

The server extension (dll) does not require it.
That's not an answer why you strip it. My browser doesn't require it
also, but it doesn't strip.

If a file is called index.htm I see no reason for whatever extension
to strip the .htm part.

It's useful to specify URLs without the extension. For example:

http://tranchant.plus.com/ie

is actually ie.phpc, but if pigs flew and hell froze over, I might want
to change to IIS/ASP, and rename the file to ie.asp. Sure, I could
rewrite the URL at the server, but I prefer to just drop the extension.

Thanks Mark, I'm aware of this, but this doesn't seem (to me at least)
what the OP is doing. In the example you gave both URLs work (with or
without an extension), but the OP said a server extension is stripping
off the extension, so I doubt if this is the same (but maybe I'm not
getting what the dll/extension is doing on the server).
Why should I expose the underlying technology any more than required?

See http://www.w3.org/Provider/Style/URI, just over half way down, under
"What to leave out".

Yes, I'm using this myself for a few months and already it paid of
(except for web statistics which show a lot of "unknown" file types
because of this).

--
Regards
Harrie

Jul 23 '05 #19

Brian

Harrie wrote:

Mark Tranchant said the following on 11/17/04 10:38:
Harrie wrote:

If a file is called index.htm I see no reason for whatever
extension to strip the .htm part.
It's useful to specify URLs without the extension. For example:

http://tranchant.plus.com/ie

is actually ie.phpc

Thanks Mark, I'm aware of this, but this doesn't seem (to me at
least) what the OP is doing. In the example you gave both URLs work
(with or without an extension),

By "extension", you only mean the last few characters of the url,
starting with the period ".", right? Because there is no file extension
on the client end of an http transaction. So whether a resource is
available using two different urls doesn't really figure in here afaics.
but the OP said a server extension is stripping off the extension,

What extension? The extension is only useful on the server end. I'm
afraid I don't quite follow you.

See http://www.w3.org/Provider/Style/URI, just over half way down,
under "What to leave out".

Yes, I'm using this myself for a few months and already it paid of
(except for web statistics which show a lot of "unknown" file types
because of this).

(?) But this is unrelated to the url.

--
Brian (remove "invalid" to email me)

Jul 23 '05 #20

Harrie

Brian said the following on 11/18/04 23:06:

Harrie wrote:
Mark Tranchant said the following on 11/17/04 10:38:
Harrie wrote:

If a file is called index.htm I see no reason for whatever extension
to strip the .htm part.

It's useful to specify URLs without the extension. For example:

http://tranchant.plus.com/ie

is actually ie.phpc
Thanks Mark, I'm aware of this, but this doesn't seem (to me at least)
what the OP is doing. In the example you gave both URLs work (with or
without an extension),

By "extension", you only mean the last few characters of the url,
starting with the period ".", right?

In the first part (".. for whatever extension ..") I was revering to a
server extension, like an Apache module. The OP answered to my question
why the ".htm" part was stripped of:

<quote>
The server extension (dll) does not require it.
</quote>

But maybe he wasn't talking about an Apache module (which is a .dll file
on Windows if I'm not mistaken), but about the ".dll" part of the file.

In my answer to Mark I was talking about the file/extension part. Sorry
for the confusion.
Because there is no file extension on the client end of an http transaction.
Agreed.
So whether a resource is
available using two different urls doesn't really figure in here afaics.

From my point of view (which might very well be the problem) it does.
In the example Mark gave two URLs point to the same file, but the OP
said in hist first post:

<quote>
I'm wondering if the server is not including the correct content-type in
the header information because the ".htm" is stripped from the URL
during processing and before the response is actually returned to the
client side.
</quote>

Which looks to me that if a client specified a URL containing a ".htm"
part, it will be stripped of by the server (which it doesn't do when
looking at his site, but at least that's what I get from the quoted part).

but the OP said a server extension is stripping off the extension,

What extension? The extension is only useful on the server end. I'm
afraid I don't quite follow you.

"server extension" ==> server module
"the extension" ==> any file extension

But by using the word extension twice in the same line it only adds to
the confusion. I think it's best to let it rest, it's not that important.

See http://www.w3.org/Provider/Style/URI, just over half way down,
under "What to leave out".

Yes, I'm using this myself for a few months and already it paid of
(except for web statistics which show a lot of "unknown" file types
because of this).

(?) But this is unrelated to the url.

True, but not to web statistics (which is server related). I'm using
AWStats and by file types I'm referring to this:
http://awstats.sourceforge.net/cgi-b...ight#filetypes
(or http://tinyurl.com/4q9he)

--
Regards
Harrie

Jul 23 '05 #21

Trevor Orton

Just to clarify things here since it seems that I may have confused matters.

The process used is actually quite common. A URL rewriter, in the form of
an MS ISAPI filter (dll) running on the web server is used to convert a
static URL to a dynamic URL before processing. The dynamic URL refers to a
dll (MS ISAPI extension), not an htm page which is why I said "the server
extension (dll) does not require it". No static web pages actually exist.
The reason for displaying static URLs instead of dynamic URLs to the user is
mainly for improved search engine indexing. The dynamic URL is then
processed via the MS ISAPI extension (dll) which creates and returns the web
page on the fly. The web pages created only include static links, not
dynamic links.

FYI: I rectified the original problem posted by adding the content-type
header during the web page creation in the MS ISAPI extension. I also added
the content-lenth while I was at it. Now the challenge is to speed things
up. :)

Hope that helps clear up the confusion.

Trevor
Previously quoted:

<quote>
The server extension (dll) does not require it.
</quote>

<quote>
I'm wondering if the server is not including the correct content-type in
the header information because the ".htm" is stripped from the URL
during processing and before the response is actually returned to the
client side.
</quote>

Jul 23 '05 #22

Harrie

Trevor Orton said the following on 11/22/04 05:56:

Please don't top post:

http://www.xs4all.nl/%7ewijnands/nnq/nquote.html#Q7

Previously quoted:
<quote>
The server extension (dll) does not require it.
</quote>

<quote>
I'm wondering if the server is not including the correct content-type in
the header information because the ".htm" is stripped from the URL
during processing and before the response is actually returned to the
client side.
</quote>

Just to clarify things here since it seems that I may have confused matters.

The process used is actually quite common. A URL rewriter, in the form of
an MS ISAPI filter (dll) running on the web server is used to convert a
static URL to a dynamic URL before processing. The dynamic URL refers to a
dll (MS ISAPI extension), not an htm page which is why I said "the server
extension (dll) does not require it". No static web pages actually exist.
The reason for displaying static URLs instead of dynamic URLs to the user is
mainly for improved search engine indexing. The dynamic URL is then
processed via the MS ISAPI extension (dll) which creates and returns the web
page on the fly. The web pages created only include static links, not
dynamic links.

I'm still confused why you strip the ".htm" part, although this example
doesn't show that you do, but if it works for you, who am I to argue.

--
Regards
Harrie

Jul 23 '05 #23

W3C HTML Validator Error - Invalid content-type

Similar topics