application/xhtml+xml in IE

Gustaf

I just read this article from today:

http://webstandards.org/buzz/archive/2005_09.html

I need some help understanding this sentense:

The W3C recommends XHTML 1.1 should be served with the
application/xhtml+xml MIME type, something that Internet
Explorer does not currently support.

I thought it was the web *server* that serves documents in a certain
MIME type, not the web *browser*. I don't see why it matters whether IE
*knows* it's dealing with XHTML as long as the page is valid, is served
with a correct Content-Type header, and displayed correctly in the browser.

Gustaf

Sep 1 '05 #1

Subscribe Post Reply

4230

Benjamin Niemann

Gustaf wrote:

I just read this article from today:

http://webstandards.org/buzz/archive/2005_09.html

I need some help understanding this sentense:

The W3C recommends XHTML 1.1 should be served with the
application/xhtml+xml MIME type, something that Internet
Explorer does not currently support.

I thought it was the web *server* that serves documents in a certain
MIME type, not the web *browser*. I don't see why it matters whether IE
*knows* it's dealing with XHTML as long as the page is valid, is served
with a correct Content-Type header, and displayed correctly in the
browser.

It certainly matters, if the document is not displayed at all ;)

IE does not recognize documents *served* as application/xhtml+xml as a
document it could display. It will popup a 'Save as..', 'Open with..' or
dunno dialog, if you try to load such a document. And a XHTML 1.1 document
is not valid, if it is served as text/html - the only MIME type IE
understands (for displaying (X)HTML documents).

Serving XHTML 1.0 as text/html is valid and advocated by the W3C as a
temporary workaround, but many people (including me) think that this is an
ugly, unnecessary hack that should be avoided. XHTML 1.0 has no added value
over HTML 4.01, especially if served as text/html.

--
Benjamin Niemann
Email: pink at odahoda dot de
WWW: http://www.odahoda.de/

Sep 1 '05 #2

Gustaf

Benjamin Niemann wrote:

IE does not recognize documents *served* as application/xhtml+xml as a
document it could display. It will popup a 'Save as..', 'Open with..' or
dunno dialog, if you try to load such a document.

That's strange. I started to make the switch to XHTML 1.1, and I have no
problem in IE6 on Windows XP SP2. Maybe the problem you mention occurs
in IE5 (which is bad enough). Can anyone confirm this? I saw that IE6
doesn't include "application/xhtml+xml" in the "Accept" HTTP header in
the request, but the page still renders properly. Maybe it shouldn't.

For those interested, I wrote a bit on how to write conformant XHTML 1.1
documents (the URL is temporary). Enjoy. :-)

http://gusgus.cn/www/xhtml/authoringxhtml11.html

It looks identical IE and Firefox. In Opera, the <pre> elements are
displayed in a smaller font.

Gustaf

Sep 1 '05 #3

Darin McGrew

Gustaf <gu*****@algonet.se> wrote:

For those interested, I wrote a bit on how to write conformant XHTML 1.1
documents (the URL is temporary). Enjoy. :-)

http://gusgus.cn/www/xhtml/authoringxhtml11.html

It looks identical IE and Firefox. In Opera, the <pre> elements are
displayed in a smaller font.

Interesting. My copy of MSIE renders it as

File Download

You have chosen to download a file from this location.

authoringxhtml11.html from gusgus.cn

What would you like to do with this file?
( ) Open this file from its current location
(*) Save this file to disk

[OK] [Cancel] [More Info]

If I select "Open this file from its current location", then calls Opera to
display the file.
--
Darin McGrew, mc****@stanfordalumni.org, http://www.rahul.net/mcgrew/
Web Design Group, da***@htmlhelp.com, http://www.HTMLHelp.com/

"I used to have a handle on life, but it broke."

Sep 1 '05 #4

Alan J. Flavell

On Thu, 1 Sep 2005, Darin McGrew wrote:

Gustaf <gu*****@algonet.se> wrote:

http://gusgus.cn/www/xhtml/authoringxhtml11.html
Interesting. My copy of MSIE renders it as

File Download

[...] ( ) Open this file from its current location [...]

Yup, back with Win/NT4 I configured IE to to use Mozilla to open this
content-type ...
If I select "Open this file from its current location", then calls
Opera to display the file.

I've no argument with that...

But interestingly, if I try to use MSIE to access the above URL, I get
an alert saying "Your current security settings do not allow this file
to be downloaded".

MSIE always seems to have a new trick up its sleeve.

Maybe you haven't applied the latest MS security fixes? :-}

Sep 1 '05 #5

Jukka K. Korpela

Gustaf <gu*****@algonet.se> wrote:

Benjamin Niemann wrote:
IE does not recognize documents *served* as application/xhtml+xml as a
document it could display. It will popup a 'Save as..', 'Open with..' or
dunno dialog, if you try to load such a document.
That's strange. I started to make the switch to XHTML 1.1, and I have no
problem in IE6 on Windows XP SP2. Maybe the problem you mention occurs
in IE5 (which is bad enough).

No, it occurs on IE6 as well.
I saw that IE6
doesn't include "application/xhtml+xml" in the "Accept" HTTP header in
the request, but the page still renders properly.
IE6 does not explicitly mention application/xhtml+xml but it includes */*,
which means "anything goes".
http://gusgus.cn/www/xhtml/authoringxhtml11.html

It looks identical IE and Firefox.

Not here. I don't know what is going on, but on IE6 I see a dialog pop up,
then vanish, and the page opens in the browser _but_ with no CSS effects
and with no images, apparently because the browser shows a copy that it has
stored in the Temporary Internet Files folder.

I don't know a way to check what a server sends to IE6 specifically. It
could be something different from what one gets by just talking to the
server at port 80: the response is something like

HTTP/1.1 200 OK
Date: Thu, 01 Sep 2005 22:11:04 GMT
Server: Apache/1.3.33 (Unix) mod_ssl/2.8.22 OpenSSL/0.9.7e PHP/4.4.0
Last-Modified: Tue, 30 Aug 2005 16:22:23 GMT
ETag: "4bfc82-1a71-431487bf"
Accept-Ranges: bytes
Content-Length: 6769
Content-Type: application/xhtml+xml; charset=utf-8

That looks pretty normal, and essentially the same as for XHTML files that
IE6 refuses to open.

But now I notice that if I use the filename suffix .html for an XHTML file,
then I get the same reaction from IE6 as with your document. It seems that
IE6 (on my system at least) automagically downloads the file and opens it
in the browser - but from the temporary folder

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Sep 1 '05 #6

Gustaf

Jukka K. Korpela wrote:

IE6 does not explicitly mention application/xhtml+xml but it includes */*,
which means "anything goes".
Yes, I noticed. My guess is that resources fetched as */* triggers the
File Download dialog. But I still don't understand why I can browse
pages served as "application/xhtml+xml" in IE6. Am I really alone about
this?

My Windows is fully updated. My version of IE is:

6.0.2900.2180.xpsp_sp2_gdr.050301-1519
But now I notice that if I use the filename suffix .html for an XHTML file,
then I get the same reaction from IE6 as with your document.

If I use an .xhtml suffix, I get the File Download dialog, like the
others here. (There's a copy of the same document with an .xhtml suffix
at the same place now.)

Gustaf

Sep 2 '05 #7

Brian

Alan J. Flavell wrote:

interestingly, if I try to use MSIE to access the above URL, I get an
alert saying "Your current security settings do not allow this file
to be downloaded".

MSIE always seems to have a new trick up its sleeve.

Maybe you haven't applied the latest MS security fixes? :-}

:-D That was just plain mean. And rather funny, too. Thanks for that.

--
Brian

Sep 2 '05 #8

Henri Sivonen

In article <0Y********************@giganews.com>,
Gustaf <gu*****@algonet.se> wrote:

But I still don't understand why I can browse
pages served as "application/xhtml+xml" in IE6. Am I really alone about
this?

Do you have the MathPlayer plug-in installed? Or have you tweaked the
registry manually so that application/xhtml+xml becomes an alias for
text/html?

--
Henri Sivonen
hs******@iki.fi
http://hsivonen.iki.fi/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

Sep 2 '05 #9

Jukka K. Korpela

Gustaf wrote:

But I still don't understand why I can browse
pages served as "application/xhtml+xml" in IE6. Am I really alone about
this?

No, you aren't. Now testing on Win XP SP 2, using IE 6 to access
http://gusgus.cn/www/xhtml/authoringxhtml11.html
opens your document as intended, with styles and images. However, on the
status line I see, for a short time, a message about _loading_ a file,
i.e. a message I don't normally see when accessing a web page.

If I try to access
http://gusgus.cn/www/xhtml/authoringxhtml11.xhtml
I get the dialog. This seems to be the same as your experience.

My conjecture is that IE 6 (on XP) first recognizes the document as
being of an unknown application type (from the HTTP headers, since IE 6
does not grok application/xhtml+xml). Then, after downloading it, it
starts to wonder what to do with it. It uses its usual suffix sniffing
and decides that .html means it's HTML (good old HTML) after all, and
automagically opens it in the browser window - whereas .xhtml is Greek
to it, at least by default.

I guess if I played with the filename suffix association settings in
Windows, IE 6 might open even URLs ending with .xhtml as (good old) HTML
documents. I'm not particularly interested in messing around that,
though, and it wouldn't really be that relevant to HTML authoring for
the WWW - we can't expect users to play such games just to see our
glorious XHTML documents rendered by HTML rules.

Sep 2 '05 #10

Guy Macon

Jukka K. Korpela wrote:

No, you aren't. Now testing on Win XP SP 2, using IE 6 to access
http://gusgus.cn/www/xhtml/authoringxhtml11.html
opens your document as intended, with styles and images. However, on the
status line I see, for a short time, a message about _loading_ a file,
i.e. a message I don't normally see when accessing a web page.

If I try to access
http://gusgus.cn/www/xhtml/authoringxhtml11.xhtml
I get the dialog. This seems to be the same as your experience.

I just tried it on a fresh install of Windows 2000 Advanced Server
with no customization, and with a standard install of Mozilla
Firefox 1.06. IE is 6.0.2800.1106. All security updates are
installed.
WITH FIREFOX
http://gusgus.cn/www/xhtml/authoringxhtml11.xhtml and .html display fine.
WITH INTERNET EXPLORER
http://gusgus.cn/www/xhtml/authoringxhtml11.xhtml and .html display the
same dialog box:

File Download

Some files can harm your computer. If the file information below looks
suspicious, or you do not fully trust the source, do not open or save this
file.

File name: authoringxhtml11.html

File Type: FIERFOXHTML (looks like FireFox set up a file association...)

From: gusgus.cn

Would you like to open the file or save it to your computer?

[Open] [Save] {Cancel] [More info] (save is the default)

[ ] Always ask before opening this type of file (selected by default)

Save and Cancel do what you would expect.
More Info pulls up the downloading files section of IE help.
Open displays the page in FireFox with no CSS formatting and a URL of
file:///i:/Documents%20and%20Settings/Administrator/Local%20Settings/Temporary%20Internet%20Files/Content.IE5/8IA23D5F/authoringxhtml11%5B1%5D.html

The Always Ask check box is checked and won't stay unchecked.

Adding gusgus.cn to trusted sites in IE does not change the behavior.

Sep 2 '05 #11

Guy Macon

http://gusgus.cn/www/xhtml/authoringxhtml11.html says...

"According to the rules of XML, skipping the XML declaration
is okay only when using either UTF-8 or UTF-16 as character
encoding in the document."

I was under the impression that US-ASCII was OK as well.
Does anyone have a reference for the above?

Sep 2 '05 #12

Gustaf

Guy Macon wrote:

"According to the rules of XML, skipping the XML declaration
is okay only when using either UTF-8 or UTF-16 as character
encoding in the document." I was under the impression that US-ASCII was OK as well.
Does anyone have a reference for the above?

If the encoding declaration is omitted, XML only admits UTF-8 or UTF-16.

http://www.w3.org/TR/REC-xml/#charencoding

But you can write pure ASCII documents just fine, since characters in
the ASCII range are encoded the same in UTF-8. That is, you don't need
an XML declaration for pure ASCII documents, since they will be treated
as UTF-8 documents.

Gustaf

Sep 2 '05 #13

Gustaf

Henri Sivonen wrote:

Do you have the MathPlayer plug-in installed? Or have you tweaked the
registry manually so that application/xhtml+xml becomes an alias for
text/html?

Of course not. I would have mentioned that.

Gustaf

Sep 2 '05 #14

Alan J. Flavell

On Fri, 2 Sep 2005, Guy Macon wrote:

http://gusgus.cn/www/xhtml/authoringxhtml11.html says...

"According to the rules of XML, skipping the XML declaration
is okay only when using either UTF-8 or UTF-16 as character
encoding in the document."

I was under the impression that US-ASCII was OK as well.
US-ASCII is just a special case of utf-8, in this sense: so yes,
that's OK too.[1]

Note that (to use recent Unicode terminology), the "character
encoding" of utf-16 comprises three "character encoding schemes":
utf-16 with BOM (where the byte ordering is discerned by reading the
BOM), utf-16LE and utf-16BE (where the byte ordering is laid down by
the name of the encoding scheme). I rather suspect that only the
first of those three schemes is legal XML without using the ?xml
thingy to specify the encoding (scheme!).
Does anyone have a reference for the above?

I'll leave that to someone who happens to have it at their fingertips,
if you didn't find it yourself.

But the character encoding scheme which is advertised from an HTTP
server via the MIME "charset=" is authoritative, according to RFC2616,
and this attribute should not be omitted according to security alert
CA-2000-02, so the <?xml thingy should really only be getting *used*
in non-HTTP contexts (e.g reading a local file): if the ?xml thingy
specified an encoding in an HTTP context, that evidently needs to be
consistent with what the server's HTTP Content-type header says.[2]

cheers

[1] Amusingly, HTTP rules say that the default is iso-8859-1.

So you can present us-ascii to XML, and allow it to assume that it's
utf-8, and at the same time present it to HTTP and allow -it- to
assume that it's iso-8859-1. and *both of them are correct*, in this
special case ;-)

[2] Readers should not confuse this with any "meta http-equiv"
content-type, which has no meaning as far as XML is concerned.
But you knew that, right?

Sep 2 '05 #15

Alan J. Flavell

On Fri, 2 Sep 2005, while I was getting myself up to speed with
another followup, I now see that Gustaf wrote:

If the encoding declaration is omitted, XML only admits UTF-8 or UTF-16.

http://www.w3.org/TR/REC-xml/#charencoding

Thanks - I see that this actually confirms what I had just posted:

___
/
utf-16 with BOM (where the byte ordering is discerned by reading the
BOM), utf-16LE and utf-16BE (where the byte ordering is laid down by
the name of the encoding scheme). I rather suspect that only the
first of those three schemes is legal XML without using the ?xml
thingy to specify the encoding (scheme!).
\___

cheers

Sep 2 '05 #16

Guy Macon

Gustaf wrote:

Guy Macon wrote:
"According to the rules of XML, skipping the XML declaration
is okay only when using either UTF-8 or UTF-16 as character
encoding in the document."

I was under the impression that US-ASCII was OK as well.
Does anyone have a reference for the above?

If the encoding declaration is omitted, XML only admits UTF-8 or UTF-16.

http://www.w3.org/TR/REC-xml/#charencoding

But you can write pure ASCII documents just fine, since characters in
the ASCII range are encoded the same in UTF-8. That is, you don't need
an XML declaration for pure ASCII documents, since they will be treated
as UTF-8 documents.

On my website (http://www.guymacon.com/) I set my .htaccess so that
the server response includes:

Content-Type: text/html; charset=us-ascii

Then I wrote hy markup with no XML declaration (to avoid triggerin
quirks mode in the brain-dead MS browser):

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII" />
...
(The meta http-equiv is useless, but I don't think it hurts anything.)

So, unless I am mistaken, I am using US-ASCII character encoding,
and I have skipped the XML declaration, and yet I am within the
rules of XML. So would it be fair to say that...

"According to the rules of XML, skipping the XML declaration
is okay only when using either UTF-8 or UTF-16 as character
encoding in the document."

....should be changed to...

"According to the rules of XML, skipping the XML declaration
is okay only when using US-ASCII, UTF-8 or UTF-16 as the
character encoding in the document."

....?

Or am I missing something? (It wouldn't be the first time...)

Sep 2 '05 #17

Alan J. Flavell

On Fri, 2 Sep 2005, Guy Macon wrote:

http://www.w3.org/TR/REC-xml/#charencoding

On my website (http://www.guymacon.com/) I set my .htaccess so that
the server response includes:

Content-Type: text/html; charset=us-ascii

Then I wrote hy markup with no XML declaration

Read the cited reference carefully again - it includes the following
at 4.3.3, third paragraph:

In the absence of external character encoding information (such as
MIME headers), parsed entities which are stored in an encoding other
than UTF-8 or UTF-16 MUST begin with a text declaration [...]
containing an encoding declaration.

But you *are* providing "external character encoding information", so
you don't need the ?xml thingy (ahem, the xml "text declaration").

So, what you are doing is fine, but your explanation of why you are
doing it was adrift, and your proposed correction to the spec was
off-beam. Even if the HTTP Content-type had advertised
charset=iso-8859-2, or Big5 or whatever, you *still* would not have
needed the ?xml thingy. (Of course, this only works for encoding
schemes which are supported by the XML processor in question, but
that applies whichever way you communicate the character encoding to
them!)

good luck

Sep 2 '05 #18

Lachlan Hunt

Alan J. Flavell wrote:

[1] Amusingly, HTTP rules say that the default is iso-8859-1.
Only for text/* media types (including text/xml). application/*
(including application/xml and application/xhtml+xml) do not have a
default charset defined by HTTP rules.
So you can present us-ascii to XML, and allow it to assume that it's
utf-8, and at the same time present it to HTTP and allow -it- to
assume that it's iso-8859-1. and *both of them are correct*, in this
special case ;-)

That's why RFC3023 enforces a US-ASCII default for text/xml, since it is
a subset of both.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox

Sep 3 '05 #19

Henri Sivonen

In article <Pi*******************************@ppepc56.ph.gla. ac.uk>,
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> wrote:

But the character encoding scheme which is advertised from an HTTP
server via the MIME "charset=" is authoritative, according to RFC2616,
Yes.
and this attribute should not be omitted according to security alert
CA-2000-02,
That alert did not convince me. Do you have concrete threat scenarios?
Do they involve encodings that do not map Basic Latin to US-ASCII bytes?
so the <?xml thingy should really only be getting *used*
in non-HTTP contexts (e.g reading a local file):
I strongly disagree. There is no security risk in using application/*
types and internal character encoding information. Even the TAG
disagrees with RFC 3023.

The TAG has found "Thus there is no ambiguity when the charset is
omitted, and the STRONGLY RECOMMENDED injunction [of RFC 3023] to use
the charset is misplaced for application/xml and for non-text "+xml"
types." (http://www.w3.org/2001/tag/2004/0430...char-encoding).

RFC 3023's insistence on declaring the encoding authoritatively outside
the XML byte stream itself is, in my opinion, as silly as insisting on
declaring the compression method of a zip archive authoritatively on the
HTTP level instead of using the information stored in the file.
if the ?xml thingy
specified an encoding in an HTTP context, that evidently needs to be
consistent with what the server's HTTP Content-type header says.[2]
Yes, if the Content-Type says something about the encoding, it had
better be consistent with the document.
[1] Amusingly, HTTP rules say that the default is iso-8859-1.

But that does not apply to application/* types. For text/xml there is
RFC 3023, which says US-ASCII. For text/html, there is reality, which
disagrees. For text/css, the CSS WG has more practical rules.

--
Henri Sivonen
hs******@iki.fi
http://hsivonen.iki.fi/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

Sep 3 '05 #20

Alan J. Flavell

On Sat, 3 Sep 2005, Lachlan Hunt wrote:

Alan J. Flavell wrote:
[1] Amusingly, HTTP rules say that the default is iso-8859-1.

Only for text/* media types (including text/xml). application/* (including
application/xml and application/xhtml+xml) do not have a default charset
defined by HTTP rules.

Oh yes, you're right: thanks for the correction.

So you can present us-ascii to XML, and allow it to assume that
it's utf-8, and at the same time present it to HTTP and allow -it-
to assume that it's iso-8859-1. and *both of them are correct*, in
this special case ;-)

That's why RFC3023 enforces a US-ASCII default for text/xml, since
it is a subset of both.

Yes, I think you'll find that RFC2616 was anomalous in defining that
default of iso-8859-1 for text/* media types: the normal MIME default
for text/* was indeed us-ascii, as set out in RFC2046 (which is part
of the MIME specifications).

Sep 3 '05 #21

Lachlan Hunt

Gustaf wrote:

For those interested, I wrote a bit on how to write conformant XHTML 1.1
documents (the URL is temporary). Enjoy. :-)

http://gusgus.cn/www/xhtml/authoringxhtml11.html

You have made a number of mistakes made in that document.

1. Triggering standards mode
Firstly, it's called a DOCTYPE declaration, not a "DOCTYPE tag".

Secondly, while you are correct that standards/quirks mode will usually
be triggered by the presence (or absense) of various DOCTYPEs, all
browsers that support XHTML will use standards mode when the document is
served as XML, regardless of the DOCTYPE used (even if it's omitted).
2. The XML declaration

It is correct that the XML declaration will trigger quirks mode in IE,
however (as already pointed out in this thread) IE does not support
application/xhtml+xml (although, it seems that it will sometimes use
content sniffing if the file has a .html extension). Basically, if
you're going to serve XHTML with the correct MIME type, IE bugs are
irrelevant.
3. Choice of character encoding

| According to the rules of XML, skipping the XML declaration is okay
| only when using either UTF-8 or UTF-16 as character encoding in the
| document.

That's only true when the encoding is not specified in a higher level
protocol, such as the HTTP content-type header.
4. Setting Content-Type in the HTTP headers

You suggest that authors send this:

Content-Type: application/xhtml+xml; charset=utf-8

However, the W3C TAG WG disagree:

# Good practice: XML and character encodings
#
# In general, a representation provider SHOULD NOT specify the character
# encoding for XML data in protocol headers since the data is
# self-describing.

(Note: that only really applies to application/*+xml, not text/xml,
which they recommend should be avoided)

http://www.w3.org/TR/2004/REC-webarc...ml-media-types
5. Setting Content-Type in the meta element

<meta http-equiv="Content-Type" content="application/xhtml+xml;
charset=utf-8"/>

That's completely useless for determining the MIME type, even if the
file is read from the local file system. Typically, when read from a
local file system, files with a .htm or .html file extension are
processed as text/html and files with .xht or .xhtml are processed as
application/xhtml+xml.

When it's processed as text/html, UAs are lenient enough with their
parsing to determine the character encoding, but will not begin
processing the document as XML. When it's parsed as XML, that's
completely useless and UAs will obey the XML declaration (if present) or
default to UTF-8/UTF-16, as defined by the XML rec.
6. Saving documents

| it's best to avoid the BOM in UTF-8, because its presence is not
| supported everywhere.
| ...
| 2. UTF-8 documents must not be saved with a BOM.

That is not true for XML documents. XML processors are required to
fully support UTF-8 and UTF-16 (including the BOM). That guideline is
only relevant for serving UTF-8 encoded files as text/html to obsolete
browsers.

By the way, SuperEdi is another good editor that supports Unicode, and
even includes an option to omit the BOM.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox

Sep 3 '05 #22

Alan J. Flavell

On Sat, 3 Sep 2005, Henri Sivonen wrote:

"Alan J. Flavell" <fl*****@ph.gla.ac.uk> wrote:
and this attribute should not be omitted according to security alert
CA-2000-02,
That alert did not convince me. Do you have concrete threat scenarios?

Sorry, I can't offer anything specific, beyond what it says at
http://www.cert.org/tech_tips/malici...igation.html#3

I've seen a case where dangerous markup was sneaked into a web page by
a technique of this kind, but I can no longer give you chapter and
verse, sorry. I don't think it's become a popular attack scenario.

I recognised that it's a complex issue, and that the Apache folks
responded by related changes in their default configurations, and
concluded it would be best to follow their advice as far as possible.
But this is for text/* MIME types.

so the <?xml thingy should really only be getting *used*
in non-HTTP contexts (e.g reading a local file):

I strongly disagree.

Well, I've been caught-out on the difference between text/* and
application/* data types, so I'm in no position to argue... But just
to clarify what I was trying to say there:

* if the ?xml encoding specifier is /present/, (which you presumably
favour), it will be overridden when the HTTP Content-type also
specifies the charset= attribute. Then, the ?xml encoding specifier
can do nothing better than to repeat what is already known and
authoritiative from HTTP: in that sense, it is not actually /used/,
even though it's present.
There is no security risk in using application/*
types and internal character encoding information.

I can't see any reason to dispute that. The CERT CA alert relates
specifically to text/html, and at its widest to text/* MIME types.

Thanks for the corrections.

Sep 3 '05 #23

Henri Sivonen

In article <Pi******************************@ppepc56.ph.gla.a c.uk>,
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> wrote:

Sorry, I can't offer anything specific, beyond what it says at
http://www.cert.org/tech_tips/malici...igation.html#3

Thanks.

It seems to me that server-side programs including tainted snippets of
text on the byte level major part of the problem and could be avoided if
the server-side programs operated on the character level internally (in
which case they'd have to perform a bytes to characters conversion of
the tainted text before doing anything with it).

--
Henri Sivonen
hs******@iki.fi
http://hsivonen.iki.fi/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

Sep 4 '05 #24

application/xhtml+xml in IE

Similar topics