This may be too far off topic, however I was looking at this page http://www.hixie.ch/advocacy/xhtml about XHTML problems by Ian Hickson.
It is served as text/plain, according to Firefox
Response Headers - http://www.hixie.ch/advocacy/xhtml
Date: Wed, 23 Nov 2005 21:36:06 GMT
Server: Apache/1.3.33 (Unix) DAV/1.0.3 mod_fastcgi/2.4.2
mod_gzip/1.3.26.1a PHP/4.3.10 mod_ssl/2.8.22 OpenSSL/0.9.7e
Vary: Accept-Encoding,User-agent
X-Pingback: http://tracking.damowmow.com/
Content-Language: en-GB-Hixie
Last-Modified: Sat, 17 Sep 2005 12:16:19 GMT
Etag: "17063c7-4a12-432c0913"
Accept-Ranges: bytes
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/plain; charset=utf-8
Content-Encoding: gzip
Content-Length: 7452
200 OK
The page displays in Firefox, and in Opera as if the text were
surrounded by pre tags. In Safari 2, the page displays as a single long
(but word wrapped) string, as if Safari were treating it as HTML markup.
The interesting point to me is that the displayed contents are
incomplete. Safari has the contents, as looking at source confirms.
The places where the contents are not displayed are
<script type="text/javascript"><!--//--><![CDATA[//><!--
...
//--><!]]></script>
which is replaced by *
<script type="text/javascript"><!--//--><![CDATA[//><!--
...
//--><!]]></script>
which is not replaced by anything.
The document as displayed truncates on the next paragraph, when it
encounters
<script> and <style>
Given that the script element never closes, it seems reasonable to hide
the contents.
So my question is, should a browser display a file served as text/plain
the way Firefox and Opera do, or should a browser look deep inside the
file for HTML (or other tags) the way Safari does?
Or should it use some heuristic to second guess the server, given the
number of servers that do not correctly identify content-type?
If a browser pays attention only to the content-type as provided by the
server, what should it do about a file.css served as text/html instead
of text/css? Or isn't that a problem when the css file could be
considered to be included in the html file that calls it?
-- http://www.ericlindsay.com 10 3436
On Thu, 24 Nov 2005, Eric Lindsay wrote: This may be too far off topic, however I was looking at this page http://www.hixie.ch/advocacy/xhtml about XHTML problems by Ian Hickson.
I've often seen plain-text documents from Hixie, but I must admit
I hadn't looked at their headers.
It is served as text/plain, according to Firefox Response Headers - http://www.hixie.ch/advocacy/xhtml
[...] Vary: Accept-Encoding,User-agent
[...] Content-Type: text/plain; charset=utf-8
Which is at least *suggestive* that there might be other variants
available, although we don't know what they are...
But a visit to http://www.hixie.ch/advocacy/ shows a conventional
directory listing. If there's any alternative version served out to
other browsers or in other character encodings, it would have to be
done by some kind of server conversion...? *Do* note that
accept-language is *not* one of the negotiation dimensions according
to that Vary header, even though there appears to be a French
translation available in the directory listing.
The page displays in Firefox, and in Opera as if the text were surrounded by pre tags.
Well no, it displays "as plain text". There are big differences
between the two assertions, when the material contains markup and
&-notations - which this does.
In Safari 2, the page displays as a single long (but word wrapped) string, as if Safari were treating it as HTML markup.
Booooooh!
The interesting point to me is that the displayed contents are incomplete. Safari has the contents, as looking at source confirms. The places where the contents are not displayed are
<script type="text/javascript"><!--//--><![CDATA[//><!-- ... //--><!]]></script>
This is fun stuff, but you really mustn't let yourself be so grossly
diverted from making real web pages, or you'll risk ending up like me
- posting too much about pedantic detail, and never getting around to
updating my sadly obsolescent web pages. Not good.
So my question is, should a browser display a file served as text/plain the way Firefox and Opera do,
Of course.
or should a browser look deep inside the file for HTML (or other tags) the way Safari does?
Sigh. I've been battering on about the mandate of RFC2616, but
somehow it doesn't seem to have sunk home. See the notes below the
table at http://ppewww.ph.gla.ac.uk/~flavell/....html#browconf ,
which now take you directly to the relevant section of (the W3C's
HTML-ised copy of) RFC2616 - http://www.w3.org/Protocols/rfc2616/....html#sec7.2.1
Or should it use some heuristic to second guess the server,
Absolutely and utterly not. RFC2616 forbids it.
given the number of servers that do not correctly identify content-type?
It would still be permissible for a browser to say to its user "excuse
me, this content seems to be the wrong type. At some risk to your
security, I could try to guess this, are you prepared to take that
chance?". What RFC2616 is ruling out is that a client agent should
take it upon itself to unilaterally second-guess, without informed
consent from its user. That's my best interpretation, anyway.
If a browser pays attention only to the content-type as provided by the server, what should it do about a file.css served as text/html instead of text/css?
Per RFC2616, it's mandated to ignore it, i.e to render the HTML
without it, and Mozilla does so[1]: that's correct behaviour.
Unfortunately, some other browsers are not so cautious. The web would
be a better place if they were.
[1] at least in its Standards mode.
On Wed, 23 Nov 2005, Alan J. Flavell wrote: Vary: Accept-Encoding,User-agent
If there's any alternative version served out to other browsers or in other character encodings, it would have to be done by some kind of server conversion...?
Sorry, I shot my mouth off too quickly on that point. It wasn't
"accept-charset" in that header, it was "accept-encoding". That's why
his server has sent gzip-ed content, because the browser said it was
willing to accept that encoding. Nothing to do with
character-encoding ("charset"). Sorry for that - spotted my mistake
just too late!
--
Post in haste, repent at leisure...
Eric Lindsay <NO**********@ericlindsay.com> writes: The page displays in Firefox, and in Opera as if the text were surrounded by pre tags. In Safari 2, the page displays as a single long (but word wrapped) string, as if Safari were treating it as HTML markup.
Filed bug #4353871, at: <http://bugreporter.apple.com>.
sherm--
--
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org
Alan J. Flavell said the following on 11/24/2005 00:04 +0200: But a visit to http://www.hixie.ch/advocacy/ shows a conventional directory listing. If there's any alternative version served out to other browsers or in other character encodings, it would have to be done by some kind of server conversion...? *Do* note that accept-language is *not* one of the negotiation dimensions according to that Vary header, even though there appears to be a French translation available in the directory listing.
Which is a directory, which contains a French version in HTML.
Speaking of language, what's it with his "Content-Language:
en-GB-Hixie"? Is it a valid Content-Language?
--
Regards
Harrie
In article <Pi*******************************@ppepc56.ph.gla. ac.uk>,
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> wrote: It is served as text/plain, according to Firefox Response Headers - http://www.hixie.ch/advocacy/xhtml Which is at least *suggestive* that there might be other variants available, although we don't know what they are...
Like you, I couldn't see anything in the directory to indicate that some
user agent may have received a different version. But that was why I
used Firefox rather than curl when I looked at the header. Now that you
folks have pointed out that dynamically generated sites may be doing all
sorts of things according to user agent (or spider), I am no longer sure
of anything.
The page displays in Firefox, and in Opera as if the text were surrounded by pre tags.
Well no, it displays "as plain text". There are big differences between the two assertions, when the material contains markup and &-notations - which this does.
So now I need to chase up what really happens with pre? That was
another tag I thought I could just ignore.
This is fun stuff, but you really mustn't let yourself be so grossly diverted from making real web pages, or you'll risk ending up like me - posting too much about pedantic detail, and never getting around to updating my sadly obsolescent web pages. Not good.
I already have a large collection of sadly obsolescent web pages. I
just hope when I understand this a bit better I eventually get around to
putting together a content management system and update them. So my question is, should a browser display a file served as text/plain the way Firefox and Opera do,
Of course.
Good, that is what I thought. or should a browser look deep inside the file for HTML (or other tags) the way Safari does?
Sigh. I've been battering on about the mandate of RFC2616, but somehow it doesn't seem to have sunk home. See the notes below the table at http://ppewww.ph.gla.ac.uk/~flavell/....html#browconf , which now take you directly to the relevant section of (the W3C's HTML-ised copy of) RFC2616 - http://www.w3.org/Protocols/rfc2616/....html#sec7.2.1
Thanks for that direct link Alan. That certainly is a clear demand that
it not be done. I was pretty sure Safari was wrong, but given they seem
to keep working on it, I thought maybe they knew something I didn't.
Bug report sent. If a browser pays attention only to the content-type as provided by the server, what should it do about a file.css served as text/html instead of text/css?
Per RFC2616, it's mandated to ignore it, i.e to render the HTML without it, and Mozilla does so[1]: that's correct behaviour. Unfortunately, some other browsers are not so cautious. The web would be a better place if they were.
[1] at least in its Standards mode.
I had noticed that Mozilla said it ignored incorrectly served CSS files.
I wasn't actually sure that was really the case, because I originally
checked servers with curl, and in one case found my css file served as
text/html
curl --head www.sheltersrus.com.au/sheltersrus.css
HTTP/1.1 302 Found
Date: Thu, 24 Nov 2005 03:06:05 GMT
Server: Apache/1.3.29 Sun Cobalt (Unix) mod_ssl/2.8.16 OpenSSL/0.9.6m
PHP/4.3.4 mod_auth_pam_external/0.1 FrontPage/5.0.2.2510 mod_perl/1.26
Location: http://site.sheltersrus.com.au/sheltersrus.css
Content-Type: text/html; charset=iso-8859-1
However look at the different header for the same file from Firefox
Response Headers - http://site.sheltersrus.com.au/sheltersrus.css
Date: Thu, 24 Nov 2005 03:07:34 GMT
Server: Apache/1.3.29 Sun Cobalt (Unix) mod_ssl/2.8.16 OpenSSL/0.9.6m
PHP/4.3.4 mod_auth_pam_external/0.1 FrontPage/5.0.2.2510 mod_perl/1.26
Last-Modified: Sat, 19 Nov 2005 06:58:32 GMT
Etag: "844310-524-437ecd18"
Accept-Ranges: bytes
Content-Length: 1316
Keep-Alive: timeout=15
Connection: Keep-Alive
Content-Type: text/css
200 OK
The charset variation is also interesting. They almost look like
different files. Opps. I think they are. Look at this.
curl --head site.sheltersrus.com.au/sheltersrus.css
HTTP/1.1 200 OK
Date: Thu, 24 Nov 2005 03:14:51 GMT
Server: Apache/1.3.29 Sun Cobalt (Unix) mod_ssl/2.8.16 OpenSSL/0.9.6m
PHP/4.3.4 mod_auth_pam_external/0.1 FrontPage/5.0.2.2510 mod_perl/1.26
Last-Modified: Sat, 19 Nov 2005 06:58:32 GMT
ETag: "844310-524-437ecd18"
Accept-Ranges: bytes
Content-Length: 1316
Content-Type: text/css
So I guess I need to check every site for 302 responses instead of 200.
I thought the thing was named www.sheltersrus.com.au, not
site.sheltersrus.com.au
-- http://www.ericlindsay.com
Harrie wrote in message news:43***********************@news.xs4all.nl... Alan J. Flavell said the following on 11/24/2005 00:04 +0200:
But a visit to http://www.hixie.ch/advocacy/ shows a conventional directory listing. If there's any alternative version served out to other browsers or in other character encodings, it would have to be done by some kind of server conversion...? *Do* note that accept-language is *not* one of the negotiation dimensions according to that Vary header, even though there appears to be a French translation available in the directory listing.
Which is a directory, which contains a French version in HTML.
Speaking of language, what's it with his "Content-Language: en-GB-Hixie"? Is it a valid Content-Language?
In theory, yes. http://www.faqs.org/rfcs/rfc3066.html
"2.1 Language tag syntax
The language tag is composed of one or more parts: A primary language
subtag and a (possibly empty) series of subsequent subtags."
*series of subsequent subtags*
Practically, no.
"2.2 Language tag sources
The namespace of language tags is administered by the Internet
Assigned Numbers Authority (IANA) [RFC 2860] according to the rules
in section 3 of this document." http://www.iana.org/assignments/language-tags
doesn't show "en-GB-Hixie" to be registered.
On Thu, 24 Nov 2005, Eric Lindsay wrote: "Alan J. Flavell" <fl*****@ph.gla.ac.uk> wrote:
Which is at least *suggestive* that there might be other variants available, although we don't know what they are... Like you, I couldn't see anything in the directory to indicate that some user agent may have received a different version.
Right - other than maybe sending the file compressed (e.g gzip) if the
client agent says via Accept-encoding that it accepts that. The page displays in Firefox, and in Opera as if the text were surrounded by pre tags.
Well no, it displays "as plain text". There are big differences between the two assertions, when the material contains markup and &-notations - which this does.
So now I need to chase up what really happens with pre?
Just normal HTML parsing!
checked servers with curl, and in one case found my css file served as text/html
curl --head www.sheltersrus.com.au/sheltersrus.css HTTP/1.1 302 Found Date: Thu, 24 Nov 2005 03:06:05 GMT Server: Apache/1.3.29 Sun Cobalt (Unix) mod_ssl/2.8.16 OpenSSL/0.9.6m PHP/4.3.4 mod_auth_pam_external/0.1 FrontPage/5.0.2.2510 mod_perl/1.26 Location: http://site.sheltersrus.com.au/sheltersrus.css Content-Type: text/html; charset=iso-8859-1
No, that's a redirection response. If you want curl to follow
redirection responses, you need this option:
-L/--location
(HTTP/HTTPS) If the server reports that the requested page has a
different location (indicated with the header line Location:)
this flag will let curl attempt to reattempt the get on the new
place.
See its man page.
When you get a status 30x redirection response, it usually comes
with a text/html body part. Most normal client agents however will
respond to the 30x status by proceeding to the new URL given in the
Location: header of the response. (RFC2616 for details).
In this case, the server is recognising the request for http://www.sheltersrus.com.au/sheltersrus.css and redirecting
the request to http://site.sheltersrus.com.au/sheltersrus.css
(a rather curious thing to do - I would rather have expected
the opposite, seeing that www.sheltersrus.com.au is likely to
be the human-expected name for the site).
However look at the different header for the same file from Firefox
Response Headers - http://site.sheltersrus.com.au/sheltersrus.css
Exactly!
So I guess I need to check every site for 302 responses instead of 200.
As I say, you can use the -L option on curl.
In article <Pi*******************************@ppepc56.ph.gla. ac.uk>,
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> wrote: So I guess I need to check every site for 302 responses instead of 200.
As I say, you can use the -L option on curl.
Thanks Alan. curl -L --head URL works fine, and reports the redirect
and the actual page just fine. I'll use that as my default set of
options in future, so I actually notice redirects.
I tend to get a bit lost in the options in curl. I had only really
looked at it for doing something like uploading web pages. Having
looked at curl, I used a here document with the command line ftp
instead. Seemed a whole heap easier to understand.
-- http://www.ericlindsay.com
In article <m2************@Sherm-Pendleys-Computer.local>,
Sherm Pendley <sh***@dot-app.org> wrote: Eric Lindsay <NO**********@ericlindsay.com> writes:
The page displays in Firefox, and in Opera as if the text were surrounded by pre tags. In Safari 2, the page displays as a single long (but word wrapped) string, as if Safari were treating it as HTML markup.
Filed bug #4353871, at: <http://bugreporter.apple.com>.
Thanks Sherm. I'm not a developer, so I can only use the bug reporting
menu item in Safari.
-- http://www.ericlindsay.com This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: J. Alan Rueckgauer |
last post by:
Hello. I'm looking for a simple way to do the following:
We have a database that serves-up content to a website. Some of those items
are events, some are news articles. They're stored in the...
|
by: Akseli Mäki |
last post by:
Hi,
Hopefully this is not too much offtopic.
I'm working on a FAQ. I want to make two versions of it, plain text and
HTML. I'm looking for a tool that will make a plain text doc out of the...
|
by: Mike Bridge |
last post by:
Is there any way to get Internet explorer to treat a text/plain .net
page as plain text using asp.net? It seems like IE doesn't trust
text/plain as a mime type, and so it (ironically) displays it...
|
by: Doominato |
last post by:
good day,
I was just wondering how can I download a web page as plain text from a
certain web site. I have tried to use the OpenURL() method from INET control
in my VB.NET app, but it returns...
|
by: toby989 |
last post by:
Hi All
Sorry for reposting...the entries of the post from 11/23/2005 by Eric
Lindsay have been removed from the server already and I am seeing only
the header.
So, I have the problem of...
|
by: Tim_Mac |
last post by:
hi,
i have a tricky problem and my regex expertise has reached its limit.
i have read other posts on this newsgroup that pull out the plain text
from a html string, but that won't work for me...
|
by: Rey |
last post by:
Howdy all.
Am using visual web developer 2005 (vb), xp pro sp2.
In testing of the system.net.mail to send email from an aspx page where
I'm pulling the email contents from a textbox, find that...
|
by: John Nagle |
last post by:
This, which is from a real web site, went into BeautifulSoup:
<param name="movie" value="/images/offersBanners/sw04.swf?binfot=We offer
fantastic rates for selected weeks or days!!&blinkt=Click...
|
by: Billy |
last post by:
Hi All,
I'm attempting to use the MapNetworkDrive <snippedbelow from entire
code below with very poor performance results.
Basically, I have very small 73kb text files that are rewritten daily...
|
by: DolphinDB |
last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation.
Take...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: PapaRatzi |
last post by:
Hello,
I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
|
by: CloudSolutions |
last post by:
Introduction:
For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
|
by: Shællîpôpï 09 |
last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
|
by: af34tf |
last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
| |