473,394 Members | 1,693 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

what's wrong with this page?

hi

i've got kind of a strange problem i'm struggeling with here. this
page:

http://www.pohflepp.de/eavesdripping4.html

won't be recognized as html/xml by firefox. i've tried everything that
i could think of but still it doesn't work. looks great in safari,
though.

please have a look at it and tell me what i'm doing wrong!
thanks a lot
sascha

Aug 11 '05 #1
5 1746
On 11/08/2005 11:28, pl*****@gmail.com wrote:
i've got kind of a strange problem i'm struggeling with here. this
page:

http://www.pohflepp.de/eavesdripping4.html


This is an encoding problem. Part of the document is encoded as
something compatible with US-ASCII, but other parts are trying to be
UTF-16. Add to this the fact that the XML prolog states the character
encoding is UTF-8, the server sends no character encoding information at
all, and a META element claims ISO-8859-1 (if the browser even gets that
far), and you have a real mess.

Choose an encoding (ISO-8859-1 is probably OK), stick to it, and fix
your server to send the charset parameter in the Content-Type header.

[snip]

On a different note, remove the XML prolog, the xmlns and xml:lang
attributes in the HTML start tag, and change your DOCTYPE to HTML 4.01
Strict. You aren't writing XHTML, so there's no point claiming
otherwise. Furthermore, replace the BR elements in your markup with CSS
margins on the relevant elements.

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
Aug 11 '05 #2
> http://www.pohflepp.de/eavesdr ipping4.html
won't be recognized as html/xml by firefox


I don't really blame it! It's a right old mess.

How are you producing this page? It looks like automatically generated
code coming through UTF-16 at some points and UTF-8 (?) at others -
then just concatenated together.

I haven't seen a mess like that since I last used ObTree !

Aug 11 '05 #3
Sascha wrote:
i've got kind of a strange problem i'm struggeling with here. this
page:

http://www.pohflepp.de/eavesdripping4.html

won't be recognized as html/xml by firefox.


You've already heard what you're doing wrong. Here's one way to fix it.
Start out by reading Joel Spolsky's great article, "The Absolute
Minimum Every Software Developer Absolutely, Positively Must Know About
Unicode and Character Sets (No Excuses!)". Get it here:

http://www.joelonsoftware.com/articles/Unicode.html

The other people who posted are correct that you have intermixed 3
different kinds of character encodings, and not been consistent at it.
View the source in Firefox ("View -> Page Source") to see what's going
on.

The basic problem you are having is realizing that HTML tags themselves
must be in 7-bit ASCII, while the content between the tags

<p>L.i.k.e. .t.h.i.s.</p>

should be in the encoding set that you have defined in the head portion
of the web document (UTF-8, UTF-16 or whatever). Taking a web page, and
artificially doing "save as UTF-16" will convert the tags themselves to
UTF-16, which is not what you want to do.

Since the text of your document is actually English, you probably
should be using the ISO-8859-1 encoding throughout. And as was already
pointed out, don't use an <xml ...> tag unless you are actually using
XML, which you are not (at least, not in this document!). Here's how I
would fix it quickly:

sed "1d; s/\x00//g" eavesdripping4.html | tidy >eaves_test.html

Then edit eaves_test.html in a good HTML editor. You can get sed (the
stream editor) from any good source of Unix text tools (in Windows, try
http://gnuwin32.sourceforge.net or http://unxutils.sourceforge.net).
And you can get HTML-Tidy, which cleans up and reformats your HTML,
from http://tidy.sourceforge.net .

Kind regards,

Eric Pement

Aug 11 '05 #4
In <11**********************@g47g2000cwa.googlegroups .com>, on
08/11/2005
at 07:26 AM, pe*****@northpark.edu said:
The basic problem you are having is realizing that HTML tags
themselves must be in 7-bit ASCII, while the content between the tags <p>L.i.k.e. .t.h.i.s.</p> should be in the encoding set that you have defined in the head
portion of the web document (UTF-8, UTF-16 or whatever).


That's possible for UTF-8, since it encodes ASCII as itself, but how
could you encode the tages as ASCII and the remaining text as UTF-16?
Doesn't UTF-16 encode each character in one or more 16-bit bytes?

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to sp******@library.lspace.org

Aug 12 '05 #5
On 11/08/2005 15:26, pe*****@northpark.edu wrote:

[snip]
The basic problem you are having is realizing that HTML tags
themselves must be in 7-bit ASCII [...]


As someone else has noted, this cannot be true. UTF-16 is a perfectly
legitimate character encoding for transmitting HTML (as long as the
/server/ indicates this), and one certainly cannot represent tags in
7-bit ASCII using this scheme.

You also seem to be advocating mixing encoding schemes. Part of the
original problem was that the OP was doing just this, as well as giving
no reliable indication as to how the document was encoded.

I believe what you're confusing is the section in the HTML specification
that suggests what to do when character encoding information must be
obtained from the document itself, rather than the server. In this case,
"ASCII-valued bytes [must] stand for ASCII characters (at least until
the META element is parsed)." (5.2.2) In other words, until the user
agent encounters a META element that indicates the true encoding, it
should not be presented with anything but 7-bit ASCII. However, I would
not recommend this.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
Aug 12 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: tin | last post by:
<script language="Javascript"> <!-- function apri (theURL,winName,features){ window.open (theURL,winName,features); var a=null; oldwindow = window.self; oldwindow.opener = window.self;...
47
by: Neal | last post by:
Patrick Griffiths weighs in on the CSS vs table layout debate in his blog entry "Tables my ass" - http://www.htmldog.com/ptg/archives/000049.php . A quite good article.
10
by: G Matthew J | last post by:
interesting "signal vs. noise" blog entry: http://37signals.com/svn/archives2/whats_wrong_with_ajax.php
17
by: Paul | last post by:
HI! I get an error with this code. <SCRIPT language="JavaScript"> If (ifp==""){ ifp="default.htm"} //--></SCRIPT> Basicly I want my iframe to have a default page if the user enters in...
6
by: Doc | last post by:
I'm trying to get to the bottom of a problem I've been having with publishing a freebie website. I'm using a program called WebEasy. Using a very simple site upload as an example, in this case a...
12
by: Nathan Sokalski | last post by:
What is the difference between the Page_Init and Page_Load events? When I was debugging my code, they both seemed to get triggered on every postback. I am assuming that there is some difference,...
2
by: Rod | last post by:
I've been struggling with this thing for 2 days, and after searching the 'net for help, I cannot find what is wrong. We're using Crystal Reports XI Release 2, with Visual Studio .NET 2003 in...
0
by: shapper | last post by:
Hello, I am creating a class with a control. I compiled the class and used it on an Asp.Net 2.0 web site page. I can see the begin and end tags of my control (<oland </ol>) but somehow the...
8
by: =?Utf-8?B?Tmlja28u?= | last post by:
Hi, I'm at my wits-end here. I'm a beginner with ASP/C# (using .NET 2003) and I'm trying to post variables from a classic ASP form to a ASP.NET form. The Classic ASP form was scripted with...
16
by: SirG | last post by:
I'm looking for an explanation of why one piece of code works and another does not. I have to warn you that this is the first piece of Javascript I've ever written, so if there is a better way or a...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.