By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,512 Members | 3,520 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,512 IT Pros & Developers. It's quick & easy.

changing or at least detecting character encoding via javascript ?

P: n/a
Hi all,

I have a question if it is possible to manipulate the settings of
character encoding in Ms Internet Explorer 5.0, 5.5 and 6.0. The
problem is that the default instalation of Ms IE seems to have hard
selected default encoding to "Western European (ISO)", which means
iso-8859-1. When browsing pages with some Central/Eastern European
characters these are converted to iso-8859-1 so displayed wrong.

I would suppose the "auto-select" option should be default, so the
browser can select the right encoding according to the meta-tags in
the head of webpage. But this is apparently not true.

Please, is it possible to use JavaScript or Java applet to get the
information about the current client character encoding settings
and/or change it to the "auto-select" value ? How to do this ?

Thanks in advance,

David Komanek
Jul 20 '05 #1
Share this Question
Share on Google+
10 Replies


P: n/a


David Komanek wrote:
Hi all,

I have a question if it is possible to manipulate the settings of
character encoding in Ms Internet Explorer 5.0, 5.5 and 6.0. The
problem is that the default instalation of Ms IE seems to have hard
selected default encoding to "Western European (ISO)", which means
iso-8859-1. When browsing pages with some Central/Eastern European
characters these are converted to iso-8859-1 so displayed wrong.

I would suppose the "auto-select" option should be default, so the
browser can select the right encoding according to the meta-tags in
the head of webpage. But this is apparently not true.

Please, is it possible to use JavaScript or Java applet to get the
information about the current client character encoding settings
and/or change it to the "auto-select" value ? How to do this ?


What about using the HTML <meta> tag:
<meta http-equiv="Content-Type" content="text/html;
charset=yourCharsetHere">

--

Martin Honnen
http://JavaScript.FAQTs.com/

Jul 20 '05 #2

P: n/a
Hello!

ko*****@natur.cuni.cz (David Komanek) wrote in message news:<e5**************************@posting.google. com>...
Hi all,

I have a question if it is possible to manipulate the settings of
character encoding in Ms Internet Explorer 5.0, 5.5 and 6.0. The
problem is that the default instalation of Ms IE seems to have hard
selected default encoding to "Western European (ISO)", which means
iso-8859-1.
No, there is no such thing in Internet Explorer as
'default encoding' (Netscape/Mozilla do have such thing).
When browsing pages with some Central/Eastern European
characters these are converted to iso-8859-1 so displayed wrong.


Martin and VK has already answered that - it's a _site_'s problem,
it's probably does not specify its encoding so you need to choose
it manually in IE's menu - only if theb page you visited right
before was not Central European - then IE will show your new page
Ok - if a new page does not specify its encoding, IE uses
*last used encoding* to show such page.

--
Regards,
Paul Gorodyansky
"Russian On-screen Keyboard"
(based on the JavaScript code by Matin Honnen et al):
http://ourworld.compuserve.com/homep...r/onscreen.htm
Jul 20 '05 #3

P: n/a
Hi all,

thank you for the responses. Unfortunately my colleague is abroad, in
Netherlands and I have no possibility to play with his compoter (and all
computers in his department, too :-) But What I can tell for sure is
that I have the appropriate meta-tag in the page: iso-8859-2. He says he
has iso-8859-1 is his setting what he sees in the "view|encoding" menu
as selected. And all the Czech characters he sees converted to the
english equivalents. For example &Aacute; he sees as a simple "A" if I
use the normal character. Only two ways to get the right character to
his display which I can go is to use the &Aacute; entity itself or to
recode the page to utf-8, right. But f I use the "normal character"
(not the corresponding entity) in the html source and my colleague
manually switches the encoding to the "Central European (ISO)", which
means iso-8859-2, voila, he sees the character well .... but tell this
to do to all people abroad .... :-)

I am pretty sure I have the meta-tag o.k. because I see the characters
exaxtly as I should on my windows machine (and on many others close to
me), even if the default codepage in Czech editions of windows is
cp-1250 which is different one. Yes, it differs only in few characters,
but I tried them, too - with no problems.

I would agree, that if my colleague would have not fonts properly
installed, he should see strange characters. But why are the characters
implicitly converted on his side ? And why on many computers ? Is it
possible it does his proxy ?

Thanks,

David


*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #4

P: n/a
David,

David Komanek <ko*****@natur.cuni.cz> wrote in message news:<3f***********************@news.frii.net>...
Hi all,

thank you for the responses. Unfortunately my colleague is abroad, in
Netherlands and I have no possibility to play with his compoter (and all
computers in his department, too :-) But What I can tell for sure is
that I have the appropriate meta-tag in the page: iso-8859-2. He says he
has iso-8859-1 is his setting what he sees in the "view|encoding" menu
as selected.


If you would let us know the URL it would be easier for us to
help you.
Any way, the above happens often with Russian too for the following reason:
- author created good page with correct META...charset=
- he placed .html to the Web Server of his Internet Provider
- The Web Server of the Provider is configured in such a way that
it places Charset=iso-8859-1 ("Western European") into
HTTP Header that is sent along with the page itself to a reader
( http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html )

- HTTP Header, but the standards, has higher priority than META...charset=
so browser gets it as a iso-8859-1 page!

So your friend needs to ask Web Server people if they do the above.
For example, my Internet Provider, CompuServe, does NOT fill our
Charset field of HTTP Header, so in my files META...charset=
works OK.

There is a test page that shows HTTP Header, so:
- create a Web page *without* META...charset= in it
- place it to the Web Server
- go to this page, get the screen with HTTP Header and see
what is the value of "Charset" field:
http://www.delorie.com/web/headers.html

If you do the above for _my_ page where there is no META...charset=
http://ourworld.compuserve.com/homep...r/test1251.htm
you will see that CompuServe leaves Charset field empty...
--
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet":
http://ourworld.compuserve.com/homepages/PaulGor/
Jul 20 '05 #5

P: n/a
David Komanek wrote:

Hi all,

thank you for the responses. Unfortunately my colleague is abroad, in
Netherlands and I have no possibility to play with his compoter (and all
computers in his department, too :-) But What I can tell for sure is
that I have the appropriate meta-tag in the page: iso-8859-2. He says he
has iso-8859-1 is his setting what he sees in the "view|encoding" menu
as selected. And all the Czech characters he sees converted to the
english equivalents.


I have ISO-8859-2 Test Page (because I work as Software I18n engineer),
so you can ask your friend to check how it is shown using *my*
Provider who does not fill out Charset of HTTP Header:
http://ourworld.compuserve.com/homep...gor/8859-2.htm

--
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet":
http://ourworld.compuserve.com/homepages/PaulGor/
Jul 20 '05 #6

P: n/a
Paul Gorodyansky wrote:
David,

David Komanek <ko*****@natur.cuni.cz> wrote in message news:<3f***********************@news.frii.net>...
Hi all,

thank you for the responses. Unfortunately my colleague is abroad, in
Netherlands and I have no possibility to play with his compoter (and all
computers in his department, too :-) But What I can tell for sure is
that I have the appropriate meta-tag in the page: iso-8859-2. He says he
has iso-8859-1 is his setting what he sees in the "view|encoding" menu
as selected.

If you would let us know the URL it would be easier for us to
help you.
Any way, the above happens often with Russian too for the following reason:
- author created good page with correct META...charset=
- he placed .html to the Web Server of his Internet Provider
- The Web Server of the Provider is configured in such a way that
it places Charset=iso-8859-1 ("Western European") into
HTTP Header that is sent along with the page itself to a reader
( http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html )

- HTTP Header, but the standards, has higher priority than META...charset=
so browser gets it as a iso-8859-1 page!

So your friend needs to ask Web Server people if they do the above.
For example, my Internet Provider, CompuServe, does NOT fill our
Charset field of HTTP Header, so in my files META...charset=
works OK.


Is it possible that you might also get 8859-1 because the client sends
this in the Accept-charset request header? Without providing for
alternatives, and regardless of server configuration?
There is a test page that shows HTTP Header, so:
- create a Web page *without* META...charset= in it
- place it to the Web Server
- go to this page, get the screen with HTTP Header and see
what is the value of "Charset" field:
http://www.delorie.com/web/headers.html


OT: The above URL is an example of an application that is broken by
Verisign's implementing "sitefinder".

Regards
Stephen

Jul 20 '05 #7

P: n/a
Hi,

Stephen wrote:

Paul Gorodyansky wrote:
David,

David Komanek <ko*****@natur.cuni.cz> wrote in message news:<3f***********************@news.frii.net>...
Hi all,

thank you for the responses. Unfortunately my colleague is abroad, in
Netherlands and I have no possibility to play with his compoter (and all
computers in his department, too :-) But What I can tell for sure is
that I have the appropriate meta-tag in the page: iso-8859-2. He says he
has iso-8859-1 is his setting what he sees in the "view|encoding" menu
as selected.

If you would let us know the URL it would be easier for us to
help you.
Any way, the above happens often with Russian too for the following reason:
- author created good page with correct META...charset=
- he placed .html to the Web Server of his Internet Provider
- The Web Server of the Provider is configured in such a way that
it places Charset=iso-8859-1 ("Western European") into
HTTP Header that is sent along with the page itself to a reader
( http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html )

- HTTP Header, but the standards, has higher priority than META...charset=
so browser gets it as a iso-8859-1 page!

So your friend needs to ask Web Server people if they do the above.
For example, my Internet Provider, CompuServe, does NOT fill our
Charset field of HTTP Header, so in my files META...charset=
works OK.


Is it possible that you might also get 8859-1 because the client sends
this in the Accept-charset request header? Without providing for
alternatives, and regardless of server configuration?


No, not really. First - and it's easy to verify - many browsers - and
MS Internet Explorer is one of them - do *not* fill out Accept-Charset
field - you can check it for example using "CGI Test Script" link
here: http://koi8.pp.ru/frame.html?htmlreq.html

Second, Accept-Charset is for different reason - when server has
*several* variants of the same page, say one contains same Russian
text in KOI8-R encoding, another - in Windows-1251 encoding, then
a browser via Accept-Charset=koi8-r tells the server what it
can take. Server can not *make* a document to be KOI8-R if it does not
havev such. Same in our case - if server contains ISO-8859-2
document and browser (f.e. Mozilla) requests ISO-8859-1, then
it does not mean at all that server will send existing -2 document
as -1.

--
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet":
http://ourworld.compuserve.com/homepages/PaulGor/
Jul 20 '05 #8

P: n/a
Paul Gorodyansky wrote:
Hi,

Stephen wrote:
Paul Gorodyansky wrote:
David,

David Komanek <ko*****@natur.cuni.cz> wrote in message news:<3f***********************@news.frii.net>...
[...snip...]


Is it possible that you might also get 8859-1 because the client sends
this in the Accept-charset request header? Without providing for
alternatives, and regardless of server configuration?


No, not really. First - and it's easy to verify - many browsers - and
MS Internet Explorer is one of them - do *not* fill out Accept-Charset
field - you can check it for example using "CGI Test Script" link
here: http://koi8.pp.ru/frame.html?htmlreq.html

Second, Accept-Charset is for different reason - when server has
*several* variants of the same page, say one contains same Russian
text in KOI8-R encoding, another - in Windows-1251 encoding, then
a browser via Accept-Charset=koi8-r tells the server what it
can take. Server can not *make* a document to be KOI8-R if it does not
havev such. Same in our case - if server contains ISO-8859-2
document and browser (f.e. Mozilla) requests ISO-8859-1, then
it does not mean at all that server will send existing -2 document
as -1.

Of course. Thanks for the commentary. I did notice that Gecko-based
browsers (Netscape 7.0, Moz 1.4, Firebird) do send Accept-charset. And
contrary to what I was remembering, you are right about IE: it does not.
Thanks again,
Stephen

Jul 20 '05 #9

P: n/a
Thank you all for your help.

In the meantime I got the workaround for my problem by recoding the
pages to utf8, as was suggested here. Because the encoding is made by
a module in Apache on the server, where the implicit codepage served
to clients is iso-8859-2, I just prefixed the pages with /utf8, wich
tells the server to use the explicit encoding "utf-8". So, for
example, one of the recoded pages, where is the problem is

http://www.natur.cuni.cz/utf8/fem_mo...index.php?id=4

the original one is now as

http://www.natur.cuni.cz/fem_modflow..._test.php?id=4

Please, colud somebody form non-central/eastern-european region tell
me what (s)he sees on the page between lines

"Organizing Committee"

and

"Institute of Hydrogeology, Engineering Geology and Applied
Geophysics" ?

The should be the name "Zbynek Hrkal", where the "e" has a special
decoration (sthg. like tilde, but not exactly, I have no idea how to
call this letter in english language, sorry (does anybody know ?)). I
see it right in both encodings, with MS IE 6, Netscape 7.1, Mozilla
..... but my colleague in Netherlands sees it well only in utf8, not in
original iso-8859-2. In the latter case he sees "regular e" instead.

I do not know how to ge the http header from the server. When I
connect to port 80 of the webserver via unix telnet and type

GET /fem_modflow/index_test.php?id=4

I get just the source of the webpage, no http header lines:

# telnet www.natur.cuni.cz 80
Trying 195.113.56.1...
Connected to tao.natur.cuni.cz.
Escape character is '^]'.
GET /fem_modflow/index_test.php?id=4
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
.....
.....
.....

Thanks again for your comments.

With best regards,

David Komanek
Jul 20 '05 #10

P: n/a
> - HTTP Header, but the standards, has higher priority than META...charset=
so browser gets it as a iso-8859-1 page!

So your friend needs to ask Web Server people if they do the above.
For example, my Internet Provider, CompuServe, does NOT fill our
Charset field of HTTP Header, so in my files META...charset=
works OK.


Well, this seems to be the problem. Thank you. The header displayed by
http://www.delorie.com/web/headers.html tells the charset should be
"us-asci". Regardless of setting AddDefaultCharset in Apache
httpd.conf, php.ini setting and "header()" function as the forst line
of PHP source itself. Very strange. And even more strange is that on
some computers the meta-tag based information about the encoding takes
precedence and on some not ....

David Komanek
Jul 20 '05 #11

This discussion thread is closed

Replies have been disabled for this discussion.