By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,767 Members | 1,255 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,767 IT Pros & Developers. It's quick & easy.

Unicode browser support charts

P: n/a
I recently completed a web page, "Browser Tests of Entities in 2004".
http://www.santagata.us/characters/C...rEntities.html

It shows those characters that work in all of the version 5.2+
browsers that were tested and those that only work in some of them.
Take a look, maybe you'll consider it useful.

This is not my field (I'm an architect - you know the house
construction kind), so if you notice any inaccuracies I'd appreciate a
note. If you don't get a character that the charts say that you should
get I'd like to hear that too, but only if you have Arial & Times
installed..

Thanks,
Nancy
Jul 20 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Nancy scribbled something along the lines of:
I recently completed a web page, "Browser Tests of Entities in 2004".
http://www.santagata.us/characters/C...rEntities.html

It shows those characters that work in all of the version 5.2+
browsers that were tested and those that only work in some of them.
Take a look, maybe you'll consider it useful.

This is not my field (I'm an architect - you know the house
construction kind), so if you notice any inaccuracies I'd appreciate a
note. If you don't get a character that the charts say that you should
get I'd like to hear that too, but only if you have Arial & Times
installed..

Thanks,
Nancy


Well, get your site valid (replace all ampersands ("&") in your links
with the apropriate HTML entity ("&")) please.

Also stop calling modern browsers "5.2+ browsers". Just call them
mainstream browsers if you're referring to MSIE,
Mozilla/Netscape/Firefox, Opera, etc.

Also, please specify which exact version you are testing. Firefox and
Netscape don't need to be tested if you're testing the latest Mozilla
release. Their support can be assumed to be nearly identical (except
when testing Mozilla on multiple platforms).

Please also don't use glyphs as decoration just because you find them
pretty. If you need to do so, use graphics, not characters -- otherwise
the user might see some very decorative questionmarks or empty blocks.

Some corrections to your table:
The Arc (U+2312) is not supported by Gecko browsers, nor is the large
circle (U+25EF) or the third CJK symbol(U+4E1B).

Other than that, good work.
--
Alan Plum, WAD/WD, Mushroom Cloud Productions
http://www.mushroom-cloud.com/
Jul 20 '05 #2

P: n/a
On Sat, 17 Apr 2004, Ashmodai wrote:
http://www.santagata.us/characters/C...rEntities.html

after criticising the lack of precise version details, seems to
go and do the same:
The Arc (U+2312) is not supported by Gecko browsers,
What exactly do you mean by "not supported"? I'm looking at it right
now on Win Mozilla 1.6
nor is the large circle (U+25EF)


It's showing a circle, though it's not particularly large...

Presumably the presence or absence of these characters on our displays
depends on something else than the choice of browser/version, but
rather on the availability of fonts.
Anyhow, let's get back to the page itself.

I have a criticism of the cited page in its use of terminology. It
seems to consistently refer to "numeric character references" as
"numeric entities", and refers to collectively to (what are properly
called) "numeric character references" and "character entities" under
the major heading of "entities". This sloppy terminology is quite
widespread, and I wouldn't normally get worked-up about it in casual
usage, but in a situation like this where the distinction is rather
important, I really would rather see the terms used accurately.

The correct terminology is surely that given in
http://www.w3.org/TR/html4/charset.html#h-5.3

There are three ways to include characters in a document: the
character itself, and two kinds of "character reference".

These two kinds of "character reference" are:

- the numeric character reference

- the character entity (for those relatively few characters where
one is defined).

In a testing situation like this, it's essential to make it clear
which of the two is under discussion. By referring to both kinds
loosely as "entities", it causes confusion in the mind of those who
are familiar with the correct terminology, IMHO.

Of course the numeric character reference then comes in two flavours,
the decimal or the hexadecimal form. There's what seems to be a
pointless and incorrect distinction drawn between these in the
covering notes to the page:

<li>A numeric entity which is a decimal encoding of older Unicode
characters, [iso-8859-1].

<li>A numeric entity which is an 8-bit encoding (hex)
of present & future Unicode characters, [ISO10646].

There's no such distinction. Either a browser understands the
&#x...; form or it does not: earlier ones (e.g NN4.* versions) did
not. But they had already started to understand Unicode values in
decimal per RFC2070 at an earlier stage of development. What I'm
saying is that decimal references are BY NO MEANS limited to
iso-8859-1 characters, but can be used even more widely than the
hexadecimal form, as shown e.g in my rough-and-ready tables below
http://ppewww.ph.gla.ac.uk/~flavell/unicode/

My test pages of course are much less visually attractive than the
page under discussion, but they were made for a rather different
purpose.

And a word of warning about tests with IE: the results depend not only
on what fonts are available, but also on which language options have
been installed. For example, I found that installing Japanese
language support (which I didn't actually need because I can't read
it) nevertheless enabled a whole swath of useful symbols that hadn't
previously worked, and that had no obvious relationship to Japanese.

all the best
Jul 20 '05 #3

P: n/a
On 17 Apr 2004, Nancy wrote:
I recently completed a web page, "Browser Tests of Entities in 2004".
http://www.santagata.us/characters/C...rEntities.html
Your "results" are just too simplistic! The browser [version] is only
one factor. Other factors are the operating system [version], the
installed "language kits"/"language packs", the installed fonts,
the chosen fonts in the browser, the browser's settings such as
"Allow documents to use other fonts", etc.

For example, the developer of iCab for Macintosh once wrote that
you would need to install the huge Korean Language Kit in order to
display mere Latin-1 characters. (Which is a ridiculous demand BTW.)

It can make a difference if a browser runs under Windows 95 or 98.

And there's also Netscape 4, which could display quite a lot of symbols
if you just used the decimal notation.
It shows those characters that work in all of the version 5.2+
browsers that were tested


Where can I get Lynx 5.2?

Jul 20 '05 #4

P: n/a
Alan P.,

While I disagree with a lot of your comments, I can't thank you enough
for saying "(replace all ampersands ("&") in your links with the
apropriate HTML entity ("&amp;")) please". I was experiencing a
complete mental block about that. 4500 lines of valid code and I just
didn't get what the validator's problem was on that line.

the user might see some very decorative questionmarks or empty blocks
That's the WHOLE POINT of the page! How many did you see?

Also, please specify which exact version you are testing. Firefox and
Netscape don't need to be tested if you're testing the latest Mozilla
release. Their support can be assumed to be nearly identical (except
when testing Mozilla on multiple platforms).
I was surprised to find that
1) the browser's version didn't make a difference after version 4
2) Netscape, Mozilla, & Firebird did not have the same results, nor
were they the same for a browser in the other platform.
The 2 exceptions to this were between IE5 & IE6, the NON-BREAKING
HYPHEN and DOUBLE LOW LINE. That I don't undestand.

But this isn't very relevant since the page is for well recognized
characters. If IE with the fonts that Windows recommends doesn't get a
character then it isn't on the page, and the majority of characters
that IE got everyone got. On the other hand most of MISCELLANEOUS
SYMBOLS weren't included because every browser had a different opinion
about which to include. Yes it tended to be Gecko or Windows, but not
that neat. The test results came from multiple sources and multiple
test pages. If I didn't have several " yes's" then one "no" would
overide. Anyway I'll try to say this on the page.

You're right about that third CJK symbol(U+4E1B), I knew that but
forgot. Thanks. If you got those you must have a language pack
installed, but you should have gotten the arc & circle. If your
Windows is from early '98 you may have missed a relevant update.

Alan F.,
It seems to consistently refer to "numeric character references" as
"numeric entities", and refers to collectively to (what are properly
called) "numeric character references" and "character entities" underthe major heading of "entities".
You're right about this. When I started the page everywhere I looked
used different nomenclature, and I didn't know html 4 defined it. I'll
clean it up.
There's what seems to be a
pointless and incorrect distinction drawn between these in the
covering notes to the page:
Fixed

For example, I found that installing Japanese
language support (which I didn't actually need because I can't read
it) nevertheless enabled a whole swath of useful symbols that hadn't
previously worked, and that had no obvious relationship to Japanese.


I think the install probably also installed "Lucinda Sans Uni' or
"Arial Unicode MS'. Are they on your computer? Or enabled something as
described here:
http://www.i18nguy.com/surrogates.html
Nancy
Jul 20 '05 #5

P: n/a
On Mon, 19 Apr 2004, Nancy wrote:
For example, I found that installing Japanese
language support (which I didn't actually need because I can't read
it) nevertheless enabled a whole swath of useful symbols that hadn't
previously worked, and that had no obvious relationship to Japanese.
I think the install probably also installed "Lucinda Sans Uni' or
"Arial Unicode MS'.


I had both of *those* already. You're right that it installed some
additional fonts, though, so maybe it was one or other of those
(MSMINCHO.TTC, MSGOTHIC.TTC) which did the trick. But I didn't tell
it to use those fonts explicitly.
Or enabled something as described here:
http://www.i18nguy.com/surrogates.html


Interesting point - but no, the symbols which I'm thinking about were
in the BMP, they didn't need support for surrogates in order to make
them work.

Unfortunately I don't have any software handy which will show me
what's inside a TTC file.

Your web page certainly shows an interesting and nicely-made snapshot
of what a regular user might see; but it looks as if there's a great
deal of potential variability between installations, so I'd suggest
that readers need to be warned of that possibility, somehow.

all the best

Jul 20 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.