472,146 Members | 1,266 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,146 software developers and data experts.

multiple versions of "Extended ASCII characters"(No. 128 to 255)

wob
Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are more
than one version of "Extended ASCII characters"(No. 128 to 255) . e.g., in
one version No. 224 is the symbol alpha, in another, it's a "a" with a ` on
it... How come?

You can see it here:

http://www.kturby.com/cables/ascii2.htm

http://www.idevelopment.info/data/Pr...ii_table.shtml
Nov 15 '05 #1
4 3681
"wob" writes:
Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are
more than one version of "Extended ASCII characters"(No. 128 to 255) .
e.g., in one version No. 224 is the symbol alpha, in another, it's a "a"
with a ` on it... How come?


The phrase "extended ASCII" has come to mean that the new character set
contains ASCII as a subset. There are probably hundreds of these. ISTM
there should have been a better way to express that thought, but it doesn't
leap out at me. Related words that might help you pursue this subject in
google: font, code page.

There is now, and always has been only one ASCII and it contains 128
characters, basically the American version of the latin alphabet, plus
digits and punctuation and control characters. There is no established
graphic to identify the control characters.
Nov 15 '05 #2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

osmium wrote:
"wob" writes:

Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are
more than one version of "Extended ASCII characters"(No. 128 to 255) .
e.g., in one version No. 224 is the symbol alpha, in another, it's a "a"
with a ` on it... How come?

The phrase "extended ASCII" has come to mean that the new character set
contains ASCII as a subset.


Precicely!

IMHO, the phrase "Extended ASCII" should be banned from any discussion. People
too often say "Extended ASCII" when they mean "some unknown characterset that
shares a common set of characters with ASCII", and expect a precise answer
relating to ASCII.
There are probably hundreds of these.
One of the ISO working committees keeps a website just as a catalog of
charactersets. The URL is http://anubis.dkuug.dk/i18n/charmaps/
ISTM
there should have been a better way to express that thought, but it doesn't
leap out at me. Related words that might help you pursue this subject in
google: font, code page.
"coded character set" or "coded characterset"
Also, related to "characterset translation"

There is now, and always has been only one ASCII and it contains 128
characters, basically the American version of the latin alphabet, plus
digits and punctuation and control characters. There is no established
graphic to identify the control characters.


See http://anubis.dkuug.dk/i18n/charmaps/ASCII for an ASCII-to-Unicode table.
While you /can/ purchase the ASCII specs from ISO, the ECMA provides identical
specs for free at
http://www.ecma-international.org/pu...t/ECMA-006.pdf,
http://www.ecma-international.org/pu...t/ECMA-048.pdf, and
http://www.ecma-international.org/pu...t/ECMA-035.pdf

- --
Lew Pitcher
IT Specialist, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFC39pQagVFX4UWr64RArjIAKDtK42C9728hfxIaF100L GQ9DEWrwCg88iN
3b2x+QqZcRbjDb5KOGn2WYQ=
=BwwV
-----END PGP SIGNATURE-----
Nov 15 '05 #3
On Thu, 21 Jul 2005 10:59:45 -0500, wob
<wo***@yahoo.com> wrote:
Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are more
than one version of "Extended ASCII characters"(No. 128 to 255) . e.g., in
one version No. 224 is the symbol alpha, in another, it's a "a" with a ` on
it... How come?


There is no such thing as "Extended ASCII" in any meaningful form. It's
like "C with extensions", the extended parts are done by whoever wants
them.

ASCII defines /only/ characters using the bottom 7 bits, thus the
characters numbered 0 to 127. Various people have decided that they
want more, so they allocated them to codes above 127 as they felt like
it. Line drawing characters, European accented characters (at least
four versions used commonly in Europe), mathematical symbols, Cyrillic
(Russuan) characters, Greek, funny faces, you name it. And of course
Microsoft came up with its own ones different from any others.

Recently (i.e. in the last 20 years) there have been attempts to
standardise, but because all of the characters can't fit into the
'spare' 128 available positions there are lots of variants in the
ISO-8859 standard (at least 10 variants). See for instance

http://czyborra.com/charsets/iso8859.html

It was realised that what was really wanted was a much expanded
character space, to allow for the thousands of Chinese characters and
other languages to be added, so Unicode was born. This uses fixed-width
characters of either 16 or 32 bits, with each character assigned to only
one position (some of the characters look alike but are in different
national or specific sets so they are treated as different characters).

Because much software still uses 8 bit strings (and 8 bit transport
paths), Unicode also specifies a method of converting a 'wide' (16 or
32 bit) character into an string of 8 bit characters. This system,
UTF (Unicode Transformation Format) 8 keeps the ASCII characters as
individual 7 bits with the top bit of the 8 bit character zero, so it is
compatible with 7 bit ASCII, and characterss with the top bit set are
not valid on their own, only as part of a "multi-byte character" string.

The web page above has descriptions of the ISO 8859 variants, and also
points to articles and descriptions of Unicode, UTF-8 and other matters.

This is relevant to C in the support for 'wide' characters and multibyte
characters, and the functions which transform and output them.

Chris C
Nov 15 '05 #4
On Thu, 21 Jul 2005 09:32:50 -0700, "osmium" <r1********@comcast.net>
wrote:
<snip>
The phrase "extended ASCII" has come to mean that the new character set
contains ASCII as a subset. There are probably hundreds of these. ISTM
there should have been a better way to express that thought, but it doesn't
leap out at me. Related words that might help you pursue this subject in
google: font, code page.
Right.
There is now, and always has been only one ASCII and it contains 128
characters, basically the American version of the latin alphabet, plus
digits and punctuation and control characters. There is no established
graphic to identify the control characters.

There is only one ASCII now, but it has changed significantly at least
once, when lowercase and other 6/x and 7/x was added, IIRC about 1968.
And to be pedantic it went through periods of being designated USASCII
and ANSCII as the name of the organization changed, but this did not
imply any substantive change. The American alphabet is the (modern)
English alphabet, at least for America = US plus most of CA; there are
other American countries (primarily) using other languages.

There _is_ a standard for graphical representations for control
characters, albeit at least mostly just two-letter mnemonics jammed
together, not "graphical" in the common sense of pictorial or iconic:
ISO 2047, IIRC based on and superseding an X3.n like 646 versus ASCII;
but it certainly hasn't been widely used or even known. I have seen
what I believe(d) were displays obeying it on various datascopes, and
a few (real) terminals back-in-the-day in "show controls" mode.

- David.Thompson1 at worldnet.att.net
Nov 15 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

68 posts views Thread by Marco Bubke | last post: by
3 posts views Thread by ncf | last post: by
3 posts views Thread by Jonathan Mcdougall | last post: by
3 posts views Thread by Petr Prikryl | last post: by
12 posts views Thread by Mark | last post: by
6 posts views Thread by Flavio | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.