473,545 Members | 2,688 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

multiple versions of "Extended ASCII characters"(No. 128 to 255)

wob
Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are more
than one version of "Extended ASCII characters"(No. 128 to 255) . e.g., in
one version No. 224 is the symbol alpha, in another, it's a "a" with a ` on
it... How come?

You can see it here:

http://www.kturby.com/cables/ascii2.htm

http://www.idevelopment.info/data/Pr...ii_table.shtml
Nov 15 '05 #1
4 3834
"wob" writes:
Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are
more than one version of "Extended ASCII characters"(No. 128 to 255) .
e.g., in one version No. 224 is the symbol alpha, in another, it's a "a"
with a ` on it... How come?


The phrase "extended ASCII" has come to mean that the new character set
contains ASCII as a subset. There are probably hundreds of these. ISTM
there should have been a better way to express that thought, but it doesn't
leap out at me. Related words that might help you pursue this subject in
google: font, code page.

There is now, and always has been only one ASCII and it contains 128
characters, basically the American version of the latin alphabet, plus
digits and punctuation and control characters. There is no established
graphic to identify the control characters.
Nov 15 '05 #2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

osmium wrote:
"wob" writes:

Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are
more than one version of "Extended ASCII characters"(No. 128 to 255) .
e.g., in one version No. 224 is the symbol alpha, in another, it's a "a"
with a ` on it... How come?

The phrase "extended ASCII" has come to mean that the new character set
contains ASCII as a subset.


Precicely!

IMHO, the phrase "Extended ASCII" should be banned from any discussion. People
too often say "Extended ASCII" when they mean "some unknown characterset that
shares a common set of characters with ASCII", and expect a precise answer
relating to ASCII.
There are probably hundreds of these.
One of the ISO working committees keeps a website just as a catalog of
charactersets. The URL is http://anubis.dkuug.dk/i18n/charmaps/
ISTM
there should have been a better way to express that thought, but it doesn't
leap out at me. Related words that might help you pursue this subject in
google: font, code page.
"coded character set" or "coded characterset"
Also, related to "characters et translation"

There is now, and always has been only one ASCII and it contains 128
characters, basically the American version of the latin alphabet, plus
digits and punctuation and control characters. There is no established
graphic to identify the control characters.


See http://anubis.dkuug.dk/i18n/charmaps/ASCII for an ASCII-to-Unicode table.
While you /can/ purchase the ASCII specs from ISO, the ECMA provides identical
specs for free at
http://www.ecma-international.org/pu...t/ECMA-006.pdf,
http://www.ecma-international.org/pu...t/ECMA-048.pdf, and
http://www.ecma-international.org/pu...t/ECMA-035.pdf

- --
Lew Pitcher
IT Specialist, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFC39pQagV FX4UWr64RArjIAK DtK42C9728hfxIa F100LGQ9DEWrwCg 88iN
3b2x+QqZcRbjDb5 KOGn2WYQ=
=BwwV
-----END PGP SIGNATURE-----
Nov 15 '05 #3
On Thu, 21 Jul 2005 10:59:45 -0500, wob
<wo***@yahoo.co m> wrote:
Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are more
than one version of "Extended ASCII characters"(No. 128 to 255) . e.g., in
one version No. 224 is the symbol alpha, in another, it's a "a" with a ` on
it... How come?


There is no such thing as "Extended ASCII" in any meaningful form. It's
like "C with extensions", the extended parts are done by whoever wants
them.

ASCII defines /only/ characters using the bottom 7 bits, thus the
characters numbered 0 to 127. Various people have decided that they
want more, so they allocated them to codes above 127 as they felt like
it. Line drawing characters, European accented characters (at least
four versions used commonly in Europe), mathematical symbols, Cyrillic
(Russuan) characters, Greek, funny faces, you name it. And of course
Microsoft came up with its own ones different from any others.

Recently (i.e. in the last 20 years) there have been attempts to
standardise, but because all of the characters can't fit into the
'spare' 128 available positions there are lots of variants in the
ISO-8859 standard (at least 10 variants). See for instance

http://czyborra.com/charsets/iso8859.html

It was realised that what was really wanted was a much expanded
character space, to allow for the thousands of Chinese characters and
other languages to be added, so Unicode was born. This uses fixed-width
characters of either 16 or 32 bits, with each character assigned to only
one position (some of the characters look alike but are in different
national or specific sets so they are treated as different characters).

Because much software still uses 8 bit strings (and 8 bit transport
paths), Unicode also specifies a method of converting a 'wide' (16 or
32 bit) character into an string of 8 bit characters. This system,
UTF (Unicode Transformation Format) 8 keeps the ASCII characters as
individual 7 bits with the top bit of the 8 bit character zero, so it is
compatible with 7 bit ASCII, and characterss with the top bit set are
not valid on their own, only as part of a "multi-byte character" string.

The web page above has descriptions of the ISO 8859 variants, and also
points to articles and descriptions of Unicode, UTF-8 and other matters.

This is relevant to C in the support for 'wide' characters and multibyte
characters, and the functions which transform and output them.

Chris C
Nov 15 '05 #4
On Thu, 21 Jul 2005 09:32:50 -0700, "osmium" <r1********@com cast.net>
wrote:
<snip>
The phrase "extended ASCII" has come to mean that the new character set
contains ASCII as a subset. There are probably hundreds of these. ISTM
there should have been a better way to express that thought, but it doesn't
leap out at me. Related words that might help you pursue this subject in
google: font, code page.
Right.
There is now, and always has been only one ASCII and it contains 128
characters, basically the American version of the latin alphabet, plus
digits and punctuation and control characters. There is no established
graphic to identify the control characters.

There is only one ASCII now, but it has changed significantly at least
once, when lowercase and other 6/x and 7/x was added, IIRC about 1968.
And to be pedantic it went through periods of being designated USASCII
and ANSCII as the name of the organization changed, but this did not
imply any substantive change. The American alphabet is the (modern)
English alphabet, at least for America = US plus most of CA; there are
other American countries (primarily) using other languages.

There _is_ a standard for graphical representations for control
characters, albeit at least mostly just two-letter mnemonics jammed
together, not "graphical" in the common sense of pictorial or iconic:
ISO 2047, IIRC based on and superseding an X3.n like 646 versus ASCII;
but it certainly hasn't been widely used or even known. I have seen
what I believe(d) were displays obeying it on various datascopes, and
a few (real) terminals back-in-the-day in "show controls" mode.

- David.Thompson1 at worldnet.att.ne t
Nov 15 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

68
4289
by: Marco Bubke | last post by:
Hi I have read some mail on the dev mailing list about PEP 318 and find the new Syntax really ugly. def foo(x, y): pass I call this foo(1, 2), this isn't really intuitive to me! Also I don't like the brackets.
3
1959
by: ncf | last post by:
I'm having an odd problem. I'm getting an error from IDLE saying "End Of Line detected while scanning single-quoted string." Odd thing is, it's not single-quoted, it's one of the doc-strings (if that's what you call them). In the following code (class name replaced with <<<NAME>>>), the error is being highlighted as the hyphen joining "non"...
3
2634
by: Jonathan Mcdougall | last post by:
I started using boost's filesystem library a couple of days ago. In its FAQ, it states "Wide-character names would provide an illusion of portability where portability does not in fact exist. Behavior would be completely different on operating systems (Windows, for example) that support wide-character names, than on systems which don't...
3
1364
by: Petr Prikryl | last post by:
Hi all, My question is: How do you tackle with mixing Unicode and non-Unicode parts of your application? Context: ======== The PEP 3000 says "Make all strings be Unicode, and have a separate bytes() type."
2
1809
by: tlyczko | last post by:
Hello, Doug Steele, Microsoft Access MVP, has commented: "See whether the sample application "At Your Survey" which Duane Hookom has at http://www.rogersaccesslibrary.com/duanehookom/duanehookom.htm gives you what you need." I wonder if anyone who has used this little app could advise me upon adapting this app to work successfully to...
1
1990
by: ian.davies52 | last post by:
I'm having a problem running a query. I get the "too many fields" error message, but I only have 162 fields in the query and I thought the limit was 255. The problem query (Query1) is based on query2 that pulls together records from two tables. Could the problem be that some of the fields in Query1 use expressions using (SELECT "",...
12
7337
by: Mark | last post by:
In our web.config, we have changed the first line below to look like the second: OLD: <globalization requestEncoding="utf-8" responseEncoding="utf-8" /> NEW: <globalization requestEncoding="ISO-8859-1" responseEncoding="ISO-8859-1" /> What characters are we excluding from working properly and/or what problems might we encounter by making...
6
7531
by: Flavio | last post by:
Hi I am havin a problem with urllib2.urlopen. I get this error when I try to pass a unicode to it. raise UnicodeError, "label too long" is this problem avoidable? no browser or programs such as wget seem to have a problem with these strings.
2
3313
by: John Nagle | last post by:
I'm trying to clean up a bad ASCII string, one read from a web page that is supposedly in the ASCII character set but has some characters above 127. And I get this: File "D:\projects\sitetruth\InfoSitePage.py", line 285, in httpfetch sitetext = sitetext.encode('ascii','replace') # force to clean ASCII UnicodeDecodeError: 'ascii' codec...
0
7499
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7432
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7689
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7943
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7786
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
5076
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3490
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3470
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1919
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.