473,408 Members | 1,976 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

How to put non-ascii characters into C string?

wob
Hi there,

I wish to show some special characters such as the char for "alpha" and the
symbol for degrees.. Anyone please give me some suggestions how to do that?

Thank you very much!

Owen
Nov 15 '05 #1
12 2625
wob wrote:
Hi there,

I wish to show some special characters such as the char for "alpha" and the
symbol for degrees.. Anyone please give me some suggestions how to do that?


The C Standard guarantees the presence of a "basic execution
character set" consisting of upper- and lower-case unaccented
letters, the digits zero through nine, assorted punctuation marks,
and a few special-purpose characters like '\n'. No matter what C
implementation you are using, these characters will be present and
available for your use.

The Standard also permits additional characters in the (non-
"basic") "execution character set." However, it does not require
that additional characters exist, nor does it specify any such
additional characters. The consequence is that any such extra
characters are available only at the whim of the implementation,
and different implementations will have different (possibly empty)
sets of extra characters.

The numeric character code that produces "alpha" one one
implementation may produce "upper left corner" on another and
"the Euro symbol" on still another. Thus, the way you ask for
an "alpha" to be produced will be specific to your system and
may not work on others; you might send "the alpha code" to the
output stream and see nothing but a smiley face or a blank.

The standardization of character sets and character codes is
a relatively recent development, and C has not yet caught up with
it. C still maintains a sort of agnosticism towards such matters
(which is why C is easily implemented on systems that use different
character repertoires), but it does complicate things when one wants
to use "exotic" characters. The trouble arises even within the
family of Latin-ish alphabets: One cannot write "Cosí fan tutte"
or "Götterdämmerung" or "Aïda" with the facilities guaranteed
by C.

So: You're stuck with the unhappy task of trying to figure out
what character codes (if any!) produce "alpha" and "degree sign"
on your system, and with the realization that the same codes might
not do anything sensible on the next system you use. As a purely
practical and ad-hoc approach, you could write yourself a little
program that runs through every `char' value and displays the glyph
your system produces for each. If you happen to find "alpha" and
"degree symbol" among the rendered glyphs you're in luck -- but
never forget that the same codes may do something completely
different on other systems.

There was a perfect world around here somewhere, but I think
I left it in the pocket of my other trousers.

--
Eric Sosman
es*****@acm-dot-org.invalid
Nov 15 '05 #2
wob wrote:
I wish to show some special characters such as the char for "alpha" and the
symbol for degrees.. Anyone please give me some suggestions how to do that?


On my Fedora Core 4 system (which uses Unicode UTF-8) I can in fact do:

#include <stdio.h>
#include <string.h>

int main(void)
{
printf("The character α occupies %i bytes.\n", strlen("α"));
return 0;
}

and the program will output:

The character α occupies 2 bytes.

In Emacs, special Unicode characters can be inserted with
`set-input-method' and argument `TeX'. α is then inserted by typing
`\alpha'. (This message will not display correctly without UTF-8 support.)
August
Nov 15 '05 #3
"wob" <wo***@hotmail.com> wrote:
# Hi there,
#
# I wish to show some special characters such as the char for "alpha" and the
# symbol for degrees.. Anyone please give me some suggestions how to do that?

If the encoding of the glyph doesn't include a null byte, you can use fprintf
or fputs or %s or %c. However determining the encoding of glyphs at the moment
is in flux. It depends on the font or whether you're using Unicode which gives
one code for all fonts. If it's Unicode, you have to know what the encoding is
like UTF-8 or Latin-1 or MacRoman or ...

It all depends on the context in which you want to specify a character like
"alpha".
--
SM Ryan http://www.rawbw.com/~wyrmwif/
So basically, you just trace.
Nov 15 '05 #4
----- Original Message -----
From: "Eric Sosman" <es*****@acm-dot-org.invalid>
Newsgroups: comp.lang.c
Sent: Wednesday, July 20, 2005 9:09 PM
Subject: Re: How to put non-ascii characters into C string?

wob wrote:
Hi there,

I wish to show some special characters such as the char for "alpha" and the symbol for degrees.. Anyone please give me some suggestions how to do
that?

[snip]
So: You're stuck with the unhappy task of trying to figure out
what character codes (if any!) produce "alpha" and "degree sign"
on your system, and with the realization that the same codes might
not do anything sensible on the next system you use. As a purely
practical and ad-hoc approach, you could write yourself a little
program that runs through every `char' value and displays the glyph
your system produces for each. If you happen to find "alpha" and
"degree symbol" among the rendered glyphs you're in luck -- but
never forget that the same codes may do something completely
different on other systems.

There was a perfect world around here somewhere, but I think
I left it in the pocket of my other trousers.

--
Eric Sosman
es*****@acm-dot-org.invalid


The following code works for me, although some of the control characters
mangle the output in places:

#include <stdio.h>
#define MY_CHAR_MAX 256

int main(void) {
int i;

for(i=0;i<(MY_CHAR_MAX-1);i+=3) {
printf("Char value: %d = %-8c", i, i);
printf("Char value: %d = %-8c", i+1, i+1);
printf("Char value: %d = %c\n", i+2, i+2);
} /* for i */
printf("Char value: %d = %c\n", i, i);

return 0;
} /* main */

Apparently, my implementation has no "alpha", and the degree symbol is value
176.

-Charles
Nov 15 '05 #5
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Charles M. Reinke wrote:
----- Original Message -----
From: "Eric Sosman" <es*****@acm-dot-org.invalid>
Newsgroups: comp.lang.c
Sent: Wednesday, July 20, 2005 9:09 PM
Subject: Re: How to put non-ascii characters into C string?
wob wrote:
Hi there,

I wish to show some special characters such as the char for "alpha" and
the
symbol for degrees.. Anyone please give me some suggestions how to do


that?

[snip]

So: You're stuck with the unhappy task of trying to figure out
what character codes (if any!) produce "alpha" and "degree sign"
on your system, and with the realization that the same codes might
not do anything sensible on the next system you use. As a purely
practical and ad-hoc approach, you could write yourself a little
program that runs through every `char' value and displays the glyph
your system produces for each. If you happen to find "alpha" and
"degree symbol" among the rendered glyphs you're in luck -- but
never forget that the same codes may do something completely
different on other systems.

There was a perfect world around here somewhere, but I think
I left it in the pocket of my other trousers.

--
Eric Sosman
es*****@acm-dot-org.invalid

The following code works for me, although some of the control characters
mangle the output in places:

#include <stdio.h>
#define MY_CHAR_MAX 256

int main(void) {
int i;

for(i=0;i<(MY_CHAR_MAX-1);i+=3) {
printf("Char value: %d = %-8c", i, i);
printf("Char value: %d = %-8c", i+1, i+1);
printf("Char value: %d = %c\n", i+2, i+2);
} /* for i */
printf("Char value: %d = %c\n", i, i);

return 0;
} /* main */

Apparently, my implementation has no "alpha", and the degree symbol is value
176.


Funny, when I ran your code, I find that value 176 is empty. OTOH, I get

Char value: 192 = { Char value: 193 = A Char value: 194 = B
Char value: 195 = C Char value: 196 = D Char value: 197 = E
Char value: 198 = F Char value: 199 = G Char value: 200 = H
Char value: 201 = I Char value: 202 = Char value: 203 =
Char value: 204 = ö Char value: 205 = Char value: 206 = ó
Char value: 207 = Char value: 208 = } Char value: 209 = J
Char value: 210 = K Char value: 211 = L Char value: 212 = M
Char value: 213 = N Char value: 214 = O Char value: 215 = P
Char value: 216 = Q Char value: 217 = R Char value: 218 =
Char value: 219 = Char value: 220 = Char value: 221 =
Char value: 222 = Char value: 223 = Char value: 224 = \
Char value: 225 = Char value: 226 = S Char value: 227 = T
Char value: 228 = U Char value: 229 = V Char value: 230 = W
Char value: 231 = X Char value: 232 = Y Char value: 233 = Z

What sort of "Extended ASCII" do I have?

- --
Lew Pitcher
IT Specialist, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFC3+GSagVFX4UWr64RAm4ZAKC5bRwzaQzqHPru5msBTS 9NNBEyrQCghIGn
PQ9E4Rnlddi/aJFOTjcD9N8=
=V3NY
-----END PGP SIGNATURE-----
Nov 15 '05 #6
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lew Pitcher wrote:
Charles M. Reinke wrote:

[snip]
The following code works for me, although some of the control characters
mangle the output in places:

#include <stdio.h>
#define MY_CHAR_MAX 256

int main(void) {
int i;

for(i=0;i<(MY_CHAR_MAX-1);i+=3) {
printf("Char value: %d = %-8c", i, i);
printf("Char value: %d = %-8c", i+1, i+1);
printf("Char value: %d = %c\n", i+2, i+2);
} /* for i */
printf("Char value: %d = %c\n", i, i);

return 0;
} /* main */

Apparently, my implementation has no "alpha", and the degree symbol is value
176.

Funny, when I ran your code, I find that value 176 is empty. OTOH, I get

Char value: 192 = { Char value: 193 = A Char value: 194 = B
Char value: 195 = C Char value: 196 = D Char value: 197 = E
Char value: 198 = F Char value: 199 = G Char value: 200 = H
Char value: 201 = I Char value: 202 = Char value: 203 =
Char value: 204 = ö Char value: 205 = Char value: 206 = ó
Char value: 207 = Char value: 208 = } Char value: 209 = J
Char value: 210 = K Char value: 211 = L Char value: 212 = M
Char value: 213 = N Char value: 214 = O Char value: 215 = P
Char value: 216 = Q Char value: 217 = R Char value: 218 =
Char value: 219 = Char value: 220 = Char value: 221 =
Char value: 222 = Char value: 223 = Char value: 224 = \
Char value: 225 = Char value: 226 = S Char value: 227 = T
Char value: 228 = U Char value: 229 = V Char value: 230 = W
Char value: 231 = X Char value: 232 = Y Char value: 233 = Z

What sort of "Extended ASCII" do I have?


Hint, here's how I compiled and ran the code...

//PITCHLW JOB (0000),' LEW PITCHER',CLASS=E,MSGCLASS=A,
// PRTY=8,NOTIFY=LDP
/*ROUTE PRINT LOCAL
//CC EXEC EDCCLG,
// INFILE='TEST.LDP.SOURCE(CHARTEST)'

- --
Lew Pitcher
IT Specialist, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFC3+IqagVFX4UWr64RAhIIAJ0eNPB8Q3yb04VXgy9vaq YT3YNtbgCggoDj
9Cotu1T18sfb951PhnGizrY=
=k4f5
-----END PGP SIGNATURE-----
Nov 15 '05 #7
"Lew Pitcher" <Le*********@td.com> wrote in message
news:Ul*******************@news20.bellglobal.com.. .
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lew Pitcher wrote:

[snip]
What sort of "Extended ASCII" do I have?


Hint, here's how I compiled and ran the code...

//PITCHLW JOB (0000),' LEW PITCHER',CLASS=E,MSGCLASS=A,
// PRTY=8,NOTIFY=LDP
/*ROUTE PRINT LOCAL
//CC EXEC EDCCLG,
// INFILE='TEST.LDP.SOURCE(CHARTEST)'

[snip]

Let's see, that looks like something from an IBM mainframe, maybe OS/390.
If that's the case, then the character set should be EBCDIC, which according
to Wikipedia is incompatible with ASCII and therefore not any kind of
"Extended ASCII". Based on the output you showed, I'd say you're using the
CCSID 500 (or something similar) variant of EBCDIC, which would explain why
e.g. "Char value: 202" was something other than "J".

I'm I close? What did I win?

-Charles
Nov 15 '05 #8
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Charles M. Reinke wrote:
"Lew Pitcher" <Le*********@td.com> wrote in message
news:Ul*******************@news20.bellglobal.com.. .
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lew Pitcher wrote:
[snip]
What sort of "Extended ASCII" do I have?


Hint, here's how I compiled and ran the code...

//PITCHLW JOB (0000),' LEW PITCHER',CLASS=E,MSGCLASS=A,

[snip] I'm I close? What did I win?


Good analysis. You win not being bothered by me for a week :-)
- --
Lew Pitcher
IT Specialist, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFC3/4jagVFX4UWr64RAs4wAJ9LDU1HACQFGXxoMUg8aYvkEwaE5gCg t3pE
Vcw9tpVvYwqOIV5+7olfRlA=
=i/bI
-----END PGP SIGNATURE-----
Nov 15 '05 #9
Lew Pitcher wrote:
.... snip ...
Funny, when I ran your code, I find that value 176 is empty. OTOH, I get

Char value: 192 = { Char value: 193 = A Char value: 194 = B
Char value: 195 = C Char value: 196 = D Char value: 197 = E
Char value: 198 = F Char value: 199 = G Char value: 200 = H
Char value: 201 = I Char value: 202 = Char value: 203 =
Char value: 204 = ö Char value: 205 = Char value: 206 = ó
Char value: 207 = Char value: 208 = } Char value: 209 = J
Char value: 210 = K Char value: 211 = L Char value: 212 = M
Char value: 213 = N Char value: 214 = O Char value: 215 = P
Char value: 216 = Q Char value: 217 = R Char value: 218 =
Char value: 219 = Char value: 220 = Char value: 221 =
Char value: 222 = Char value: 223 = Char value: 224 = \
Char value: 225 = Char value: 226 = S Char value: 227 = T
Char value: 228 = U Char value: 229 = V Char value: 230 = W
Char value: 231 = X Char value: 232 = Y Char value: 233 = Z

What sort of "Extended ASCII" do I have?


I'm not sure, but I suspect you have EBCDIC, which is not ASCII.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 15 '05 #10
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

CBFalconer wrote:
Lew Pitcher wrote:

... snip ...
Funny, when I ran your code, I find that value 176 is empty. OTOH, I get

Char value: 192 = { Char value: 193 = A Char value: 194 = B
Char value: 195 = C Char value: 196 = D Char value: 197 = E
Char value: 198 = F Char value: 199 = G Char value: 200 = H
Char value: 201 = I Char value: 202 = Char value: 203 =
Char value: 204 = ö Char value: 205 = Char value: 206 = ó
Char value: 207 = Char value: 208 = } Char value: 209 = J
Char value: 210 = K Char value: 211 = L Char value: 212 = M
Char value: 213 = N Char value: 214 = O Char value: 215 = P
Char value: 216 = Q Char value: 217 = R Char value: 218 =
Char value: 219 = Char value: 220 = Char value: 221 =
Char value: 222 = Char value: 223 = Char value: 224 = \
Char value: 225 = Char value: 226 = S Char value: 227 = T
Char value: 228 = U Char value: 229 = V Char value: 230 = W
Char value: 231 = X Char value: 232 = Y Char value: 233 = Z

What sort of "Extended ASCII" do I have?

I'm not sure, but I suspect you have EBCDIC, which is not ASCII.


Yah. It was a trick question. :-)

I sometimes like to 'remind' clc posters that conforming C isn't restricted to
charactersets that map ASCII to the first 128 codepoints. It's a failing of
mine, I know, and I pay penance for it every day; I'm a COBOL programmer by
trade :-S
- --
Lew Pitcher

Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFC4FXEagVFX4UWr64RAlgtAJ4mNY9IOftiv2/25FqmSoit/NeLGgCg2lb/
peLXjWQhpUme2B+iHhMqVEE=
=Vhj7
-----END PGP SIGNATURE-----
Nov 15 '05 #11
CBFalconer wrote:
Lew Pitcher wrote:
[...]
What sort of "Extended ASCII" do I have?


I'm not sure, but I suspect you have EBCDIC, which is not ASCII.


It's "Distended ASCII."

--
Eric Sosman
es*****@acm-dot-org.invalid
Nov 15 '05 #12
wob
Many thanks for all!

"Lew Pitcher" <Le*********@td.com> wrote in message
news:N5*******************@news20.bellglobal.com.. .
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Charles M. Reinke wrote:
"Lew Pitcher" <Le*********@td.com> wrote in message
news:Ul*******************@news20.bellglobal.com.. .
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lew Pitcher wrote:


[snip]
What sort of "Extended ASCII" do I have?

Hint, here's how I compiled and ran the code...

//PITCHLW JOB (0000),' LEW PITCHER',CLASS=E,MSGCLASS=A,

[snip]
I'm I close? What did I win?


Good analysis. You win not being bothered by me for a week :-)
- --
Lew Pitcher
IT Specialist, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFC3/4jagVFX4UWr64RAs4wAJ9LDU1HACQFGXxoMUg8aYvkEwaE5gCg t3pE
Vcw9tpVvYwqOIV5+7olfRlA=
=i/bI
-----END PGP SIGNATURE-----

Nov 15 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
by: cheeser | last post by:
Hello all, Please see the question in the code below... Thanks! Dave #include <iostream>
3
by: Mario | last post by:
Hello, I couldn't find a solution to the following problem (tried google and dejanews), maybe I'm using the wrong keywords? Is there a way to open a file (a linux fifo pipe actually) in...
25
by: Yves Glodt | last post by:
Hello, if I do this: for row in sqlsth: ________pkcolumns.append(row.strip()) ________etc without a prior:
32
by: Adrian Herscu | last post by:
Hi all, In which circumstances it is appropriate to declare methods as non-virtual? Thanx, Adrian.
8
by: Bern McCarty | last post by:
Is it at all possible to leverage mixed-mode assemblies from AppDomains other than the default AppDomain? Is there any means at all of doing this? Mixed-mode is incredibly convenient, but if I...
14
by: Patrick Kowalzick | last post by:
Dear all, I have an existing piece of code with a struct with some PODs. struct A { int x; int y; };
11
by: ypjofficial | last post by:
Hello All, So far I have been reading that in case of a polymorphic class ( having at least one virtual function in it), the virtual function call get resolved at run time and during that the...
2
by: Ian825 | last post by:
I need help writing a function for a program that is based upon the various operations of a matrix and I keep getting a "non-aggregate type" error. My guess is that I need to dereference my...
0
by: amitvps | last post by:
Secure Socket Layer is very important and useful for any web application but it brings some problems too with itself. Handling navigation between secure and non-secure pages is one of the cumbersome...
12
by: puzzlecracker | last post by:
is it even possible or/and there is a better alternative to accept input in a nonblocking manner?
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.