By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,412 Members | 993 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,412 IT Pros & Developers. It's quick & easy.

utf8 -> ascii in c language??

P: n/a
i have a class, it can read one line(\r\n ended) from string,when i
read line from utf8 string i can't get any thing!
maybe i should conversion utf8 to ascii??there is any function can
conversion utf8 to ascii? very thanks to your help!!

Nov 14 '05 #1
Share this Question
Share on Google+
12 Replies


P: n/a
On 6 Feb 2005 00:18:24 -0800, "chunhui_true" <ch*********@gmail.com>
wrote in comp.lang.c:
i have a class, it can read one line(\r\n ended) from string,when i
read line from utf8 string i can't get any thing!
There is no such thing as a "class" in C. Perhaps you are looking for
comp.lang.c++ down the hall.

Note that both C and C++ are case sensitive languages, so you had
better learn how to use the shift key on your keyboard.
maybe i should conversion utf8 to ascii??there is any function can
conversion utf8 to ascii? very thanks to your help!!


There is no such thing as "ascii". There is ASCII, named for the
American Code for Information Interchange. But neither C nor C++
define or require any specific character set, neither ASCII nor UTF8
nor any other.

So most likely comp.lang.c++ won't want your question either. You
need a group that supports your particular compiler/operating system
combination, but since you posted through Google I have no information
on which to suggest one to you.

And you need to learn how to use your shift key. And the space bar.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #2

P: n/a
chunhui_true wrote:
i have a class, it can read one line(\r\n ended) from string,when i
read line from utf8 string i can't get any thing!
maybe i should conversion utf8 to ascii??there is any function can
conversion utf8 to ascii? very thanks to your help!!


I would recommend writing the bytes of each string out as a number and
as a text value so that you can see them. You might find that this is a
case where it isn't as difficult as it seems.

-Chris

Nov 14 '05 #3

P: n/a
Thanks for your suggestion.My English is very poor.And I first use
groups, So I don't konw the culture in groups.Though the time I think I
can communicate with each other very well.

Nov 14 '05 #4

P: n/a
Jack Klein wrote:
"chunhui_true" <ch*********@gmail.com> wrote:
i have a class, it can read one line(\r\n ended) from string,
when i read line from utf8 string i can't get any thing!
There is no such thing as a "class" in C.


But there is such a thing in object oriented programming.
<snip>
maybe i should conversion utf8 to ascii??there is any function can conversion utf8 to ascii? very thanks to your help!!
There is no such thing as "ascii". There is ASCII, named for the
American Code for Information Interchange. But neither C nor C++
define or require any specific character set, neither ASCII nor
UTF8 nor any other.


More importantly, ASCII characters remain unchanged under UTF8,
so the question is likely ill-formed to begin with.

Nonetheless, it's certainly possible to convert UTF8 sequences back
to their original character codes in C. However, there is no specific
standard function for this purpose alone.
<snip>


--
Peter

Nov 14 '05 #5

P: n/a
On 15 Feb 2005 19:07:05 -0800,
Peter Nilsson <ai***@acay.com.au> wrote:


Nonetheless, it's certainly possible to convert UTF8 sequences back
to their original character codes in C. However, there is no specific
standard function for this purpose alone.


The standard(?) mbtowc could be doing that; well, it could convert
utf8 which is a multibyte format to unicode, which is a wide character
format, and the first 128 values of unicode is identical to US-ASCII.

Villy
Nov 14 '05 #6

P: n/a
Does ASCII characters remain unchanged under UTF8????Since it
unchanged,Why I can't printf thme in screen?
I use libcap get the FTP commands from Ethernet.I have one class to
get all packages and flowed,buffered them,Then aonther class can
readline (ended with \r\n)from buffer.Every time I readline from buffer
to get a command.
When I use CuteFTP I can get all commands an printf them in
screen.But when I use IE to FTP I can see one command "set utf8 on" and
then next commands I can't printf them in screen.Should I conver utf8
to ASCII?:(

Nov 14 '05 #7

P: n/a
Does ASCII characters remain unchanged under UTF8????Since it
unchanged,Why I can't printf thme in screen?
I use libcap get the FTP commands from Ethernet.I have one class to
get all packages and flowed,buffered them,Then aonther class can
readline (ended with \r\n)from buffer.Every time I readline from buffer
to get a command.
When I use CuteFTP I can get all commands an printf them in
screen.But when I use IE to FTP I can see one command "set utf8 on" and
then next commands I can't printf them in screen.Should I conver utf8
to ASCII?:(

Nov 14 '05 #8

P: n/a
On 16 Feb 2005 18:10:46 -0800,
chunhui_true <ch*********@gmail.com> wrote:

Does ASCII characters remain unchanged under UTF8????Since it
unchanged,Why I can't printf thme in screen?


They certainly do. Latin1 characters, however, don't, except for the
ASCII subset. Also, the unicode and ASCII have the same code values
for the ASCII subset of unicode.

See the description of UTF-8 in for example rfc-2044:
|
| Network Working Group F. Yergeau
| Request for Comments: 2044 Alis Technologies
| Category: Informational October 1996
|
|
| UTF-8, a transformation format of Unicode and ISO 10646
|
| Status of this Memo
|
| This memo provides information for the Internet community. This memo
| does not specify an Internet standard of any kind. Distribution of
| this memo is unlimited.
|
| Abstract
|
| The Unicode Standard, version 1.1, and ISO/IEC 10646-1:1993 jointly
| define a 16 bit character set which encompasses most of the world's
| writing systems. 16-bit characters, however, are not compatible with
| many current applications and protocols, and this has led to the
| development of a few so-called UCS transformation formats (UTF), each
| with different characteristics. UTF-8, the object of this memo, has
| the characteristic of preserving the full US-ASCII range: US-ASCII
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
| characters are encoded in one octet having the usual US-ASCII value,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
| and any octet with such a value can only be an US-ASCII character.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
| This provides compatibility with file systems, parsers and other
| software that rely on US-ASCII values but are transparent to other
| values.
| [...]

Villy
Nov 14 '05 #9

P: n/a
Oh,Thanks!!
since it :
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^*^^^^^
| This provides compatibility with file systems, parsers and other
| software that rely on US-ASCII values but are transparent to other

| values.
It means my programme wich parser the old ASCII FTP commands also can
parser the UTF8 commands??But why I can't readline from buffer when it
"set utf8 on"?

Nov 14 '05 #10

P: n/a
chunhui_true wrote:
...
It means my programme wich parser the old ASCII FTP commands
also can parser the UTF8 commands??But why I can't readline
from buffer when it "set utf8 on"?


Do you have a question on the ISO C language?

Comp.lang.c is the wrong forum for (repeatedly) asking questions
about utf8 and ftp commands.

Try comp.programming (say) instead, or a platform specific newsgroup
that caters for your current development tools. Also note that you
are much more likely to get useful responses by posting sample code
that exhibits the problems you're having.

--
Peter

Nov 14 '05 #11

P: n/a
"chunhui_true" <ch*********@gmail.com> wrote:
# Oh,Thanks!!
# since it :
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^=AD^^^^^
# | This provides compatibility with file systems, parsers and other
# | software that rely on US-ASCII values but are transparent to other
#
# | values.
# It means my programme wich parser the old ASCII FTP commands also can
# parser the UTF8 commands??But why I can't readline from buffer when it
# "set utf8 on"?

Because readline is not transparent to character codes 0x80 - 0xFF? You'd
have to examine the source; lots of older of program used the 'fact' that
characters where only seven bits and used the extra bit as a flag.

--
SM Ryan http://www.rawbw.com/~wyrmwif/
If your job was as meaningless as theirs, wouldn't you go crazy too?
Nov 14 '05 #12

P: n/a
Sorry I'm a newer in groups. What topic does this forum discuss??

Nov 14 '05 #13

This discussion thread is closed

Replies have been disabled for this discussion.