473,848 Members | 1,530 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to get the ascii code of Chinese characters?

Hi,everyone:

Have you any ideas?

Say whatever you know about this.
thanks.

Aug 19 '06
19 32859
On 2006-08-19 16:54:36, Peter Maas wrote:
Gerhard Fiedler wrote:
>Well, ASCII can represent the Unicode numerically -- if that is what the OP
wants.

No. ASCII characters range is 0..127 while Unicode characters range is
at least 0..65535.
Actually, Unicode goes beyond 65535. But right in this sentence, you
represented the number 65535 with ASCII characters, so it doesn't seem to
be impossible.
>For example, "U+81EC" (all ASCII) is one possible -- not very
readable though <g-- representation of a Hanzi character (see
http://www.cojak.org/index.php?funct...kup&term=81EC).

U+81EC means a Unicode character which is represented by the number
0x81EC.
Exactly. Both versions represented in ASCII right in your message :)
UTF-8 maps Unicode strings to sequences of bytes in the range 0..255,
UTF-7 maps Unicode strings to sequences of bytes in the range 0..127.
You *could* read the latter as ASCII sequences but this is not correct.
Of course not "correct". I guess the only "correct" representation is the
original Chinese character. But the OP doesn't seem to want this... so a
non-"correct" representation is necessary anyway.
How to do it in Python? Let chinesePhrase be a Unicode string with
Chinese content. Then

chinesePhrase_7 bit = chinesePhrase.e ncode('utf-7')

will produce a sequences of bytes in the range 0..127 representing
chinesePhrase and *looking like* a (meaningless) ASCII sequence.
Actually, no. There are quite a few code positions in the range 0..127 that
don't "look like" anything (non-printable). And, as you say, this is rather
meaningless.
chinesePhrase_1 6bit = chinesePhrase.e ncode('utf-16be')

will produce a sequence with Unicode numbers packed in a byte
string in big endian order. This is probably closest to what
the OP wants.
That's what you think... but it's not really ASCII. If you want this in
ASCII, and readable, I still suggest to transform this sequence of 2-byte
values (for Chinese characters it will be 2 bytes per character) into a
sequence of something like U+81EC (or 0x81EC if you are a C fan or 81EC if
you can imply the rest)... that's where we come back to my original
suggestion :)

Gerhard

Aug 19 '06 #11

many_years_afte r wrote:
hi:

what I want to do is just to make numbers as people input some Chinese
character(hanzi ,i mean).The same character will create the same
number.So I think ascii code can do this very well.
Possibly you have "create" upside-down. Could you possibly be talking
about an "input method", in which people type in ascii letters (and
maybe numbers) and the *result* is a Chinese character? In other words,
what *everybody* uses to input Chinese characters?

Perhaps you could ask on the Chinese Python newsgroup.

*GIVE* *EXAMPLES* of what you want to do.

Aug 19 '06 #12

John Machin wrote:
many_years_afte r wrote:
hi:

what I want to do is just to make numbers as people input some Chinese
character(hanzi ,i mean).The same character will create the same
number.So I think ascii code can do this very well.

Possibly you have "create" upside-down. Could you possibly be talking
about an "input method", in which people type in ascii letters (and
maybe numbers) and the *result* is a Chinese character? In other words,
what *everybody* uses to input Chinese characters?

Perhaps you could ask on the Chinese Python newsgroup.

*GIVE* *EXAMPLES* of what you want to do.
Well, people may input from keyboard. They input some Chinese
characters, then, I want to create a number. The same number will be
created if they input the same Chinese characters.

Aug 20 '06 #13
"many_years_aft er" <sh*****@gmail. comwrites:
Well, people may input from keyboard. They input some Chinese
characters, then, I want to create a number. The same number will be
created if they input the same Chinese characters.
You seem to be looking for a hash.

<URL:http://docs.python.org/lib/module-md5>
<URL:http://docs.python.org/lib/module-sha>

If not, please tell us what your *purpose* is. It's not at all clear
from your questions what you are trying to achieve.

--
\ "I was in a bar the other night, hopping from barstool to |
`\ barstool, trying to get lucky, but there wasn't any gum under |
_o__) any of them." -- Emo Philips |
Ben Finney

Aug 20 '06 #14
Gerhard Fiedler wrote:
>No. ASCII characters range is 0..127 while Unicode characters range is
at least 0..65535.

Actually, Unicode goes beyond 65535.
you may want to look up "at least" in a dictionary.

</F>

Aug 20 '06 #15
many_years_afte r wrote:
Well, people may input from keyboard. They input some Chinese
characters, then, I want to create a number. The same number will be
created if they input the same Chinese characters.
assuming you mean "code point" rather than "ASCII code" (ASCII is a
specific encoding that *doesn't* include Chinese characters), "ord" is
what you want:

char = read_from_some_ input_device()
code = ord(char)

see:

http://pyref.infogami.com/ord

</F>

Aug 20 '06 #16
In message <ma************ *************** ************@py thon.org>, Fredrik
Lundh wrote:
Gerhard Fiedler wrote:
>>No. ASCII characters range is 0..127 while Unicode characters range is
at least 0..65535.

Actually, Unicode goes beyond 65535.

you may want to look up "at least" in a dictionary.
Maybe you need to do the same for "actually".
Aug 20 '06 #17
On 2006-08-20 05:56:05, Fredrik Lundh wrote:
>>No. ASCII characters range is 0..127 while Unicode characters range is
at least 0..65535.

Actually, Unicode goes beyond 65535.

you may want to look up "at least" in a dictionary.
As a homework, try to parse "at least until" and "goes beyond" and compare
the two (a dictionary is not necessarily of help with this :)

"range is least 0..65535" : upper_bound >= 65535
"goes beyond 65535" : upper_bound 65535

For some discussions (like how to represent code points etc) this
distinction is crucial.

Gerhard

Aug 20 '06 #18
Gerhard Fiedler wrote:
>>Actually, Unicode goes beyond 65535.

you may want to look up "at least" in a dictionary.

As a homework, try to parse "at least until" and "goes beyond" and compare
the two (a dictionary is not necessarily of help with this :)

"range is least 0..65535" : upper_bound >= 65535
"goes beyond 65535" : upper_bound 65535

For some discussions (like how to represent code points etc) this
distinction is crucial.
do you know anything about how Unicode is used in real life, or are you
just squabbling ?

</F>

Aug 20 '06 #19
On 2006-08-20 10:31:20, Fredrik Lundh wrote:
>"range is least 0..65535" : upper_bound >= 65535
"goes beyond 65535" : upper_bound 65535

For some discussions (like how to represent code points etc) this
distinction is crucial.

do you know anything about how Unicode is used in real life, or are you
just squabbling ?
Your point is?

Gerhard

Aug 20 '06 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
6903
by: K | last post by:
I've an XML file in UTF-8. It contains some chinese characters ( both simplified chinese and traditional chinese). In loading the XML file with MSXML parser, I used the below code to retrieve the data in a node. The CString was then display in CListCtrl. For the traditional chinese characters, they were shown correctly, but for simplified characters, I encounted many "?", but some characters were correct.
1
2186
by: wtistang | last post by:
I need some suggestions regarding a problem I am facing. I have a web page (asp.net and C#) which containing Chinese characters. In my web.config file I have: <globalization requestEncoding="iso-8859-1" responseEncoding="iso-8859-1" fileEncoding="iso-8859-1" />
7
4316
by: c.verma | last post by:
I have a web application. There is a page which has a datagrid on it.The datagrid displays the data that comes from SAP. SAP sends the chinese characters to this grid. Before I display CHinese charactes, I have to use the following code to let it display on the web page: Public Function ToSCUnicode(ByVal str As String) As String Dim enc1252 As System.Text.Encoding = System.Text.Encoding.GetEncoding(1252) Dim arrByte_GBK As Byte() Dim...
1
2498
by: lyudmilal | last post by:
I have a list of last names that can be in different languages: chinese, english, russian, german, etc. Different format need to be applied only for last names in chinese. For this purpose, I need to check if last name contains chinese characters. Any ideas how to do it (it should be data related check)? Thank you, Ludmila
0
3716
by: st.frey | last post by:
I've got a problem with importing chinese characters into a mysql-table and have read several mailings but didn't find a solution. i have a utf-8 text file that contains chinese characters. the table where i want to import the data using "load data local infile" has collation utf8_unicode_ci. but after the import is done, the chinese characters are converted into so strange characters. to show the chinese symbols in a php-script i use...
5
8568
by: Figmo | last post by:
I'm having a problem working with foreign characters (well....foreign to me anyway) I have a textbox control on a form. The font is set to MS Arial Unicode. If I use the Chinese input method I can type characters into this box no problem. They display correctly. I can also copy and paste Chinese characters from web sites into this text box and they display no problem. Here is my problem. If I put some Chinese characters in the...
13
3942
by: Liang Chen | last post by:
Hope you all had a nice weekend. I have a question that I hope someone can help me out. I want to run a Python program that uses Tkinter for the user interface (GUI). The program allows me to type Chinese characters, but neverthelss is unable to show them up on screen. The follow is some of the error message I received after I logged off the program: "Could not write output: <type "exceptions: UnicodeEncodeError'>, 'ascii' codec can't...
0
1078
by: Terry Reedy | last post by:
Liang Chen wrote: Start with the Unicode HOWTO in the HOWTOs part of the Manual set. For 2.6 http://docs.python.org/howto/unicode.html For 3.0, which has been updated in spite of the warning http://docs.python.org/dev/3.0/howto/unicode.html
2
6734
by: Flying Kite | last post by:
Hi All, I want to know how to print chinese characters on Zebra Printer, following code working fine with English string, but it's not working for Chinese string. It shows ASCII characters instead of Chinese characters, on my machine I installed language pack and currently set language as Chinese PRC but still it's not working. Option Explicit Private Type DOCINFO pDocName As String pOutputFile As String pDatatype As String
0
9892
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, weíll explore What is ONU, What Is Router, ONU & Routerís main usage, and What is the difference between ONU and Router. Letís take a closer look ! Part I. Meaning of...
0
9735
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10661
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10718
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10347
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7067
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5731
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
4134
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3172
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.