473,480 Members | 1,872 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Chinese character sets

Hi all, I have to do a website in chinese!

Basically I just need to know how to output chinese characters. I am
assuming its very easy, but have never done it before. I can however do
simple things like changing the formats of currency and calendars and
so on.

I am guessing the answer is quite simple given; I assume Unicode would
support all the chinese characters right? Ideally I'd like them to be
able to enter their own content through a WYSIWYG or something similar
in their native language so that I don't have to worry about any
translation. Knowing me, I'd intend to write "give your dog a bone" but
end up writing "I'd like to bone your dog".

Cheers
Steve

Feb 27 '06 #1
12 3205
Steven Nagy <le*********@hotmail.com> wrote:
Hi all, I have to do a website in chinese!

Basically I just need to know how to output chinese characters. I am
assuming its very easy, but have never done it before. I can however do
simple things like changing the formats of currency and calendars and
so on.

I am guessing the answer is quite simple given; I assume Unicode would
support all the chinese characters right? Ideally I'd like them to be
able to enter their own content through a WYSIWYG or something similar
in their native language so that I don't have to worry about any
translation. Knowing me, I'd intend to write "give your dog a bone" but
end up writing "I'd like to bone your dog".


It's really just a matter of choosing your encoding, as far as writing
the file is concerned. You might choose Big5, or UTF-8. The latter is
somewhat handier in various ways, and can cover all Unicode strings.

Strings in .NET are Unicode by default, so you shouldn't have many
problems.

See http://www.pobox.com/~skeet/csharp/unicode.html for a bit more
information.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 27 '06 #2
Cheers Jon.

In my research I have discovered that lots of people seem to have
problems storing the chinese chars in an english SQL Server 2000
instance.
Has anyone experienced this? Any advice to give me before I design the
schema?
Essentially the plan is to create a chinese web site, provide an admin
page that lets them edit all their own content through a wysiwyg that
supports chinese chars. The chinese chars obviously get saved in the
sql database and retreived on request of whichever content page was
requested.
Does anyone see any fundamental flaws that may not be possible with
chinese chars?
Also, is it usual to allow the user to option between traditional and
simplified chinese? I don't know anything about the language and am not
sure if chinese sites normally support this feature. Perhaps they are
all in 1 or the other as a general rule?

Thanks,
Steve

Feb 27 '06 #3
Steven Nagy <le*********@hotmail.com> wrote:
In my research I have discovered that lots of people seem to have
problems storing the chinese chars in an english SQL Server 2000
instance.
Has anyone experienced this? Any advice to give me before I design the
schema?
My experience with this is that there are issues with collation,
particularly with Japanese (not sure about Chinese) and that often it
depends on how the *instance* was created to start with, unless the
database itself specifies the collation on creation.

I'm not a SQL Server expert, but I'd definitely try some
experimentation, and make sure that all the appropriate fields are
Unicode text fields.
Essentially the plan is to create a chinese web site, provide an admin
page that lets them edit all their own content through a wysiwyg that
supports chinese chars. The chinese chars obviously get saved in the
sql database and retreived on request of whichever content page was
requested.
Does anyone see any fundamental flaws that may not be possible with
chinese chars?
No, that should be fine.
Also, is it usual to allow the user to option between traditional and
simplified chinese? I don't know anything about the language and am not
sure if chinese sites normally support this feature. Perhaps they are
all in 1 or the other as a general rule?


I don't *think* you can easily just switch text between the two - but
you could save whatever the user enters, whichever character set they
use.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 27 '06 #4
Cheers Jon.

The StrConv function seems to support translation back and forth
between simplified and traditional. I was thinking of having it do a
conversion before output depending on user setting.
Here is the doc:
http://msdn2.microsoft.com/en-us/library/cd7w43ec.aspx

Thanks for your input.

Feb 27 '06 #5
If you're using data objects .NET framework provides and not those 3rd party
data objects, and remember to pass all your parameters to the SQL string by
SQLParameters, you should be quite safe as so far I haven't experienced any
problem. (Of course, you have to configure your database to use Unicode
first)

For other cases, there's a lot of things to do. For example, in MySQL you
can use "char(hex value seperated by comma)" to store Big5 string without
problem even if the database is set to ASCII charset only. Read your SQL
manual to see the SQL language specific syntax you may use.

Note that String.Replace("\"", "\"\"") like method won't do as .NET knows
the double-byte char is a single char, so that replacement won't work.

"Steven Nagy" <le*********@hotmail.com>
???????:11**********************@i39g2000cwa.googl egroups.com...
Cheers Jon.

In my research I have discovered that lots of people seem to have
problems storing the chinese chars in an english SQL Server 2000
instance.
Has anyone experienced this? Any advice to give me before I design the
schema?
Essentially the plan is to create a chinese web site, provide an admin
page that lets them edit all their own content through a wysiwyg that
supports chinese chars. The chinese chars obviously get saved in the
sql database and retreived on request of whichever content page was
requested.
Does anyone see any fundamental flaws that may not be possible with
chinese chars?
Also, is it usual to allow the user to option between traditional and
simplified chinese? I don't know anything about the language and am not
sure if chinese sites normally support this feature. Perhaps they are
all in 1 or the other as a general rule?

Thanks,
Steve

Feb 28 '06 #6
If you need conversion, I'd recommand you to store everything in
Tranditional Chinese in order to get rid of "lost of information" during
translation.

You know, CHS to CHT conversion is a many to one conversion. Normally the
translation routines do a good job in finding a highest probable
replacement. But in case of username, it doesn't help much. And I can tell
you many user in the world will upset if you display the name they entered
wrongly.

Store everything in CHT in the beginning will save you from a lot of problem
later.

"Steven Nagy" <le*********@hotmail.com>
???????:11**********************@p10g2000cwp.googl egroups.com...
Cheers Jon.

The StrConv function seems to support translation back and forth
between simplified and traditional. I was thinking of having it do a
conversion before output depending on user setting.
Here is the doc:
http://msdn2.microsoft.com/en-us/library/cd7w43ec.aspx

Thanks for your input.

Feb 28 '06 #7
> In my research I have discovered that lots of people seem to have
problems storing the chinese chars in an english SQL Server 2000
instance.
Go Unicode all the way.
UTF-8 for the web pages and forms, NCHAR & NVARCHAR for the database.
Also, is it usual to allow the user to option between traditional and
simplified chinese? You should think of them as different languages.
So if you allow swithcing between French, German, English, then you should
allow for Simplified/Traditional Chinese.
Perhaps they are
all in 1 or the other as a general rule? No.
The StrConv function seems to support translation back and forth
between simplified and traditional. Useless. Even if you convert between Simplified Chinese encoding (gb1230) and
Traditional Chinese encoding (big5), there are still linguistic diffences
(more than US/Australia/U.K./Indian English).
Yes, the user might understand something, but it will be obvious is not the
real thing.
You know, CHS to CHT conversion is a many to one conversion. .... Store everything in CHT in the beginning will save you from a lot of
problem later.

Please don't. They two are different languages, there is no conversion.
It is more like translation. Compare again with English.
One might come up with a list of search-replaces (color-colour and so on),
but then you have full expressions (money for jam <=> easy money, fitted
carpet <=> wall to wall carpeting)

Best advice: keep them separate, consider them different languages.

--
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
Feb 28 '06 #8
Ok thanks for that.
Which of the 2 do you recommend? Traditional or Simplified?

Feb 28 '06 #9
> Ok thanks for that.
Which of the 2 do you recommend? Traditional or Simplified?

There is no direct answer for this. What customers do you have?
Is like asking "Which of the 2 do you recommend: German or French?"
--
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
Mar 1 '06 #10
The customers are chinese!
So you can't use a french german comparison because they are two
different countries.
Germans don't have two different writing scripts.

I guess what I am asking is:
Which is used more predominantly on chinese websites, traditional or
simplified?
I have no idea what markets the two seperate scripts would be targeting
because I have no idea who uses which script.

Mar 1 '06 #11
I'd say half and half.

CHT is commonly used in Taiwan, Hong Kong, Macau and Singapore, while CHS is
used in other places of the world.

So the decision would be of which audience your website is intended to. If
you audience includes both parties, you had better support both language.

"Steven Nagy" <le*********@hotmail.com>
???????:11**********************@i39g2000cwa.googl egroups.com...
The customers are chinese!
So you can't use a french german comparison because they are two
different countries.
Germans don't have two different writing scripts.

I guess what I am asking is:
Which is used more predominantly on chinese websites, traditional or
simplified?
I have no idea what markets the two seperate scripts would be targeting
because I have no idea who uses which script.

Mar 2 '06 #12
Cool,. cheers.

Mar 2 '06 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
2900
by: WindAndWaves | last post by:
Hi Folk Here I am writing my first php / mysql site, almost ready, and now this... charactersets.... The encoding that I use on my webpage is: <META HTTP-EQUIV="content-type"...
4
6618
by: see_mun_lee | last post by:
I use asp to develop a web page to read an excel file containing Chinese Character then display it in the web page. Unfortunately, I cant display it!!! it will display (?????????). <META...
0
1813
by: boonkit | last post by:
Currently i store the chinese character in unicode decimal format (&#<number>;). In this way, I can display and search the chinese character correctly. However, this will take more storage space....
8
3454
by: Agnes | last post by:
In my .net ,i need to generate an xml file , however, user may input a chinese character, Then , the xml will got something unknow characters. the following is my code, Does anyone know how to...
4
2996
by: Winnie | last post by:
Hi, I am currently writing a C# Windows Application. On my form, I have several labels with Traditional Chinese text, it is ok on my machine (Windows 2000), but after install on Windows 98 or NT,...
0
724
by: Alex Chan | last post by:
Hi group, I am writing a RFC Server with SAP.NET Connector to connect to SAP. There are chinese characters passing back and forth. I found that all chinese characters sending from SAP are...
3
2290
by: Spider_Jia | last post by:
I am using win 2000 and vstudio.net 2002. When I input chinese character into vb.net (aspx page) using vstudio.net, everything is okay.I can see the chinese words. But if i try to view it as a...
1
2400
by: stepby | last post by:
Hi All, I have use the ajax with the server side language ASP. I find the when passing the parameter with the chinese character, the chinese character cannot show orderly and become "wrong code" ....
1
2932
by: James Su | last post by:
I tried to use pymssql to access MSSQL 2000, with a table, I store Chinese Character in NVarchar field, Chinese Character display normally when I query them by MS SQL Query Analyzer under Windows...
0
7041
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
6908
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7081
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
6921
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
4776
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4481
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
2995
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
2984
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1300
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.