By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,257 Members | 1,263 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,257 IT Pros & Developers. It's quick & easy.

How to validate a string containing Chinese?

P: n/a
Hi All,

I want to validate a string, and see if it contains any Chinese character
(simple or traditional). I'm trying to use RegExp and Encoding, but no
result.
Can someone point me a direction?

Kind regards,
Kevin
Dec 20 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Hi Kevin,

I am not sure why you would need the Encoding class. All strings are
internally Unicode in .NET (unless you are dealing with _byte_ arrays).
Therefore, what I think you need first is to determine what range(s) of
numeric character codes constitute all the Chinese hieroglyphs. Then, just
define an RegExp pattern that would capture all such characters, for
example, if there are just two ranges:

[\uXXXX-\uYYYY]|[\uZZZZ-\uWWWW]

(of course XXXX must be numerically less than YYYY, the same goes for ZZZZ
and WWWW).

--
Sincerely,
Dmytro Lapshyn [Visual Developer - Visual C# MVP]
"Kevin" <ke***@home.nl> wrote in message
news:%2****************@TK2MSFTNGP11.phx.gbl...
Hi All,

I want to validate a string, and see if it contains any Chinese character
(simple or traditional). I'm trying to use RegExp and Encoding, but no
result.
Can someone point me a direction?

Kind regards,
Kevin


Dec 20 '05 #2

P: n/a
Kevin <ke***@home.nl> wrote:
I want to validate a string, and see if it contains any Chinese character
(simple or traditional). I'm trying to use RegExp and Encoding, but no
result.
Can someone point me a direction?


Dmytro gave a regular expression solution - I'd just hard code it,
personally. Just iterate through each character in the string, and
check whether it's in the range you're interested in. Personally I
think that's a bit more readable than the regular expression solution,
although if there are lots of ranges to consider, a regular expression
formatted on multiple lines with a range and a comment on each line
might be better than a hard-coded solution. (The hard-coded solution is
likely to be faster too, but I wouldn't worry about that until you've
determined that it's actually a performance bottleneck.)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 20 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.