By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,369 Members | 966 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,369 IT Pros & Developers. It's quick & easy.

Translate UTF16 into lower ascii

P: n/a
Bob
Is there an easy way to translate odd UTF8/16 characters (like letters
with umlauts, vowels with accent symbols above) into the closest
'look-alike' lower ascii equivalent (A-Z, a-z)?

This is something that has probably been done, but I can't think of a
good search key for finding the code.

Nov 21 '08 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Bob
On Fri, 21 Nov 2008 02:44:24 -0500, "Michael B. Trausch"
<mi**@trausch.uswrote:
>On Fri, 21 Nov 2008 01:56:04 -0500
Bob <Bo*@nospam.comwrote:
>Is there an easy way to translate odd UTF8/16 characters (like letters
with umlauts, vowels with accent symbols above) into the closest
'look-alike' lower ascii equivalent (A-Z, a-z)?

This is something that has probably been done, but I can't think of a
good search key for finding the code.

There may be a library out there somewhere, but I am sure that it is so
obscure that I can't find it.

Your best best would be to try to transliterate what you can and drop
what you can't transliterate. A table-based approach would be the only
way I can see being able to do it reasonably. Maybe looking for a list
of transliterations that you could preprocess into a table would be
ideal?

--- Mike
Very likely that someone has already done this, as there are occasions
that plain 'lower ascii' must be used, like on cell phone keypads. If
someone wanted to enter the name "Andre" on a cell phone, there would
be no access to an E with the accent over it.

Now, to find it...
Nov 21 '08 #2

P: n/a
Bob
On Fri, 21 Nov 2008 09:00:53 +0100, Jérémy Jeanson
<je************@free.frwrote:
>System.Text.ASCIIEncoding have some methodes to convert, translate
chars. you can find many exmeple in MSDN
Entirely appropriate to hear from someone with two accents in their
name. <G Good example here, as I wouldn't know how to type your name
as you have it spelled above. And you wouldn't want to drop the two
E's...you'd translate to lower ascii E when necessary.

I presume that you're referring to the Decoder.Convert functions via
ASCIIEncoding classes. I didn't see anything that looked like it would
do this.
Nov 21 '08 #3

P: n/a
Bob wrote:
Very likely that someone has already done this, as there are occasions
that plain 'lower ascii' must be used, like on cell phone keypads. If
someone wanted to enter the name "Andre" on a cell phone, there would

Really? I have all umlauts available on my mobile (and it is not a
special or expensive model). It depends on the language setting, if it
is set to English then there are no special characters of course. Think
about Chinese or Japanese mobiles, they do not have 2000+ tiny keys -
but I guess you can send Chinese text using the keypad somehow...

Michael
Nov 21 '08 #4

P: n/a
MC

"Bob" <Bo*@nospam.comwrote in message news:8d********************************@4ax.com...
Is there an easy way to translate odd UTF8/16 characters (like letters
with umlauts, vowels with accent symbols above) into the closest
'look-alike' lower ascii equivalent (A-Z, a-z)?

This is something that has probably been done, but I can't think of a
good search key for finding the code.
Check an earlier thread here about "Remove accents" or somesuch.

The key idea is to "normalize" the Unicode in such a way that the accents become combining characters (e.g., the acute accent is a separate character from the letter it appears on), then remove the combining characters (which have codes in a particular, high-numbered range).
Nov 21 '08 #5

This discussion thread is closed

Replies have been disabled for this discussion.