By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,218 Members | 1,281 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,218 IT Pros & Developers. It's quick & easy.

Translate accented characters

P: n/a
Is there anything in the framework which will help translate accented
characters in strings to their standard counterparts?

eg. "Grda" to "Grada"
Nov 17 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
"JezB" <je***********@blueyonder.co.uk> wrote in message
news:uQ**************@TK2MSFTNGP14.phx.gbl...
Is there anything in the framework which will help translate accented
characters in strings to their standard counterparts?

eg. "Grda" to "Grada"


Interestingly enough, I was looking for exactly the same thing recently, and
was unable to find anything native to the Framework, so I ended up writing
my own mapping function. Easy enough for the Latin languages (e.g. French,
Spanish, Italian, Portuguese etc), fairly simply for German (e.g. any vowel
with an umlaut is replaced by the unmodified vowel + 'e'), a little messier
for the Scandinavian languages, even worse for Greek and Cyrillic, and
almost impossible for the Eastern European languages with diacritics.

What is the business purpose behind your need to do this, AAMOI?
Nov 17 '05 #2

P: n/a
I'm passing artist/album names stored within mp3 files through Amazon's web
service, to look up album details. Many of the artist names have accented
characters, since I am interested in world/celtic music, but Amazon's serach
criteria seems to be based on normalized unaccented strings. A real pain to
edit all my id3 tags !

"Mark Rae" <ma**@mark-N-O-S-P-A-M-rae.co.uk> wrote in message
news:Od**************@TK2MSFTNGP12.phx.gbl...
"JezB" <je***********@blueyonder.co.uk> wrote in message
news:uQ**************@TK2MSFTNGP14.phx.gbl...
Is there anything in the framework which will help translate accented
characters in strings to their standard counterparts?

eg. "Grda" to "Grada"


Interestingly enough, I was looking for exactly the same thing recently,
and was unable to find anything native to the Framework, so I ended up
writing my own mapping function. Easy enough for the Latin languages (e.g.
French, Spanish, Italian, Portuguese etc), fairly simply for German (e.g.
any vowel with an umlaut is replaced by the unmodified vowel + 'e'), a
little messier for the Scandinavian languages, even worse for Greek and
Cyrillic, and almost impossible for the Eastern European languages with
diacritics.

What is the business purpose behind your need to do this, AAMOI?

Nov 17 '05 #3

P: n/a
Hi,

There is nothing like this in the framework, what you can do is use
String.Replace , it will be slower but there are only 5 vocals after all :)

Cheers,

--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation

"JezB" <je***********@blueyonder.co.uk> wrote in message
news:uQ**************@TK2MSFTNGP14.phx.gbl...
Is there anything in the framework which will help translate accented
characters in strings to their standard counterparts?

eg. "Grda" to "Grada"

Nov 17 '05 #4

P: n/a
Hi Jez,

There is nothing pre-made in .Net that will do what you want. You need to create a translation table and translate each character as necessary.

There is a method that seems to work in most cases involving translation between different encodings, but I cannot guarantee that it works in all cases.

string s = "";
byte[] b = Encoding.GetEncoding(1251).GetBytes(s);
string t = Encoding.ASCII.GetString(b);

//t == aaaaaaeeeeiiiioooooouuuuyy
--
Happy coding!
Morten Wennevik [C# MVP]
Nov 17 '05 #5

P: n/a
"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sqif0dt2klbvpo@stone...
There is a method that seems to work in most cases involving translation
between different encodings, but I cannot guarantee that it works in all
cases.

string s = "";
byte[] b = Encoding.GetEncoding(1251).GetBytes(s);
string t = Encoding.ASCII.GetString(b);

//t == aaaaaaeeeeiiiioooooouuuuyy


No use at all for German:

= ae
= oe
= ue
= ss
Nov 17 '05 #6

P: n/a
"Ignacio Machin ( .NET/ C# MVP )" <ignacio.machin AT dot.state.fl.us> wrote
in message news:%2****************@TK2MSFTNGP10.phx.gbl...
There is nothing like this in the framework, what you can do is use
String.Replace , it will be slower but there are only 5 vocals after all
:)


If by "vocals" you mean "vowels", then that just isn't the case in many
languages...
Nov 17 '05 #7

P: n/a
"JezB" <je***********@blueyonder.co.uk> wrote in message
news:Ow**************@TK2MSFTNGP10.phx.gbl...
I'm passing artist/album names stored within mp3 files through Amazon's
web service, to look up album details. Many of the artist names have
accented characters, since I am interested in world/celtic music, but
Amazon's serach criteria seems to be based on normalized unaccented
strings. A real pain to edit all my id3 tags !


Then the translation table approach is what you need here...
Nov 17 '05 #8

P: n/a
That's good enough for me !!! This is just a hobby program so doesn't need
to be foolproof.
Many thanks Morten

"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sqif0dt2klbvpo@stone...
Hi Jez,

There is nothing pre-made in .Net that will do what you want. You need to
create a translation table and translate each character as necessary.

There is a method that seems to work in most cases involving translation
between different encodings, but I cannot guarantee that it works in all
cases.

string s = "";
byte[] b = Encoding.GetEncoding(1251).GetBytes(s);
string t = Encoding.ASCII.GetString(b);

//t == aaaaaaeeeeiiiioooooouuuuyy
--
Happy coding!
Morten Wennevik [C# MVP]

Nov 17 '05 #9

P: n/a
Hi,

Yes I meant vowels :)

It was an "spanglish" vocales = vowels in spanish :)
You are right, but in a particular language there are not that many, most
certainly by the description of the OP he has the mp3 tags in one language

cheers,

--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation

"Mark Rae" <ma**@mark-N-O-S-P-A-M-rae.co.uk> wrote in message
news:%2****************@TK2MSFTNGP10.phx.gbl...
"Ignacio Machin ( .NET/ C# MVP )" <ignacio.machin AT dot.state.fl.us>
wrote in message news:%2****************@TK2MSFTNGP10.phx.gbl...
There is nothing like this in the framework, what you can do is use
String.Replace , it will be slower but there are only 5 vocals after all
:)


If by "vocals" you mean "vowels", then that just isn't the case in many
languages...

Nov 17 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.