472,782 Members | 1,510 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,782 software developers and data experts.

string conversion latin2 to ascii

Hi all,

sorry for a newbie question. I have unicode string (or better say
latin2 encoding) containing non-ascii characters, e.g.

s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_S OA"

I would like to convert this string to plain ascii (using some lookup
table for latin2)

to get

-Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA

Thanks for any hits! Regards, Martin Landa
Nov 27 '07 #1
5 2583
On Nov 27, 3:35 pm, Martin Landa <landa.mar...@gmail.comwrote:
Hi all,

sorry for a newbie question. I have unicode string (or better say
latin2 encoding) containing non-ascii characters, e.g.

s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_S OA"

I would like to convert this string to plain ascii (using some lookup
table for latin2)

to get

-Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA

Thanks for any hits! Regards, Martin Landa
With a little googling, I found this:

http://www.peterbe.com/plog/unicode-to-ascii

You might also find this article useful:

http://www.reportlab.com/i18n/python..._tutorial.html

Mike
Nov 27 '07 #2
sorry for a newbie question. I have unicode string (or better say
latin2 encoding) containing non-ascii characters, e.g.

s = "Ukázka_moĹľnosti_vyuĹľitĂ*_programu_OpenJUMP_v_S OA"
That's not a Unicode string (at least in Python 2); it is
a latin-2 encoded byte string; it has nothing to do with Unicode.
I would like to convert this string to plain ascii (using some lookup
table for latin2)

to get

-Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA
I recommend to use string.translate. You need a translation
table there, which is best generated with string.maketrans.

table=string.maketrans("ážĂ*","azi")
print s.translate(table)

HTH,
Martin
Nov 27 '07 #3
On Nov 28, 8:45 am, kyoso...@gmail.com wrote:
On Nov 27, 3:35 pm, Martin Landa <landa.mar...@gmail.comwrote:
Hi all,
sorry for a newbie question. I have unicode string (or better say
latin2 encoding) containing non-ascii characters, e.g.
s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_S OA"
I would like to convert this string to plain ascii (using some lookup
table for latin2)
to get
-Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA
Thanks for any hits! Regards, Martin Landa

With a little googling, I found this:

http://www.peterbe.com/plog/unicode-to-ascii
and if the OP has the patience to read *ALL* the comments on that blog
entry, he will find that comment[-2] points to

http://effbot.python-hosting.com/fil...xt/unaccent.py

and comment[-1] (from the blog owner) is "Brilliant! Thank you."

The bottom line is that there is no universal easy solution; you need
to handcraft a translation table suited to your particular purpose
(e.g. do you want u-with-umlaut to become u or ue?). The
unicodedata.normalize function is useful for off-line preparation of a
set of candidate mappings for that table; it should not be applied
either on-line or blindly.

Cheers,
John
Nov 27 '07 #4
* Martin Landa <la**********@gmail.com>, 2007-11-27:
I have unicode string (or better say latin2 encoding) containing
non-ascii characters, e.g.

s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_S OA"

I would like to convert this string to plain ascii (using some lookup
table for latin2)

to get

-Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA
You may try python-elinks
<http://freshmeat.net/projects/python-elinks/>:

>>import elinks
print "Ukázka_mo\236nosti_vyu\236ití_programu_OpenJUMP_v _SOA".decode('Windows-1250').encode('ASCII', 'elinks')
Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA
--
Jakub Wilk
Nov 28 '07 #5
On Nov 27, 5:08 pm, John Machin <sjmac...@lexicon.netwrote:
On Nov 28, 8:45 am, kyoso...@gmail.com wrote:


On Nov 27, 3:35 pm, Martin Landa <landa.mar...@gmail.comwrote:
Hi all,
sorry for a newbie question. I have unicode string (or better say
latin2 encoding) containing non-ascii characters, e.g.
s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_S OA"
I would like to convert this string to plain ascii (using some lookup
table for latin2)
to get
-Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA
Thanks for any hits! Regards, Martin Landa
With a little googling, I found this:
http://www.peterbe.com/plog/unicode-to-ascii

and if the OP has the patience to read *ALL* the comments on that blog
entry, he will find that comment[-2] points to

http://effbot.python-hosting.com/fil...xt/unaccent.py

and comment[-1] (from the blog owner) is "Brilliant! Thank you."

The bottom line is that there is no universal easy solution; you need
to handcraft a translation table suited to your particular purpose
(e.g. do you want u-with-umlaut to become u or ue?). The
unicodedata.normalize function is useful for off-line preparation of a
set of candidate mappings for that table; it should not be applied
either on-line or blindly.

Cheers,
John
Sorry...I didn't know about translation tables or I would have
mentioned that instead. My bad.

Mike
Nov 28 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: john | last post by:
On my form i have a message box called txtItemDesc that displays the french phrase qualité Père Noël. Now then when i run this code on that text box: Dim chrArr() As Char chrArr =...
6
by: Willem | last post by:
What is the best way to calculate an ascii string into an integer (not talking about an atoi conversion): For examle if I have the ascii string: "/b" then in hex it would be 2F7A and if I...
4
by: Dennis Myrén | last post by:
Hi. Is there a way to utilize the great primitive data type formatting routines available in .NET without working with strings? I want a byte directly rather than a string. I think it is...
3
by: pkumar | last post by:
How to convert this byte array to string byte b=new byte; Is there any function or I need read one by one and build the string thanks
18
by: Ger | last post by:
I have not been able to find a simple, straight forward Unicode to ASCII string conversion function in VB.Net. Is that because such a function does not exists or do I overlook it? I found...
30
by: Steve Edwards | last post by:
Hi, I'm re-writing some code that had relied on some platform/third-party dependent utility functions, as I want to make it more portable. Is there a standard C/C++/stl routine for changing an stl...
0
by: Grzesiek | last post by:
Hey there! Please help me... I've got a iSeries (AS 400 with DB2), and I must connect to this database from Linux. On AS400 I've this settings (which I can't change): Language: Polish...
5
by: Jamie Risk | last post by:
This is the code snippet that I've come up to convert a byte to string. Is there a best practiced method for such a conversion? - Jamie public static string ByteArrayToString(byte array) {...
4
by: vcnewbie | last post by:
Hi I'm maintaining a VisualC++ project to increase its security regarding stored passwords. I thought about using SHA256Managed to create a hash for the password when creating a user and when...
0
by: Rina0 | last post by:
Cybersecurity engineering is a specialized field that focuses on the design, development, and implementation of systems, processes, and technologies that protect against cyber threats and...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 2 August 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: erikbower65 | last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA: 1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
0
by: Taofi | last post by:
I try to insert a new record but the error message says the number of query names and destination fields are not the same This are my field names ID, Budgeted, Actual, Status and Differences ...
14
DJRhino1175
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
5
by: DJRhino | last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer) If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _ 310030356 Or 310030359 Or 310030362 Or...
0
by: lllomh | last post by:
How does React native implement an English player?
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.