470,604 Members | 2,024 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,604 developers. It's quick & easy.

How to do special encode in string ?

Hi !

I'm hungarian, we use special characters like:
á - a'
ő -o"

etc.

I want to encode this characters to in config file I see these
characters as \nnn format.
And I want to decode it automatically with python.

How to I do it without write complex converter tool ?

Thanx for it:
FT

Example:
Encode("az állam én vagyok") -> "az \xe1llam \xe9n vagyok"

Decode("az \xe1llam \xe9n vagyok") -> "az állam én vagyok"

Jul 18 '05 #1
3 2160
"fo***********@anonym.hu" <fo***********@anonym.hu> wrote in
news:ma***********************************@python. org:
Encode("az állam én vagyok") -> "az \xe1llam \xe9n vagyok"

Decode("az \xe1llam \xe9n vagyok") -> "az állam én vagyok"

s = "az \xe1llam \xe9n vagyok"
print s.decode('latin-1') az állam én vagyok


You want to use unicode strings if you have characters outside the ASCII
range. The decode method on a byte string will let you convert it to a
unicode string, and the encode method will let you convert it back to byte
string.

The tricky bit is that you need to know the correct encoding to use as \xe1
could mean different characters, but in this case it looks as though you
meant latin-1.
Jul 18 '05 #2
On 21 Jun 2004 10:14:57 GMT, Duncan Booth <me@privacy.net> wrote:
"fo***********@anonym.hu" <fo***********@anonym.hu> wrote in
news:ma***********************************@python .org:
Encode("az állam én vagyok") -> "az \xe1llam \xe9n vagyok"

Decode("az \xe1llam \xe9n vagyok") -> "az állam én vagyok"

s = "az \xe1llam \xe9n vagyok"
print s.decode('latin-1')az állam én vagyok


You want to use unicode strings if you have characters outside the ASCII
range. The decode method on a byte string will let you convert it to a
unicode string, and the encode method will let you convert it back to byte
string.

The tricky bit is that you need to know the correct encoding to use as \xe1
could mean different characters, but in this case it looks as though you
meant latin-1.


For Hungarian long umlauts, you'll want to use latin-2 (or iso8859-2).
--
Christopher
Jul 18 '05 #3
> Encode("az llam n vagyok") -> "az \xe1llam \xe9n vagyok"

Decode("az \xe1llam \xe9n vagyok") -> "az llam n vagyok"


The functions you want are str.encode and str.decode:
"az llam n vagyok".encode("string_escape") -> "az \xe1llam \xe9n
vagyok"
"az \xe1llam \xe9n vagyok".decode("string_escape") -> "az llam n
vagyok"

If you choose to use Unicode strings instead, use the "unicode_escape"
codec instead of the "string_escape" codec.

A list of the standard encodings is available at
http://docs.python.org/lib/node127.html if you need with some other
format (rot13 is my personal favourite :P).
Jul 18 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

6 posts views Thread by Jonas Meurer | last post: by
1 post views Thread by Asha | last post: by
2 posts views Thread by Alan Searle | last post: by
reply views Thread by Luis Corrales | last post: by
6 posts views Thread by 7stud | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.