472,127 Members | 1,449 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,127 software developers and data experts.

Is there a way to change the default string encoding?

Is there a way to change the default string encoding used by the
string.encode() method? My default environment is utf-8 but I need it
to be latin-1 to avoid errors like this:
>>'Andr\xe9 Ramel'.decode()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 4:
ordinal not in range(128)

I can't change the code to pass an encoding argument to the decode
method because it's someone else's code.

Thanks,
rg
Aug 21 '07 #1
4 2499
Ron Garret wrote:
Is there a way to change the default string encoding used by the
string.encode() method?
encode() or decode()? Encoding is best handled by the output stream, e. g.
passing codecs.open(...) instead of the builtin open(...).

My default environment is utf-8 but I need it
to be latin-1 to avoid errors like this:
>>>'Andr\xe9 Ramel'.decode()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 4:
ordinal not in range(128)
If your environment were latin-1, you'd get the same error because Python
assumes ascii by default.
I can't change the code to pass an encoding argument to the decode
method because it's someone else's code.
Does that code accept unicode strings? Try to pass u"Andre\xe9 Ramel"
instead of the byte string.

If all else fails there's
>>sys.setdefaultencoding("latin1")
"Andre\xe9 Ramel".decode()
u'Andre\xe9 Ramel'

but that's an evil hack, you should rather talk to the maintainer of the
offending code to update it to accept unicode.

Peter
Aug 21 '07 #2
In article <fa*************@news.t-online.com>,
Peter Otten <__*******@web.dewrote:
If all else fails there's
>sys.setdefaultencoding("latin1")
"Andre\xe9 Ramel".decode()
u'Andre\xe9 Ramel'

but that's an evil hack, you should rather talk to the maintainer of the
offending code to update it to accept unicode.
Yes, but I need to hack around it until I can get it fixed.

Thanks!

rg
Aug 21 '07 #3
Il Mon, 20 Aug 2007 18:44:39 -0700, Ron Garret ha scritto:
Is there a way to change the default string encoding ...
Dive Into Python. Section 9 on http://diveintopython.org/xml_processing/
unicode.html

That will help.

Bye
Fabio
Aug 21 '07 #4
Ron Garret wrote:
In article <fa*************@news.t-online.com>,
Peter Otten <__*******@web.dewrote:
>If all else fails there's
>>sys.setdefaultencoding("latin1")
"Andre\xe9 Ramel".decode()
u'Andre\xe9 Ramel'

but that's an evil hack, you should rather talk to the maintainer of the
offending code to update it to accept unicode.

Yes, but I need to hack around it until I can get it fixed.
Oops, the snippet above omits the actual hack. It should be
>>import sys
reload(sys)
<module 'sys' (built-in)>
>>sys.setdefaultencoding("latin1")
"Andre\xe9 Ramel".decode()
u'Andre\xe9 Ramel'

Peter
Aug 21 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Ann | last post: by
5 posts views Thread by Hardy Wang | last post: by
reply views Thread by Brad Wood | last post: by
reply views Thread by 6kjfsyg02 | last post: by
7 posts views Thread by MrNobody | last post: by
11 posts views Thread by Freddy Coal | last post: by
11 posts views Thread by cybervigilante | last post: by
1 post views Thread by Macneed | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.