469,282 Members | 2,000 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,282 developers. It's quick & easy.

Unicode and exception strings

Assuming an exception like:

x = ValueError(u'\xf8')

AFAIK the common way to get a string representation of the exception
as a message is to simply cast it to a string: str(x). This will
result in an "UnicodeError: ASCII encoding error: ordinal not in
range(128)".

The common way to fix this is with something like
u'\xf8'.encode("ascii", 'replace'). However I can't find any way to
tell ValueErrors __str__ method which encoding to use.

Is it possible to solve this without using sys.setdefaultencoding()
from sitecustomize?

Regards,
Rune Frøysa
Jul 18 '05 #1
4 3689
On 09 Jan 2004 13:18:39 +0100, Rune Froysa <ru*********@usit.uio.no>
wrote:
Assuming an exception like:

x = ValueError(u'\xf8')

AFAIK the common way to get a string representation of the exception
as a message is to simply cast it to a string: str(x). This will
result in an "UnicodeError: ASCII encoding error: ordinal not in
range(128)".

The common way to fix this is with something like
u'\xf8'.encode("ascii", 'replace'). However I can't find any way to
tell ValueErrors __str__ method which encoding to use.


Rune, I'm not understanding what your problem is.

Is there any reason you're not using, for example, just repr(u'\xf8')?

In one program I have that occasionally runs into a line that includes
some (UTF-8) Unicode-encoded Chinese characters , I have something like
this:

try:
_display_text = _display_text + "%s\n" % line
except UnicodeDecodeError:
try:
# decode those UTF8 nasties
_display_text = _display_text + "%s\n" % line.decode('utf-8'))
except UnicodeDecodeError:
# if that still doesn't work, punt
# (I don't think we'll ever reach this, but just in case)
_display_text = _display_text + "%s\n" % repr(line)

I don't know if this will help you or not.

Jul 18 '05 #2
On Fri, 09 Jan 2004 19:44:21 GMT, Terry Carroll <ca*****@tjc.com> wrote:
In one program I have that occasionally runs into a line that includes
some (UTF-8) Unicode-encoded Chinese characters , I have something like
this:
Sorry, a stray parenthesis crept in here (since this is a pared down
version of my actual code). It should read:
try:
_display_text = _display_text + "%s\n" % line
except UnicodeDecodeError:
try:
# decode those UTF8 nasties
_display_text = _display_text + "%s\n" % line.decode('utf-8')
except UnicodeDecodeError:
# if that still doesn't work, punt
# (I don't think we'll ever reach this, but just in case)
_display_text = _display_text + "%s\n" % repr(line)
I don't know if this will help you or not.


Jul 18 '05 #3
Terry Carroll <ca*****@tjc.com> writes:
On 09 Jan 2004 13:18:39 +0100, Rune Froysa <ru*********@usit.uio.no>
wrote:
Assuming an exception like:

x = ValueError(u'\xf8')

AFAIK the common way to get a string representation of the exception
as a message is to simply cast it to a string: str(x). This will
result in an "UnicodeError: ASCII encoding error: ordinal not in
range(128)".

The common way to fix this is with something like
u'\xf8'.encode("ascii", 'replace'). However I can't find any way to
tell ValueErrors __str__ method which encoding to use.
Rune, I'm not understanding what your problem is.

Is there any reason you're not using, for example, just repr(u'\xf8')?


The problem is that I have little control over the message string that
is passed to ValueError(). All my program knows is that it has caught
one such error, and that its message string is in unicode format. I
need to access the message string (for logging etc.).
_display_text = _display_text + "%s\n" % line.decode('utf-8'))


This does not work, as I'm unable to get at the 'line', which is
stored internally in the ValueError class (and generated by its __str_
method).

Regards,
Rune Frøysa
Jul 18 '05 #4
On Wed, 14 Jan 2004 01:32:36 GMT, Terry Carroll <ca*****@tjc.com> wrote:
You can try to extract it as above, and then decode it with the codecs
module, but if it's only the first byte, it won't decode correctly:
import codecs
d = codecs.getdecoder('utf-8')
x.args[0]u'\xf8' d.decode(x.args[0])Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'builtin_function_or_method' object has no attribute
'decode'
Oops. Copy-and-pasted the wrong line here. Let's try that again:
x = ValueError(u'\xf8')
import codecs
d = codecs.getdecoder('utf-8')
d(x.args[0]) Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
position 0:
ordinal not in range(128)


*That's* the exception I was trying to show, not the AttributeError you
get when you use the decoder wrongly!

Jul 18 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

30 posts views Thread by aurora | last post: by
7 posts views Thread by Robert | last post: by
24 posts views Thread by ChaosKCW | last post: by
14 posts views Thread by Dennis Benzinger | last post: by
15 posts views Thread by luc.saffre | last post: by
9 posts views Thread by Jim | last post: by
7 posts views Thread by 7stud | last post: by
7 posts views Thread by Robert Latest | last post: by
1 post views Thread by nkarkhan | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.