Unicode and exception strings

Rune Froysa

Assuming an exception like:

x = ValueError(u'\xf8')

AFAIK the common way to get a string representation of the exception
as a message is to simply cast it to a string: str(x). This will
result in an "UnicodeError: ASCII encoding error: ordinal not in
range(128)".

The common way to fix this is with something like
u'\xf8'.encode("ascii", 'replace'). However I can't find any way to
tell ValueErrors __str__ method which encoding to use.

Is it possible to solve this without using sys.setdefaultencoding()
from sitecustomize?

Regards,
Rune Frøysa

Jul 18 '05 #1

Subscribe Post Reply

3805

Terry Carroll

On 09 Jan 2004 13:18:39 +0100, Rune Froysa <ru*********@usit.uio.no>
wrote:

Assuming an exception like:

x = ValueError(u'\xf8')

AFAIK the common way to get a string representation of the exception
as a message is to simply cast it to a string: str(x). This will
result in an "UnicodeError: ASCII encoding error: ordinal not in
range(128)".

The common way to fix this is with something like
u'\xf8'.encode("ascii", 'replace'). However I can't find any way to
tell ValueErrors __str__ method which encoding to use.

Rune, I'm not understanding what your problem is.

Is there any reason you're not using, for example, just repr(u'\xf8')?

In one program I have that occasionally runs into a line that includes
some (UTF-8) Unicode-encoded Chinese characters , I have something like
this:

try:
_display_text = _display_text + "%s\n" % line
except UnicodeDecodeError:
try:
# decode those UTF8 nasties
_display_text = _display_text + "%s\n" % line.decode('utf-8'))
except UnicodeDecodeError:
# if that still doesn't work, punt
# (I don't think we'll ever reach this, but just in case)
_display_text = _display_text + "%s\n" % repr(line)

I don't know if this will help you or not.

Jul 18 '05 #2

Terry Carroll

On Fri, 09 Jan 2004 19:44:21 GMT, Terry Carroll <ca*****@tjc.com> wrote:

In one program I have that occasionally runs into a line that includes
some (UTF-8) Unicode-encoded Chinese characters , I have something like
this:
Sorry, a stray parenthesis crept in here (since this is a pared down
version of my actual code). It should read:
try:
_display_text = _display_text + "%s\n" % line
except UnicodeDecodeError:
try:
# decode those UTF8 nasties
_display_text = _display_text + "%s\n" % line.decode('utf-8')
except UnicodeDecodeError:
# if that still doesn't work, punt
# (I don't think we'll ever reach this, but just in case)
_display_text = _display_text + "%s\n" % repr(line)
I don't know if this will help you or not.

Jul 18 '05 #3

Rune Froysa

Terry Carroll <ca*****@tjc.com> writes:

On 09 Jan 2004 13:18:39 +0100, Rune Froysa <ru*********@usit.uio.no>
wrote:
Assuming an exception like:

x = ValueError(u'\xf8')

AFAIK the common way to get a string representation of the exception
as a message is to simply cast it to a string: str(x). This will
result in an "UnicodeError: ASCII encoding error: ordinal not in
range(128)".

The common way to fix this is with something like
u'\xf8'.encode("ascii", 'replace'). However I can't find any way to
tell ValueErrors __str__ method which encoding to use.
Rune, I'm not understanding what your problem is.

Is there any reason you're not using, for example, just repr(u'\xf8')?

The problem is that I have little control over the message string that
is passed to ValueError(). All my program knows is that it has caught
one such error, and that its message string is in unicode format. I
need to access the message string (for logging etc.).
_display_text = _display_text + "%s\n" % line.decode('utf-8'))

This does not work, as I'm unable to get at the 'line', which is
stored internally in the ValueError class (and generated by its __str_
method).

Regards,
Rune Frøysa

Jul 18 '05 #4

Terry Carroll

On Wed, 14 Jan 2004 01:32:36 GMT, Terry Carroll <ca*****@tjc.com> wrote:

You can try to extract it as above, and then decode it with the codecs
module, but if it's only the first byte, it won't decode correctly:
import codecs
d = codecs.getdecoder('utf-8')
x.args[0]u'\xf8' d.decode(x.args[0])Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'builtin_function_or_method' object has no attribute
'decode'
Oops. Copy-and-pasted the wrong line here. Let's try that again:
x = ValueError(u'\xf8')
import codecs
d = codecs.getdecoder('utf-8')
d(x.args[0]) Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
position 0:
ordinal not in range(128)

*That's* the exception I was trying to show, not the AttributeError you
get when you use the decoder wrongly!

Jul 18 '05 #5

Similar topics

unicode encoding usablilty problem

by: aurora | last post by:

I have long find the Python default encoding of strict ASCII frustrating. For one thing I prefer to get garbage character than an exception. But the biggest issue is Unicode exception often pop up...

Python

Unicode & Pythonwin / win32 / console?

by: Robert | last post by:

Hello, I'm using Pythonwin and py2.3 (py2.4). I did not come clear with this: I want to use win32-fuctions like win32ui.MessageBox, listctrl.InsertItem ..... to get unicode strings on the...

Python

Convertion of Unicode to ASCII NIGHTMARE

by: ChaosKCW | last post by:

Hi I am reading from an oracle database using cx_Oracle. I am writing to a SQLite database using apsw. The oracle database is returning utf-8 characters for euopean item names, ie special...

Python

Problem with sets and Unicode strings

by: Dennis Benzinger | last post by:

Hi! The following program in an UTF-8 encoded file: # -*- coding: UTF-8 -*- FIELDS = ("Fächer", ) FROZEN_FIELDS = frozenset(FIELDS) FIELDS_SET = set(FIELDS)

Python

comparing Unicode and string

by: luc.saffre | last post by:

Hello, here is something that surprises me. #coding: iso-8859-1 s1=u"Frau Müller machte große Augen" s2="Frau Müller machte große Augen" if s1 == s2: pass

Python

os.lisdir, gets unicode, returns unicode... USUALLY?!?!?

by: gabor | last post by:

hi, from the documentation (http://docs.python.org/lib/os-file-dir.html) for os.listdir: "On Windows NT/2k/XP and Unix, if path is a Unicode object, the result will be a list of Unicode...

Python

error messages containing unicode

by: Jim | last post by:

Hello, I'm trying to write exception-handling code that is OK in the presence of unicode error messages. I seem to have gotten all mixed up and I'd appreciate any un-mixing that anyone can...

Python

unicode

by: 7stud | last post by:

Based on this example and the error: ----- u_str = u"abc\u9999" print u_str UnicodeEncodeError: 'ascii' codec can't encode character u'\u9999' in position 3: ordinal not in range(128) ------

Python

How does unicode() work?

by: Robert Latest | last post by:

Here's a test snippet... import sys for k in sys.stdin: print '%s -%s' % (k, k.decode('iso-8859-1')) ....but it barfs when actually fed with iso8859-1 characters. How is this done right? ...

Python

Unicode lists and join (python 2.2.3)

by: nkarkhan | last post by:

Hello, I have a list of strings, some of the strings might be unicode. I am trying to a .join operation on the list and the .join raises a unicode exception. I am looking for ways to get around...

Python

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing