By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,732 Members | 1,429 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,732 IT Pros & Developers. It's quick & easy.

minidom's setAttribute + UnicodeDecodeError

P: n/a
Hi, everybody.
In this excerpt of code

enc = 'some_type_of_encoding'

def _encode(v):
if isinstance(v, UnicodeType):
v = v.encode(v)
return v

.....
node.setAttribute('style:name', _encode(value))
.....

i get UnicodeDecodeError:
------------------------------------------------------------
Traceback (most recent call last):
File "stnreplace.py", line 107, in ?
StylesHelper(fname).replace(trdict)
File "stnreplace.py", line 63, in replace
node.setAttribute('style:name', _encode(uval))
File "/usr/local/lib/python2.3/site-packages/_xmlplus/dom/minidom.py",
line 704, in setAttribute
elif value != attr.value:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 14:
ordinal not in range(128)
------------------------------------------------------------

The value passed to setAttribute is utf8 string, i tried with different
encodings, but no luck.
Could somebody give any suggestion how to solve that? Seems problem is
in 'ascii' codec in minidom, but how to make it handle not just ascii?
TIA

P.S. Tested with python2.3.3 & python2.3.4

Best regards,
Ruslan

Jul 18 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
Ruslan wrote:
def _encode(v):
if isinstance(v, UnicodeType):
v = v.encode(v)
return v

....
node.setAttribute('style:name', _encode(value))
.... [...] Could somebody give any suggestion how to solve that? Seems problem is
in 'ascii' codec in minidom, but how to make it handle not just ascii?


The problem is in your code. node.setAttribute requires both the
attribute name and the attribute value to be Unicode objects, as
per the DOM spec.

For backwards-compatibility, ease-of-use, and performance reasons,
it does not actually check that these are Unicode objects, and it
will work with byte strings just fine as long as they are ASCII.
But this would still be an error in the application, which really
needs to pass Unicode objects.

IOW: just remove the _encode call, and all will be fine.

Regards,
Martin
Jul 18 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.