467,080 Members | 894 Online
Bytes | Developer Community
Ask Question

Home New Posts Topics Members FAQ

Post your question to a community of 467,080 developers. It's quick & easy.

minidom's setAttribute + UnicodeDecodeError

Hi, everybody.
In this excerpt of code

enc = 'some_type_of_encoding'

def _encode(v):
if isinstance(v, UnicodeType):
v = v.encode(v)
return v

.....
node.setAttribute('style:name', _encode(value))
.....

i get UnicodeDecodeError:
------------------------------------------------------------
Traceback (most recent call last):
File "stnreplace.py", line 107, in ?
StylesHelper(fname).replace(trdict)
File "stnreplace.py", line 63, in replace
node.setAttribute('style:name', _encode(uval))
File "/usr/local/lib/python2.3/site-packages/_xmlplus/dom/minidom.py",
line 704, in setAttribute
elif value != attr.value:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 14:
ordinal not in range(128)
------------------------------------------------------------

The value passed to setAttribute is utf8 string, i tried with different
encodings, but no luck.
Could somebody give any suggestion how to solve that? Seems problem is
in 'ascii' codec in minidom, but how to make it handle not just ascii?
TIA

P.S. Tested with python2.3.3 & python2.3.4

Best regards,
Ruslan

Jul 18 '05 #1
  • viewed: 4520
Share:
1 Reply
Ruslan wrote:
def _encode(v):
if isinstance(v, UnicodeType):
v = v.encode(v)
return v

....
node.setAttribute('style:name', _encode(value))
.... [...] Could somebody give any suggestion how to solve that? Seems problem is
in 'ascii' codec in minidom, but how to make it handle not just ascii?


The problem is in your code. node.setAttribute requires both the
attribute name and the attribute value to be Unicode objects, as
per the DOM spec.

For backwards-compatibility, ease-of-use, and performance reasons,
it does not actually check that these are Unicode objects, and it
will work with byte strings just fine as long as they are ASCII.
But this would still be an error in the application, which really
needs to pass Unicode objects.

IOW: just remove the _encode call, and all will be fine.

Regards,
Martin
Jul 18 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Roman Yakovenko | last post: by
4 posts views Thread by Derek Basch | last post: by
4 posts views Thread by webdev | last post: by
2 posts views Thread by Peter Møllerud | last post: by
5 posts views Thread by fscked | last post: by
6 posts views Thread by Dan | last post: by
1 post views Thread by Paul Kozik | last post: by
2 posts views Thread by JYA | last post: by
2 posts views Thread by ashmir.d@gmail.com | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.