469,358 Members | 1,682 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,358 developers. It's quick & easy.

MySQL: 'latin-1' codec can't encode character

Hi.

I'm trying to store a text within a MySQL field (v 3.23.58) by using
MySQLdb
(v 1.2.1c3).

The text is: "telephone..." (note the last character)

And I get this error message:
-----------
File "/usr/lib/python2.3/site-packages/MySQLdb/connections.py", line
33, in defaulterrorhandler
raise errorclass, errorvalue
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2026' in
position 288: ordinal not in range(256)
-----------------------

Position 288 is the character I've mentioned. I suppose I must encode
this caracter
into a right one which MySQL could store, but I have no idea about how
to perform
it. Any suggestion?

Thank you very much.

Jul 19 '05 #1
4 12108
"fr**********@europe.com"
I'm trying to store a text within a MySQL field (v 3.23.58) by using
MySQLdb
(v 1.2.1c3).

The text is: "telephone..." (note the last character)

And I get this error message:
-----------
File "/usr/lib/python2.3/site-packages/MySQLdb/connections.py", line
33, in defaulterrorhandler
raise errorclass, errorvalue
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2026' in
position 288: ordinal not in range(256)
-----------------------

Position 288 is the character I've mentioned. I suppose I must encode
this caracter
into a right one which MySQL could store, but I have no idea about how
to perform
it. Any suggestion?


the character \u2026 is not part of the ISO-8859-1 character set. if you
insist on storing that in 8-bit string, you have to find an 8-bit encoding
that includes that character (UTF-8 is one such alternative).

if MySQL is set to store ISO-8859-1 only, you can replace the character
with it with three periods, drop it (use the "ignore" encoding option) or
replace it with a suitable marker (use the "replace" encoding option).

</F>

Jul 19 '05 #2
Hi Fredrik.

Thank you very much for your quick answer.

Do you suggest to change it by using regexp or must I encode the whole
texto into a suitable one?

Regards.

Fredrik Lundh wrote:
"fr**********@europe.com"
I'm trying to store a text within a MySQL field (v 3.23.58) by using MySQLdb
(v 1.2.1c3).

The text is: "telephone..." (note the last character)

And I get this error message:
-----------
File "/usr/lib/python2.3/site-packages/MySQLdb/connections.py", line 33, in defaulterrorhandler
raise errorclass, errorvalue
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2026' in position 288: ordinal not in range(256)
-----------------------

Position 288 is the character I've mentioned. I suppose I must encode this caracter
into a right one which MySQL could store, but I have no idea about how to perform
it. Any suggestion?
the character \u2026 is not part of the ISO-8859-1 character set. if

you insist on storing that in 8-bit string, you have to find an 8-bit encoding that includes that character (UTF-8 is one such alternative).

if MySQL is set to store ISO-8859-1 only, you can replace the character with it with three periods, drop it (use the "ignore" encoding option) or replace it with a suitable marker (use the "replace" encoding option).
</F>


Jul 19 '05 #3
fr**********@europe.com wrote:
File "/usr/lib/python2.3/site-packages/MySQLdb/connections.py", line 33, in defaulterrorhandler
raise errorclass, errorvalue
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2026' in position 288: ordinal not in range(256)
Thank you very much for your quick answer.

Do you suggest to change it by using regexp or must I encode the whole
texto into a suitable one?


a simple solution would be to manually create a table of problematic
unicode characters, use the translate method on the unicode string,
and then encode using the "replace" option.

charmap = {
0x2026: u"...",
# ...
}

text = u'telephone\u2026'

text = text.translate(charmap)
text = text.encode("iso-8859-1", "replace")

print text

http://docs.python.org/lib/string-methods.html

if you want more control of the replacement, you can skip the translate
step and use your own error handler, e.g.

charmap = ... see above ...

def fixunicode(info):
s = info.object[info.start:info.end]
try:
return charmap[ord(s)], info.end
except KeyError:
# fallback
return u"<U+%04x>" % ord(s), info.end

import codecs
codecs.register_error("fixunicode", fixunicode)

text = u'telephone\u2026'

text = text.encode("iso-8859-1", "fixunicode")

hope this helps!

</F>

Jul 19 '05 #4
Fredrik Lundh wrote:
[...]
if you want more control of the replacement, you can skip the translate
step and use your own error handler, e.g.

charmap = ... see above ...

def fixunicode(info):
s = info.object[info.start:info.end]
try:
return charmap[ord(s)], info.end


This will fail if there's more than one consecutive unencodable
character, better use
return charmap[ord(s[0])], info.start+1
or
return "".join(charmap.get(ord(c), u"<U+%04x>" % ord(c)) for c in
s), info.end
(without the try:) instead.

Bye,
Walter Dörwald
Jul 19 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

17 posts views Thread by thinkfirst | last post: by
8 posts views Thread by Stanley Sinclair | last post: by
9 posts views Thread by Andy | last post: by
2 posts views Thread by Piotr | last post: by
27 posts views Thread by Pom | last post: by
reply views Thread by mistral | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.