473,386 Members | 1,766 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

unicode keys in dicts

Hi all,

is the following behaviour normal :
d = {"é" : 1}
d["é"] 1 d[u"é"]

Traceback (most recent call last):
File "<stdin>", line 1, in ?
KeyError: u'\xe9'
it seems that "é" and u"é" are not considered as the same key (in Python
2.3.3). Though they have the same hash code (returned by hash()).

And "e" and u"e" (non accentuated characters) are considered as the same
!

Jiba
Jul 18 '05 #1
2 6420
>>> chr(0xe9) == unichr(0xe9)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeError: ASCII decoding error: ordinal not in range(128)

unequal objects can hash to the same value. Your two keys are not
equal (in fact, you can't even compare them on my system). They would
be comparable but not equal on many systems, for instance one where the
system's encoding is Microsoft's CP850.

You can misconfigure your system to assume that byte strings are in (eg)
iso-8859-1 encoding by changing site.py.

Jeff

Jul 18 '05 #2
Jiba wrote:

is the following behaviour normal :
d = {"é" : 1}
d["é"] 1 d[u"é"]

Traceback (most recent call last):
File "<stdin>", line 1, in ?
KeyError: u'\xe9'

it seems that "é" and u"é" are not considered as the same key (in Python
2.3.3). Though they have the same hash code (returned by hash()).

And "e" and u"e" (non accentuated characters) are considered as the same
!


Well, "e" and u"e" _are_ the same character, while the unicode that comes
from decoding the "é" representation is entirely dependent on which codec
you use for the decoding. It is only the same as u"é" when decoded using
certain codecs, most likely. ASCII is 7-bit only, so the "é" value is
not legal in ASCII, which is likely your default encoding.

For example, try "é".decode('iso-8859-1') and you will probably get the
unicode value you were expecting.

I'm not the best to answer this, but I would at least say that the above
behaviour is considered "normal", though it can be surprising to those
of us not expert in Unicode issues...

-Peter
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: python | last post by:
Hi- I have several different dictionaries. I want to make a unique list of all the keys in all the dictionaries. What would be the best way of doing that? Thanks.
57
by: Egor Bolonev | last post by:
why functions created with lambda forms cannot contain statements? how to get unnamed function with statements?
2
by: Timothy Babytch | last post by:
Imagine you have some list that looks like ('unicode', 'not-acii', 'russian') and contains characters not from acsii. or list of dicts, or dict of dicts. how can I print it? not on by one, with...
90
by: Christoph Zwerschke | last post by:
Ok, the answer is easy: For historical reasons - built-in sets exist only since Python 2.4. Anyway, I was thinking about whether it would be possible and desirable to change the old behavior in...
14
by: vatamane | last post by:
This has been bothering me for a while. Just want to find out if it just me or perhaps others have thought of this too: Why shouldn't the keyset of a dictionary be represented as a set instead of a...
1
by: Michael J. Fromberger | last post by:
I'm not sure whether this is a bug, or a feature that I do not fully understand. I would therefore appreciate some advice, if you have any, on the following problem: I have just installed...
3
by: George Sakkis | last post by:
I wrote an 'fkdict' dict-like class for mappings with a fixed set of keys but I'm wondering if there's a simpler way to go about it. First off, the main motivation for it is to save memory in...
11
by: John | last post by:
I am coding a radix sort in python and I think that Python's dictionary may be a choice for bucket. The only problem is that dictionary is a mapping without order. But I just found that if the...
3
by: james_027 | last post by:
hi, a_dict = {'name':'apple', 'color':'red', 'texture':'smooth', 'shape':'sphere'} is there any difference between .. for key in a_dict: from
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.