Why are there no ordered dictionaries? - Page 10

Christoph Zwerschke

This is probably a FAQ, but I dare to ask it nevertheless since I
haven't found a satisfying answer yet: Why isn't there an "ordered
dictionary" class at least in the standard list? Time and again I am
missing that feature. Maybe there is something wrong with my programming
style, but I rather think it is generally useful.

I fully agree with the following posting where somebody complains why so
very basic and useful things are not part of the standard library:
http://groups.google.com/group/comp....52d2f771a49857

Are there plans to get it into the standard lib sometime?

-- Christoph

Nov 22 '05

Subscribe Reply

210

10547

« First
<
8
9
10
11
12
20
>
Last »

Ben Sizer

Fredrik Lundh wrote:

Ben Sizer wrote:
This is interesting; I would have thought that the tuple is read and a
dictionary created by inserting each pair sequentially. Is this not the
case?
pointers to the members of each pair, yes. but a pointer copy is a
cheap operation (for the given use case, we're only talking about a
few dozen pairs anyway, at the most).

I was really thinking more about the other work, such as the hashing
and whatever, but I guess that is very efficient anyway.
this is a common fallacy; Python programmers underestimate the
cost of adding extra layers to their code (e.g. by using an ordered
dict data structure that has to incrementally update both a list and
a dictionary), and overestimate the cost of letting the C layer do
bulk operations.

If it was me I would probably have just used a list and searched it
linearly: premature optimisation is the root of all evil, etc. But then
I've never found a need for an ordered dictionary anyway; I always felt
they were more an artifact of the language implementation than a
reflection of something inherently useful.

However, you have to forgive people for falling prey to the 'fallacy'
you describe - for years there's been an attempt to teach people to use
proper data structures and algorithms instead of relying on
micro-optimisations (ie. "it's too slow: redo it in assembly"). So
often, the first port of call for a good programmer will be to try and
find a structure that maps directly to the problem.

--
Ben Sizer

Nov 22 '05 #91

Fredrik Lundh

Stuart McGraw wrote

What would improve the Cheese Shop's interface for you?

Getting rid of those damn top level links to old versions.
Seeing a long list of old versions, when 99% of visitors are
only interested in the current version, is just visual noise,
and really lame. Move the old version links onto the page
describing the software.

hmm? the pypi package automatically hides old versions when
you post new ones, and it's been that way for ages...

(which is bloody annoying if you're a package developers, since it
means that alphas for the next release hides the most recent stable
version)

looking at the full index, ZODB seems to be the only package that's
available in more than just one stable and one development version...

</F>

Nov 22 '05 #92

Bengt Richter

On 22 Nov 2005 02:16:22 -0800, "Fuzzyman" <fu******@gmail .com> wrote:

Kay Schluehr wrote:
Christoph Zwerschke wrote:
> That would be also biased (in favour of Python) by the fact that
> probably very little people would look for and use the package in the
> cheese shop if they were looking for ordered dicts.

Does anyone actually use this site? While the Vaults offered a nice
place and a nice interface the Cheese Shop has the appeal of a code
slum.

Hmmm.. it's *the* repository for Python code. I expect quite a few
people use it...

:-)

I hadn't realized how much stuff was there. I generally google for stuff,
but I will be looking directly now.

BTW, I made a mod[1] to your odict that I think/hope is going to be generally faster.
It requires 2.4 though. It passes the same doctest, so its probably close to the same.
It stores the ordering info differently, but is also a dict subclass.

Do you happen to have a timing test that exercises various aspects, so I can try it
before I embarrass myself? Otherwise I guess I'll write something.

Would the would-be users chime in with some idea of what kinds of operations are
most important timing-wise? Which would get the most use? How dynamic would ordering typically be?

[1] fairly pervasive little mods actually
[ 9:15] C:\pywk\clp>dif f odict.py odictb.py |wc
146 458 4948

[ 9:15] C:\pywk\clp>wc odict*.py
467 1228 12483 odict.py
511 1500 14728 odictb.py
978 2728 27211 Totals

Regards,
Bengt Richter

Nov 22 '05 #93

Stuart McGraw

"Fredrik Lundh" <fr*****@python ware.com> wrote in message news:ma******** *************** *************** *@python.org...

Stuart McGraw wrote
What would improve the Cheese Shop's interface for you?

Getting rid of those damn top level links to old versions.
Seeing a long list of old versions, when 99% of visitors are
only interested in the current version, is just visual noise,
and really lame. Move the old version links onto the page
describing the software.

hmm? the pypi package automatically hides old versions when
you post new ones, and it's been that way for ages...

(which is bloody annoying if you're a package developers, since it
means that alphas for the next release hides the most recent stable
version)

looking at the full index, ZODB seems to be the only package that's
available in more than just one stable and one development version...

http://cheeseshop.python.org/pypi?:a...rowse&asdf=405
- ClientForm-0.1.17
- ClientForm-0.2.1b
....
- EmPy_3.1
- EmPy_3.1.1
- EmPy_3.2
- EmPy_3.3
....
- FauxIdent-1.1
- FauxIdent-1.2
- FauxIdent-1.2.1
....
Well, it is better than I remember it being a while (year?) ago, my
recollection is that many packages had many, many old versions
listed but now I usualy see only a couple versions.

Hmm, so two versions means one is a development version,
and the other is a stable version? I did not know that, and did
not see it documented on the site. I would say documenting
that would be an interface improvement.

I still think it would be better to have just a package name
(with current version) listed in the index page(s), and have alternate
versions (old, alpha testing, etc) listed on the package's description
page.

Nov 22 '05 #94

Bengt Richter

On Tue, 22 Nov 2005 10:06:07 +0100, Christoph Zwerschke <ci**@online.de > wrote:

Bengt Richter schrieb:
Ok, so if not in the standard library, what is the problem? Can't find what
you want with google and PyPI etc.? Or haven't really settled on what your
_requirements_ are? That seems to be the primary problem people who complain
with "why no sprollificator mode?" questions.
What I don't understand is why legitimate questions such as "why are
there no ordered dictionaries" are immediately interpreted as
*complaints* and not just as questions. If I ask such a question, I am
not complaining but trying to simply figure out *why* there is no such
thing. Probably there are reasons and all I want to know is find these
reasons and learn a little bit more about Python in doing so.

Why can't such questions be discussed in a factual, calm and friendly way?

Sorry, I was tired and vented some impatience. I'm mostly friendly ;-)
I took the "why" in a different sense than it was meant, I guess.
Sort of like hearing "why haven't I been served yet" in a restaurant, and
having to say, "I don't work here, you'll have to ask the waiter."

They don't know what they really mean when it comes down to a DYFR
(Define Your Felicitous Requirements) challenge.
I don't think that this was true in this case, and even if this is the
outcome, those who asked the question will have learned something.

I agree again.
I think a discussion group is not there for only presenting mature,
sophisticate d thoughts and concepts, but also for "thinking loud"
together with other about these issues. We all know that clarifying our
thoughts works often best if you discuss them with others. And I think
that's one purpose of discussion lists. Asking questions should not be
immediately be discouraged, even silly questions. If it is really a FAQ,
you can simply point to the FAQ or add the answer in the FAQ list if it
is missing there.

Agreed again. Thank you for your nice reply.

Regards,
Bengt Richter

Nov 22 '05 #95

Fredrik Lundh

Stuart McGraw wrote:

Hmm, so two versions means one is a development version,
and the other is a stable version? I did not know that, and did
not see it documented on the site. I would say documenting
that would be an interface improvement.
well, that's up to the developer. when you upload a new version, all
older ones are automagically hidden. the only way to make old versions
appear again is to "unhide" them via a web form. for the few packages
I sampled, the older versions were stable, and the latest one was "less
stable", but I didn't check all of the...
I still think it would be better to have just a package name
(with current version) listed in the index page(s), and have alternate
versions (old, alpha testing, etc) listed on the package's description
page.

agreed.

a "nonstable"-property in the setup file would be nice too (so that stable
versions don't disappear when you upload an alpha or beta...)

</F>

Nov 22 '05 #96

Bengt Richter

On 22 Nov 2005 03:07:47 -0800, "bo****@gmail.c om" <bo****@gmail.c om> wrote:

Bengt Richter wrote:
Ok, so if not in the standard library, what is the problem? Can't find what
you want with google and PyPI etc.? Or haven't really settled on what your
_requirements_ are? That seems to be the primary problem people who complain
with "why no sprollificator mode?" questions. They don't know what they really
mean when it comes down to a DYFR (Define Your Felicitous Requirements) challenge.
So DYFR ;-)Beat me. I am not the one asking the question.

Sorry, I thought you wanted an ordered dict too.

>> > parsing or not parsing is not the point, and parsing/converting is
>> > still "create a new view" of an existing data structure.
>>

So you'd like the mechanics to be automated and hidden? Then you need to
DYFR for using the black box you want. Methods, semantics.

Lose you. don't know what you want to say.

I like solving problems. I just get frustrated when people don't focus on getting
the problem defined, which IME is 2/3 of the way to a solution. I don't mind,
in fact enjoy, rambling musings, but if someone seems actually to want a solution
for something, I like to try to realize it concretely.

After finally reading that the odict.py in PyPI by Larosa/Foord was what was desired,
but slower in use than what Fredrik posted, I decided to see if I could speed up odict.py.
I now have a version that I think may be generally faster.

I still don't know whether it will be of any user w.r.t. the requirements of anyone
on the bandwagon of asking for some kind of ordered dict, but we'll see what we'll see ;-)

Regards,
Bengt Richter

Nov 22 '05 #97

Bengt Richter

On Tue, 22 Nov 2005 10:26:22 +0100, Christoph Zwerschke <ci**@online.de > wrote:

Bengt Richter wrote:
d = OrderedDict(); d[1]='one'; d[2]='two' =>> list(d) => [1, 2]
ok, now we do d[1]='ein' and what is the order? list(d) => [2, 1] ??
Or do replacements not count as "insertions "?
If you simply set a value for a key that already exists, the order
should not be changed. I think this is intuitive.
Or maybe you want to permit append and NOT prevent
[('a',1), ('a':2)] and maybe d['a'] => [1, 2] ???

You could ask the same question about dict. I think that is not an
option. Why should you want odict behave different than dict?

Well, it was beginning to remind of RDB with possible non-unique keys,
where a select can get you multiple records back.
I still believe that the concept of an "ordered dictionary" ("behave
like dict, only keep the order of the keys") is intuitive and doesn't
give you so much scope for ambiguity. But probably I need to work on an
implementati on to become more clear about possible hidden subtleties.

Does that mean that the Larosa/Foord odict.py implementation in PyPI
does not do what you want?

Regards,
Bengt Richter

Nov 22 '05 #98

Christoph Zwerschke

Alex Martelli wrote:

Perl's _arrays_ are a bit like Python _lists_, and ordered; it's the
_hashes_ that are a bit like Python _dicts_, and unordered. PHP's use
of "array" for both concepts may indeed be a bit confusing.

Perl's _hashes_ have been also called _associative arrays_ originally.

Anyway, it would be interesting to examine this in detail and how this
is implemented in other languages.

Ok, I just did a little research an compared support for ordered dicts
in some other languages:

* Perl has hashes (associative arrays) which are not ordered.
Here also people are asking for and implementing "ordered hashes",
e.g. http://perltraining.com.au/tips/2005-06-29.html
http://search.cpan.org/dist/Tie-IxHa.../Tie/IxHash.pm
http://search.cpan.org/dist/Tie-Inse...rtOrderHash.pm
http://www.yapc.org/America/previous...or/pinyan.html

* Ruby hashes are not ordered.
Again people are asking for and implementing "ordered hashes".
e.g. http://raa.ruby-lang.org/project/orderedhash/
http://groups.google.com/group/comp....28a870925f23f4

* Smalltalk: Innately has unordered Dictionaries.
People are asking for and implementing "OrderedDiction aries" as well,
e.g. http://exept.eu.org:8080/ClassDoc/cl...eredDictionary

* Lisp has (ordered) "associatio n lists".

* PHP has ordered associative arrays.

* Awk, TCL: Associative arrays are unordered.

* C++ has a Map template in the STL which is ordered (a "Sorted
Associative Container").

* Java: Has "LinkedHash Map" which is ordered.

So ordered dictionaries don't seem to be such an exotic idea.

All implementations I found were pretty unequivocally the same that I
had in mind, using insertion order, appending the latest inserted keys
at the end, not changing the order if an existing key is re-inserted,
and deleting keys together with entries.

I also found a discussion thread like the current where similar
arguments were mentioned for and against ordered dictionaries:

In http://mail.python.org/pipermail/pyt...ch/052041.html,
Nick Coghlan raises the following rhetorical question:

Would the default semantics below really be that suprising?

"An ordered dictionary remembers the order in which keys are first seen
and, when iterated over, returns entries based on that order. This
applies to direct iteration, iteration over values and (key, value)
pairs, to the list-producing methods (i.e. keys(), values() and items())
and to any other operations that involve implicit iteration (e.g.
converting to a string representation) . Overwriting an entry replaces
its value, but does not affect its position in the key order. Removing
an entry (using 'del') _does_ remove it from the key order. Accordingly,
if the entry is later recreated, it will then occur last in the key
order. This behaviour is analagous to that of a list constructed using
only list.append() to add items (indeed, the key order can be thought of
as a list constructed in this fashion, with keys appended to the list
when they are first encountered)."

This was also the semantics I immediately had in mind when I thought
about ordered dictionaries, though I hadn't read anything about ordered
dictionaries before and it is the same semantics that I found others
have implemented in other languages.

I can't help but I still find it unambiguous and intuitive enough to
consider it "the one" standard implementation for ordered dictionaries.

Also, in the use cases mentioned (describing database columns, html form
fields, configuration parameters etc.), the dictionary is usually only
created once and then not changed, so different handling of re-insertion
or deletion of keys would not even be relevant in these cases.

-- Christoph

Nov 22 '05 #99

Christoph Zwerschke

Bengt Richter wrote:

I'm mostly friendly ;-)

I'm pretty sure you are :-)

-- Chris

Nov 22 '05 #100