Why are there no ordered dictionaries? - Page 4

Christoph Zwerschke

This is probably a FAQ, but I dare to ask it nevertheless since I
haven't found a satisfying answer yet: Why isn't there an "ordered
dictionary" class at least in the standard list? Time and again I am
missing that feature. Maybe there is something wrong with my programming
style, but I rather think it is generally useful.

I fully agree with the following posting where somebody complains why so
very basic and useful things are not part of the standard library:
http://groups.google.com/group/comp....52d2f771a49857

Are there plans to get it into the standard lib sometime?

-- Christoph

Nov 22 '05

Subscribe Post Reply

210

10251

« First
<
2
3
4
5
>

Christoph Zwerschke

Bengt Richter wrote:

I think the concept has converged to a replace-or-append-by-key ordering
of key:value items with methods approximately like a dict. We're now
into usability aspects such as syntactic sugar vs essential primitives,
and default behaviour vs selectable modes, ISTM.
Yes, and we would be good if we do not stop the discussion at this point
with nothing, but really create such a sophisticated implementation.
Whether it will become popular or go to the standard lib some day is a
completely different question.
E.g., it might be nice to have a mode that assumes d[key] is d.items()[k][1] when key is an integer, and otherwise uses dict lookup, for cases where the use case is just string dict keys.

I also thought about that and I think PHP has that feature, but it's
probably better to withstand the temptation to do that. It could lead to
an awful confusion if the keys are integers.

-- Christoph

Nov 23 '05 #151

Christoph Zwerschke

>>* C++ has a Map template in the STL which is ordered (a "Sorted

Associative Container").

Alex Martelli wrote:
Ordered *by comparisons on keys*, NOT by order of insertion -- an
utterly and completely different idea.

Shame on me. I talked so much about the difference between "ordered" and
"sorted" and now I wrote such a confusing sentence. You're right, C++
Maps are not an example for "ordered dictionaries", but for "sorted
dictionaries".

-- Christoph

Nov 23 '05 #152

Christoph Zwerschke

Duncan Booth schrieb:

In Javascript Object properties (often used as an associative array) are
defined as unordered although as IE seems to always store them in the order
of original insertion it wouldn't surprise me if there are a lot of
websites depending on that behaviour.

You're right with both. The ECMA language definition says object
properties are an unordered collection, but MSIE and probably other
browsers keep the order in which they were created. Of course one should
not rely on that.

-- Christoph

Nov 23 '05 #153

Carsten Haese

On Wed, 2005-11-23 at 15:17, Christoph Zwerschke wrote:

Bengt Richter wrote:
> I think the concept has converged to a replace-or-append-by-key ordering
> of key:value items with methods approximately like a dict. We're now
> into usability aspects such as syntactic sugar vs essential primitives,
> and default behaviour vs selectable modes, ISTM.

Yes, and we would be good if we do not stop the discussion at this point
with nothing, but really create such a sophisticated implementation.
Whether it will become popular or go to the standard lib some day is a
completely different question.
> E.g., it might be nice to have a mode that assumes d[key] is

d.items()[k][1] when
> key is an integer, and otherwise uses dict lookup, for cases where

the use
> case is just string dict keys.

I also thought about that and I think PHP has that feature, but it's
probably better to withstand the temptation to do that. It could lead to
an awful confusion if the keys are integers.

Thus quoth the Zen of Python:
"Explicit is better than implicit."
"In the face of ambiguity, refuse the temptation to guess."

With those in mind, since an odict behaves mostly like a dictionary, []
should always refer to keys. An odict implementation that wants to allow
access by numeric index should provide explicitly named methods for that
purpose.

-Carsten

Nov 23 '05 #154

Christoph Zwerschke

Alex Martelli wrote:

However, since Christoph himself just misclassified C++'s std::map as
"ordered" (it would be "sorted" in this new terminology he's now
introducing), it seems obvious that the terminological confusion is
rife. Many requests and offers in the past for "ordered dictionaries"
(e.g. on this group) were also "sorted", NOT "ordered", in this new
terminology.

I'm sorry again. Please don't conclude that others are as uneducated as
I am (I haven't studied computer science but mathematics). Speaking
about "ordered" and "sorted" in the context of collections is not a new
terminology I am introducing, but seems to be pretty common in computer
science (see, e.g.
http://www.gamespp.com/java/dataStru...avaPart6.html).

Perhaps Pythonists are not used to that terminology, since they use the
term "list" for an "ordered collection". An ordered dictionary is a
dictionary whose keys are a (unique) list. Sometimes it is also called a
"sequence" (therefore in the odict implementation, the keys attribute
has this name). A unique list, in turn, can also be called an "ordered
set", so you can also understand an ordered dictionary as a dictionary
whose keys are an ordered set. I think all of this is pretty logical ;-)

-- Christoph

Nov 23 '05 #155

Christoph Zwerschke

Ganesan Rajagopal wrote:

the definition of "sorted" and "ordered", before we can > go on ? Sorted
would be ordered by key comparison. Iterating over such a container will
give you the keys in sorted order. Java calls this a SortedMap. See
http://java.sun.com/j2se/1.4.2/docs/...SortedMap.html C++ STL
map container is also a Sorted Associative container. See
http://www.sgi.com/tech/stl/Map.html Ganesan

Exactly, that's "sorted." "Ordered" means the same there is some order
between the existing elements, but there is no magic (i.e. a general
comparison function) for ordering new elements. Thus, if you add an
element to an ordered collection, it simply gets appended (is considered
as the greatest of all elements) by convention, whereas if you add an
element to a sorted collection, it will be inserted into the correct
place by using the comparison function.

-- Christoph

Nov 23 '05 #156

Christoph Zwerschke

Alex Martelli schrieb:

I detest and abhor almost-sequences which can't be sliced (I consider
that a defect of collections.deque). If the ordered dictionary records
by its sequencing the time order of key insertion, being able to ask for
"the last 5 keys entered" or "the first 3 keys entered" seem to me to be
perfectly natural use cases, and most naturally accomplished by slicing
of course, d[-5:] and d[:3] respectively.

I agree. A use case was requested: Say your dictionary holds form
fields, and you know the last field is always a hidden field you wont to
ignore in your output, or your dictionary holds attributes of a
database, and you don't want to print the first attribute which is
always some kind of uninteresting OID, then you would write
"for field in d[1:]" or "for field in d[:-1]".

(Otherwise, you would have to write "for field in d.keys()[1:]" etc.)

-- Christoph

Nov 23 '05 #157

Christoph Zwerschke

Bengt Richter wrote:

>>> from odictb import OrderedDict
>>> d1 = OrderedDict([(1, 11), (2, 12), (3, 13)])
>>> d1 {1: 11, 2: 12, 3: 13} >>> d1[1:] {2: 12, 3: 13} >>> d1[0:1] + d1[2:3] {1: 11, 3: 13} >>> d1.reverse()
>>> d1 {3: 13, 2: 12, 1: 11} >>> d1.insert(1, (4,14))
>>> d1 {3: 13, 4: 14, 2: 12, 1: 11} >>> d1.items() [(3, 13), (4, 14), (2, 12), (1, 11)] >>> d1.keys() [3, 4, 2, 1] >>> d1.values() [13, 14, 12, 11] >>> d1[1:2] {4: 14} >>> d1[-1:]

{1: 11}

Que mas?

Eso es exactamente lo que yo queria haber!

-- Chris

Nov 23 '05 #158

Christoph Zwerschke

>> I think it would be probably the best to hide the keys list from the

public, but to provide list methods for reordering them (sorting,
slicing etc.).

Tom Anderson wrote:
I'm not too keen on this - there is conceptually a list here, even if
it's one with unusual constraints, so there should be a list i can
manipulate in code, and which should of course be bound by those
constraints.

Think of it similar as the case of an ordinary dictionary: There is
conceptually a set here (the set of keys), but you cannot manipulate it
directly, but only through the according dictionary methods.

For an ordedred dictionary, there is conceptually a list (or more
specifically a unique list). Again you should not manipulate it
directly, but only through methods of the ordered dictionary.

This sounds at first more complicated, but is in reality more easy.

For instance, if I want to put the last two keys of an ordered dict d at
the beginning, I would do it as d = d[:-2] + d[-2:].

With the list attribute (called "sequence" in odict, you would have to
write: d.sequence = d.sequence[:-2] + d.sequence[-2:]. This is not only
longer to write down, but you also have to know that the name of the
attribute is "sequence". Python's strength is that you don't have to
keep many details in mind because it has a small "basic vocabulary" and
orthogonal use. I prefer the ordered dictionary does not introduce new
concepts or attributes if everything can be done intuitively with the
existing Python methods and operators.

-- Christoph

Nov 23 '05 #159

Christoph Zwerschke

Fuzzyman wrote:

So what do you want returned when you ask for d1[1] ? The member keyed
by 1, or the item in position 1 ?
In case of conflict, the ordered dictionary should behave like a
dictionary, not like a list. So d1[1] should be the member keyed by 1,
not the item in position 1. Only in case there is no member keyed by 1,
the item in position 1 could be returned, but I think that would be too
adventurous a hattrick and can lead to big confusion. Better to raise a
KeyError in that case.

But no other way to directly manipulate the keys should be provided.

Why - don't trust yourself with it ?

No, because I think it is not needed if list operations like insert are
directly possible on your dictionary.

But maybe methods such as setkeys() and setvalues() would be nice to
have in addition.

Instead of writing d.sequence = new_sequence, I would write
d.setkeys(new_sequence). But I'm not sure what to do if new_sequence is
not a permutation of the old one. Raise a KeyError? Or even swallow
this? For instance

d = OrderedDict((1,11), (2,12))

d.setkeys((2, 1)) ==> d = OrderedDict((2, 11), (1, 12))

d.setkeys((3, 4)) ==> d = OrderedDict((3, 11), (4, 12)) (or KeyError?)

d.setvalues((12, 11)) ==> d = OrderedDict((1, 12), (2, 11))

d.setvalues((13, 14)) ==> d = OrderedDict((1, 13), (2, 14)) (always ok)

(Or maybe better set_keys in analogy to has_key?)

I hesitate making keys and values managed properties, because this would
conflict with them being methods in ordinary dicts. Ordered dicts should
resemble ordinary dicts as closely as possible. And giving it a
different name like "sequence" I find confusing and unintuitive.

A resort could be the following: If keys() is given a sequence as
argument, then use this as the new sequence of keys, and similar with
values(). This way, no new methods need to be introduced.

-- Christoph

Nov 23 '05 #160

Christoph Zwerschke

Fuzzyman wrote:

- the internal keys list should be hidden
I disagree. It is exposed so that you can manually change the order
(e.g. to create a "sorted" dict, rather than one ordered by key
insertion). What do you *gain* by hiding it ?
See my other posting. Of course hiding the list can only be done, if
- list methods be mixed in instead

In this case, you can change the order directly by using the list
methods on the dictionary. Sorting would be an example. Instead of

d.sequence = sorted(d.sequence)

you could simply write d.sort() which does the same.
Hmm... I did consider this. Which list methods would you consider
appropriate ?

Everything method that does not clash with the use as dictionary. For
instance, both lists and dicts have __getitem__ and __setitem__, so im
this case, the dictionary method must take precedence. But a dictionary
has not __getslice__ and __setslice__, so here the list methods can be
used (__getslice__ is actually deprecated, but you get the idea). In
some cases, like __delitem__, both have it, but there is no clash.

Other interesting methods are sort() and reverse().

Here, we have another problem however: There is not only the list of
keys, but also the list of values, and sometimes, as in the case of
sort() and reverse(), it would be also nice to have it operate on the
list of values. How should this be done? PHP solves it by using two
different methods ksort and asort for keys and values. In this notation:

d = OrderedDict( (1,13), (3,12), (2,11) )

d.ksort() ==> d = ( (1,13), (2,11), (3,12) )
d.asort() ==> d = ( (2,11), (3,12), (1,13) )

Similar for reverse().

If the keys() and values() methods would be extended to be setters, then
d.ksort() = d.keys(sorted(d.keys())) and
d.asort() = d.values(sorted(d.values()))

Anyway, I don't like "ksort" and "asort". If it must be, I'd rather use

d.ksort() = d.sortkeys()
d.asort() = d.sortvalues()

d.sort() could default to one of them (not sure which one).

-- Christoph

Nov 23 '05 #161

Christoph Zwerschke

Fuzzyman wrote:

That's not the only use case. Other use cases are to have a specific
order, not based on entry time.

Simple example :
d1 = OrderedDict(some_dict.items())
d1.sequence.sort()
d1 is now an ordered dict with the keys in alphabetic order.
As I said, I would not need to access the sequence, if I can write
d1.sort() or d1.sortkeys()
If you don't want to modify sequence, don't. If you want a copy do :
seq = d1.sequence[:]

This is not needed since you can do the same with: seq = d1.keys()

-- Christoph

Nov 23 '05 #162

Christoph Zwerschke

Carsten Haese schrieb:

Thus quoth the Zen of Python:
"Explicit is better than implicit."
"In the face of ambiguity, refuse the temptation to guess."

With those in mind, since an odict behaves mostly like a dictionary, []
should always refer to keys. An odict implementation that wants to allow
access by numeric index should provide explicitly named methods for that
purpose.

Exactly. But I don't think in this case such methods would be needed.
You easily get the i-th value in the ordered dict as d.values()[i].

-- Chris

Nov 23 '05 #163

Christoph Zwerschke

Here is another old posting I just found which again gives the same use
cases and semantics:

http://mail.python.org/pipermail/pyt...ch/051974.html

"Keys are iterated over in the order that they are added. Setting a
value using a key that compares equal to one already in the dict
replaces the value, but not the key, and does not change the iteration
order. Removing a key (and value) then re-adding it moves the key to the
end of the iteration order."

So far all those who would like to have an ordered dict seem to have the
same semantics in mind.

Nov 23 '05 #164

Carsten Haese

On Wed, 23 Nov 2005 23:39:22 +0100, Christoph Zwerschke wrote

Carsten Haese schrieb:
Thus quoth the Zen of Python:
"Explicit is better than implicit."
"In the face of ambiguity, refuse the temptation to guess."

With those in mind, since an odict behaves mostly like a dictionary, []
should always refer to keys. An odict implementation that wants to allow
access by numeric index should provide explicitly named methods for that
purpose.

Exactly. But I don't think in this case such methods would be
needed. You easily get the i-th value in the ordered dict as
d.values()[i].

-- Chris

True enough, but unless the odict has its list of values on hand, you're
asking the odict to build a list of all its values just so that you can get
the i'th element. Having a method that does the equivalent of d[d.sequence[i]]
would be cleaner and more efficient.

-Carsten

Nov 24 '05 #165

Alex Martelli

Christoph Zwerschke <ci**@online.de> wrote:
...

d.ksort() = d.sortkeys()
d.asort() = d.sortvalues()

d.sort() could default to one of them (not sure which one).

Define JUST d.sort, you can trivially implement the other as
d.sort(key=d.get).
Alex

Nov 24 '05 #166

Fuzzyman

Alex Martelli wrote:

Fuzzyman <fu******@gmail.com> wrote:
...
- the internal keys list should be hidden
I disagree. It is exposed so that you can manually change the order
(e.g. to create a "sorted" dict, rather than one ordered by key
insertion).

What do you *gain* by hiding it ?

Freedom of implementation, of course. E.g., I can perhaps get better
performance by keeping a red-black tree of (insertiontime,key) pairs, so
for example deleting a key is O(log(N)) rather than O(N) as it would
have to be if the sequence had to be kept as a Python list.

What ELSE do you ever gain by enforcing abstraction and information
hiding? Conceptual integrity and implementation freedom...

The poster I was replying to inferred that we should hide it to protect
him from breaking it. :-)

Your reason is much more valid than the one I inferred. (And probably
worth chanign the code for).

- list methods should be mixed in instead
Hmm... I did consider this. Which list methods would you consider
appropriate ?

The difficulty is that integers are valid keys for a dictionary.
Perhaps a subclass that uses integers as indexes instead...

What about allowing slicing, taking advantage of the fact that slices
are NOT valid dictionary keys?

Hmm... ok, simple enough to implement. What would anyone *use* it for
though ?
Presumably sort and reverse methods are also desired.

Yeah - good idea. Insert is also good - I can't remember if we provide
this or not.

Thanks
Fuzzyman
http://www.voidspac.org.uk/python/index.shtml

You can always perform list operations on the ``sequence`` attribute of
course.

But NOT having to expose .sequence as a real mutable list whose changes
affect ordering has many advantages, as above noticed.
Alex

Nov 24 '05 #167

Fuzzyman

Ok.

That answers a question in the post I've just made. This thread is hard
to follow.

Thanks

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

Nov 24 '05 #168

Fuzzyman

Christoph Zwerschke wrote:

Fuzzyman wrote:
So what do you want returned when you ask for d1[1] ? The member keyed
by 1, or the item in position 1 ?
In case of conflict, the ordered dictionary should behave like a
dictionary, not like a list. So d1[1] should be the member keyed by 1,
not the item in position 1. Only in case there is no member keyed by 1,
the item in position 1 could be returned, but I think that would be too
adventurous a hattrick and can lead to big confusion. Better to raise a
KeyError in that case.

I agree - hard to trace bugs lie this way....

[snip..]
Instead of writing d.sequence = new_sequence, I would write
d.setkeys(new_sequence). But I'm not sure what to do if new_sequence is
not a permutation of the old one. Raise a KeyError? Or even swallow
this? For instance

This is an interesting question - I don't know which is the best
behaviour here.

It's probably better to raise a KeyError - that way the error will
occur when it happens.

The simplest way of doing this is to check that the new key list
(sorted) is the same as the original one (sorted). This is potentially
quite an expensive operation.

It might be faster to compare sets with Python 2.4 - but we intend to
retain compatibility with Python 2.2.
d = OrderedDict((1,11), (2,12))

d.setkeys((2, 1)) ==> d = OrderedDict((2, 11), (1, 12))

d.setkeys((3, 4)) ==> d = OrderedDict((3, 11), (4, 12)) (or KeyError?)

d.setvalues((12, 11)) ==> d = OrderedDict((1, 12), (2, 11))

d.setvalues((13, 14)) ==> d = OrderedDict((1, 13), (2, 14)) (always ok)

(Or maybe better set_keys in analogy to has_key?)

I hesitate making keys and values managed properties, because this would
conflict with them being methods in ordinary dicts. Ordered dicts should
resemble ordinary dicts as closely as possible. And giving it a
different name like "sequence" I find confusing and unintuitive.

You really want to be able to set values as a sequence ? I suppose it's
even easier to implement than changing keys, but I'd never considered
it.
A resort could be the following: If keys() is given a sequence as
argument, then use this as the new sequence of keys, and similar with
values(). This way, no new methods need to be introduced.

That's not a bad idea. I'll chat with Nicola Larosa and see what he
thinks.

It does break backawards compatibility of course...

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
-- Christoph

Nov 24 '05 #169

Duncan Booth

Christoph Zwerschke wrote:

Duncan Booth schrieb:
In Javascript Object properties (often used as an associative array)
are defined as unordered although as IE seems to always store them in
the order of original insertion it wouldn't surprise me if there are
a lot of websites depending on that behaviour.

You're right with both. The ECMA language definition says object
properties are an unordered collection, but MSIE and probably other
browsers keep the order in which they were created. Of course one
should not rely on that.

Yes, the real fun bit is arrays. If you create an array using the 'new
Array' or [ ... ] then a for..in loop might well trick you into thinking it
is going to go through the elements in order, but if you assign to elements
directly it breaks down:

a = ['a', 'b', 'c'];
a[4] = 'e';
a[3] = 'd';
for (var k in a) {
alert(a[k]+'='+a[k]);
}

On IE this will go through elements in the order 0, 1, 2, 4, 3.

Also, it is original order of insertion which matters, not the 'last
in/last out' you might have expected. In other words it looks rather as
though IE simply sets deleted properties to undefined (and skips them on
iteration), it never really deletes a property, so anyone who tries to use
a Javascript object as an associative array with lots of rapidly changing
keys could be in for a nasty shock.

Nov 24 '05 #170

Fuzzyman

Carsten Haese wrote:

On Wed, 23 Nov 2005 23:39:22 +0100, Christoph Zwerschke wrote
Carsten Haese schrieb:
Thus quoth the Zen of Python:
"Explicit is better than implicit."
"In the face of ambiguity, refuse the temptation to guess."

With those in mind, since an odict behaves mostly like a dictionary, []
should always refer to keys. An odict implementation that wants to allow
access by numeric index should provide explicitly named methods for that
purpose.
Exactly. But I don't think in this case such methods would be
needed. You easily get the i-th value in the ordered dict as
d.values()[i].

-- Chris

True enough, but unless the odict has its list of values on hand, you're
asking the odict to build a list of all its values just so that you can get
the i'th element. Having a method that does the equivalent of d[d.sequence[i]]
would be cleaner and more efficient.

I'm going to add some of the sequence methods. I'm *not* going to allow
indexing, but I will allow slicing.

You can also do d[d.keys()[i]]

This provides two ways of fetching values by index, so I don't want to
add another.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml -Carsten

Nov 24 '05 #171

Piet van Oostrum

>>>>> Christoph Zwerschke <ci**@online.de> (CZ) escribió:

CZ> Eso es exactamente lo que yo queria haber!

¿Haber? ¿Tener? :=(
--
Piet van Oostrum <pi**@cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: pi**@vanoostrum.org

Nov 24 '05 #172

Fuzzyman

By the way, Nicola and I will be working on an improving odict in line
with several of the suggestions in this thread.

See :
http://www.voidspace.org.uk/python/w..._19.shtml#e140

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

Nov 24 '05 #173

Christoph Zwerschke

Duncan Booth schrieb:

On IE this will go through elements in the order 0, 1, 2, 4, 3.

Oops! I bet most people would not expect that, and it is probably not
mentioned in most Javascript tutorials. I think this is a weakpoint of
the ECMA definition, not MSIE.

-- Christoph

Nov 24 '05 #174

Fuzzyman

Tom Anderson wrote:

On Tue, 22 Nov 2005, Christoph Zwerschke wrote:
One implementation detail that I think needs further consideration is in
which way to expose the keys and to mix in list methods for ordered
dictionaries.

In Foord/Larosa's odict, the keys are exposed as a public member which
also seems to be a bad idea ("If you alter the sequence list so that it
no longer reflects the contents of the dictionary, you have broken your
OrderedDict").

I think it would be probably the best to hide the keys list from the public,
but to provide list methods for reordering them (sorting, slicing etc.).
I'm not too keen on this - there is conceptually a list here, even if it's
one with unusual constraints, so there should be a list i can manipulate
in code, and which should of course be bound by those constraints.

I think I am now in favour of hiding hte sequence attribute.

You will be able to mutate the the keys list through :

d1 = OrderedDict(some_sequence_of_items)
keys = d1.keys()
keys.sort() # or other mutation
d1.keys(keys)

Admittedly this is a lot slower than :

d1 = OrderedDict(some_sequence_of_items)
d1.sequence.sort()

*but* it frees the squence attribute from any implementation details.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
I think the way to do it is to have a sequence property (which could be a
managed attribute to prevent outright clobberation) which walks like a
list, quacks like a list, but is in fact a mission-specific list subtype
whose mutator methods zealously enforce the invariants guaranteeing the
odict's integrity.

I haven't actually tried to write such a beast, so i don't know if this is
either of possible and straightforward.

tom

--
When I see a man on a bicycle I have hope for the human race. --
H. G. Wells

Nov 24 '05 #175

Bengt Richter

On Wed, 23 Nov 2005 23:00:29 +0100, Christoph Zwerschke <ci**@online.de> wrote:

Fuzzyman wrote:
So what do you want returned when you ask for d1[1] ? The member keyed
by 1, or the item in position 1 ?
In case of conflict, the ordered dictionary should behave like a
dictionary, not like a list. So d1[1] should be the member keyed by 1,
not the item in position 1. Only in case there is no member keyed by 1,
the item in position 1 could be returned, but I think that would be too
adventurous a hattrick and can lead to big confusion. Better to raise a
KeyError in that case.
But no other way to directly manipulate the keys should be provided.

Why - don't trust yourself with it ?

No, because I think it is not needed if list operations like insert are
directly possible on your dictionary.

But maybe methods such as setkeys() and setvalues() would be nice to
have in addition.

Instead of writing d.sequence = new_sequence, I would write
d.setkeys(new_sequence). But I'm not sure what to do if new_sequence is
not a permutation of the old one. Raise a KeyError? Or even swallow
this? For instance

d = OrderedDict((1,11), (2,12))

d.setkeys((2, 1)) ==> d = OrderedDict((2, 11), (1, 12))

d.setkeys((3, 4)) ==> d = OrderedDict((3, 11), (4, 12)) (or KeyError?)

d.setvalues((12, 11)) ==> d = OrderedDict((1, 12), (2, 11))

d.setvalues((13, 14)) ==> d = OrderedDict((1, 13), (2, 14)) (always ok)

Too many weird possibilities for unexpected key-value associations.
d.setitems() might be safer, but see d.items[i:j] below (without eliminating d.items() ;-)

The implication above is that OrderedDict takes an *args argument,
but really it takes a single argument that is a sequence of k,v pairs,
(and maybe some keyword options).

(Or maybe better set_keys in analogy to has_key?)

I hesitate making keys and values managed properties, because this would
conflict with them being methods in ordinary dicts. Ordered dicts should
resemble ordinary dicts as closely as possible. And giving it a
different name like "sequence" I find confusing and unintuitive.

A resort could be the following: If keys() is given a sequence as
argument, then use this as the new sequence of keys, and similar with
values(). This way, no new methods need to be introduced.

You could make keys, values, and items custom descriptors which would define __call__
for the old key() etc accesses, well as __getitem__ and __setitem__. Then you could write

d.items[0], d.items[-1] = d.items[-1], d.items[0]

to swap the order of the first and last items in the thing (I hesitate to say dict ;-)
You could also operate on slices. BTW, before I showed an example where d[2:3] returned
a new dict instance rather than d.items()[:]. I think the items list is better, since
they naturally add, sort, reverse, etc as lists, and you can write OrderedDict(d[2:3]+d[1:2])
if you want a new dict.

A slice assignment to d[i:j] is really sugar for something, but we have to decide exactly
what in case of duplicate keys in the incoming items, either with each other (effictively
a shorter incoming list with the last key:value pairs replacing prior pairs with the same key,
but the first key determining placement -- but where? what if a key doesn't exists outside of
the target slice (hence not within the eliminated slice, since the whole target dict item list
has unique keys)? One way to define it would be
d.items[i:j] = itemseq

to be implemented as sugar for
newitems = d.items[:i] + list(itemseq) + d.items[j:]
d.clear()
d.update(newitems)
so
d.reverse()
could be spelled
d.items[:] = d.items[::-1]

If you are using slices, you can safely use them directly in the __getitem__ of th dict interface,
as I did in the "Que mas?" post. So we don't have to write d.items[i:j] and could write d[i:j].
The thing is, d[i:j] will tempt to skip ".items" in d.items[i] and write d[i], which has the dict key meaning,
not the item list index. It is faster not have a descriptor between though.

I think maybe allowing write to keys or values is pretty iffy. There are too many weird
re-associations of keys and values possible, and which you could do my other means if you
really really wanted to. But I do think full index and slice read access would be fine.

I think also that d1==d2 should effectively be implemented as d1[:] == d2[:] -- i.e, compare
the item lists to implement comparisons.

Detailed requirements are most of the work ;-)
I'm thinking now to try subclassing list in a constrained way instead of dict, but well see.

Regards,
Bengt Richter

Nov 24 '05 #176

Christoph Zwerschke

Fuzzyman schrieb:

I'm going to add some of the sequence methods. I'm *not* going to allow
indexing, but I will allow slicing.

You can also do d[d.keys()[i]]

This provides two ways of fetching values by index, so I don't want to
add another.

And this would be probably faster than d.values()[i] in the usual
implementation where values() has to be built first as Carsten noted.

-- Chris

Nov 24 '05 #177

Fuzzyman

d.keys() will still return a copy of the list, so d.keys()[i] will
still be slower than d.sequence[i]

Slicing shouldn't be too much slower though.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

Nov 24 '05 #178

Fuzzyman

Bengt Richter wrote:

On Wed, 23 Nov 2005 23:00:29 +0100, Christoph Zwerschke <ci**@online.de> wrote:
[snip..] I think also that d1==d2 should effectively be implemented as d1[:] == d2[:] -- i.e, compare
the item lists to implement comparisons.

IIUC then the odict effectively already does this.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
Detailed requirements are most of the work ;-)
I'm thinking now to try subclassing list in a constrained way instead of dict, but well see.

Regards,
Bengt Richter

Nov 24 '05 #179

Christoph Zwerschke

Fuzzyman wrote:

You will be able to mutate the the keys list through :

d1 = OrderedDict(some_sequence_of_items)
keys = d1.keys()
keys.sort() # or other mutation
d1.keys(keys)

Admittedly this is a lot slower than :

d1 = OrderedDict(some_sequence_of_items)
d1.sequence.sort()

*but* it frees the squence attribute from any implementation details.

You should also implement

d1.sort() or d1.sortkeys()

which will have no performance drawbacks.

-- Christoph

Nov 24 '05 #180

Christoph Zwerschke

Bengt Richter schrieb:

d.setvalues((13, 14)) ==> d = OrderedDict((1, 13), (2, 14))
The implication above is that OrderedDict takes an *args argument,
but really it takes a single argument that is a sequence of k,v pairs,
(and maybe some keyword options).
Right. Interpret it as a short notation only, I did not want to change
the interface. I think it's a common mistake to forget the brackets
because you think one pair should be enough. At least I often forget it.
You could make keys, values, and items custom descriptors which would define __call__
for the old key() etc accesses, well as __getitem__ and __setitem__. Then you could write

d.items[0], d.items[-1] = d.items[-1], d.items[0]

to swap the order of the first and last items in the thing (I hesitate to say dict ;-)
You could also operate on slices.
Nice. You could also get the i-th value with d.values[i].

But is this considered good style or would it be considered "dirty" to
have a callable member that also supports indexing and slicing? (I don't
know, just asking?) Plus, it opens a can of worms by increasing the
complexity tremendously. Let's see whether this can be handled.
BTW, before I showed an example where d[2:3] returned
a new dict instance rather than d.items()[:]. I think the items list
is better, since they naturally add, sort, reverse, etc as lists,
and you can write OrderedDict(d[2:3]+d[1:2]) if you want a new dict.
Not sure about that. I would rather expect that if you slice an object,
you get an object of the same type. And you can also add, sort, reverse,
etc the ordered dict if you want.
A slice assignment to d[i:j] is really sugar for something, but we have to decide exactly
what in case of duplicate keys in the incoming items, either with each other ...
You mean slice assignments to d.items[i:j]. If you make the slice
assignment to d[i:j] then at least you cannot have conflicts in the
incoming items. The first question is: Should a slice assignment be
treated as deletion with subsequential addition (changing the order) or
as a replacement (i.e. try to not change the order)? I agree that the
second would be the expected semantics, but as you mentioned more
difficult to implement.
One way to define it would be
d.items[i:j] = itemseq

to be implemented as sugar for
newitems = d.items[:i] + list(itemseq) + d.items[j:]
d.clear()
d.update(newitems)
Which should be the same as
d = OrderedDict(d.items[:i] + list(itemseq) + d.items[j:])
Sounds reasonable.
So
d.reverse()
could be spelled
d.items[:] = d.items[::-1]
Slice assignment for keys and values would also allow to set a new key
order with

d.keys[:] = newkeyseq

So no need to define a setkeys() method or let keys() take arguments.
If we want to allow this kind of things at all.
If you are using slices, you can safely use them directly in the __getitem__ of th dict interface,
as I did in the "Que mas?" post. So we don't have to write d.items[i:j] and could write d[i:j].
The thing is, d[i:j] will tempt to skip ".items" in d.items[i] and write d[i], which has the dict
key meaning, not the item list index. It is faster not have a descriptor between though.

I still think d[i:j] should return an ordered dict, not an item list.
I think maybe allowing write to keys or values is pretty iffy. There are too many weird
re-associations of keys and values possible, and which you could do my other means if you
really really wanted to. But I do think full index and slice read access would be fine.
There are different opinions. Fuzzyman would probably say "Don't trust
yourself?" I myself am undecided. Perhaps you could expect that somebody
knows what he does if he makes a slice assignment to d.keys. In any way,
such an assignment should not "break" the directory (with "broken" I
mean that the internal keys sequence does not match the keys of the
internal dictionary). If anything does not match, it should raise a
KeyException. If it is implemented that way, I think we can allow it.
I think also that d1==d2 should effectively be implemented as d1[:] == d2[:] -- i.e, compare
the item lists to implement comparisons.

Wouldn't it be more performant to compare for
d1.internal_dict==d2.internal_dict and
d1.internal_sequence==d2.internal_sequence?
You don't keep track of the item lists, they need to be built on every
occasion.

-- Christoph

Nov 24 '05 #181

Christoph Zwerschke

Fuzzyman schrieb:

d.keys() will still return a copy of the list, so d.keys()[i] will
still be slower than d.sequence[i]

Right, I forgot that. Bengt suggested to implement __call__ as well as
__getitem__ and __setitem__ for keys, values and items.

In this case, you could very effectively access it as d.values[i].

-- Christoph

Nov 24 '05 #182

dcalvelo

"hacer" probablemente.

DCA.

Piet van Oostrum wrote:

>> Christoph Zwerschke <ci**@online.de> (CZ) escribió:

CZ> Eso es exactamente lo que yo queria haber!

¿Haber? ¿Tener? :=(
--
Piet van Oostrum <pi**@cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: pi**@vanoostrum.org

Nov 25 '05 #183

Fuzzyman

Christoph Zwerschke wrote:

Fuzzyman schrieb:
d.keys() will still return a copy of the list, so d.keys()[i] will
still be slower than d.sequence[i]
Right, I forgot that. Bengt suggested to implement __call__ as well as
__getitem__ and __setitem__ for keys, values and items.

In this case, you could very effectively access it as d.values[i].

That means making keys, values, and items custom objects.
Creating a new instance would have the overhead of creating 4 new
objects instead of just 1. Is the added convenience worth it ? (Plus
the extra layers of method calls for each access).

I'm not sure. It's a nice idea in terms of using it (we could just
leave the sequence attribute as an alias for the new keys attribute -
for backwards compatibility).

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
-- Christoph

Nov 25 '05 #184

Christoph Zwerschke

Le ruego me perdone.

replace('haber', random.choice('tener', 'hacer', 'lograr'))

Mi espanol es peor que mi python.

-- Christoph

Nov 25 '05 #185

Fuzzyman

Sure - that was just an example of mutating the keys list without
having direct access to it.

If keys was implemented as an object (with a ``__call__`` method) then
we could also implement sequence methods on it - making it easier to
mutate the keys/values/items directly.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

Nov 25 '05 #186

Fuzzyman

Nov 25 '05 #187

Fuzzyman

Alex Martelli wrote:

Fuzzyman <fu******@gmail.com> wrote:
There is already an update method of course. :-)

Slicing an ordered dictionary is interesting - but how many people are
actually going to use it ? (What's your use case)
I detest and abhor almost-sequences which can't be sliced (I consider
that a defect of collections.deque). If the ordered dictionary records
by its sequencing the time order of key insertion, being able to ask for
"the last 5 keys entered" or "the first 3 keys entered" seem to me to be
perfectly natural use cases, and most naturally accomplished by slicing
of course, d[-5:] and d[:3] respectively.

If you slice an ordered dictionary, I assume you would expect to get an
ordered dictionary back ?

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

Alex

Nov 25 '05 #188

Christoph Zwerschke

Fuzzyman wrote:

That means making keys, values, and items custom objects.
Creating a new instance would have the overhead of creating 4 new
objects instead of just 1. Is the added convenience worth it ? (Plus
the extra layers of method calls for each access).
I'm not sure about that either. But since you are using odict for
convenience reasons anyway, and not for performance reasons, it would be
consequential. Performance testing should be made anyway, so this could
be tested as well. I think that creating these 3 additional objects will
not matter much if the dict has more than a handful of items. And the
extra layers will be only called if you really access these keys, values
or items attributes which will not happen very frequently. Normally, you
just loop over an ordered directory and acess keyed values.
I'm not sure. It's a nice idea in terms of using it (we could just
leave the sequence attribute as an alias for the new keys attribute -
for backwards compatibility).

Yes, you could make it a deprecated feature.

-- Christoph

Nov 25 '05 #189

Alex Martelli

Fuzzyman <fu******@gmail.com> wrote:
...

If you slice an ordered dictionary, I assume you would expect to get an
ordered dictionary back ?

That would be helpful, yes, though there are precedents for types whose
slicing doesn't return an instance of that type (e.g. slices of an mmap
are instances of str, not of mmap, if I recall correctly), most
sliceable sequences do follow that pattern.
Alex

Nov 25 '05 #190

Tom Anderson

On Wed, 23 Nov 2005, Christoph Zwerschke wrote:

Alex Martelli wrote:
However, since Christoph himself just misclassified C++'s std::map as
"ordered" (it would be "sorted" in this new terminology he's now
introducing), it seems obvious that the terminological confusion is
rife.
Speaking about "ordered" and "sorted" in the context of collections is
not a new terminology I am introducing, but seems to be pretty common in
computer science

This is quite true. I haven't seen any evidence for 'rife'
misunderstanding of these terms.

That said ...
Perhaps Pythonists are not used to that terminology, since they use the
term "list" for an "ordered collection". An ordered dictionary is a
dictionary whose keys are a (unique) list. Sometimes it is also called a
"sequence"

Maybe we should call it a 'sequenced dictionary' to fit better with
pythonic terminology?

tom

--
YOU HAVE NO CHANCE TO ARRIVE MAKE ALTERNATIVE TRAVEL ARRANGEMENTS. --
Robin May

Nov 25 '05 #191

Tom Anderson

On Wed, 23 Nov 2005, Carsten Haese wrote:

On Wed, 2005-11-23 at 15:17, Christoph Zwerschke wrote:
Bengt Richter wrote:
> E.g., it might be nice to have a mode that assumes d[key] is

d.items()[k][1] when
> key is an integer, and otherwise uses dict lookup, for cases where

the use
> case is just string dict keys.

I also thought about that and I think PHP has that feature, but it's
probably better to withstand the temptation to do that. It could lead
to an awful confusion if the keys are integers.

Thus quoth the Zen of Python:
"Explicit is better than implicit."
"In the face of ambiguity, refuse the temptation to guess."

With those in mind, since an odict behaves mostly like a dictionary, []
should always refer to keys. An odict implementation that wants to allow
access by numeric index should provide explicitly named methods for that
purpose.

+1

Overloading [] to sometimes refer to keys and sometimes to indices is a
really, really, REALLY bad idea. Let's have it refer to keys, and do
indices either via a sequence attribute or the return value of items().

More generally, if we're going to say odict is a subtype of dict, then we
have absolutely no choice but to make the methods that it inherits behave
the same way as in dict - that's what subtyping means. That means not
doing funky things with [], returning a copy from items() rather than a
live view, etc.

So, how do we provide mutatory access to the order of items? Of the
solutions discussed so far, i think having a separate attribute for it -
like items, a live view, not a copy (and probably being a variable rather
than a method) - is the cleanest, but i am starting to think that
overloading items to be a mutable sequence as well as a method is quite
neat. I like it in that the it combines two things - a live view of the
order and a copy of the order - that are really two aspects of one thing,
which seems elegant. However, it does strike me as rather unpythonic; it's
trying to cram a lot of functionality in an unexpected combination into
one place. Sparse is better than dense and all that. I guess the thing to
do is to try both out and see which users prefer.

tom

--
YOU HAVE NO CHANCE TO ARRIVE MAKE ALTERNATIVE TRAVEL ARRANGEMENTS. --
Robin May

Nov 25 '05 #192

Tom Anderson

On Wed, 23 Nov 2005, Christoph Zwerschke wrote:

Tom Anderson wrote:
I think it would be probably the best to hide the keys list from the
public, but to provide list methods for reordering them (sorting, slicing
etc.).
one with unusual constraints, so there should be a list i can manipulate in
code, and which should of course be bound by those constraints.

Think of it similar as the case of an ordinary dictionary: There is
conceptually a set here (the set of keys), but you cannot manipulate it
directly, but only through the according dictionary methods.

Which is a shame!
For an ordedred dictionary, there is conceptually a list (or more
specifically a unique list). Again you should not manipulate it
directly, but only through methods of the ordered dictionary.

This sounds at first more complicated, but is in reality more easy.

For instance, if I want to put the last two keys of an ordered dict d at
the beginning, I would do it as d = d[:-2] + d[-2:].
As i mentioned elsewhere, i think using [] like this is a terrible idea -
and definitely not easier.
With the list attribute (called "sequence" in odict, you would have to
write: d.sequence = d.sequence[:-2] + d.sequence[-2:]. This is not only
longer to write down, but you also have to know that the name of the
attribute is "sequence".
True, but that's not exactly rocket science. I think the rules governing
when your [] acts like a dict [] and when it acts like a list [] are
vastly more complex than the name of one attribute.
Python's strength is that you don't have to keep many details in mind
because it has a small "basic vocabulary" and orthogonal use.
No it isn't - it's in having a wide set of basic building blocks which do
one simple thing well, and thus which are easy to use, but which can be
composed to do more complex things. What are other examples of this kind
of 'orthogonal use'?
I prefer the ordered dictionary does not introduce new concepts or
attributes if everything can be done intuitively with the existing
Python methods and operators.

I strongly agree. However, i don't think your overloading of [] is at all
intuitive.

tom

--
YOU HAVE NO CHANCE TO ARRIVE MAKE ALTERNATIVE TRAVEL ARRANGEMENTS. --
Robin May

Nov 25 '05 #193

Christoph Zwerschke

Tom Anderson schrieb:

Maybe we should call it a 'sequenced dictionary' to fit better with
pythonic terminology?

That's not such a bad idea. Note that it is called like that in the
Python version of the "Programming Language Examples Alike Cookbook":

http://pleac.sourceforge.net/pleac_p...es.html#AEN250

Anyway, I still favor the more common term "ordered dictionary".

-- Christoph

Nov 25 '05 #194

Christoph Zwerschke

It seems everybody is in full agreement here.

I have the same mixed feeling about letting keys/values/items become
both managed list attributes and still returning copies of the lists
when called in the usual way as methods.

I don't know any precedent for doing things that way and i'm unsure
whether it meets the Zen of Python. But anyway the idea is nice enough
to try it out.

-- Chris

Nov 25 '05 #195

Christoph Zwerschke

Tom Anderson wrote:

True, but that's not exactly rocket science. I think the rules governing
when your [] acts like a dict [] and when it acts like a list [] are
vastly more complex than the name of one attribute.
I think it's not really rocket science either to assume that an ordered
dictionary behaves like a dictionary if you access items by subscription
and like a list if you use slices (since slice indexes must evaluate to
integers anyway, they can only be used as indexes, not as keys).

Python's strength is that you don't have to keep many details in mind
because it has a small "basic vocabulary" and orthogonal use.

No it isn't - it's in having a wide set of basic building blocks which
do one simple thing well, and thus which are easy to use, but which can
be composed to do more complex things. What are other examples of this
kind of 'orthogonal use'?

I mean for instance that you can deal with a tuple in the same way as
with a string and things like that. Concerning "small": Somebody coined
the good slogan "Python fits my brain" but I could also say "Python fits
*into* my brain" (I mean without the batteries of course ;-) - you
pretty soon have all the built-in functins, datatypes and their use in
your head. On the other side of course, Python is a huge field and you
can discuss endlessly as we are doing here.

Anyway, my argument was not good (avoiding new attributes names in order
to keep the vocabulary small).

And by the way, what both of us listed as strengths of Python will be
probably claimed by protagonists of any other language as well.

-- Christoph

Nov 25 '05 #196

Tom Anderson

On Fri, 25 Nov 2005, Christoph Zwerschke wrote:

Tom Anderson wrote:
True, but that's not exactly rocket science. I think the rules governing
when your [] acts like a dict [] and when it acts like a list [] are vastly
more complex than the name of one attribute.

I think it's not really rocket science either to assume that an ordered
dictionary behaves like a dictionary if you access items by subscription
and like a list if you use slices (since slice indexes must evaluate to
integers anyway, they can only be used as indexes, not as keys).

When you put it that way, it makes a certain amount of sense - [:] is
always about index, and [] is always about key. It's still icky, but it is
completely unambiguous.

tom

--
I content myself with the Speculative part [...], I care not for the
Practick. I seldom bring any thing to use, 'tis not my way. Knowledge
is my ultimate end. -- Sir Nicholas Gimcrack

Nov 26 '05 #197

Bengt Richter

On Thu, 24 Nov 2005 18:42:45 +0100, Christoph Zwerschke <ci**@online.de> wrote:

Bengt Richter schrieb:
d.setvalues((13, 14)) ==> d = OrderedDict((1, 13), (2, 14))
The implication above is that OrderedDict takes an *args argument,
but really it takes a single argument that is a sequence of k,v pairs,
(and maybe some keyword options).

Right. Interpret it as a short notation only, I did not want to change
the interface. I think it's a common mistake to forget the brackets
because you think one pair should be enough. At least I often forget it.

Ok, I forget this too.
You could make keys, values, and items custom descriptors which would define __call__
for the old key() etc accesses, well as __getitem__ and __setitem__. Then you could write

d.items[0], d.items[-1] = d.items[-1], d.items[0]

to swap the order of the first and last items in the thing (I hesitate to say dict ;-)
You could also operate on slices.
Nice. You could also get the i-th value with d.values[i].

But is this considered good style or would it be considered "dirty" to
have a callable member that also supports indexing and slicing? (I don't
know, just asking?) Plus, it opens a can of worms by increasing the
complexity tremendously. Let's see whether this can be handled.

I suspect not that complex, but it has the same performance disadvantage as
properties.

BTW, before I showed an example where d[2:3] returned
a new dict instance rather than d.items()[:]. I think the items list
is better, since they naturally add, sort, reverse, etc as lists,
and you can write OrderedDict(d[2:3]+d[1:2]) if you want a new dict.
Not sure about that. I would rather expect that if you slice an object,
you get an object of the same type. And you can also add, sort, reverse,
etc the ordered dict if you want.
A slice assignment to d[i:j] is really sugar for something, but we have to decide exactly
what in case of duplicate keys in the incoming items, either with each other ...

You mean slice assignments to d.items[i:j]. If you make the slice
assignment to d[i:j] then at least you cannot have conflicts in the
incoming items. The first question is: Should a slice assignment be
treated as deletion with subsequential addition (changing the order) or
as a replacement (i.e. try to not change the order)? I agree that the
second would be the expected semantics, but as you mentioned more
difficult to implement.
One way to define it would be
d.items[i:j] = itemseq

to be implemented as sugar for
newitems = d.items[:i] + list(itemseq) + d.items[j:]
d.clear()
d.update(newitems)

Which should be the same as
d = OrderedDict(d.items[:i] + list(itemseq) + d.items[j:])
Sounds reasonable.

Actually it's not the same, which is why I wrote it with update.
Analogous to slice assignment in lists, the referred-to object
gets mutated. Hence preexisting references to the object see the change too.

If you just rebind d with a new OrderedDict object, previous bindings
are not affected. I.e.,
d = OrderedDict(sorted((k,i)) for i,k in enumerate('abc'))
e = d
d = OrderedDict(d.items[:1] + [('b', 100)] + d.items[2:])
d.items()[1] # => ('b', 100)
e.items()[1] # => ('b', 1) # unaffected

So
d.reverse()
could be spelled
d.items[:] = d.items[::-1]
Slice assignment for keys and values would also allow to set a new key
order with

d.keys[:] = newkeyseq

Do you really mean just re-ordering the keys without a corresponding reording of values??
That would be a weird renaming of all values. Or do you means that any key should still
retrieve the same value as before if used as d[key]? In which case the values must undergo
the same permutation as the keys. I.e., you are assuming key->value pairings remain stable
through any key reorderings?

So no need to define a setkeys() method or let keys() take arguments.
If we want to allow this kind of things at all.
If you are using slices, you can safely use them directly in the __getitem__ of th dict interface,
as I did in the "Que mas?" post. So we don't have to write d.items[i:j] and could write d[i:j].
The thing is, d[i:j] will tempt to skip ".items" in d.items[i] and write d[i], which has the dict
key meaning, not the item list index. It is faster not have adescriptor between though.

I still think d[i:j] should return an ordered dict, not an item list.

Easy to do either way.

I think maybe allowing write to keys or values is pretty iffy. There are too many weird
re-associations of keys and values possible, and which you could do my other means if you
really really wanted to. But I do think full index and slice read access would be fine.
There are different opinions. Fuzzyman would probably say "Don't trust
yourself?" I myself am undecided. Perhaps you could expect that somebody
knows what he does if he makes a slice assignment to d.keys. In any way,
such an assignment should not "break" the directory (with "broken" I
mean that the internal keys sequence does not match the keys of the
internal dictionary). If anything does not match, it should raise a
KeyException. If it is implemented that way, I think we can allow it.

Exactly what, though? should e.g.
d.keys[3] = newk3
mean (not a suggested implementation, just to define semantics)
keys = d.keys()
if newk3 in keys and keys.index(newk3)!=3:
raise ValueError,'Attempt to introduce duplicate key'
items = d.items()
items[3] = (newk3, items[3][1])
d.clear()
d.update(items)
?
This would allow what you might call renaming in place.
Similarly
d.keys[i:j] = newkeysij
might have the semantics of
keys = d.keys()
outside = set(keys[:i])+set(keys[j:])
if outside & set(newkeysij) or len(newkeysij) != len(set(newkeysij)):
raise ValueError,'Attempt to introduce duplicate key(s)'
items = d.items()
items[i:j] = [(k, items[kx+i][1]) for kx,k in enumerate(newkeysij)]
d.clear()
d.update(items)

Is this what is desired?

I think also that d1==d2 should effectively be implemented as d1[:] == d2[:] -- i.e, compare
the item lists to implement comparisons.

Wouldn't it be more performant to compare for
d1.internal_dict==d2.internal_dict and
d1.internal_sequence==d2.internal_sequence?
You don't keep track of the item lists, they need to be built on every
occasion.

That depends on how you implement ;-)

Back from holiday, so maybe I'll hack something out.

Regards,
Bengt Richter

Nov 27 '05 #198

Bengt Richter

On Fri, 25 Nov 2005 19:42:49 +0000, Tom Anderson <tw**@urchin.earth.li> wrote:

On Wed, 23 Nov 2005, Carsten Haese wrote:
On Wed, 2005-11-23 at 15:17, Christoph Zwerschke wrote:
Bengt Richter wrote:

> E.g., it might be nice to have a mode that assumes d[key] is
d.items()[k][1] when
> key is an integer, and otherwise uses dict lookup, for cases where
the use
> case is just string dict keys.

I also thought about that and I think PHP has that feature, but it's
probably better to withstand the temptation to do that. It could lead
to an awful confusion if the keys are integers.
Thus quoth the Zen of Python:
"Explicit is better than implicit."
"In the face of ambiguity, refuse the temptation to guess."

With those in mind, since an odict behaves mostly like a dictionary, []
should always refer to keys. An odict implementation that wants to allow
access by numeric index should provide explicitly named methods for that
purpose.

+1

Overloading [] to sometimes refer to keys and sometimes to indices is a
really, really, REALLY bad idea. Let's have it refer to keys, and do
indices either via a sequence attribute or the return value of items().

More generally, if we're going to say odict is a subtype of dict, then we
have absolutely no choice but to make the methods that it inherits behave
the same way as in dict - that's what subtyping means. That means not
doing funky things with [], returning a copy from items() rather than a
live view, etc.

OTOH,
{}[:]

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: unhashable type
I.e., slices are not valid keys for ordinary dicts, and slices tie in
very well with the ordered aspect of ordered dicts, so that's an
argument for permitting it via the indexing syntax, not just items[:]
or items()[:] which have related but not identical semantics.
So, how do we provide mutatory access to the order of items? Of the
solutions discussed so far, i think having a separate attribute for it -
like items, a live view, not a copy (and probably being a variable rather
than a method) - is the cleanest, but i am starting to think that
overloading items to be a mutable sequence as well as a method is quite
neat. I like it in that the it combines two things - a live view of the
order and a copy of the order - that are really two aspects of one thing,
which seems elegant. However, it does strike me as rather unpythonic; it's
trying to cram a lot of functionality in an unexpected combination into
one place. Sparse is better than dense and all that. I guess the thing to
do is to try both out and see which users prefer.

I wonder who is going to use it for what.

Regards,
Bengt Richter

Nov 27 '05 #199

Christoph Zwerschke

Bengt Richter wrote:

d.keys[:] = newkeyseq
Do you really mean just re-ordering the keys without a corresponding reording of values??
That would be a weird renaming of all values. Or do you means that any key should still
retrieve the same value as before if used as d[key]? In which case the values must undergo
the same permutation as the keys. I.e., you are assuming key->value pairings remain stable
through any key reorderings?

Since it is considered as being a dictionary in the first place, the
key->value pairings should of course stay stable. In the usual
implementation based on an ordinary dictionary with an additional key
list ("sequence" in the Foord/Larosa and "_keys" in the Bejamin/Winter
implementation), you would only set the key list, since the value list
is generated dynamically. But if your implementation keeps internal
values or items lists, these need to be adjusted as well.

I will assume that d has is a Foord/Larosa ordered dict with "sequence"
attribute in the following.

Then, with other words,

d.keys[:] = newkeyseq

should do the same as:

d.sequence = newkeyseq
Exactly what, though? should e.g.
d.keys[3] = newk3
mean (not a suggested implementation, just to define semantics)
keys = d.keys()
if newk3 in keys and keys.index(newk3)!=3:
raise ValueError,'Attempt to introduce duplicate key'
items = d.items()
items[3] = (newk3, items[3][1])
d.clear()
d.update(items)
Yes, that would be the correct semantics. Of course this should not be
the real implementation and use KeyError instead of ValueError. With
other words,

d.keys[i] = newkey

sould be the same as:

if d.sequence[i] != newkey:
if newkey in d.sequence:
raise KeyError,'Attempt to introduce duplicate key'
else:
d.sequence[i] = newkey
This would allow what you might call renaming in place.
Similarly
d.keys[i:j] = newkeysij
might have the semantics of
keys = d.keys()
outside = set(keys[:i])+set(keys[j:])
if outside & set(newkeysij) or len(newkeysij) != len(set(newkeysij)):
raise ValueError,'Attempt to introduce duplicate key(s)'
items = d.items()
items[i:j] = [(k, items[kx+i][1]) for kx,k in enumerate(newkeysij)]
d.clear()
d.update(items)

Is this what is desired?
Not quite, because it does not preserve the key->value pairings (see
above) and it would behave strangely or raise an exception if the new
slice is larger. The following code would do:

keys = d.keys()
outside = set(keys[:i])|set(keys[j:])
if outside & set(newkeysij) or len(newkeysij) != len(set(newkeysij)):
raise ValueError,'Attempt to introduce duplicate key(s)'
items = d.items()
items[i:j] = [(k, d.get(k, None)) for k in newkeysij]
d.clear()
d.update(items)

(Note that there was a bug in the second line. You cannot add sets.)

Again, this would be equivalent to:

seq = d.sequence
newseq = seq[:]
newseq[i:j] = newkeysij
if len(newseq) != len(set(newseq)):
raise KeyError,'Attempt to introduce duplicate key(s)'
for k in set(seq[i:j]) - set(newkeysij):
del d[k]
for k in set(newkeysij) - set(seq[i:j]):
d[k] = None
d.sequence = newseq

You don't keep track of the item lists, they need to be built on every
occasion.

That depends on how you implement ;-)

Ok, I was thinking of the usual implementations.
Back from holiday, so maybe I'll hack something out.

Let us know when you have something to check out.

Maybe Fuzzyman can make some moderate improvements to the existing
odict.py, and you can do something more "radical". Then we have two
"reference implementations" and can compare how they prove regarding
performance and usability.

-- Christoph

Nov 27 '05 #200