Tuple question

Will McGugan

Hi,

Why is that a tuple doesnt have the methods 'count' and 'index'? It
seems they could be present on a immutable object.

I realise its easy enough to convert the tuple to a list and do this,
I'm just curious why it is neccesary..
Thanks,

Will McGugan

Jul 18 '05 #1

Subscribe Post Reply

3618

Wai Yip Tung

I'm not sure what do you mean by index. But you can use len() to get the
number of objects in a tuple. e.g.

t=(1,2,3)
len(t) 3 t[2] 3
Hi,

Why is that a tuple doesnt have the methods 'count' and 'index'? It
seems they could be present on a immutable object.

I realise its easy enough to convert the tuple to a list and do this,
I'm just curious why it is neccesary..
Thanks,

Will McGugan

Jul 18 '05 #2

Will McGugan

Wai Yip Tung wrote:

I'm not sure what do you mean by index. But you can use len() to get
the number of objects in a tuple. e.g.
t=(1,2,3)
len(t)
3
t[2]
3

Lista have an index method that returns the index of the first occurance
of an element, but tuple doesnt (nor count). Just wondering why.

l= [ 1, 2, 3 ]
t= ( 1, 2, 3 )
l.index(2) 1 t.index(2)

Traceback (most recent call last):
File "<pyshell#8>", line 1, in ?
t.index(2)
AttributeError: 'tuple' object has no attribute 'index'

Jul 18 '05 #3

Wai Yip Tung

Oops I misunderstood that you said about count and index. Now I got it.

Speaking as a user of Python, here is my take:

You consider tuple an immutable version of list. But in Python's design
they have different purpose. List a collection of homogeneous items, while
tuple is a convenient grouping of any kind of items. For example, you can
use them this way:

users = ['admin', 'user1', 'user2']
address = ('www.python.org', 80)

index and count only make sense when the collection is homogeneous.
Therefore they are not defined for tuple.

tung

On Thu, 02 Sep 2004 17:40:27 +0100, Will McGugan
<ne**@NOwillmcguganSPAM.com> wrote:

Wai Yip Tung wrote:
I'm not sure what do you mean by index. But you can use len() to get
the number of objects in a tuple. e.g.
> t=(1,2,3)
> len(t)

3
> t[2]

3

Lista have an index method that returns the index of the first occurance
of an element, but tuple doesnt (nor count). Just wondering why.
>>> l= [ 1, 2, 3 ]
>>> t= ( 1, 2, 3 )
>>> l.index(2) 1 >>> t.index(2)

Traceback (most recent call last):
File "<pyshell#8>", line 1, in ?
t.index(2)
AttributeError: 'tuple' object has no attribute 'index'

Jul 18 '05 #4

Gandalf

users = ['admin', 'user1', 'user2']
address = ('www.python.org', 80)

index and count only make sense when the collection is homogeneous.
Therefore they are not defined for tuple.

Why?

address = ['www.python.org',80]

A list can hold any kind of objects. I think that the 'index' method for
tuples would be a good idea.

Jul 18 '05 #5

Peter Hansen

Will McGugan wrote:

Why is that a tuple doesnt have the methods 'count' and 'index'? It
seems they could be present on a immutable object.

I realise its easy enough to convert the tuple to a list and do this,
I'm just curious why it is neccesary..

Please see these recent threads, and read the FAQ:

http://groups.google.ca/groups?threa...t%40python.org

and

http://groups.google.ca/groups?threa...ing.google.com

-Peter

Jul 18 '05 #6

Roy Smith

In article <ma**************************************@python.o rg>,
Gandalf <ga*****@geochemsource.com> wrote:

users = ['admin', 'user1', 'user2']
address = ('www.python.org', 80)

index and count only make sense when the collection is homogeneous.
Therefore they are not defined for tuple.

Why?

address = ['www.python.org',80]

A list can hold any kind of objects. I think that the 'index' method for
tuples would be a good idea.

Personally, I think it should be more general. I think index should be
a sequence method, and tuple should just inherit from that.

Jul 18 '05 #7

Donn Cave

In article <ma**************************************@python.o rg>,
Gandalf <ga*****@geochemsource.com> wrote:

users = ['admin', 'user1', 'user2']
address = ('www.python.org', 80)

index and count only make sense when the collection is homogeneous.
Therefore they are not defined for tuple.

Why?

address = ['www.python.org',80]

A list can hold any kind of objects. I think that the 'index' method for
tuples would be a good idea.

Yes, lists and tuples can hold the same kinds of objects.
I don't care whether I manage to convince you that the
index method is not needed, but here's my take on an aspect
of the the homogeneity issue, a somewhat obscure point that
isn't explained in the FAQ.
(
http://www.python.org/doc/faq/genera...parate-tuple-a
nd-list-data-types )

Lists are not naturally homogeneous because each item is
of the same type as the next. That would be sort of absurd
in a language like Python, where that kind of typing isn't
done. Rather they are homogeneous because if you say that
that an object is "list of (something)", typically a slice
of that list will still be a valid "list of (something)" -
a list of hosts, a list of dictionary keys, etc. In this
less concrete sense of type, the list itself has a type
that applies not only to the whole list but to any slice.
The list object has all kinds of support for iterative
traversal, deletion, extension, etc., because these are
naturally useful for this kind of sequence.

On the other hand, we normally use tuples for data that
is meaningful only when it's intact. The (key, value)
pair that comes back from dict.items(), for example. Each
value may very well be a string, but the sequence is not
homogeneous in the sense we're talking about, and index()
is not useful.

Donn Cave, do**@u.washington.edu

Jul 18 '05 #8

Colin J. Williams

Wai Yip Tung wrote:

Oops I misunderstood that you said about count and index. Now I got it.

Speaking as a user of Python, here is my take:

You consider tuple an immutable version of list. But in Python's design
they have different purpose. List a collection of homogeneous items,
while tuple is a convenient grouping of any kind of items. For
example, you can use them this way:

users = ['admin', 'user1', 'user2']
address = ('www.python.org', 80)

index and count only make sense when the collection is homogeneous. What about:

addresses= list(address)
print addresses
users.append(address)
print users

Colin W. Therefore they are not defined for tuple.

tung

On Thu, 02 Sep 2004 17:40:27 +0100, Will McGugan
<ne**@NOwillmcguganSPAM.com> wrote:
Wai Yip Tung wrote:
I'm not sure what do you mean by index. But you can use len() to get
the number of objects in a tuple. e.g.

>> t=(1,2,3)
>> len(t)

3

>> t[2]

3

Lista have an index method that returns the index of the first
occurance of an element, but tuple doesnt (nor count). Just wondering
why.
>>> l= [ 1, 2, 3 ]
>>> t= ( 1, 2, 3 )
>>> l.index(2)

1
>>> t.index(2)

Traceback (most recent call last):
File "<pyshell#8>", line 1, in ?
t.index(2)
AttributeError: 'tuple' object has no attribute 'index'

Jul 18 '05 #9

Gandalf

Please see these recent threads, and read the FAQ:

http://groups.google.ca/groups?threa...t%40python.org

This is from that thread:
Note that while .index() makes sense for some sequences,
such as strings and lists, it doesn't make sense for the
way in which tuples are "supposed to be used", which is
as collections of heterogeneous data and not usually as
simply read-only lists.

Why it is not useful to have an index() method for collections of heterogeneous data?

Suppose, you have big amount of data stored in tuples (for using less memory).
You may want to extract slices from the tuploes from a given index determined by an object.
This is just an example, however it is quite realistic (e.g. using tuples instead of lists
because there is a huge amount of static data that you need to access quickly).

Jul 18 '05 #10

Peter Hansen

Gandalf wrote:

This is from that thread:
Note that while .index() makes sense for some sequences,
such as strings and lists, it doesn't make sense for the
way in which tuples are "supposed to be used", which is
as collections of heterogeneous data and not usually as
simply read-only lists.
Why it is not useful to have an index() method for collections of
heterogeneous data?

Because you will already know where the different items or
types of items are in the sequence. If you don't, it's
probably not heterogeneous data using the definition that
is being used by those saying that tuples are not just
immutable lists.
Suppose, you have big amount of data stored in tuples (for using less
memory).
Why do you think tuples use significantly less memory than lists?
As far as I know, they don't. (They do use less, but if you are
really talking about huge amounts of data such that you would
be trying to optimize in this way, then the amount that they use
is not *significantly* less.)
You may want to extract slices from the tuploes from a given index
determined by an object.
This is just an example, however it is quite realistic (e.g. using
tuples instead of lists
because there is a huge amount of static data that you need to access
quickly).

Actually, it's realistic but unwise and a waste of time. Use lists,
that's what they were meant for...

-Peter

Jul 18 '05 #11

Roy Smith

In article <ma**************************************@python.o rg>,
Gandalf <ga*****@geochemsource.com> wrote:

Please see these recent threads, and read the FAQ:

http://groups.google.ca/groups?threa....5135.python-l
ist%40python.org

This is from that thread:
Note that while .index() makes sense for some sequences,
such as strings and lists, it doesn't make sense for the
way in which tuples are "supposed to be used", which is
as collections of heterogeneous data and not usually as
simply read-only lists.

Why it is not useful to have an index() method for collections of
heterogeneous data?

Suppose, you have big amount of data stored in tuples (for using less
memory).
You may want to extract slices from the tuploes from a given index determined
by an object.
This is just an example, however it is quite realistic (e.g. using tuples
instead of lists
because there is a huge amount of static data that you need to access
quickly).

Also, you must use tuples instead of lists as dictionary keys. If your
keys are inherently arbitrary length ordered collections of homogeneous
data, you might very well want to use things like index() on the keys.
In this case, they really are "immutable lists".

I understand the argument that tuples are supposed to be the moral
equivalent of anonymous C structs, but if that's the case, why do things
like len() and slicing work on them? Not to mention "in" (either as a
test or as an iterator). None of those things make sense to do on
structs.

It's really pretty arbitrary that of the things you can do on immutable
sequences, index() and count() are special-cased as innapropriate
operations for tuples.

Looking over the Zen list, I'd say any of:

Simple is better than complex.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.

argue for allowing index() and count() to be used on tuples.

Jul 18 '05 #12

Aahz

In article <P5********************@powergate.ca>,
Peter Hansen <pe***@engcorp.com> wrote:

Why do you think tuples use significantly less memory than lists?
As far as I know, they don't. (They do use less, but if you are
really talking about huge amounts of data such that you would
be trying to optimize in this way, then the amount that they use
is not *significantly* less.)

Actually, if you have large numbers of short sequences, the memory
savings from tuples can indeed be significant. I don't remember off-hand
what the number is, but I think it's something on the order of 20%.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"To me vi is Zen. To use vi is to practice zen. Every command is a
koan. Profound to the user, unintelligible to the uninitiated. You
discover truth everytime you use it." --*****@lion.austin.ibm.com

Jul 18 '05 #13

Mel Wilson

In article <ma**************************************@python.o rg>,
Gandalf <ga*****@geochemsource.com> wrote:

Note that while .index() makes sense for some sequences,
such as strings and lists, it doesn't make sense for the
way in which tuples are "supposed to be used", which is
as collections of heterogeneous data and not usually as
simply read-only lists.

Why it is not useful to have an index() method for collections of heterogeneous data?

Since the data are heterogenous, they're interpreted
according to their positions in the tuple. In a tuple of
name, address, city:

('Marlene Stewart Street', 'Lois Lane', 'Margaret Bay')

It makes no sense to exchange elements, or treat one like
another, no matter that they're all strings, and no matter
what you think they look like. Just finding the value 'Lois
Lane' in the tuple doesn't tell you what 'Lois Lane' means.

All of which is belied by our recent experience of rows
of database fields kicked off by a row of column labels:

('name', 'address', 'city')
('Marlene Stewart Street', 'Lois Lane', 'Margaret Bay')
('Cecil Rhodes', 'Frank Court', 'Saint John')

Then my first and only reaction is to turn the whole mess into a
list of dictionaries and forget about the list/tuple distinction
for the rest of the day.

Regards. Mel.

Jul 18 '05 #14

Dan Christensen

I'm not sure I buy the arguments against an index operation for
tuples. For example, suppose we were conducting a vote about
something, hmm, let's say decorator syntax. <wink> And suppose that
each person is allowed three distinct votes, ranked in order, with the
first vote getting 3 points, the second 2, and the third 1. We might
store the votes in a database, whose rows would naturally be tuples
like

('J2', 'C64', 'X11')

Now suppose we want to calculate the total number of points for
proposal X, but don't need to compute the totals for the other
choices. Code like the following would be a pretty natural
approach:

for row in rows:
try:
points += 3-row.index(X)
except:
pass

I realize that there are different ways to code it, but most
are simply reimplementations of the proposed index function.

Dan

Jul 18 '05 #15

Jason Lai

Dan Christensen wrote:

I'm not sure I buy the arguments against an index operation for
tuples. For example, suppose we were conducting a vote about
something, hmm, let's say decorator syntax. <wink> And suppose that
each person is allowed three distinct votes, ranked in order, with the
first vote getting 3 points, the second 2, and the third 1. We might
store the votes in a database, whose rows would naturally be tuples
like

('J2', 'C64', 'X11')

Now suppose we want to calculate the total number of points for
proposal X, but don't need to compute the totals for the other
choices. Code like the following would be a pretty natural
approach:

for row in rows:
try:
points += 3-row.index(X)
except:
pass

I realize that there are different ways to code it, but most
are simply reimplementations of the proposed index function.

Dan

Well,

for index, row in enumerate(rows):
points += 3 - i

It's not really reimplementing the index function. You already have the
index. I think for most of the cases where people use tuples, you
already know what you're using each index for.

- Jason

Jul 18 '05 #16

Dan Christensen

Jason Lai <jm***@uci.edu> writes:

Dan Christensen wrote:
for row in rows:
try:
points += 3-row.index(X)
except:
pass
I realize that there are different ways to code it, but most
are simply reimplementations of the proposed index function.
Dan

for index, row in enumerate(rows):
points += 3 - index

[I changed "i" to "index" in the last line.]

That doesn't do the same thing, but that's because my description
wasn't clear. rows is a list of tuples, with each tuple being one
person's three votes. E.g.

rows = [('J2', 'C64', 'X11'), ('U2', 'J2', 'P3000')]

So if X == 'J2', the code should calculate 3+2=5 points.

Dan

Jul 18 '05 #17

Peter Hansen

Aahz wrote:

In article <P5********************@powergate.ca>,
Peter Hansen <pe***@engcorp.com> wrote:
Why do you think tuples use significantly less memory than lists?
As far as I know, they don't. (They do use less, but if you are
really talking about huge amounts of data such that you would
be trying to optimize in this way, then the amount that they use
is not *significantly* less.)

Actually, if you have large numbers of short sequences, the memory
savings from tuples can indeed be significant. I don't remember off-hand
what the number is, but I think it's something on the order of 20%.

Differing definitions of "significant", I guess, because
nothing less than about a 2:1 ratio would make me consider
optimizing to use tuples instead of lists...

Consider, for example, that one actually has to build the
tuple in the first place... how can you do that without
having the info in a list to begin with? (I'm sure there
are ways if one is ingenious, but I think the answers
would just go to prove the point I was making.)

-Peter

Jul 18 '05 #18

Jason Lai

Dan Christensen wrote:

Jason Lai <jm***@uci.edu> writes:

Dan Christensen wrote:

for row in rows:
try:
points += 3-row.index(X)
except:
pass
I realize that there are different ways to code it, but most
are simply reimplementations of the proposed index function.
Dan

for index, row in enumerate(rows):
points += 3 - index

[I changed "i" to "index" in the last line.]

That doesn't do the same thing, but that's because my description
wasn't clear. rows is a list of tuples, with each tuple being one
person's three votes. E.g.

rows = [('J2', 'C64', 'X11'), ('U2', 'J2', 'P3000')]

So if X == 'J2', the code should calculate 3+2=5 points.

Dan

Ah, okay, I see my mistake. I should read closer before replying :P

Well, I see your point, although I still don't think it would happen
that often. I also don't see it as a newbie trap, because currently all
the other "mutable sequence" member functions are only for lists, not
tuples. Index and count are the only mutable sequence functions that
might apply to tuples. One could argue that those functions might be
useful (to a lesser extent, probably) for iterators too.

- Jason Lai

Jul 18 '05 #19

Donn Cave

Quoth Dan Christensen <jd*@uwo.ca>:
| I'm not sure I buy the arguments against an index operation for
| tuples. For example, suppose we were conducting a vote about
| something, hmm, let's say decorator syntax. <wink> And suppose that
| each person is allowed three distinct votes, ranked in order, with the
| first vote getting 3 points, the second 2, and the third 1. We might
| store the votes in a database, whose rows would naturally be tuples
| like
|
| ('J2', 'C64', 'X11')
|
| Now suppose we want to calculate the total number of points for
| proposal X, but don't need to compute the totals for the other
| choices. Code like the following would be a pretty natural
| approach:
|
| for row in rows:
| try:
| points += 3-row.index(X)
| except:
| pass
|
| I realize that there are different ways to code it, but most
| are simply reimplementations of the proposed index function.

The algorithm is fine, it's the choice of sequence that's debatable.
Once you get index() for this application, next you'll want append().
After all, there's no apparent reason that at any time there must be
exactly 3 votes, so what if you want to collect data in several passes -

for person, vote in data:
try:
row = votes[person]
except KeyError:
row = []
votes[person] = row
if len(row) < 3:
row.append(vote)

Your application is really suited for a list or dictionary. It can,
conceptually, support mutations like insert, delete and append, with
fairly obvious semantics in terms of your application. I mean, if
you delete the first item, then the second item becomes the 3 point
vote, etc. (In an application where that's not the case, then you
probably want a dictionary, where deleting whatever item has no
effect on the others.) Compare with the mtime tuple returned by
time.localtime() - minus the first item, about the most you can say
is it's an mtime minus its first item.

No one is saying you must use a list then. Do whatever you want!
But no one is forcing you to use a tuple, either (in this hypothetical
application), and if you need an index function, you know where to get it.

Donn Cave, do**@drizzle.com

Jul 18 '05 #20

Arthur

"Donn Cave" <do**@drizzle.com> wrote in message
news:1094189336.822541@yasure...

No one is saying you must use a list then. Do whatever you want!
But no one is forcing you to use a tuple, either (in this hypothetical
application), and if you need an index function, you know where to get
it.

Which to me, is what it boils down to. What are you trying to do, and
where's the fit. At a very practical level. I continue to think any
general discussion of homogeneity/heterogenuity is misdirecting. Python
relies on lists to be able to do duty in many different kinds of
circumstances, as oppose to acccessing specialized containers, as in other
languages. Clearly, in some of those circumstances homogeneity, in some
sense or other, is to the essence. In others it clearly is not. It's the
append method one is after, for example. In a dynamic app, append and
ordered access solves a set of problems that may or may not be reasonablely
conceptualized as related to homogeneity. So any attempt to describe
anything about lists vs. tuples in terms of its data content always in the
end seems unnecessarily reductionist, IMO - if that's the right word.

Art

Jul 18 '05 #21

Donn Cave

Quoth "Arthur" <aj******@optonline.com>:
| ... In a dynamic app, append and
| ordered access solves a set of problems that may or may not be reasonablely
| conceptualized as related to homogeneity. So any attempt to describe
| anything about lists vs. tuples in terms of its data content always in the
| end seems unnecessarily reductionist, IMO - if that's the right word.

Say, have we been here before? Remember, it really isn't about the
data content considered separately, rather the synthesis of structure
and data.

Donn Cave, do**@drizzle.com

Jul 18 '05 #22

Alex Martelli

Donn Cave <do**@u.washington.edu> wrote:
...

On the other hand, we normally use tuples for data that
is meaningful only when it's intact. The (key, value)
So by this argument len(t) should not work if t is a tuple...

I've never accepted the BDFL's explanations on what tuples are for; like
Python beginners I use them as immutable lists (to index into a
dictionary or be set members) and curse their lack of useful methods.
pair that comes back from dict.items(), for example. Each
value may very well be a string, but the sequence is not
homogeneous in the sense we're talking about, and index()
is not useful.

Even for a pair I sometimes like to know if 42 is the key, the value,
or neither. index is handy for that... but not if the pair is a tuple,
only if it's a list. Rationalize as you will, it's still a Python wart.

Pseudotuples with NAMED (as well as indexed) arguments, as modules stat
and time now return, may be a different issue. Not sure why we never
made declaring such pseudotuples as usertypes as easy as it should be, a
custom metaclass in some stdlib module shd be enough. But tuples whose
items can't be named, just indexed or sliced, just are not a good fit
for the kind of use case you and Guido use to justify tuple's lack of
methods, IMHO.
Alex

Jul 18 '05 #23

Alex Martelli

Peter Hansen <pe***@engcorp.com> wrote:
...

Consider, for example, that one actually has to build the
tuple in the first place... how can you do that without
having the info in a list to begin with? (I'm sure there
are ways if one is ingenious, but I think the answers
would just go to prove the point I was making.)

tuple(somegenerator(blah)) will work excellently well. In 2.4, you can
even often code that 'somegenerator' inline as a generator
comprehension. So this 'having the info in a list' argument sounds just
totally bogus to me.

Say I want to work with some primes and I have a primes generator.
Primes aren't going to change, so a tuple is a natural. I start with,
e.g.,

ps = tuple(itertools.islice(primes(), 999999))

....and then I'm stumped because I can't index into ps to find, say, the
progressive number of some given prime N by ps.index(N). How silly,
having to keep ps a list, i.e. mutable (when it intrinsically isn't)
just to be able to index into it! [I can usefully exploit ps's
sortedness via module bisect... ignoring the latter's specs and docs
that keep screamign LISTS, bisect.bisect DOES work on tuples... but I
wouldn't feel comfy about that surviving, given said docs and
specs...:-)
Alex

Jul 18 '05 #24

Andrew Durdin

On Sat, 4 Sep 2004 11:00:30 +0200, Alex Martelli <al*****@yahoo.com> wrote:

Pseudotuples with NAMED (as well as indexed) arguments, as modules stat
and time now return, may be a different issue. Not sure why we never
made declaring such pseudotuples as usertypes as easy as it should be, a
custom metaclass in some stdlib module shd be enough. But tuples whose
items can't be named, just indexed or sliced, just are not a good fit
for the kind of use case you and Guido use to justify tuple's lack of
methods, IMHO.

Such "pseudotuples" are easy enough to implement. Since I'm not all
that crash hot with metaclasses, I just made a tuple subclass (see
below). Would a metaclass implementation offer any significant
benefits over a subclass?

class NamedTuple(tuple):
"""Builds a tuple with elements named and indexed.

A NamedTuple is constructed with a sequence of (name, value) pairs;
the values can then be obtained by looking up the name or the value.
"""

def __new__(cls, seq):
return tuple.__new__(cls, [val for name,val in seq])

def __init__(self, seq):
tuple.__init__(self)
tuple.__setattr__(self, "_names", dict(zip([name for name,val
in seq], range(len(seq)))))

def __getattr__(self, name):
try:
return tuple.__getitem__(self, self.__dict__["_names"][name])
except KeyError:
raise AttributeError, "object has no attribute named '%s'" % name

def __setattr__(self, name, value):
if self._names.has_key(name):
raise TypeError, "object doesn't support item assignment"
else:
tuple.__setattr__(self, name, value)

# Example
if __name__ == "__main__":
names = ("name", "age", "height")
person1 = NamedTuple(zip(names, ["James", "26", "185"]))
person2 = NamedTuple(zip(names, ["Sarah", "24", "170"]))

print person1.name
for i,name in enumerate(names):
print name, ":", person2[i]
(Submitted to the Cookbook: recipe #303439)

Jul 18 '05 #25

Arthur

"Donn Cave" <do**@drizzle.com> wrote in message
news:1094268184.524370@yasure...

Quoth "Arthur" <aj******@optonline.com>:
| ... In a dynamic app, append and
| ordered access solves a set of problems that may or may not be reasonablely | conceptualized as related to homogeneity. So any attempt to describe
| anything about lists vs. tuples in terms of its data content always in the | end seems unnecessarily reductionist, IMO - if that's the right word.

Say, have we been here before?
Have we? ;)
Remember, it really isn't about the
data content considered separately, rather the synthesis of structure
and data.

Yes. I slipped. Continuing to discuss the issue in terms of homogenuity ane
hetereogenuity (in any sense) seems unnecessarily reductionist, IMO - if
that's the right word.

That Guido conceptualizes in some hard to define way related to these
concepts may in fact explain why things are as they are..

And I guess some of the questions that lead into to these discussions are
more of the "why are things as they are", rahter than anything related to
the practical use of lists and tuples. And in the contgext of the question
of "why things are as they are" it is hard to avoid discussion of
homogenuity and hetereogenuity - which is really mostly an attempt to psyche
out Guido's reasoning.

I guess I don't do PEPs, becuase I am a humble user - more interested in
picking things up once they are, and as they are - and accomplishing what I
need to accomplish

I certainly *don't* think the concepts of homogenuity and hetereogenuity
help a twit.

A clue about perfromance issues arounds tuples vs. lists is *much* more
interesting to me - for example.

Even a 20%-er.

Art

Jul 18 '05 #26

Alex Martelli

Andrew Durdin <ad*****@gmail.com> wrote:
...

and time now return, may be a different issue. Not sure why we never
made declaring such pseudotuples as usertypes as easy as it should be, a
custom metaclass in some stdlib module shd be enough. But tuples whose
... Such "pseudotuples" are easy enough to implement. Since I'm not all
that crash hot with metaclasses, I just made a tuple subclass (see
below). Would a metaclass implementation offer any significant
benefits over a subclass?
I think of a tuple with a given sequence of names for its fields as a
type (a subclass of tuple, sure). For example, the name->index
correspondence for all the pseudotuples-with-9-items returned by module
time is just the same one -- why would I want to carry around that
dictionary for each INSTANCE of a time-pseudotuple, rather than having
it once and for all in the type? So I'd have, say:

example_type = tuple_with_names('foo', 'bar', 'baz')
assert issubclass(example_type, tuple)

and then I could make as many instances of this subclass of tuple as
needed, with a call like either example_type(1,2,3) or
example_type(baz=3,foo=1,bar=2) [the first form would accept 3
positional arguments, the second one 3 named arguments -- they're all
needed of course, but passing them as named will often result in clearer
and more readable application-level code].

You prefer to specify the names every time you make an instance and
build and carry the needed name->index dict along with each instance.
Ah well, I guess that's OK, but I don't really see the advantage
compared to the custom-metaclass approach.
(Submitted to the Cookbook: recipe #303439)

Great, thanks -- I should get the snapshot tomorrow (or whenever they do
start working again over in Vancouver, since I get it from
ActiveState:-) and I'll be happy to consider presenting your approach
(and a custom metaclass for contrast;-).
Alex

Jul 18 '05 #27

Donn Cave

Quoth al*****@yahoo.com (Alex Martelli):
| Donn Cave <do**@u.washington.edu> wrote:
| ...
|> On the other hand, we normally use tuples for data that
|> is meaningful only when it's intact. The (key, value)
|
| So by this argument len(t) should not work if t is a tuple...

I expect it's used relatively infrequently, and for different
reasons. "if len(info) == 5", for example - just from that
line from a relatively popular Python application, would you
guess info is a list, or a tuple?

| I've never accepted the BDFL's explanations on what tuples are for; like
| Python beginners I use them as immutable lists (to index into a
| dictionary or be set members) and curse their lack of useful methods.
|
| > pair that comes back from dict.items(), for example. Each
| > value may very well be a string, but the sequence is not
| > homogeneous in the sense we're talking about, and index()
| > is not useful.
|
| Even for a pair I sometimes like to know if 42 is the key, the value,
| or neither. index is handy for that... but not if the pair is a tuple,
| only if it's a list. Rationalize as you will, it's still a Python wart.

Maybe the problem is that tuples have too many features already.
It's sort of silly that they're indexed by number, and if that
weren't allowed, we would find fewer people trying to make lists
of them.

| Pseudotuples with NAMED (as well as indexed) arguments, as modules stat
| and time now return, may be a different issue. Not sure why we never
| made declaring such pseudotuples as usertypes as easy as it should be, a
| custom metaclass in some stdlib module shd be enough. But tuples whose
| items can't be named, just indexed or sliced, just are not a good fit
| for the kind of use case you and Guido use to justify tuple's lack of
| methods, IMHO.

There you go, they shouldn't be indexed or sliced, that's right!
Named attributes would be nice, but otherwise you use pattern
matching (to the extent support in Python -- key, value = item.)
Makes for more readable code.

Donn Cave, do**@drizzle.com

Jul 18 '05 #28

Bengt Richter

On Sun, 05 Sep 2004 04:39:43 -0000, "Donn Cave" <do**@drizzle.com> wrote:

Quoth al*****@yahoo.com (Alex Martelli):
| Donn Cave <do**@u.washington.edu> wrote:
| ...
|> On the other hand, we normally use tuples for data that
|> is meaningful only when it's intact. The (key, value)
|
| So by this argument len(t) should not work if t is a tuple...

I expect it's used relatively infrequently, and for different
reasons. "if len(info) == 5", for example - just from that
line from a relatively popular Python application, would you
guess info is a list, or a tuple?

| I've never accepted the BDFL's explanations on what tuples are for; like
| Python beginners I use them as immutable lists (to index into a
| dictionary or be set members) and curse their lack of useful methods.
|
| > pair that comes back from dict.items(), for example. Each
| > value may very well be a string, but the sequence is not
| > homogeneous in the sense we're talking about, and index()
| > is not useful.
|
| Even for a pair I sometimes like to know if 42 is the key, the value,
| or neither. index is handy for that... but not if the pair is a tuple,
| only if it's a list. Rationalize as you will, it's still a Python wart.

Maybe the problem is that tuples have too many features already.
It's sort of silly that they're indexed by number, and if that
weren't allowed, we would find fewer people trying to make lists
of them.

| Pseudotuples with NAMED (as well as indexed) arguments, as modules stat
| and time now return, may be a different issue. Not sure why we never
| made declaring such pseudotuples as usertypes as easy as it should be, a
| custom metaclass in some stdlib module shd be enough. But tuples whose
| items can't be named, just indexed or sliced, just are not a good fit
| for the kind of use case you and Guido use to justify tuple's lack of
| methods, IMHO.

There you go, they shouldn't be indexed or sliced, that's right!
Named attributes would be nice, but otherwise you use pattern
matching (to the extent support in Python -- key, value = item.)
Makes for more readable code.

How about just named read-only but redefinable views? E.g.,

class TV(tuple): ... """tuple view"""
... _views = {}
... def __getattr__(self, name):
... try: return tuple.__getitem__(self, self.__class__._views[name])
... except KeyError: raise AttributeError, '%s is not a tuple view.' %name
... def __setattr__(self, name, ix):
... self.__class__._views[name] = ix
... def __delattr__(self, name): del self.__class__._views[name]
... t=TV(range(3,10))
t (3, 4, 5, 6, 7, 8, 9) t.a Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 6, in __getattr__
AttributeError: a is not a tuple view. t.a = 3
t.a 6 t.b = slice(4,6)
t.b (7, 8) del t.a
t.a Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 6, in __getattr__
AttributeError: a is not a tuple view. t (3, 4, 5, 6, 7, 8, 9) t.a = slice(None,None,-1)
t.a (9, 8, 7, 6, 5, 4, 3)

Of course, with the views in the class's _views dict, further instances share
previous definitions:
t2 = TV('abcdefg')
t2 ('a', 'b', 'c', 'd', 'e', 'f', 'g') t2.a ('g', 'f', 'e', 'd', 'c', 'b', 'a') t2.b

('e', 'f')

You could generalize further with properties defining whatever viewing
function you want, of course.

Regards,
Bengt Richter

Jul 18 '05 #29

Andrew Durdin

On Sat, 4 Sep 2004 15:58:08 +0200, Alex Martelli <al*****@yahoo.com> wrote:

I think of a tuple with a given sequence of names for its fields as a
type (a subclass of tuple, sure). For example, the name->index
correspondence for all the pseudotuples-with-9-items returned by module
time is just the same one -- why would I want to carry around that
dictionary for each INSTANCE of a time-pseudotuple, rather than having
it once and for all in the type? <snip> You prefer to specify the names every time you make an instance and
build and carry the needed name->index dict along with each instance.
Ah well, I guess that's OK, but I don't really see the advantage
compared to the custom-metaclass approach.
Ah. This is one reason why a non-metaclass version is not so good.
There really is no advantage to my inheritance-based implementation --
I tried to make a metaclass version but ran into some issues. However,
on my second try I succeeded -- see below.
Great, thanks -- I should get the snapshot tomorrow (or whenever they do
start working again over in Vancouver, since I get it from
ActiveState:-) and I'll be happy to consider presenting your approach
(and a custom metaclass for contrast;-).

Below is a better (=easier to use) implementation using metaclasses;
I've submitted it to the Cookbook anyway (recipe #303481) despite
being past the deadling. The NamedTuple function is for convenience
(although the example doesn't use it for the sake of explicitness).
NamedTuples accepts a single argument: a sequence -- as for tuple() --
or a dictionary with (at least) the names that the NamedTuple expects.
class NamedTupleMetaclass(type):
"""Metaclass for a tuple with elements named and indexed.

NamedTupleMetaclass instances must set the 'names' class attribute
with a list of strings of valid identifiers, being the names for the
elements. The elements can then be obtained by looking up the name
or the index.
"""

def __init__(cls, classname, bases, classdict):
super(NamedTupleMetaclass, cls).__init__(cls, classname,
bases, classdict)

# Must derive from tuple
if not tuple in bases:
raise ValueError, "'%s' must derive from tuple type." % classname

# Create a dictionary to keep track of name->index correspondence
cls._nameindices = dict(zip(classdict['names'],
range(len(classdict['names']))))
def instance_getattr(self, name):
"""Look up a named element."""
try:
return self[self.__class__._nameindices[name]]
except KeyError:
raise AttributeError, "object has no attribute named
'%s'" % name

cls.__getattr__ = instance_getattr
def instance_setattr(self, name, value):
raise TypeError, "'%s' object has only read-only
attributes (assign to .%s)" % (self.__class__.__name__, name)

cls.__setattr__ = instance_setattr
def instance_new(cls, seq_or_dict):
"""Accept either a sequence of values or a dict as parameters."""
if isinstance(seq_or_dict, dict):
seq = []
for name in cls.names:
try:
seq.append(seq_or_dict[name])
except KeyError:
raise KeyError, "'%s' element of '%s' not
given" % (name, cls.__name__)
else:
seq = seq_or_dict
return tuple.__new__(cls, seq)

cls.__new__ = staticmethod(instance_new)
def NamedTuple(*namelist):
"""Class factory function for creating named tuples."""
class _NamedTuple(tuple):
__metaclass__ = NamedTupleMetaclass
names = list(namelist)

return _NamedTuple
# Example follows
if __name__ == "__main__":
class PersonTuple(tuple):
__metaclass__ = NamedTupleMetaclass
names = ["name", "age", "height"]

person1 = PersonTuple(["James", 26, 185])
person2 = PersonTuple(["Sarah", 24, 170])
person3 = PersonTuple(dict(name="Tony", age=53, height=192))

print person1
for i, name in enumerate(PersonTuple.names):
print name, ":", person2[i]
print "%s is %s years old and %s cm tall." % person3

person3.name = "this will fail"

Jul 18 '05 #30

Roy Smith

"Donn Cave" <do**@drizzle.com> wrote:

I expect it's used relatively infrequently, and for different
reasons. "if len(info) == 5", for example - just from that
line from a relatively popular Python application, would you
guess info is a list, or a tuple?
I'd guess it was something which had a __len__ method.
Maybe the problem is that tuples have too many features already.
It's sort of silly that they're indexed by number, and if that
weren't allowed, we would find fewer people trying to make lists
of them.

If tuples weren't indexed, the only way you'd be able to access the
elements would be to unpack them, which would be rather inconvenient.
Unless of course you had an alternate way to name the elements, but if
you're going to allow named element access, and forbid indexed access,
then you might as well just create a normal class instance.

The more I look into this, the more I realize just how inconsistent the
whole thing is.

For example, tuples can be used as dictionary keys because they are
immutable. Or so it's commonly said. But, that's not true. The real
reason they can be used as keys is because they're hashable. If you try
to use a list as a key, it doesn't complain that it's immutable, it
complains that it's unhashable:

d = {}
d[[1, 2]] = 3

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: list objects are unhashable

Furthermore, regular objects can be used as keys, even though they *are*
mutable. You can do this:

class data:
pass

key = data ()
key.x = 1
key.y = 2

d = {}
d[key] = None

dictKey = d.keys()[0]
print dictKey.x, dictKey.y

key.x = 42
dictKey = d.keys()[0]
print dictKey.x, dictKey.y

If a mutable class instance object can be used as a dictionary key, then
I don't really see any reason a list shouldn't be usable as a key. How
is a class instance's mutability any less of disqualifier for key-ness
than a list's mutability?

And, once you allow lists to be keys, then pretty much the whole raison
d'etre for tuples goes away. And if we didn't have tuples, then we
wouldn't have to worry about silly syntax warts like t = (1,) to make a
1-tuple :-)

Jul 18 '05 #31

Benjamin Niemann

Roy Smith wrote:

"Donn Cave" <do**@drizzle.com> wrote:
I expect it's used relatively infrequently, and for different
reasons. "if len(info) == 5", for example - just from that
line from a relatively popular Python application, would you
guess info is a list, or a tuple?

I'd guess it was something which had a __len__ method.

Maybe the problem is that tuples have too many features already.
It's sort of silly that they're indexed by number, and if that
weren't allowed, we would find fewer people trying to make lists
of them.

If tuples weren't indexed, the only way you'd be able to access the
elements would be to unpack them, which would be rather inconvenient.
Unless of course you had an alternate way to name the elements, but if
you're going to allow named element access, and forbid indexed access,
then you might as well just create a normal class instance.

The more I look into this, the more I realize just how inconsistent the
whole thing is.

For example, tuples can be used as dictionary keys because they are
immutable. Or so it's commonly said. But, that's not true. The real
reason they can be used as keys is because they're hashable. If you try
to use a list as a key, it doesn't complain that it's immutable, it
complains that it's unhashable:

d = {}
d[[1, 2]] = 3

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: list objects are unhashable

Furthermore, regular objects can be used as keys, even though they *are*
mutable. You can do this:

class data:
pass

key = data ()
key.x = 1
key.y = 2

d = {}
d[key] = None

dictKey = d.keys()[0]
print dictKey.x, dictKey.y

key.x = 42
dictKey = d.keys()[0]
print dictKey.x, dictKey.y

If a mutable class instance object can be used as a dictionary key, then
I don't really see any reason a list shouldn't be usable as a key. How
is a class instance's mutability any less of disqualifier for key-ness
than a list's mutability?

And, once you allow lists to be keys, then pretty much the whole raison
d'etre for tuples goes away. And if we didn't have tuples, then we
wouldn't have to worry about silly syntax warts like t = (1,) to make a
1-tuple :-)

A very handy feature of lists is:

a = [1, 2, 3]
b = [1, 2, 3]
if a == b:
print "List equality is based on content"

while:

a = data()
a.x = 42
b = date()
b.x = 42
if a != b:
print "Other objects have an identity that is independent of\
content"

This special behaviour of lists is implemented by implementing the
__eq__ method. Objects with non-standard __eq__ usually don't have the
expected behaviour when used as keys:

a = [1, 2, 3]
b = [1, 2, 4]
d = {}
d[a] = "first"
d[b] = "second"
a[2] = 4
b[2] = 3
print d[[1, 2, 3]]

Which result would you expect here?

Jul 18 '05 #32

Roy Smith

I asked:

How is a class instance's mutability any less of disqualifier for
key-ness than a list's mutability?
Benjamin Niemann <pi**@odahoda.de> wrote:
a = [1, 2, 3]
b = [1, 2, 3]
if a == b:
print "List equality is based on content"

Tuple (and string) equality is based on content too. So what? I can
give my data class an __eq__ method, and then my class instance equality
would also based on content.

So, to restate my original question, why should my mutable,
content-based-eqality class instance be a valid dictionary key, when a
list is not? Which part of a list's behavior makes it inherently
unusable as a key? I'm not asking about design philosophy, I'm asking
about observable behavior.

Jul 18 '05 #33

Benjamin Niemann

Roy Smith wrote:

I asked:
How is a class instance's mutability any less of disqualifier for
key-ness than a list's mutability?

Benjamin Niemann <pi**@odahoda.de> wrote:

a = [1, 2, 3]
b = [1, 2, 3]
if a == b:
print "List equality is based on content"

Tuple (and string) equality is based on content too. So what?

tuples and strings are immutable.

I can
give my data class an __eq__ method, and then my class instance equality
would also based on content.

So, to restate my original question, why should my mutable,
content-based-eqality class instance be a valid dictionary key, when a
list is not? Which part of a list's behavior makes it inherently
unusable as a key? I'm not asking about design philosophy, I'm asking
about observable behavior.

The example I provides should have shown this: when you modify the
objects which is used as a dictionary key, the dictionary is also
modified. This is an usually undesired side effect.
Python won't prevent you from doing such things with your own class that
implements __eq__. But it does not do such things for its built-in classes.

Jul 18 '05 #34

Bengt Richter

On Sun, 05 Sep 2004 10:35:43 -0400, Roy Smith <ro*@panix.com> wrote:

I asked:
How is a class instance's mutability any less of disqualifier for
key-ness than a list's mutability?

Benjamin Niemann <pi**@odahoda.de> wrote:
a = [1, 2, 3]
b = [1, 2, 3]
if a == b:
print "List equality is based on content"

Tuple (and string) equality is based on content too. So what? I can
give my data class an __eq__ method, and then my class instance equality
would also based on content.

So, to restate my original question, why should my mutable,
content-based-eqality class instance be a valid dictionary key, when a
list is not? Which part of a list's behavior makes it inherently
unusable as a key? I'm not asking about design philosophy, I'm asking
about observable behavior.

I don't think a list is _inherently_ unusable, but an immutable sequence
is usable in a different way because of what can be assumed.
I suspect it has something to do with optimizing lookup. For immutables,
equal id should mean equal hash and equal value. If id's are not equal,
hashes only need to be computed once for an immutable, since they can
be cached in the immutables's internal representation (trading a little
space for computation time). If hashes are equal but id's are not, you
either have duplicate tuples/immutables or a rare collision. If you are
forced to compare values in the rare-collision case, I guess you are down
to comparing vectors of pointers, and comparison would then be similar
for tuples and lists. Different lengths would be early out non-equal. Etc.

It does seem like you could allow lists as keys, but it would mean
a performance hit when using them, even if you managed to get type-dependent
dispatching in the internal logic for free. You could still cache a list hash
internally, but you would have to invalidate it on list mutation, which would
add cost to mutation. Optimization tradeoffs ripple in surprising ways, and only
pay off if they are good in real usage patterns. Not easy to get right.

As it is, you could take a big hit and subclass dict to fake it, or you could
sublass list to provide tuple-like hashing and comparing ;-)

Regards,
Bengt Richter

Jul 18 '05 #35

Bryan Olson

Peter Hansen wrote:

Consider, for example, that one actually has to build the
tuple in the first place... how can you do that without
having the info in a list to begin with? (I'm sure there
are ways if one is ingenious, but I think the answers
would just go to prove the point I was making.)

x = (1, 2)
y = (3, 4)
print x + y
--
--Bryan

Jul 18 '05 #36

Bryan Olson

Donn Cave wrote:

Maybe the problem is that tuples have too many features already.
It's sort of silly that they're indexed by number, and if that
weren't allowed, we would find fewer people trying to make lists
of them.

Plus lists have the mis-feature that we can mix types.

The talk of how lists and tuples are supposed to be used
suggests that we want lists of any type, but only on type in
each list. Tuples should be indexed by statically know values.

We should treat each kind of tuple as a distinct type, that we
should not mix in a list, so should not be able to justify

[(5, 4.23), ("Hi", [])]

as a list of one type, simply because the type is 'tuple'. Since
the structure of tuples would be statically known, we can dump
the (item,) notation and just make any item the same thing as
the one-tuple holding that item.
Alas, that's ML, not Python. Were that Python's designers'
intent, why isn't it part of Python's design? Why would we want
to live within the confines of static typing, but without the
safety and efficiency advantages of a type-checking compiler?

In Python, tuples are immutable, hashable lists. Deal with it.
--
--Bryan

Jul 18 '05 #37

Alex Martelli

Andrew Durdin <ad*****@gmail.com> wrote:
...

Below is a better (=easier to use) implementation using metaclasses;
I've submitted it to the Cookbook anyway (recipe #303481) despite
being past the deadling. The NamedTuple function is for convenience
(although the example doesn't use it for the sake of explicitness).

Actually I haven't received a snapshot from ActiveState yet, so I
suspect anything posted to them until they open for business on Monday
(assuming Canada doesn't rest on Labor Day) should get in, anyway.
So, thanks!

Remember that a comment on an existing recipe is as good as a whole new
recipe from my POV (better, if it means I don't have to work hard to
merge multiple recipes into one...;-) -- anybody whose material we use
in the printed cookbook gets credited as an author, whether the material
came as a recipe or as a comment!-)
Alex

Jul 18 '05 #38

Alex Martelli

Donn Cave <do**@drizzle.com> wrote:

Quoth al*****@yahoo.com (Alex Martelli):
| Donn Cave <do**@u.washington.edu> wrote:
| ...
|> On the other hand, we normally use tuples for data that
|> is meaningful only when it's intact. The (key, value)
|
| So by this argument len(t) should not work if t is a tuple...

I expect it's used relatively infrequently, and for different
reasons. "if len(info) == 5", for example - just from that
line from a relatively popular Python application, would you
guess info is a list, or a tuple?
No idea -- could just as well be a dict, an array.array, whatever.
That's the very point and the beauty of polymorphism - I don't CARE what
kind of container 'info' is, I know by this snippet that we're testing
if it has exactly 5 items, or not.

| Even for a pair I sometimes like to know if 42 is the key, the value,
| or neither. index is handy for that... but not if the pair is a tuple,
| only if it's a list. Rationalize as you will, it's still a Python wart.

Maybe the problem is that tuples have too many features already.
It's sort of silly that they're indexed by number, and if that
weren't allowed, we would find fewer people trying to make lists
of them.
Hmmm, how would you access a specific item (since tuples currently don't
support item access by name) if not by indexing?

| Pseudotuples with NAMED (as well as indexed) arguments, as modules stat
| and time now return, may be a different issue. Not sure why we never
| made declaring such pseudotuples as usertypes as easy as it should be, a
| custom metaclass in some stdlib module shd be enough. But tuples whose
| items can't be named, just indexed or sliced, just are not a good fit
| for the kind of use case you and Guido use to justify tuple's lack of
| methods, IMHO.

There you go, they shouldn't be indexed or sliced, that's right!
Named attributes would be nice, but otherwise you use pattern
matching (to the extent support in Python -- key, value = item.)
Makes for more readable code.

Not for sufficiently long tuples. Take the 9-item tuples that the time
module used to use before they grew names: having to unpack the tuple to
access a single item of it would be exceedingly tedious and heavily
boilerplatey to boot.

Besides, I _do_ need to have immutable sequences that are suitable as
dict keys. Today, x=tuple(mylist) performs that role admirably. So,
say I have a dict d, indexed by such tuples -- of different lengths --
and I want all the keys into d that have 23 as their first item. Today
this is a trivial task -- but if I couldn't index tuples I WOULD have a
problem... basically I would need "frozen lists" (and "frozen dicts"
where I today use tuple(d.iteritems())...). Today, tuples serve all of
these roles -- poor man's structs (lacking names), immutable 'frozen'
lists for dict-keying roles, etc. We need at least two new builtin
types to take their place if we want to remove tuple indexing...
Alex

Jul 18 '05 #39

Alex Martelli

Roy Smith <ro*@panix.com> wrote:

I asked:
How is a class instance's mutability any less of disqualifier for
key-ness than a list's mutability?
Benjamin Niemann <pi**@odahoda.de> wrote:
a = [1, 2, 3]
b = [1, 2, 3]
if a == b:
print "List equality is based on content"

Tuple (and string) equality is based on content too. So what? I can
give my data class an __eq__ method, and then my class instance equality
would also based on content.

And your class's instances wouldn't then be hashable any more unless
they defined a __hash__ method -- have you tried?

So, to restate my original question, why should my mutable,
content-based-eqality class instance be a valid dictionary key, when a
list is not? Which part of a list's behavior makes it inherently
unusable as a key? I'm not asking about design philosophy, I'm asking
about observable behavior.

This was discussed in detail in another thread about 2 days ago, I
believe. That thread started by somebody asking why modules were
hashable (could be keys in a dictionary).
Alex

Jul 18 '05 #40

Paul Rubin

"Donn Cave" <do**@drizzle.com> writes:

I expect it's used relatively infrequently, and for different
reasons. "if len(info) == 5", for example - just from that
line from a relatively popular Python application, would you
guess info is a list, or a tuple?

if I say:

def f(*args):
print len(args)

I'd certainly expect to receive args as a tuple, and would still want
to be able to find its length.

Jul 18 '05 #41

greg

Alex Martelli wrote:

Roy Smith <ro*@panix.com> wrote:
Tuple (and string) equality is based on content too. So what? I can
give my data class an __eq__ method, and then my class instance equality
would also based on content.

And your class's instances wouldn't then be hashable any more unless
they defined a __hash__ method -- have you tried?

And even if you did give it a __hash__ method, you wouldn't
be able to get it to work properly as a dict key. It's an
inescapable fact of the way dicts work.

So, to restate my original question, why should my mutable,
content-based-eqality class instance be a valid dictionary key,
when a list is not?

It wouldn't be a valid dictionary key. You might be able to fool
Python into accepting it, but it would malfunction.

Greg

Jul 18 '05 #42

Donn Cave

Quoth al*****@yahoo.com (Alex Martelli):
| Donn Cave <do**@drizzle.com> wrote:
|> Quoth al*****@yahoo.com (Alex Martelli):
....
|> | So by this argument len(t) should not work if t is a tuple...
|>
|> I expect it's used relatively infrequently, and for different
|> reasons. "if len(info) == 5", for example - just from that
|> line from a relatively popular Python application, would you
|> guess info is a list, or a tuple?
|
| No idea -- could just as well be a dict, an array.array, whatever.
| That's the very point and the beauty of polymorphism - I don't CARE what
| kind of container 'info' is, I know by this snippet that we're testing
| if it has exactly 5 items, or not.

I can't tell if you're just being playfully obtuse, or you really
don't recognize the pattern. For me, usage like this is a fairly
familiar sight with tuples, where it's part of a kind of poor man's
user defined data type - we get this tuple, and if it's 5 items it
would be last version's info, 6 items is the present version. This
use for len() is totally about the intact tuple and has nothing to
do with sequential access.

|> There you go, they shouldn't be indexed or sliced, that's right!
|> Named attributes would be nice, but otherwise you use pattern
|> matching (to the extent support in Python -- key, value = item.)
|> Makes for more readable code.
|
| Not for sufficiently long tuples. Take the 9-item tuples that the time
| module used to use before they grew names: having to unpack the tuple to
| access a single item of it would be exceedingly tedious and heavily
| boilerplatey to boot.
|
| Besides, I _do_ need to have immutable sequences that are suitable as
| dict keys. Today, x=tuple(mylist) performs that role admirably. So,
| say I have a dict d, indexed by such tuples -- of different lengths --
| and I want all the keys into d that have 23 as their first item. Today
| this is a trivial task -- but if I couldn't index tuples I WOULD have a
| problem... basically I would need "frozen lists" (and "frozen dicts"
| where I today use tuple(d.iteritems())...). Today, tuples serve all of
| these roles -- poor man's structs (lacking names), immutable 'frozen'
| lists for dict-keying roles, etc. We need at least two new builtin
| types to take their place if we want to remove tuple indexing...

Well, it's not like I'm really proposing to take away the ersatz
list properties that tuples have. It's just that if they were gone,
I think you'd have roughly the tuple that languages of the Lisp family
have had for so long, and apparently been so happy with while discontent
has festered in the Python world. (It has been a long while since I
wrote my last Lisp program, but I'm assuming that's where all the FP
languages got it from.)

Donn

Jul 18 '05 #43

Donn Cave

Quoth Bryan Olson <fa*********@nowhere.org>:
....
| Alas, that's ML, not Python. Were that Python's designers'
| intent, why isn't it part of Python's design? Why would we want
| to live within the confines of static typing, but without the
| safety and efficiency advantages of a type-checking compiler?

That is not what is homogeneous about a list. That would indeed
be an absurd contradiction, so it should be easy to convince you
that it isn't how anyone proposes you should use a list. So, what
do they mean?

A homogeneous sequence, in the sense that makes sense in Python,
is one where any slice is has the same functional meaning to the
application. Of course the data is different and that can have
fundamental consequences, but it's different than (key, value)
for example where a[:] is the only slice that preserves its
meaning.

Whether it was a well chosen word for it or or not, this notion
of how lists are designed to be used, as opposed to tuples, is
evidently why there is no index() function. That's all.

Donn Cave, do**@drizzle.com

Jul 18 '05 #44

Bryan Olson

Donn Cave wrote:

Quoth Bryan Olson:
...
| Alas, that's ML, not Python. Were that Python's designers'
| intent, why isn't it part of Python's design? Why would we want
| to live within the confines of static typing, but without the
| safety and efficiency advantages of a type-checking compiler?

That is not what is homogeneous about a list. That would indeed
be an absurd contradiction, so it should be easy to convince you
that it isn't how anyone proposes you should use a list. So, what
do they mean?
I can't tell to what you are responding.
A homogeneous sequence, in the sense that makes sense in Python,
is one where any slice is has the same functional meaning to the
application. Of course the data is different and that can have
fundamental consequences, but it's different than (key, value)
for example where a[:] is the only slice that preserves its
meaning.

That's pretty much what lists in the ML family offer, with
static but polymorphic typing. Lisp uses lists for both
homogeneous and heterogeneous sequences. Python allows lists or
tuples to be treated either way. The rule to use them based on
the homogeneous/heterogeneous distinctions strikes me more as
programming language naivete than Python expertise.
--
--Bryan

Jul 18 '05 #45

Alex Martelli

greg <gr**@cosc.canterbury.ac.nz> wrote:

Alex Martelli wrote:
Roy Smith <ro*@panix.com> wrote:
Tuple (and string) equality is based on content too. So what? I can
give my data class an __eq__ method, and then my class instance equality
would also based on content.

And your class's instances wouldn't then be hashable any more unless
they defined a __hash__ method -- have you tried?

And even if you did give it a __hash__ method, you wouldn't
be able to get it to work properly as a dict key. It's an
inescapable fact of the way dicts work.

Well, it depends on how you define 'properly'. Semantics CAN be
respected, it's _performance_ that may disappoint;-).

So, to restate my original question, why should my mutable,
content-based-eqality class instance be a valid dictionary key,
> > when a list is not?

It wouldn't be a valid dictionary key. You might be able to fool
Python into accepting it, but it would malfunction.

It might be made to function, just VERY slowly:

def __hash__(self): return 42

Containing this will make ANY class hashable no matter what. But you'd
better buy a new and very fast machine if you're gonna use multiple
instances of such a class to key into a dictionary...;-)
Alex

Jul 18 '05 #46

Peter Hansen

Bryan Olson wrote:

Peter Hansen wrote:
> Consider, for example, that one actually has to build the
> tuple in the first place... how can you do that without
> having the info in a list to begin with? (I'm sure there
> are ways if one is ingenious, but I think the answers
> would just go to prove the point I was making.)

x = (1, 2)
y = (3, 4)
print x + y

In answer to this and to Alex' response, I should say that
in the context of whether using a tuple for the final
storage saves memory or not, neither response means much.

I said "without having the info in a list" but what I
meant to say was "without having the information already
stored elsewhere before it is put in a tuple". The
above just has two tuples which add up in memory usage
to the same amount as the tuple you end up with, meaning
that just prior to the deletion of the two temporary
tuples (if that even happens) you are using *twice*
the memory you need to use. Clearly that doesn't
help you much if memory is scarce.

Alex shows use of a generator... fine, but how do you
build the tuple without storing up the results of the
generator first somewhere else? You can't preallocate the
space for the tuple if you don't know how long it will
be, but you have to preallocate the space for a tuple
(I believe, in the interpreter anyway, if not at the
programmer level) so you must therefore be storing the
entire output of the generator somewhere just prior
to the tuple creation: same problem as above.

I know Alex knows all this (or has some additional
info that I don't have and which he'll shortly provide),
so I can only assume he was reacting only to my poor
choice of wording with 'list' and/or was ignoring the
context of the discussion (memory usage).

-Peter

Jul 18 '05 #47

Alex Martelli

Peter Hansen <pe***@engcorp.com> wrote:
...

Alex shows use of a generator... fine, but how do you
build the tuple without storing up the results of the
generator first somewhere else? You can't preallocate the
You, programming in Python, can't, but the interpreter, internally, can
-- because function _PyTuple_Resize (nor being directly callable from
Pytjon) _CAN_ resize a tuple more efficiently than you imply:
space for the tuple if you don't know how long it will
be, but you have to preallocate the space for a tuple
(I believe, in the interpreter anyway, if not at the
programmer level) so you must therefore be storing the
entire output of the generator somewhere just prior
to the tuple creation: same problem as above.
No problem at all -- if your available memory is in one chunk, the
resize can work in place. (If your available memory is fragmented you
can of course be hosed, since Python's can't move allocated blocks and
thus can't compact things up to cure your fragmentation; but even when
memory's tight it's not unusual for it to not be fragmented).

I know Alex knows all this (or has some additional
info that I don't have and which he'll shortly provide),
You don't have the sources of the Python interpreter? They're freely
available for download, why would I need to provide them?! Anyway, just
follow, with your C debugger or whatever (I think mere code inspection
will be fine), what happens when you call x = tuple(someiterator()).

Moreover, if the 'someiterator()' iterator can return a reasonable
__len__, I believe Python 2.4 should now able to use that to estimate
the needed length in advance for greater efficiency, but I haven't
looked deeply into that, yet... indeed, some simple timeit.py tests
suggest to me that 2.4 alpha 3 only implements that optimization to
preallocate lists, not tuples. But even if so, that's clearly a matter
of mere expediency, not one of any intrinsic problem as you make it
sound.

so I can only assume he was reacting only to my poor
choice of wording with 'list' and/or was ignoring the
context of the discussion (memory usage).

Claiming that you have to have all info in memory before a tuple can be
built is simply wrong -- your previous claim that the info had to be in
a list was even "wronger", sure, but that doesn't make your current
weaker claims correct in the least.
Alex

Jul 18 '05 #48

Peter Hansen

Alex Martelli wrote:

Claiming that you have to have all info in memory before a tuple can be
built is simply wrong -- your previous claim that the info had to be in
a list was even "wronger", sure, but that doesn't make your current
weaker claims correct in the least.

So, back in the original context here... would you agree that
use of a tuple is "quite realistic" (i.e. a technique that will
save significant amounts of memory) when "there is a huge amount
of static data that you need to access quickly"?

(Quotations from the OP of this part of the thread.)

-Peter

Jul 18 '05 #49

Mel Wilson

In article <ro***********************@reader1.panix.com>,
Roy Smith <ro*@panix.com> wrote:

So, to restate my original question, why should my mutable,
content-based-eqality class instance be a valid dictionary key, when a
list is not? Which part of a list's behavior makes it inherently
unusable as a key? I'm not asking about design philosophy, I'm asking
about observable behavior.

a = [1,2,3,4,5]
D = {a:'First five natural numbers'}

The hash of the list should be based on value, so that after

b = [1,2,3,4,5]
c = D[b]

c would be 'First five natural numbers', even though b is
not the same object as a.

Now suppose

a.append(6)

What then happens on

e = D[b]

b no longer equals a, so the existing dictionary item can't
match b as a dictionary key. Moreover, the relevant
dictionary item is filed under hash([1,2,3,4,5]) and not
under hash([1,2,3,4,5,6]), so trying to access D with a key
equal to the appended-to value of a won't hash to the right
place to find a's item. Trouble.

If you roll your own hashable class, it's assumed you have
thought about these issues.

Regards. Mel.

Jul 18 '05 #50

Tuple question

Similar topics