473,395 Members | 1,652 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

delete duplicates in list

hello,
this must have come up before, so i am already sorry for asking but a
quick googling did not give me any answer.

i have a list from which i want a simpler list without the duplicates
an easy but somehow contrived solution would be
a = [1, 2, 2, 3]
d = {}.fromkeys(a)
b = d.keys()
print b [1, 2, 3]

there should be an easier or more intuitive solution, maybe with a list
comprehension=

somthing like
b = [x for x in a if x not in b]
print b

[]

does not work though.

thanks for any help
chris
Jul 18 '05 #1
22 35427
Hi,
this must have come up before, so i am already sorry for asking but a
quick googling did not give me any answer.

i have a list from which i want a simpler list without the duplicates
an easy but somehow contrived solution would be


In python 2.3, this should work:

import sets
l = [1,2,2,3]
s = sets.Set(l)

A set is a container for which the invariant that no object is in it twice
holds. So it suits your needs.

Regards,

Diez
Jul 18 '05 #2
> >>> a = [1, 2, 2, 3]
>>> d = {}.fromkeys(a)
>>> b = d.keys()
>>> print b

[1, 2, 3]


That, or using a Set (python 2.3+). Actually I seem to recall that
"The Python CookBook" still advises building a dict as the fastest
solution - if your elements can be hashed. See the details at

http://aspn.activestate.com/ASPN/Coo...n/Recipe/52560

Cheers,

Bernard.

Jul 18 '05 #3
christof hoeke wrote:
...
i have a list from which i want a simpler list without the duplicates
Canonical is:

import sets
simplerlist = list(sets.Set(thelist))

if you're allright with destroying order, as your example solution suggests.
But dict.fromkeys(a).keys() is probably faster. Your assertion:
there should be an easier or more intuitive solution, maybe with a list
comprehension=


doesn't seem self-evident to me. A list-comprehension might be, e.g:

[ x for i, x in enumerate(a) if i==a.index(x) ]

and it does have the advantages of (a) keeping order AND (b) not
requiring hashable (nor even inequality-comparable!) elements -- BUT
it has the non-indifferent cost of being O(N*N) while the others
are about O(N). If you really want something similar to your approach:
>>> b = [x for x in a if x not in b]


you'll have, o horrors!-), to do a loop, so name b is always bound to
"the result list so far" (in the LC, name b is only bound at the end):

b = []
for x in a:
if x not in b:
b.append(x)

However, this is O(N*N) too. In terms of "easier or more intuitive",
I suspect only this latter solution might qualify.
Alex

Jul 18 '05 #4
Bernard Delmée wrote:
>>> a = [1, 2, 2, 3]
>>> d = {}.fromkeys(a)
>>> b = d.keys()
>>> print b

[1, 2, 3]


That, or using a Set (python 2.3+). Actually I seem to recall that
"The Python CookBook" still advises building a dict as the fastest
solution - if your elements can be hashed. See the details at

http://aspn.activestate.com/ASPN/Coo...n/Recipe/52560


Yep, but a Set is roughly as fast as a dict -- it has a small
penalty wrt a dict, but, emphasis on small. Still, if you're
trying to squeeze every last cycle out of a bottleneck, it's
worth measuring, and perhaps moving from Set to dict.
Alex

Jul 18 '05 #5
Diez B. Roggisch wrote:

this must have come up before, so i am already sorry for asking but a
quick googling did not give me any answer.

i have a list from which i want a simpler list without the duplicates
an easy but somehow contrived solution would be


In python 2.3, this should work:

import sets
l = [1,2,2,3]
s = sets.Set(l)

A set is a container for which the invariant that no object is in it twice
holds. So it suits your needs.


....except it's not a LIST, which was part of the specifications given
by the original poster. It may, of course, be that you've read his
mind correctly and that, despite his words, he doesn't really care
whether he gets a list or a very different container:-).
Alex

Jul 18 '05 #6
Hi,
...except it's not a LIST, which was part of the specifications given
by the original poster. It may, of course, be that you've read his
mind correctly and that, despite his words, he doesn't really care
whether he gets a list or a very different container:-).


You're right - mathematically. However, there is no such thing like a set in
computers - you always end up with some sort of list :)

So - he'll have a list anyway. But if it respects the order the list
parameter... <just checking, standby>

.... nope:
l = [1,2,3,2]
import sets
sets.Set(l) Set([1, 2, 3]) l = [2,1,2,3,2]
sets.Set(l)

Set([1, 2, 3])

Which is of course reasonable, as the check for existence in the set might
be performed in O(ln n) instead of O(n).

Regards,

Diez
Jul 18 '05 #7
"Diez B. Roggisch" <de************@web.de> wrote in message news:<bn*************@news.t-online.com>...
Hi,
...except it's not a LIST, which was part of the specifications given
by the original poster. It may, of course, be that you've read his
mind correctly and that, despite his words, he doesn't really care
whether he gets a list or a very different container:-).
You're right - mathematically. However, there is no such thing like a set in
computers - you always end up with some sort of list :)


Those are just implementation details. There could be a group of monkeys
emulating Python under the hood and their implementation of a set
would be a neural network instead of any kind of sequence, but you
still wouldn't care as a programmer. The only thing that matters is,
if the interface stays same. Nope, the items in a set are no longer
accessible by their index (among other differences).
So - he'll have a list anyway.
So I can't agree with this. You don't know if his Python virtual machine
is a group of monkeys. Python is supposed to be a high level language.
But if it respects the order the list
parameter... <just checking, standby>

... nope:
l = [1,2,3,2]
import sets
sets.Set(l) Set([1, 2, 3]) l = [2,1,2,3,2]
sets.Set(l)

Set([1, 2, 3])

Which is of course reasonable, as the check for existence in the set might
be performed in O(ln n) instead of O(n).


Actually the Set in sets module has an average lookup of O(1), worst
case O(n) (not 100% sure of worst case, but 99% sure). It's been
implemented with dictionaries, which in turn are hash tables.
Jul 18 '05 #8
Alex Martelli <al***@aleax.it> wrote in
news:L7********************@news1.tin.it:
Your assertion:
there should be an easier or more intuitive solution, maybe with a list
comprehension=


doesn't seem self-evident to me. A list-comprehension might be, e.g:

[ x for i, x in enumerate(a) if i==a.index(x) ]

and it does have the advantages of (a) keeping order AND (b) not
requiring hashable (nor even inequality-comparable!) elements -- BUT
it has the non-indifferent cost of being O(N*N) while the others
are about O(N). If you really want something similar to your approach:
>>> b = [x for x in a if x not in b]
you'll have, o horrors!-), to do a loop, so name b is always bound to
"the result list so far" (in the LC, name b is only bound at the end):

b = []
for x in a:
if x not in b:
b.append(x)

However, this is O(N*N) too. In terms of "easier or more intuitive",
I suspect only this latter solution might qualify.


I dunno about more intuitive, but here's a fairly simple list comprehension
solution which is O(N) and preserves the order. Of course its back to
requiring hashable elements again:
startList = [5,1,2,1,3,4,2,5,3,4]
d = {}
[ d.setdefault(x,x) for x in startList if x not in d ] [5, 1, 2, 3, 4]

And for the 'I must do it on one line' freaks, here's the single expression
variant of the above: :^)
[ d.setdefault(x,x) for d in [{}] for x in startList if x not in d ]

[5, 1, 2, 3, 4]

--
Duncan Booth du****@rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
Jul 18 '05 #9
Hannu Kankaanp?? wrote:
...
So - he'll have a list anyway.
So I can't agree with this. You don't know if his Python virtual machine
is a group of monkeys. Python is supposed to be a high level language.


So, in that case those monkeys would surely be perched up on tall trees.

Which is of course reasonable, as the check for existence in the set
might be performed in O(ln n) instead of O(n).


Actually the Set in sets module has an average lookup of O(1), worst
case O(n) (not 100% sure of worst case, but 99% sure). It's been


Hmmm -- could you give an example of that worstcase...? a _full_
hashtable would give such behavior, but Python's dicts always ensure
the underlying hashtables aren't too full...
implemented with dictionaries, which in turn are hash tables.


Yep - check it out...:

[alex@lancelot bo]$ timeit.py -c -s'import sets' -s'x=sets.Set(xrange(100))'
'7 in x'
100000 loops, best of 3: 2.3 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(1000))' '7 in x'
100000 loops, best of 3: 2.3 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(10000))' '7 in x'
100000 loops, best of 3: 2.2 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(100000))' '7 in x'
100000 loops, best of 3: 2.2 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(1000000))' '7 in x'
100000 loops, best of 3: 2.3 usec per loop

see the pattern...?-)
Same when something is NOT there, BTW:

[alex@lancelot bo]$ timeit.py -c -s'import sets'
-s'x=sets.Set(xrange(1000000))' 'None in x'
100000 loops, best of 3: 2.3 usec per loop

etc, etc.
Jul 18 '05 #10
Alex Martelli <al***@aleax.it> writes:
Actually the Set in sets module has an average lookup of O(1), worst
case O(n) (not 100% sure of worst case, but 99% sure). It's been


Hmmm -- could you give an example of that worstcase...? a _full_
hashtable would give such behavior, but Python's dicts always ensure
the underlying hashtables aren't too full...


Try inserting a bunch of instances of

class C:
def __hash__(self): return 0

into a dictionary.

Cheers,
mwh

--
Whaaat? That is the most retarded thing I have seen since, oh,
yesterday -- Kaz Kylheku, comp.lang.lisp
Jul 18 '05 #11
Michael Hudson <mw*@python.net> writes:
Alex Martelli <al***@aleax.it> writes:
> Actually the Set in sets module has an average lookup of O(1), worst
> case O(n) (not 100% sure of worst case, but 99% sure). It's been


Hmmm -- could you give an example of that worstcase...? a _full_
hashtable would give such behavior, but Python's dicts always ensure
the underlying hashtables aren't too full...


Try inserting a bunch of instances of

class C:
def __hash__(self): return 0

into a dictionary.


I've though about using something like this in production code
to be able to store mutable instances in a dict.
Performance problems aside (since there are only a couple of key/value
pairs in the dict), is it such a bad idea?

Thomas
Jul 18 '05 #12
Thomas Heller <th*****@python.net> writes:
Michael Hudson <mw*@python.net> writes:
Alex Martelli <al***@aleax.it> writes:
> Actually the Set in sets module has an average lookup of O(1), worst
> case O(n) (not 100% sure of worst case, but 99% sure). It's been

Hmmm -- could you give an example of that worstcase...? a _full_
hashtable would give such behavior, but Python's dicts always ensure
the underlying hashtables aren't too full...
Try inserting a bunch of instances of

class C:
def __hash__(self): return 0

into a dictionary.


I've though about using something like this in production code
to be able to store mutable instances in a dict.


Eek!
Performance problems aside (since there are only a couple of key/value
pairs in the dict), is it such a bad idea?


Um, I don't know actually. It pretty much defeats the point of using
a hashtable. If your class has some non-mutable parts it'd be an
idea to hash them, of course.

And the performance problems are severe -- inserting just a thousand
instances of the class C() into a dictionary takes noticable time.

(What *is* a bad idea is giving such classes an __eq__ method that
mutates the dict being inserted into -- I found a bunch of such bugs a
couple years back and I think it's still possible to core Python (via
stack overflow) by being even more devious).

Cheers,
mwh

--
Our Constitution never promised us a good or efficient government,
just a representative one. And that's what we got.
-- http://www.advogato.org/person/mrorg...html?start=109
Jul 18 '05 #13
On Wed, Oct 29, 2003 at 10:02:08PM +0100, christof hoeke wrote:
there should be an easier or more intuitive solution, maybe with a list
comprehension=

somthing like
b = [x for x in a if x not in b]
print b []

does not work though.

if you want a list comp solution, similar to the one you proposed and valid
also with python < 2.3 you could do:
a = [1, 2, 2, 3]
b=[]
[b.append(x) for x in a if x not in b] [None, None, None] b [1, 2, 3]

or even:
a = [1, 2, 2, 3]
[b.append(x) for b in [[]] for x in a if x not in b] [None, None, None] b

[1, 2, 3]

Corrado.
--
Thought is only a flash between two long nights,
but this flash is everything.
(H. Poincaré)
Jul 18 '05 #14
Thomas Heller <th*****@python.net> writes:
Michael Hudson <mw*@python.net> writes:
Try inserting a bunch of instances of

class C:
def __hash__(self): return 0

into a dictionary.


I've though about using something like this in production code
to be able to store mutable instances in a dict.
Performance problems aside (since there are only a couple of key/value
pairs in the dict), is it such a bad idea?


IMO it depends on what equality means for instances. E.g. if two
instances are only equal if they're identical, i.e. a == b is equivalent
to a is b, then defining __hash__ can be very useful, because then you
can use them in dictionaries and mutability doesn't really matter
because no change to one instance can make it equal to a nother
instance.

I'd define __hash__ to return id(self), though, so that the hash values
are different for different instances to reduce collisions.

This also seems to be what class objects in Python do:
class C(object): .... pass
.... hash(C) 135622420 id(C) 135622420


Bernhard

--
Intevation GmbH http://intevation.de/
Sketch http://sketch.sourceforge.net/
Thuban http://thuban.intevation.org/
Jul 18 '05 #15
Bernhard Herzog <bh@intevation.de> writes:
Thomas Heller <th*****@python.net> writes:
Michael Hudson <mw*@python.net> writes:
Try inserting a bunch of instances of

class C:
def __hash__(self): return 0

into a dictionary.


I've though about using something like this in production code
to be able to store mutable instances in a dict.
Performance problems aside (since there are only a couple of key/value
pairs in the dict), is it such a bad idea?


IMO it depends on what equality means for instances. E.g. if two
instances are only equal if they're identical, i.e. a == b is equivalent
to a is b, then defining __hash__ can be very useful, because then you
can use them in dictionaries and mutability doesn't really matter
because no change to one instance can make it equal to a nother
instance.


Um. In that case why define __hash__ or __eq__ at all?

Cheers,
mwh

--
Just put the user directories on a 486 with deadrat7.1 and turn the
Octane into the afforementioned beer fridge and keep it in your
office. The lusers won't notice the difference, except that you're
more cheery during office hours. -- Pim van Riezen, asr
Jul 18 '05 #16
Michael Hudson <mw*@python.net> writes:
Bernhard Herzog <bh@intevation.de> writes:
IMO it depends on what equality means for instances. E.g. if two
instances are only equal if they're identical, i.e. a == b is equivalent
to a is b, then defining __hash__ can be very useful, because then you
can use them in dictionaries and mutability doesn't really matter
because no change to one instance can make it equal to a nother
instance.


Um. In that case why define __hash__ or __eq__ at all?


Good question. I didn't know that that was the default implementation of
__hash__. It's not documented AFAICT, but it does seem to be the case. I
wonder why, even though it's useful.

The language reference for __hash__ says:

If a class does not define a __cmp__() method it should not define a
__hash__() operation either;

Seems a bit outdated in that it only mentions __cmp__ here and not
__eq__ as well as it does right after the semicolon, but that would not
indicate to me that hash works for an instance without a hash method. Of
course the sentence above could be taken to mean that if you define
neither method the defaults to the right thing, but that would seem very
obscure to me.

Anyway, I've found that if you end up comparing objects very often is
can be a substantial performance gain if you do define __eq__ even
though it ends up doing the same as the default comparison. At least for
old style classes when __eq__ is not defined python tries __cmp__ and
__coerce__ and __getattr__ if defined is called for all of them.

Then, once you have defined __eq__ you also need to have __hash__ if you
want your objects to be hashable.

Bernhard

--
Intevation GmbH http://intevation.de/
Sketch http://sketch.sourceforge.net/
Thuban http://thuban.intevation.org/
Jul 18 '05 #17
Bernhard Herzog wrote:
...
Good question. I didn't know that that was the default implementation of
__hash__. It's not documented AFAICT, but it does seem to be the case. I
I don't know why the official docs don't mention this fact -- I do
mention it in "Python in a Nutshell", of course (p. 132).
wonder why, even though it's useful.
The reason this is done is because it's useful.
Anyway, I've found that if you end up comparing objects very often is
can be a substantial performance gain if you do define __eq__ even
though it ends up doing the same as the default comparison. At least for
old style classes when __eq__ is not defined python tries __cmp__ and
__coerce__ and __getattr__ if defined is called for all of them.
Yes, but making your classes newstyle would be a MUCH greater performance
boost in this situation. See:

[alex@lancelot bo]$ timeit.py -c -s'class X:pass' -s'x=X()' 'x==None'
100000 loops, best of 3: 3.5 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'class X: def __eq__(self, other): return id(self)==id(other)' -s'x=X()' 'x==None' 100000 loops, best of 3: 2.1 usec per loop

[alex@lancelot bo]$ timeit.py -c -s'class X(object):pass' -s'x=X()'
'x==None'
1000000 loops, best of 3: 0.42 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'class X(object):
def __eq__(self, other): return id(self)==id(other)' -s'x=X()' 'x==None'
100000 loops, best of 3: 2.1 usec per loop

Instead of saving 40% of the comparison time by defining __eq__, why not
save 80+% of it, instead, just by making the class newstyle? Note that
the comparison time is avoud equal for both old and new style classes if
you do define an __eq__ -- but while it's a 40% speedup for oldstyle
ones, it's a slowdown by 5 times for newstyle ones.

Then, once you have defined __eq__ you also need to have __hash__ if you
want your objects to be hashable.


Another good performance reason to make the class newstyle and thus
avoid having to define __eq__, IMHO.

These days the only casee where I use oldstyle classes (except in Q&D
scripts where I simply forget to say '(object)', but that's getting
rare as the good habit becomes ingrained:-) is for exception classes,
which (for reasons that escape me) ARE supposed to be oldstyle. The
exception that proves the rule?-)
Alex

Jul 18 '05 #18
Bernhard Herzog <bh@intevation.de> writes:
Michael Hudson <mw*@python.net> writes:
Bernhard Herzog <bh@intevation.de> writes:
IMO it depends on what equality means for instances. E.g. if two
instances are only equal if they're identical, i.e. a == b is equivalent
to a is b, then defining __hash__ can be very useful, because then you
can use them in dictionaries and mutability doesn't really matter
because no change to one instance can make it equal to a nother
instance.
Um. In that case why define __hash__ or __eq__ at all?


Good question. I didn't know that that was the default implementation of
__hash__. It's not documented AFAICT, but it does seem to be the case. I
wonder why, even though it's useful.


Well, I didn't so much know that this was the default implementation
of __hash__ as know that instances of classes that don't do anything
to the contrary compare equal iff identical and are usefully hashable.

(Python Mystery Theatre question: for such a class, when can you have
id(ob) != hash(ob)?)
The language reference for __hash__ says:

If a class does not define a __cmp__() method it should not define a
__hash__() operation either;

Seems a bit outdated in that it only mentions __cmp__ here and not
__eq__ as well as it does right after the semicolon, but that would not
indicate to me that hash works for an instance without a hash method.
I hadn't had the distraction of reading this section of the language
reference :-) Did you seriously think that you *had* to define
__hash__ to make instances of your classes hashable?
Of course the sentence above could be taken to mean that if you
define neither method the defaults to the right thing, but that
would seem very obscure to me.
Well, this is Python after all. I would have read

if it defines __cmp__() or __eq__() but not __hash__(), its
instances will not be usable as dictionary keys.

to imply that in other situations that instances would be usable as
dictionary keys.
Anyway, I've found that if you end up comparing objects very often
is can be a substantial performance gain if you do define __eq__
even though it ends up doing the same as the default comparison. At
least for old style classes when __eq__ is not defined python tries
__cmp__ and __coerce__ and __getattr__ if defined is called for all
of them.
Old-style classes? What are they? :-) Didn't know about that.
Then, once you have defined __eq__ you also need to have __hash__ if you
want your objects to be hashable.


Yes. I'm glad that bit of silliness is now dead (ish).

Cheers,
mwh

--
We've had a lot of problems going from glibc 2.0 to glibc 2.1.
People claim binary compatibility. Except for functions they
don't like. -- Peter Van Eynde, comp.lang.lisp
Jul 18 '05 #19
Alex Martelli <al***@aleax.it> writes:

[about object.__hash__]
Bernhard Herzog wrote:
...
Good question. I didn't know that that was the default implementation of
__hash__. It's not documented AFAICT, but it does seem to be the case. I


I don't know why the official docs don't mention this fact -- I do
mention it in "Python in a Nutshell", of course (p. 132).
wonder why, even though it's useful.


The reason this is done is because it's useful.


Well, Guido seems to disagree. Per his request, when I complained about
the default __hash__() method he asked me to enter a bug about it:
http://www.python.org/sf/660098. But it doesn't seem easy at all to fix
this...

Thomas
Jul 18 '05 #20
Alex Martelli wrote:
christof hoeke wrote:
...
i have a list from which i want a simpler list without the duplicates

Canonical is:

import sets
simplerlist = list(sets.Set(thelist))

if you're allright with destroying order, as your example solution suggests.
But dict.fromkeys(a).keys() is probably faster. Your assertion:

there should be an easier or more intuitive solution, maybe with a list
comprehension=

doesn't seem self-evident to me. A list-comprehension might be, e.g:

[ x for i, x in enumerate(a) if i==a.index(x) ]

and it does have the advantages of (a) keeping order AND (b) not
requiring hashable (nor even inequality-comparable!) elements -- BUT
it has the non-indifferent cost of being O(N*N) while the others
are about O(N). If you really want something similar to your approach:

>>> b = [x for x in a if x not in b]

you'll have, o horrors!-), to do a loop, so name b is always bound to
"the result list so far" (in the LC, name b is only bound at the end):

b = []
for x in a:
if x not in b:
b.append(x)

However, this is O(N*N) too. In terms of "easier or more intuitive",
I suspect only this latter solution might qualify.
Alex

i was looking at the cookbook site but could not find the solution in
the short time i was looking, so thanks for the link to Bernard.

but thanks to all for the interesting discussion which i at least partly
able to follow as i am still a novice pythoner.

as for speed i did not care really as my script will not be used regularly.
but it seems i guessed right using pythons dictionary functions, it
still seems kind of the easiest and fastest ways (i should add that the
list i am using consists only of strings, which can be dictionary keys
then).
thanks again
chris


thanks to all

Jul 18 '05 #21
On Wed, 29 Oct 2003 22:02:08 +0100, christof hoeke <cs***@yahoo.com> wrote:
hello,
this must have come up before, so i am already sorry for asking but a
quick googling did not give me any answer.

i have a list from which i want a simpler list without the duplicates
an easy but somehow contrived solution would be
a = [1, 2, 2, 3]
d = {}.fromkeys(a)
b = d.keys()
print b[1, 2, 3]

there should be an easier or more intuitive solution, maybe with a list
comprehension=

somthing like
b = [x for x in a if x not in b]
print b[]

does not work though.

If you want to replace the original list without a temporary new list,
and your original is sorted (or you don't mind having it sorted), then
you could do the following (not tested beyond what you see ;-), which
as an extra benefit doesn't require hashability:
def elimdups(thelist): ... thelist.sort() # remove if you just want to eliminate adjacent duplicates
... i = 0
... for item in thelist:
... if item==thelist[i]: continue
... i += 1
... thelist[i] = item
... del thelist[i+1:]
... a = [1, 2, 2, 3]
elimdups(a)
a [1, 2, 3] a=[]
elimdups(a)
a [] a = [123]
elimdups(a)
a [123] a = ['a', ['b', 2], ['c',3], ['b',2], 'd']
a ['a', ['b', 2], ['c', 3], ['b', 2], 'd'] elimdups(a)
a [['b', 2], ['c', 3], 'a', 'd'] a = list('deaacbb')
elimdups(a)
a ['a', 'b', 'c', 'd', 'e']

Not sure how this was decided, but that's the way it works: 'a' > ['b', 2]

True

Hm, it would have been nicer to have an optional sort flag as
a second parameter. Oh, well, another time...

Regards,
Bengt Richter
Jul 18 '05 #22
Thomas Heller <th*****@python.net> writes:
Alex Martelli <al***@aleax.it> writes:

[about object.__hash__]
Bernhard Herzog wrote:
...
Good question. I didn't know that that was the default implementation of
__hash__. It's not documented AFAICT, but it does seem to be the case. I


I don't know why the official docs don't mention this fact -- I do
mention it in "Python in a Nutshell", of course (p. 132).
wonder why, even though it's useful.


The reason this is done is because it's useful.


Well, Guido seems to disagree. Per his request, when I complained about
the default __hash__() method he asked me to enter a bug about it:
http://www.python.org/sf/660098. But it doesn't seem easy at all to fix
this...


That's actually something different: new-style classes that define
__eq__ remain hashable. I don't think making new-style classes that
*don't* define __eq__ *un*hashable is on the table, is it? If it is,
let me complain about that :-)

Cheers,
mwh

--
The proponent of the PEP shall be placed in a gladiatorial arena
together with half a dozen hungry lions, and permitted to debate
the merits of the proposal with them.
-- Greg Ewing, comp.lang.python
Jul 18 '05 #23

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Alexander Anderson | last post by:
I have a DELETE statement that deletes duplicate data from a table. It takes a long time to execute, so I thought I'd seek advice here. The structure of the table is little funny. The following is...
3
by: EoRaptor013 | last post by:
I'm having trouble figuring out how to delete some _almost_ duplicate records in a look-up table. Here's the table: CREATE TABLE ( (16) NOT NULL , (2) NOT NULL , (20) NULL , (50) NULL...
1
by: Smythe32 | last post by:
If anyone could help, I would appreciate it. I have a table as listed below. I need to check for duplicates by the OrderItem field and if there are duplicates, it then needs to keep the...
7
by: nitinloml | last post by:
well i m having a text file which contain time,user name & id, now if i want to modify the time without affecting the id & user name is it possible? or how i delete that list & enter a new...
4
by: Mokita | last post by:
Hello, I am working with Taverna to build a workflow. Taverna has a beanshell where I can program in java. I am having some problems in writing a script, where I want to eliminate the duplicates...
3
allingame
by: allingame | last post by:
Need help with append and delete duplicates I have tables namely 1)emp, 2)time and 3)payroll TABLE emp ssn text U]PK name text
1
watertraveller
by: watertraveller | last post by:
Hi all. My ultimate goal is to return two columns, where no single value appears anywhere twice. This means that not only do I want to check that nothing from column A appears in column B and...
4
by: moon24 | last post by:
Hi im working with linked list and i have to implement a function that deletes the duplicates of a number. for example if given 2 7 1 7 12 7 then the result should be 2 7 1 12 here is what I have:...
1
by: KimmyG | last post by:
I'm just starting to use SQL and am much more experienced in Access. Here is what I do in Access Copy a table and rename the new table "copytable" also select structure only. Open "copytable"...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.