473,320 Members | 1,952 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

listerator clonage

Hello,

I want to build a function which return values which appear two or
more times in a list:

So, I decided to write a little example which doesn't work:
#l = [1, 7, 3, 4, 3, 2, 1]
#i = iter(l)
#for x in i:
# j = iter(i)
# for y in j:
# if x == y:
# print x

In thinked that the instruction 'j= iter(i)' create a new iterator 'j'
based on 'i' (some kind of clone). I wrote this little test which show
that 'j = iter(i)' is the same as 'j = i' (that makes me sad):

#l = [1, 7, 3, 4, 2]
#i = iter(l)
#j = iter(i)
#k = i
#i, j, k
(<listiterator object at 0x02167B50>, <listiterator object at
0x02167B50>, <listiterator object at 0x02167B50>)

Just in order to test, I wrote these little test:
#l = [1, 7, 3, 4, 2]
#i = iter(l)
#import pickle
#j = pickle.loads(pickle.dumps(i))
Traceback (most recent call last):
File "<input>", line 1, in ?
File "C:\Python24\lib\pickle.py", line 1386, in dumps
Pickler(file, protocol, bin).dump(obj)
File "C:\Python24\lib\pickle.py", line 231, in dump
self.save(obj)
File "C:\Python24\lib\pickle.py", line 313, in save
rv = reduce(self.proto)
File "C:\Python24\lib\copy_reg.py", line 69, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle listiterator objects

#import copy
#j = copy.copy(i)
Traceback (most recent call last):
File "<input>", line 1, in ?
File "C:\Python24\lib\copy.py", line 95, in copy
return _reconstruct(x, rv, 0)
File "C:\Python24\lib\copy.py", line 320, in _reconstruct
y = callable(*args)
File "C:\Python24\lib\copy_reg.py", line 92, in __newobj__
return cls.__new__(cls, *args)
TypeError: object.__new__(listiterator) is not safe, use listiterator.__new__()

So, I would like to know if there is a way to 'clone' a 'listiterator'
object. I know that is possible in Java for example...

If it is impossible, have you better ideas to find duplicate entries
in a list...

Thanks,

Cyril
Jul 18 '05 #1
3 1844
Cyril,

Here's some code that (I think) does what you want:

l = [1, 7, 3, 4, 3, 2, 1]
s, dups = set(), set()
for x in i:
if x in s:
dups.add(x)
s.add(x)

print dups
I'm sure there are more elegant ways to do it, but this seemed to be the
most straightforward way I could think of.

Hope this helps,
Alan McIntyre
ESRG LLC
http://www.esrgtech.com
Cyril BAZIN wrote:
Hello,

I want to build a function which return values which appear two or
more times in a list:

So, I decided to write a little example which doesn't work:
#l = [1, 7, 3, 4, 3, 2, 1]
#i = iter(l)
#for x in i:
# j = iter(i)
# for y in j:
# if x == y:
# print x

In thinked that the instruction 'j= iter(i)' create a new iterator 'j'
based on 'i' (some kind of clone). I wrote this little test which show
that 'j = iter(i)' is the same as 'j = i' (that makes me sad):

#l = [1, 7, 3, 4, 2]
#i = iter(l)
#j = iter(i)
#k = i
#i, j, k
(<listiterator object at 0x02167B50>, <listiterator object at
0x02167B50>, <listiterator object at 0x02167B50>)

Just in order to test, I wrote these little test:
#l = [1, 7, 3, 4, 2]
#i = iter(l)
#import pickle
#j = pickle.loads(pickle.dumps(i))
Traceback (most recent call last):
File "<input>", line 1, in ?
File "C:\Python24\lib\pickle.py", line 1386, in dumps
Pickler(file, protocol, bin).dump(obj)
File "C:\Python24\lib\pickle.py", line 231, in dump
self.save(obj)
File "C:\Python24\lib\pickle.py", line 313, in save
rv = reduce(self.proto)
File "C:\Python24\lib\copy_reg.py", line 69, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle listiterator objects

#import copy
#j = copy.copy(i)
Traceback (most recent call last):
File "<input>", line 1, in ?
File "C:\Python24\lib\copy.py", line 95, in copy
return _reconstruct(x, rv, 0)
File "C:\Python24\lib\copy.py", line 320, in _reconstruct
y = callable(*args)
File "C:\Python24\lib\copy_reg.py", line 92, in __newobj__
return cls.__new__(cls, *args)
TypeError: object.__new__(listiterator) is not safe, use listiterator.__new__()

So, I would like to know if there is a way to 'clone' a 'listiterator'
object. I know that is possible in Java for example...

If it is impossible, have you better ideas to find duplicate entries
in a list...

Thanks,

Cyril


Jul 18 '05 #2
Cyril BAZIN wrote:
Hello,

I want to build a function which return values which appear two or
more times in a list:

So, I decided to write a little example which doesn't work:
#l = [1, 7, 3, 4, 3, 2, 1]
#i = iter(l)
#for x in i:
# j = iter(i)
# for y in j:
# if x == y:
# print x
Py> i = [1,2,5,4,3,3,6,8,2,2,6]
Py> o = []
Py> for item in i:
.... if i.count(item) > 1 and item not in o:
.... o.append(item)
....
Py> print o
[2, 3, 6]
In thinked that the instruction 'j= iter(i)' create a new iterator 'j' based on 'i' (some kind of clone). I wrote this little test which show that 'j = iter(i)' is the same as 'j = i' (that makes me sad):

#l = [1, 7, 3, 4, 2]
#i = iter(l)
#j = iter(i)
#k = i
#i, j, k
(<listiterator object at 0x02167B50>, <listiterator object at
0x02167B50>, <listiterator object at 0x02167B50>) Generally speaking Python sequence objects are iterable already no need
to cast them like java.
And yes iter() does create an iterator object.
Py> i = [1,2,5,4,3,3,6,8,2,2,6]
Py> e = iter(i)
Py> e
<stackless.iterator object at 0x010779E0>
Py> e.next()
1
Py> e.next()
2
Py> e.next()
5
Py> for item in e:
.... print item
....
4
3
3
6
8
2
2
6
But if you want a copy of a list then you can do it many other ways.
Py> i = [1,2,5,4,3,3,6,8,2,2,6]
Py> l = list(i)
Py>l
[1, 2, 5, 4, 3, 3, 6, 8, 2, 2, 6]
Py>i
[1, 2, 5, 4, 3, 3, 6, 8, 2, 2, 6]
Or,
Py> i = [1,2,5,4,3,3,6,8,2,2,6]
Py> l = i[:]
Py>l
[1, 2, 5, 4, 3, 3, 6, 8, 2, 2, 6]
Py>i
[1, 2, 5, 4, 3, 3, 6, 8, 2, 2, 6]
or you can even use copy,
Py> import copy
Py> i = [1,2,5,4,3,3,6,8,2,2,6]
Py> l = copy.copy(i)
Py> l
[1, 2, 5, 4, 3, 3, 6, 8, 2, 2, 6]
Just in order to test, I wrote these little test:
#l = [1, 7, 3, 4, 2]
#i = iter(l)
#import pickle
#j = pickle.loads(pickle.dumps(i))
Traceback (most recent call last):
File "<input>", line 1, in ?
File "C:\Python24\lib\pickle.py", line 1386, in dumps
Pickler(file, protocol, bin).dump(obj)
File "C:\Python24\lib\pickle.py", line 231, in dump
self.save(obj)
File "C:\Python24\lib\pickle.py", line 313, in save
rv = reduce(self.proto)
File "C:\Python24\lib\copy_reg.py", line 69, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle listiterator objects

#import copy
#j = copy.copy(i)
Traceback (most recent call last):
File "<input>", line 1, in ?
File "C:\Python24\lib\copy.py", line 95, in copy
return _reconstruct(x, rv, 0)
File "C:\Python24\lib\copy.py", line 320, in _reconstruct
y = callable(*args)
File "C:\Python24\lib\copy_reg.py", line 92, in __newobj__
return cls.__new__(cls, *args)
TypeError: object.__new__(listiterator) is not safe, use listiterator.__new__()
So, I would like to know if there is a way to 'clone' a 'listiterator' object. I know that is possible in Java for example... Just pickle the list not the iterator.
If it is impossible, have you better ideas to find duplicate entries
in a list... See code example above. Thanks,

Cyril

Hth,
M.E.Farmer

Jul 18 '05 #3
[Cyril BAZIN]
I want to build a function which return values which appear two or
more times in a list:

So, I decided to write a little example which doesn't work:
#l = [1, 7, 3, 4, 3, 2, 1]
#i = iter(l)
#for x in i:
# j = iter(i)
# for y in j:
# if x == y:
# print x


Here's one way:

s = [1, 7, 3, 4, 3, 2, 1]
seen = []
dups = []
for elem in s:
if elem in seen:
if elem not in dups:
dups.append(elem)
else:
seen.append(elem)
print dups

This snippet has the virtue of not outputting the duplicate value more than once
and it does not rely on the values being hashable or ordered. It has the vice
of running at O(n**2) speed. A further property is that the values are output
in the order seen.

If the values are ordered but not hashable, itertools.groupby offers a faster
O(n lg n) solution that outputs the values in sorted order:

import itertools
print [k for k, g in itertools.groupby(sorted(s)) if len(list(g))>1]

If the values are known to be hashable, dictionaries offer an even faster O(n)
solution that outputs the values in arbitrary order:

bag = {}
for elem in s:
bag[elem] = bag.get(elem, 0) + 1
print [elem for elem, count in bag.iteritems() if count>1]
Raymond Hettinger
P.S. Extra credit problem: make the itertools solution output values in the
order seen.
Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.