469,338 Members | 8,638 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,338 developers. It's quick & easy.

unique elements from list of lists

I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:

sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

must return ['a', 'b', 'c', 'd']. I tried the following
implementations:

from itertools import chain
def uniter1(listOfLists):
for item in set(chain(*listOfLists)): yield item

def uniter2(listOfLists):
for item in reduce(
lambda x,y: x|y,
[set(list_) for list_ in listOfLists]
): yield item

speed test with timeit says the first one is slightly faster. What
bothers me is that it builds a set from an iterator and then another
iterator from the set. Is there a way to implement this using only
iterators? I also tried a "full python" implementation (by that I mean
one that does not use the built-in set and keeps track of previously
yielded items in a list) but the one I pulled out is about 180 times
slower. Here it is:

def uniter3(listOfLists):
done = []
for list_ in listOfLists:
for item in list_:
if not item in done:
done.append(item)
yield item

Thanks in advance for any contribution.

-- Simone

Feb 9 '07 #1
6 10083
Tekkaman wrote:
I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:

sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

must return ['a', 'b', 'c', 'd']. I tried the following
implementations:

from itertools import chain
def uniter1(listOfLists):
for item in set(chain(*listOfLists)): yield item
def uniter(lists):
return iter(set(chain(*lists)))

This avoids the explicit for-loop.

Peter
Feb 9 '07 #2
try something else. im posting the code from a kiosk which has no
python, sooo..... no code. only explanation

if my memory works well there is a function in python that takes a
multidimensional list and returns its values as a one-dimension list.

def main():
list =unknownFunction([['a', 'b', 'd'], ['b', 'c'], ['a', 'c',
'd']) # =a, b, d, b, c, a, c, d
temp = []
for i in list:
check(i, temp, list)
sort(list)

def check(pos, temp, list ):
for i in temp:
if temp[i]== list[pos]
del list[pos]
check(pos, temp, list)
temp.append(list[pos])

im not sure if this should work but the meaning is:
the first three elements will be appended into the list directly
because there was no like this in the temp
while there are elements in the list take a pivot value and check if
they are unique in the list.

check:
while there are elements in the temporary list check if the pivot
exists in the temp. if false then append it to temp.

if true delete the element and go into the recursion on the same index
why?

temp a, b, d
list a, b, d, (b), c, a, c, d

temp a, b, d
list a, b, d, (c) a, c, d

temp a, b, d, c
list a, b, d, c, (a) c, d

temp a, b, d, c
list a, b, d, c, (c), d

temp a, b, d, c
list a, b, d, c, (d)
temp a, b, d, c
list a, b, d, c

list a, b, c, d

one way to do it. this works well if you need the list in the order
they appear. sort it and it works well
but i think that there is the possibility to do it somthing like the
merge sort. maybee it would work. try it. Merge the sublists and
remove the duplicates. (L1,L2,L3) -(L12, L3) -(L123)
this should work pretty well


Tekkaman je napisao/la:
I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:

sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

must return ['a', 'b', 'c', 'd']. I tried the following
implementations:

from itertools import chain
def uniter1(listOfLists):
for item in set(chain(*listOfLists)): yield item

def uniter2(listOfLists):
for item in reduce(
lambda x,y: x|y,
[set(list_) for list_ in listOfLists]
): yield item

speed test with timeit says the first one is slightly faster. What
bothers me is that it builds a set from an iterator and then another
iterator from the set. Is there a way to implement this using only
iterators? I also tried a "full python" implementation (by that I mean
one that does not use the built-in set and keeps track of previously
yielded items in a list) but the one I pulled out is about 180 times
slower. Here it is:

def uniter3(listOfLists):
done = []
for list_ in listOfLists:
for item in list_:
if not item in done:
done.append(item)
yield item

Thanks in advance for any contribution.

-- Simone
Feb 9 '07 #3
tra using the firs sublist (list[1]) as cell.then take zhe second
sublist and take a value from it at once and if the value from list[2]
doesnt exist in list[1] then insert it into list[1] at the correct
place. Something like the insertionsort.

Feb 9 '07 #4
Tekkaman:

If the sublists contain hashable elements you can use this:

def uniter(lists):
merge = set()
for sub in lists:
merge = merge.union(sub)
for el in merge:
yield el

data = [['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]
print list(uniter(data))

But often this too may be enough:

def uniter(lists):
merge = set()
for sub in lists:
merge = merge.union(sub)
return merge

Bye to the gentle Pegas too,
bearophile

Feb 9 '07 #5
Thanks everybody!
Azrael: your suggestions involve python-level membership testing and
dummy list construction just like my uniter3 example, so I'm afraid
they would not go any faster. There's no built-in sequence flattening
function that I know of btw.
bearophile: your implementation is very similar to my uniter2
(subsequent set unions) but without the additional lambda def
overhead, and in fact it goes faster. Not faster than uniter though.
Peter: your solution is the fastest, removing the explicit for loop
resulted in a 30% speed gain. Looks like using set() is a must.

-- Simone

Feb 9 '07 #6
no heart feelings. i was just throwing ideas. no time to testing it.
On Feb 9, 3:55 pm, "Tekkaman" <simone....@gmail.comwrote:
Thanks everybody!Azrael: your suggestions involve python-level membership testing and
dummy list construction just like my uniter3 example, so I'm afraid
they would not go any faster. There's no built-in sequence flattening
function that I know of btw.
bearophile: your implementation is very similar to my uniter2
(subsequent set unions) but without the additional lambda def
overhead, and in fact it goes faster. Not faster than uniter though.
Peter: your solution is the fastest, removing the explicit for loop
resulted in a 30% speed gain. Looks like using set() is a must.

-- Simone

Feb 11 '07 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

10 posts views Thread by Matthew Sims | last post: by
2 posts views Thread by kevin parks | last post: by
24 posts views Thread by superprad | last post: by
reply views Thread by Alec Smith | last post: by
34 posts views Thread by Adam Hartshorne | last post: by
reply views Thread by Gabriel Genellina | last post: by
3 posts views Thread by =?Utf-8?B?YW1pcg==?= | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by Marylou17 | last post: by
1 post views Thread by Marylou17 | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.