473,403 Members | 2,183 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,403 software developers and data experts.

unique elements from list of lists

I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:

sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

must return ['a', 'b', 'c', 'd']. I tried the following
implementations:

from itertools import chain
def uniter1(listOfLists):
for item in set(chain(*listOfLists)): yield item

def uniter2(listOfLists):
for item in reduce(
lambda x,y: x|y,
[set(list_) for list_ in listOfLists]
): yield item

speed test with timeit says the first one is slightly faster. What
bothers me is that it builds a set from an iterator and then another
iterator from the set. Is there a way to implement this using only
iterators? I also tried a "full python" implementation (by that I mean
one that does not use the built-in set and keeps track of previously
yielded items in a list) but the one I pulled out is about 180 times
slower. Here it is:

def uniter3(listOfLists):
done = []
for list_ in listOfLists:
for item in list_:
if not item in done:
done.append(item)
yield item

Thanks in advance for any contribution.

-- Simone

Feb 9 '07 #1
6 10352
Tekkaman wrote:
I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:

sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

must return ['a', 'b', 'c', 'd']. I tried the following
implementations:

from itertools import chain
def uniter1(listOfLists):
for item in set(chain(*listOfLists)): yield item
def uniter(lists):
return iter(set(chain(*lists)))

This avoids the explicit for-loop.

Peter
Feb 9 '07 #2
try something else. im posting the code from a kiosk which has no
python, sooo..... no code. only explanation

if my memory works well there is a function in python that takes a
multidimensional list and returns its values as a one-dimension list.

def main():
list =unknownFunction([['a', 'b', 'd'], ['b', 'c'], ['a', 'c',
'd']) # =a, b, d, b, c, a, c, d
temp = []
for i in list:
check(i, temp, list)
sort(list)

def check(pos, temp, list ):
for i in temp:
if temp[i]== list[pos]
del list[pos]
check(pos, temp, list)
temp.append(list[pos])

im not sure if this should work but the meaning is:
the first three elements will be appended into the list directly
because there was no like this in the temp
while there are elements in the list take a pivot value and check if
they are unique in the list.

check:
while there are elements in the temporary list check if the pivot
exists in the temp. if false then append it to temp.

if true delete the element and go into the recursion on the same index
why?

temp a, b, d
list a, b, d, (b), c, a, c, d

temp a, b, d
list a, b, d, (c) a, c, d

temp a, b, d, c
list a, b, d, c, (a) c, d

temp a, b, d, c
list a, b, d, c, (c), d

temp a, b, d, c
list a, b, d, c, (d)
temp a, b, d, c
list a, b, d, c

list a, b, c, d

one way to do it. this works well if you need the list in the order
they appear. sort it and it works well
but i think that there is the possibility to do it somthing like the
merge sort. maybee it would work. try it. Merge the sublists and
remove the duplicates. (L1,L2,L3) -(L12, L3) -(L123)
this should work pretty well


Tekkaman je napisao/la:
I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:

sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

must return ['a', 'b', 'c', 'd']. I tried the following
implementations:

from itertools import chain
def uniter1(listOfLists):
for item in set(chain(*listOfLists)): yield item

def uniter2(listOfLists):
for item in reduce(
lambda x,y: x|y,
[set(list_) for list_ in listOfLists]
): yield item

speed test with timeit says the first one is slightly faster. What
bothers me is that it builds a set from an iterator and then another
iterator from the set. Is there a way to implement this using only
iterators? I also tried a "full python" implementation (by that I mean
one that does not use the built-in set and keeps track of previously
yielded items in a list) but the one I pulled out is about 180 times
slower. Here it is:

def uniter3(listOfLists):
done = []
for list_ in listOfLists:
for item in list_:
if not item in done:
done.append(item)
yield item

Thanks in advance for any contribution.

-- Simone
Feb 9 '07 #3
tra using the firs sublist (list[1]) as cell.then take zhe second
sublist and take a value from it at once and if the value from list[2]
doesnt exist in list[1] then insert it into list[1] at the correct
place. Something like the insertionsort.

Feb 9 '07 #4
Tekkaman:

If the sublists contain hashable elements you can use this:

def uniter(lists):
merge = set()
for sub in lists:
merge = merge.union(sub)
for el in merge:
yield el

data = [['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]
print list(uniter(data))

But often this too may be enough:

def uniter(lists):
merge = set()
for sub in lists:
merge = merge.union(sub)
return merge

Bye to the gentle Pegas too,
bearophile

Feb 9 '07 #5
Thanks everybody!
Azrael: your suggestions involve python-level membership testing and
dummy list construction just like my uniter3 example, so I'm afraid
they would not go any faster. There's no built-in sequence flattening
function that I know of btw.
bearophile: your implementation is very similar to my uniter2
(subsequent set unions) but without the additional lambda def
overhead, and in fact it goes faster. Not faster than uniter though.
Peter: your solution is the fastest, removing the explicit for loop
resulted in a 30% speed gain. Looks like using set() is a must.

-- Simone

Feb 9 '07 #6
no heart feelings. i was just throwing ideas. no time to testing it.
On Feb 9, 3:55 pm, "Tekkaman" <simone....@gmail.comwrote:
Thanks everybody!Azrael: your suggestions involve python-level membership testing and
dummy list construction just like my uniter3 example, so I'm afraid
they would not go any faster. There's no built-in sequence flattening
function that I know of btw.
bearophile: your implementation is very similar to my uniter2
(subsequent set unions) but without the additional lambda def
overhead, and in fact it goes faster. Not faster than uniter though.
Peter: your solution is the fastest, removing the explicit for loop
resulted in a 30% speed gain. Looks like using set() is a must.

-- Simone

Feb 11 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Matthew Sims | last post by:
Python Newbie here. This is my first time learning object-oriented programming and trying to break out of the usual Korn/Perl/PHP style of programming. Having some difficulty understand some items....
2
by: kevin parks | last post by:
hi. I've been banging my head against this one a while and have asked around, and i am throwing this one out there in the hopes that some one can shed some light on what has turned out to be a...
24
by: superprad | last post by:
Is there an easy way to grab the Unique elements from a list? For Example: data = what I am looking for is the unique elements 0.4 and 0.9 with their index from the list. Probably something...
0
by: Alec Smith | last post by:
I have two tables as below: CREATE TABLE domain_types ( type_id INT(4) NOT NULL AUTO_INCREMENT, name VARCHAR(10) UNIQUE NOT NULL, description VARCHAR(75), PRIMARY KEY(type_id) ) TYPE=INNODB...
8
by: Generic Usenet Account | last post by:
To settle the dispute regarding what happens when an "erase" method is invoked on an STL container (i.e. whether the element is merely removed from the container or whether it also gets deleted in...
34
by: Adam Hartshorne | last post by:
Hi All, I have the following problem, and I would be extremely grateful if somebody would be kind enough to suggest an efficient solution to it. I create an instance of a Class A, and...
0
by: Gabriel Genellina | last post by:
En Fri, 18 Apr 2008 12:23:08 -0300, Shawn Milochik <Shawn@Milochik.comescribió: A dictionary with keys is perfectly reasonable. But a *list* of values has to be searched linearly for every...
3
by: =?Utf-8?B?YW1pcg==?= | last post by:
Hi, I have a Generic Object that has a private List field item. I populate the List in a different function I use a FOREACH LOOP with a FindAll function to get all items that have a certain...
2
by: antar2 | last post by:
Hello, I am a beginner in python. following program prints the second element in list of lists 4 for the first elements in list 4 that are common with the elements in list 5 list4 = ,,]...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.