http://jaynes.colorado.edu/PythonIdioms.html
"""Use dictionaries for searching, not lists. To find items in common
between two lists, make the first into a dictionary and then look for
items in the second in it. Searching a list for an item is lineartime,
while searching a dict for an item is constant time. This can often let
you reduce search time from quadratic to linear."""
Is this correct?
s = [1,2,3,4,5...]
t = [4,5,6,,8,...]
how to find whether there is/are common item(s) between two list in
lineartime?
how to find the number of common items between two list in lineartime? 8 1181
Per: how to find whether there is/are common item(s) between two list in lineartime?
To find items in common between two lists, make the first into a
dictionary and then look for items in the second in it.

René Pijlman
Wat wil jij leren? http://www.leren.nl
Per wrote: http://jaynes.colorado.edu/PythonIdioms.html
"""Use dictionaries for searching, not lists. To find items in common between two lists, make the first into a dictionary and then look for items in the second in it. Searching a list for an item is lineartime, while searching a dict for an item is constant time. This can often let you reduce search time from quadratic to linear."""
Is this correct? s = [1,2,3,4,5...] t = [4,5,6,,8,...] how to find whether there is/are common item(s) between two list in lineartime? how to find the number of common items between two list in lineartime?
I'm not sure if it works in linear time, but if there are no duplicates
in each list, sets would be the easiest way to do these with. s = set([1,2,3,4,5,6]) t = set([4,5,6,7,8,9]) s.intersection(t)
set([4, 5, 6]) len(s.intersection(t))
3
Cheers,
Ron
Per wrote: how to find the number of common items between two list in lineartime?
Not really sure about lineartime, but you could try the following: a=[1,2,3,4] b=[3,4,5,6] set(a) & set(b)
set([3, 4])
Irmen
Thanks Ron,
surely set is the simplest way to understand the question, to see
whether there is a nonempty intersection. But I did the following
thing in a silly way, still not sure whether it is going to be linear
time.
def foo():
l = [...]
s = [...]
dic = {}
for i in l:
dic[i] = 0
k=0
while k <len(s):
if s[k] in dic:
return True
else: pass
k+=1
if k == len(s):
return False
I am still a rookie, and partly just migrated from Haskell...
I am not clear about how making one of the lists a dictionary is
helpful
Per wrote: Thanks Ron, surely set is the simplest way to understand the question, to see whether there is a nonempty intersection. But I did the following thing in a silly way, still not sure whether it is going to be linear time. def foo(): l = [...] s = [...] dic = {} for i in l: dic[i] = 0 k=0 while k <len(s): if s[k] in dic: return True else: pass k+=1 if k == len(s): return False
I am still a rookie, and partly just migrated from Haskell... I am not clear about how making one of the lists a dictionary is helpful
The dictbased approach is to do something like this: ls1 = [1,3,5,7,9] ls2 = [3,5,6,7,10] d1 = dict.fromkeys(ls1) d1
{1: None, 3: None, 9: None, 5: None, 7: None}
Note the values, are irrelevant  we care only about the keys
Now, membership testing takes linear time:
[item for item in ls2 if item in d1]
[3, 5, 7]
But, as you say, this approach is unnecessary, given sets.
HTH
Michael
Per wrote: Thanks Ron, surely set is the simplest way to understand the question, to see whether there is a nonempty intersection. But I did the following thing in a silly way, still not sure whether it is going to be linear time. def foo(): l = [...] s = [...] dic = {} for i in l: dic[i] = 0 k=0 while k <len(s): if s[k] in dic: return True else: pass k+=1 if k == len(s): return False
I am still a rookie, and partly just migrated from Haskell... I am not clear about how making one of the lists a dictionary is helpful
Lets compare them by checking different length with no overlap which is
the worst case situation.
## is_interstion comparison
def ii_set(a, b):
return len(set(a).intersection(b))>0
def ii_dict(l, s):
dic = {}
for i in l:
dic[i] = 0
for i in s:
if i in dic:
return True
return False
def ii_dict2(l, s):
dic = dict.fromkeys(l)
for i in s:
if i in dic:
return True
return False
import time
foos = [ii_set, ii_dict, ii_dict2]
lsts = [10, 100, 1000, 10000, 100000, 1000000]
for f in foos:
for lst in lsts:
a = range(lst)
b = range(lst, lst*2)
start = time.clock()
assert f(a,b) == False
t = time.clock()start
print f.__name__, lst, t
print
ii_set 10 1.25714301678e005
ii_set 100 2.45841301059e005
ii_set 1000 0.000162031766607
ii_set 10000 0.0020256764477
ii_set 100000 0.0238681173166
ii_set 1000000 0.23067484834
ii_dict 10 2.31873045317e005
ii_dict 100 6.73269926764e005
ii_dict 1000 0.000442234976792
ii_dict 10000 0.0047891561637
ii_dict 100000 0.0502407428877
ii_dict 1000000 0.506360165887
ii_dict2 10 2.70984161395e005
ii_dict2 100 5.55936578532e005
ii_dict2 1000 0.000317358770458
ii_dict2 10000 0.00366638776716
ii_dict2 100000 0.0394256811969
ii_dict2 1000000 0.39200764343
The sets solution seems to be the fastest. And it is usually is when
you can move more of your task into the builtin methods which are
programmed in C.
From what I recently read here (not sure where) in another thread, in
python 2.3 sets were implemented as a python module using dictionaries,
and in 2.4 it was written in C code based on the dictionary C code.
Cheers,
Ron
Ron Adam <rr*@ronadam.com> wrote:
... From what I recently read here (not sure where) in another thread, in python 2.3 sets were implemented as a python module using dictionaries, and in 2.4 it was written in C code based on the dictionary C code.
....and in 2.5 they're due to be further optimized, using their own
specialized code rather than piggybacking on dictionaries'.
Alex
Per wrote: http://jaynes.colorado.edu/PythonIdioms.html
[snip]
Is this correct? s = [1,2,3,4,5...] t = [4,5,6,,8,...] how to find whether there is/are common item(s) between two list in lineartime? how to find the number of common items between two list in lineartime?
A common technique if both lists are sorted is to iterate through both
lists in parallel, advancing the smaller iterator each time. This is
the merge algorithm that is used by a merge sort, and it is O(s+t).
For two lists, the algorithm would go something like:
while not finished:
if s_iter_val < t_iter_val:
move s_iter on
elif s_iter_val > t_iter_val:
move t_iter on
else:
include / yield the value
move both iters on
For more on the standard merge algorithm, see: http://en.wikipedia.org/wiki/Merge_algorithm
For an intersection merge, I hacked the recursive solution from
there...
def merge(a, b):
if len(a) == 0: return []
if len(b) == 0: return []
if a[0] < b[0]: return merge(a[1:], b)
elif a[0] > b[0]: return merge(a, b[1:])
else: return a[0:1] + merge(a[1:], b[1:])
#8<
import unittest
class TestMerge(unittest.TestCase):
def test_merge(self):
self.assertEquals(merge([1,2],[]), [])
self.assertEquals(merge([],[1,2]), [])
self.assertEquals(merge([1,3,5],[2,4,6]), [])
self.assertEquals(merge([1,2,3],[3,4,5]), [3])
self.assertEquals(merge([1,2,3,5,6,7],[3,4,5,7,8]), [3,5,7])
if __name__ == "__main__":
unittest.main()
HTH,
Bruce This discussion thread is closed Replies have been disabled for this discussion. Similar topics
3 posts
views
Thread by David Morgenthaler 
last post: by

9 posts
views
Thread by Dan Perl 
last post: by

12 posts
views
Thread by Frans Englich 
last post: by

12 posts
views
Thread by Siemel Naran 
last post: by

2 posts
views
Thread by Debajit Adhikary 
last post: by

7 posts
views
Thread by Mikhail N. Kupchik 
last post: by

6 posts
views
Thread by Gareth Stockwell 
last post: by

6 posts
views
Thread by tings 
last post: by

9 posts
views
Thread by Christian Hackl 
last post: by

14 posts
views
Thread by Daniel Lidström 
last post: by
          