473,386 Members | 1,803 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Test if list contains another list

Hi there,

I am trying to write something very simple to test if a list
contains another one:

a = [1,2,3]

b = [3,2,1,4]

but 'a in b' returns False. How do I check that a is indeed contained
in b ?

thanks
Sep 8 '08 #1
11 23749
mathieu wrote:
Hi there,

I am trying to write something very simple to test if a list
contains another one:

a = [1,2,3]

b = [3,2,1,4]

but 'a in b' returns False. How do I check that a is indeed contained
in b ?
Use sets:
>>a = [1,2,3]
b = [3,2,1,4]
set(a).issubset(set(b))
True
Christian

Sep 8 '08 #2
Hi,
>>a = [1,2,3]
b = [3,2,1,4]
a = set(a)
b = set(b)
a.intersection(b)
set([1, 2, 3])

Is this what you want ?

cheers
James

On 9/8/08, mathieu <ma***************@gmail.comwrote:
Hi there,

I am trying to write something very simple to test if a list
contains another one:

a = [1,2,3]

b = [3,2,1,4]

but 'a in b' returns False. How do I check that a is indeed contained
in b ?

thanks

--
http://mail.python.org/mailman/listinfo/python-list

--
--
-- "Problems are solved by method"
Sep 8 '08 #3
On Sep 8, 9:32 am, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.frwrote:
mathieu a écrit :
Hi there,
I am trying to write something very simple to test if a list
contains another one:
a = [1,2,3]
b = [3,2,1,4]
but 'a in b' returns False.

Indeed. Lists are not sets, and the fact that all elements of list a
happens to also be part of list b doesn't make the list a itself an
element of list b.
>>a = [1, 2, 3]
>>b = [3,2,1,4]
>>c = [b, a]
>>a in c
True
>>b in c
True
>>c
[[3, 2, 1, 4], [1, 2, 3]]
>>>
How do I check that a is indeed contained
in b ?

But that's what you did - you *did* checked if a was contained in b, and
this is not the case. What you want is to check if *all elements* of a
are contained in b, which is quite another problem. Now the answer to
your question : use sets.
>>set(a).issubset(set(b))
True
>>>
thanks all !
Sep 8 '08 #4
On Thu, Sep 18, 2008 at 03:24:16AM -0700, ga*********@gmail.com wrote:
I looked inside this thread for my query which brought me the
following google search result
"Test if list contains another list - comp.lang.python | Google
Groups"

But then I was disappointed to see the question asked was not exactly
right.
[...]
def findAllMatchingList(mainList, subList):
resultIndex = []
globalIndex = 0
for i in range(len(mainList)):
if i < globalIndex:
continue
globalIndex = i
increment = 0
for j in range(len(subList)):
if mainList[globalIndex] == subList[j]:
globalIndex += 1
increment += 1
if j == (len(subList)-1):
resultIndex.append(globalIndex-increment)
else:
break

return resultIndex
I didn't time them to compare, but how about this instead:
>>def find_needle_in_haystack(needle, haystack):
.... r = []
.... L = len(needle)
.... for i in range(len(haystack)):
.... if haystack[i:i+L] == needle:
.... r.append(i)
.... return r
>># this fails because "3" is not 3...
find_needle_in_haystack([1,2,3], ["a","b",1,2,"3","9"])
[]
>>find_needle_in_haystack([1,2,3], [1,2,3])
[0]
>>find_needle_in_haystack([1,2,3], ["a","b",1,2,3,"9"])
[2]
>>find_needle_in_haystack([1,2,3], ["a","b",1,2,3,"9","q",1,2,3])
[2, 7]

--
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFI3TcUdjdlQoHP510RAlc0AJ9CaqXxWSN6FeWsgApaNN f973medQCeKJUZ
AAZ4X3LFezSlIXRp3H3W748=
=3OrB
-----END PGP SIGNATURE-----

Sep 26 '08 #5
I suggest Python programmers to fill the holes in the Python std lib
with some debugged & tuned implementations, their "bag of tricks", so
they don't have to re-invent and debug things all the time. This works
well with Psyco:
def issubseq(sub, items):
"""issubseq(sub, items): return true if the sequence 'sub' is a
contiguous
subsequence of the 'items' sequence.
>>issubseq()
Traceback (most recent call last):
...
TypeError: issubseq() takes exactly 2 arguments (0 given)
>>issubseq("abc")
Traceback (most recent call last):
...
TypeError: issubseq() takes exactly 2 arguments (1 given)
>>issubseq(1, [1, 2, 3])
Traceback (most recent call last):
...
TypeError: 'int' object is not iterable
>>isi = lambda s,i: int(issubseq(s,i))
isi([], [])
1
>>isi("a", "")
0
>>isi("", "a")
1
>>isi("", "aaa")
1
>>isi("a", "a")
1
>>isi("ab", "bab")
1
>>isi("ab", "bab")
1
>>isi("ba", "bbb")
0
>>isi("bab", "ab")
0
>>isi(("a", "b"), ("a","a","b"))
1
>>isi(("a", "b"), ("a","a","c"))
0
>>isi([1,2,1], [3,5, 1,2,4, 1,2,1, 6])
1
>>isi([1,2,1], [3,5, 1,2,4, 1,2,3, 6])
0
>>l = [1] * 50 + [1,2,3] + [4] * 50
isi([1,2,3], l)
1
>>l = [1] * 50 + [1,2,4] + [5] * 50
isi([1,2,3], l)
0
"""
if not hasattr(sub, "__getitem__"):
sub = list(sub)

len_sub = len(sub)
if len_sub == 0:
return True

try:
if not len(items) or len(items) < len_sub:
return False
except TypeError:
pass

if sub == items:
return True

table = [0] * (len_sub + 1)

# building prefix-function
m = 0
for i in xrange(1, len_sub):
while m 0 and sub[m] != sub[i]:
m = table[m - 1]
if sub[m] == sub[i]:
m += 1
table[i] = m

# searching
m, i = 0, 0
for x in items:
while m 0 and sub[m] != x:
m = table[m - 1]
if sub[m] == x:
m += 1
if m == len_sub:
return True
i += 1

return False
Usage:
>>from util import issubseq # uses Psyco
from timing import timeit
a = range(1000000) + range(1000) + range(100000)
sub = range(1000)
len(a)
1101000
>>timeit(issubseq, sub, a)
(True, 0.0)
>>a = range(999) * 10000 + range(1000) + range(10000)
sub = range(1000)
timeit(issubseq, sub, a)
(True, 0.20000000000000001)

Bye,
bearophile
Sep 26 '08 #6
bearophile:
# searching
m, i = 0, 0
....
i += 1
The name 'i' can be removed, sorry.

Bye,
bearophile
Sep 26 '08 #7
On Fri, Sep 26, 2008 at 01:39:16PM -0700, be************@lycos.com wrote:
# building prefix-function
m = 0
for i in xrange(1, len_sub):
while m 0 and sub[m] != sub[i]:
m = table[m - 1]
if sub[m] == sub[i]:
m += 1
table[i] = m

# searching
m, i = 0, 0
for x in items:
while m 0 and sub[m] != x:
m = table[m - 1]
if sub[m] == x:
m += 1
if m == len_sub:
return True
i += 1

return False
Quite a lot faster than mine... even without using psyco. Which is
interesting, especially because if I change my search loop to work
like yours, but leave out all your other optimizations, mine runs
roughly as fast as yours (i.e. execution time is negligible to a user
running the program in both cases, even with the large example data
you gave). This leads me to point out two caveats with your version:

1. Guavat posted a version which returns a list of all the indexes
where a match is found... which is what I mimiced. Yours returns only
true or false indicating whether or not it found a match. The main
difference in performance seems to come due to your iteration over the
list items, versus my comparing a sublist-sized slice of the whole to
the sublist. But this alone is an invalid optimization, because it
doesn't produce the same results... If you only want to know if a
match exists, it's great; but if you want to know *where*, or *how
many times*, you lose. That could be fixed though, by counting the
elements as you loop through them... I didn't attempt to determine
the performance hit for doing that, but I assume it's negligible.
I also imagine that that was your original purpose for the unused
variable i...

2. Your secondary optimizations add a great deal of complexity to your
code, making the algorithm much harder to understand. However they
don't appear to buy you much, given that the cases they optimize would
probably be rare, and the difference in execution time gained by the
optimization is not noticable to the user. Unless you're doing lots
and lots of these in your application, or maybe if you know in advance
that your data will contain many instances of the cases you optimized,
I think you're better off leaving the optimizations out, for the sake
of code clarity.

At the very least, if you're going to write complicated optimizations,
you ought to have explained what you were doing in comments... :)

--
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFI4HJ2djdlQoHP510RAhrKAJ9HhSX5lXuQrX565dv6MF 0T/jMlXgCfaQWe
kJBxLoxYmzRu6/3C7XXHobU=
=CAjG
-----END PGP SIGNATURE-----

Sep 29 '08 #8
Derek Martin:
>Quite a lot faster than mine... even without using psyco.<
It's designed for Psyco.

>However they don't appear to buy you much, given that the cases they optimize would probably be rare, and the difference in execution time gained by the optimization is not noticable to the user.<
http://www-igm.univ-mlv.fr/~lecroq/string/node7.html

>Unless you're doing lots and lots of these in your application,<
I don't agree. That's library code, so it has to be efficient and
flexible, because it's designed to be used in many different
situations (if you don't agree, then you can try to suggest a
replacement of the C code of the string search of CPython written by
effbot with some slower code).

Bye,
bearophile
Sep 29 '08 #9
On Mon, Sep 29, 2008 at 04:12:13AM -0700, be************@lycos.com wrote:
Derek Martin:
Unless you're doing lots and lots of these in your application,<

I don't agree. That's library code, so it has to be efficient and
flexible, because it's designed to be used in many different
situations
That's fair, but lots of folks writing Python code will look at that
and say, "What the %$#@! is this doing?!?" As I already suggested,
code that implements non-obvious algorithms ought to explain what it's
doing in comments, so that the neophyte programmers charged with
maintaining the library aren't tempted to rewrite the code so that
it's easier to understand what it's doing. It can be as simple as:

# Use Morris-Pratt algorithm to search data

Then, anyone not familiar with the algorithm can easily look it up,
and see why it was written that way.

I think it's just as important to do that in code you post on the
list, since a) the person asking the question obviously doesn't know
what you're doing, or they wouldn't have needed to ask the question,
and b) there are lots of other folks reading the list who could
benefit from the same knowledge. :)

--
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFI4PqudjdlQoHP510RAi3cAJsH344P41yMojDLqVL2z2 9qGodldgCfQ5u8
4N48WpmWHjjnBNwuDx2vbGE=
=OT7l
-----END PGP SIGNATURE-----

Sep 29 '08 #10
Derek Martin:
code that implements non-obvious algorithms ought to explain what it's
doing in comments,
I am sorry, you are right, of course.

In my libs/code there are always docstrings and doctests/tests, and
most times comments too, like you say. When I post code here I often
strip away part of those things, mostly to reduce the length of my
posts -.- I guess that I have to leave more of that meta-
information...

Bye,
bearophile
Sep 29 '08 #11
Ali

Its funny, I just visited this problem last week.

<http://dulceetutile.blogspot.com/200...ooking-python-
statement_17.html>

../Ali
Nov 16 '08 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Rick Hower | last post by:
The 'Web Test Tools List' at http://www.softwareqatest.com/qatweb1.html has been updated as of December 1 2003. It now contains more than 280 web test tools, (with descriptions and links...
12
by: Jan Roland Eriksson | last post by:
I have worked some more on this suggested new mFAQ version trying to get all user input to blend into the text. Another review by the NG would be welcome. ===== Archive-name:...
5
by: Jan Roland Eriksson | last post by:
Some more fine tuning and inclusion of last suggested text from regulars has been done to this test post #4 of the mFAQ. As usual, rip it up anywhere you feel that it's appropriate to do so. ...
9
by: Hudson Reis | last post by:
Hi all, Test, please ignore. Hudson
9
by: TheOne | last post by:
Would anyone please point me to a list of reentrant C library functions? I want to know which C library functions are safe to use inside a signal handler across all platforms. Does GNU C library...
2
by: emily224 | last post by:
Hello, I have been trying to understand this source code, which I retreived from my online course test. I would like to know how to find the answer for the question on the test. Im sure the answer...
4
by: emily224 | last post by:
Hello, I have been trying to understand this source code, which I retreived from my online course test. I would like to know how to find the answer for the question on the test. Im sure the answer...
2
by: =?Utf-8?B?c2lwcHl1Y29ubg==?= | last post by:
I was wonder if there is any protocol or suggestions on how to setup unit testing projects in large solution??? If you have a large solution of say 50 projects and add a project for every unit...
2
by: hcaptech | last post by:
This is my Test.can you help me ? 1.Which of the following statement about C# varialble is incorrect ? A.A variable is a computer memory location identified by a unique name B.A variable's name...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.