finding all sublists of list2 that are identical to list1

Klaus Neuner

Hello,

the function given below returns all indexes of list2 where a sublist
of list2 that is identical to list1 begins.

As I will need this function quite often, I would like to know if more
experienced programmers would agree with the way I defined the
function:

- Is there a more efficient way to do it? (Apart from building
automata.)
- Is there a more elegant/Pythonic way to write the function?

Klaus
def sublist(list1, list2):
if list1 == list2:
return [0]
elif len(list1) > len(list2):
return []
else:
result = []
shift = 0
i = 0
while shift <= len(list2)-len(list1):
if list1[i] == list2[shift+i]:
if i == len(list1)-1:
result.append(shift)
i = 0
shift += 1
else:
i += 1
else:
i = 0
shift += 1
return result

Jul 18 '05 #1

Subscribe Post Reply

1924

Diez B. Roggisch

> - Is there a more efficient way to do it? (Apart from building

automata.)
Look into the shift-and algorithm for string-search- that should be much
faster. Its not trivial so I can't reproduce it right here out of my head,
but google helps.
- Is there a more elegant/Pythonic way to write the function?

Yes. You should work much more with slicing, and compare whole sublists like
this:

-------snip-----------
hs = [0, 1, 5, 3, 2, 4, 5, 3, 2]
ne = [5, 3, 2]

def sublist(needle, haystack):
hs_size = len(haystack)
n_size = len(needle)

if n_size > hs_size:
return []
elif hs_size == n_size and needle == haystack:
return []
else:
res = []
offset = 0
sublist = haystack[offset:offset + n_size]
while len(sublist) == n_size:
if sublist == needle:
res.append(offset)
offset += 1
sublist = haystack[offset:offset + n_size]
return res

if __name__ == "__main__":
print sublist(ne, hs)

-------snip-----------

--
Regards,

Diez B. Roggisch

Jul 18 '05 #2

Dang Griffith

On 9 Feb 2004 06:59:35 -0800, kl************@yahoo.de (Klaus Neuner)
wrote:

Hello,

the function given below returns all indexes of list2 where a sublist
of list2 that is identical to list1 begins.

As I will need this function quite often, I would like to know if more
experienced programmers would agree with the way I defined the
function:

- Is there a more efficient way to do it? (Apart from building
automata.)
- Is there a more elegant/Pythonic way to write the function?

Klaus
def sublist(list1, list2):
if list1 == list2:
return [0]
elif len(list1) > len(list2):
return []
else:
result = []
shift = 0
i = 0
while shift <= len(list2)-len(list1):
if list1[i] == list2[shift+i]:
if i == len(list1)-1:
result.append(shift)
i = 0
shift += 1
else:
i += 1
else:
i = 0
shift += 1
return result

I don't know about it being Pythonic, but list comprehensions are
always fun. This one's 10-20% faster (based on timeit results) than
your code:

def sublist(list1, list2):
len1 = len(list1)
rnge = xrange(len(list2))
return [i for i in rnge if list2[i:len1+i] == list1]

If you need this for a specific purpose, you might be able to
transform the data and try a different technique which would be
executed internally at the "C" level. For example, if your list is
always integers, you might be able to join them into a tab-delimited
string and scan the string with a regular expression (unless you
consider that building automata).

The list doesn't have to be integers for the regex idea to be
used--you just have to be sure the values in list1 contain no regex
metacharacters, and that list1 and list2 don't contain your delimiter
(tab in this example). If list1 contains regex characters, you need
to escape them with a backslash first.

It's possible that if you know something about the size of list1, you
could use this information also, but without more details, I'm
shooting in the dark.

--dang

Jul 18 '05 #3

Diez B. Roggisch

> def sublist(list1, list2):

len1 = len(list1)
rnge = xrange(len(list2))
return [i for i in rnge if list2[i:len1+i] == list1]
Nice one - looks like I optimized too much.
If you need this for a specific purpose, you might be able to
transform the data and try a different technique which would be
executed internally at the "C" level. For example, if your list is
always integers, you might be able to join them into a tab-delimited
string and scan the string with a regular expression (unless you
consider that building automata).
You don't gain anything by that - it would still perform the search inO(n). But your idea makes sense when the created "strings" would then be

used in a better perforiming string searching algorithm like shift-and.

--
Regards,

Diez B. Roggisch

Jul 18 '05 #4

Klaus Neuner

> def sublist(list1, list2):

len1 = len(list1)
rnge = xrange(len(list2))
return [i for i in rnge if list2[i:len1+i] == list1]
Thanks.
but without more details, I'm
shooting in the dark.

Actually, I want to leave open what list1 and list2 may contain. I
want to define a class Matcher, like so:

class Matcher(object):

def match(self, a, b):
if a == b:
return True
else:
return False

def all_matches(self, list1, list2):
len1 = len(list1)
rnge = xrange(len(list2))
return [i for i in rnge if self.match(list2[i:len1+i], list1)]

I use the function "match" instead of "==" here, because I want to
make subclasses that use a different concept of equivalence. One type
of equivalence
could be "equality in the first element".

class FirstElementMatcher(Matcher):

def match(self, a, b):
if a[0] == b[0]:
return True
else:
return False

Jul 18 '05 #5

by: Ponder | last post by:

Okay here's the scenario: I have two big lists of numbers. One number on each line (they're telephone numbers if that helps.) What I want to do, is have VB search through list2.txt for numbers...

Visual Basic 4 / 5 / 6

If stereo WAV file's both channels are identical, change format to mono for all files recursively

by: Raseliarison nirinA | last post by:

hi all, i found an unanswered question at http://www.faqts.com/knowledge_base/index.phtml/fid/538 with possible response below. i've tried to send it at faqt.python but can't figure out how to...

Python

finding the minimum allowed value for a type

by: Adam H. Peterson | last post by:

Is there a standard way to find the minimum value for a data type? I'm thinking along the lines of std::numeric_limits<T>::min(). But that doesn't work for floating point types. I can't use...

C / C++

Finding nodes which hold for multiple conditions

by: Ginu | last post by:

Hi, the task is to identify semantically identical elements where some additional attributes do not match. The XSL-transformation should find a node NAME which @id attribute matches to another...

.NET Framework

Finding multiple matches from a database

by: Jorge | last post by:

Hi Jy In the OleDbDataAdapter.SelectCommand.CommandText = "Select from Table where FieldName like '% " & variable & "%'" Kind Regards Jorge >-----Original Message-----

Visual Basic .NET

Finding a Form Element from an Input

by: kelvin.jones | last post by:

Hi, if I had the ID of an input element, how can I find the input's FORM in javascript? Basically, given a input's dom id, I want to insert something in the onSubmit of the Form that that input...

Javascript

Finding key level in mutidimensional array

by: Manish | last post by:

This is the print_r() for a variable $categories. $categories :: Array ( =Array ( =Array (

PHP

275

Finding the instance reference of an object

by: Astley Le Jasper | last post by:

Sorry for the numpty question ... How do you find the reference name of an object? So if i have this bob = modulename.objectname() how do i find that the name is 'bob'

Python

Finding Data duplication within two fields

by: cincisuzie | last post by:

I'm a beginner, so I will try and convey my questions as best as possible. I have a database of approximately 20,000 companies - I will call this list LIST1, and I purchased a list to with...

Microsoft SQL Server

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

finding all sublists of list2 that are identical to list1

Similar topics