finding items that occur before or after an item in lists

Simon Forman

I've got a function that I'd like to improve.

It takes a list of lists and a "target" element, and it returns the set
of the items in the lists that appear either before or after the target
item. (Actually, it's a generator, and I use the set class outside of
it to collect the unique items, but you get the idea. ;-) )

data = [
['this', 'string', 'is', 'nice'],
['this', 'string', 'sucks'],
['string', 'not', 'good'],
['what', 'a', 'string']
]

def f(target, ListOfLists):
for N in ListOfLists:
try:
i = N.index(target)
except ValueError:
continue

# item before target
if i: yield N[i - 1]

# item after target
try:
yield N[i + 1]
except IndexError:
pass

print set(n for n in f('string', data))

# Prints set(['this', 'not', 'is', 'sucks', 'a'])
Now, this works and normally I'd be happy with this and not try to
improve it unless I found that it was a bottleneck. However, in this
case I know that when a try..except statement fails (i.e. executes the
except part of the statement) it's "slow", *and* I know that the real
list of lists will have many lists in it in which the target item does
not appear.

That means that the first try..except statement will be executing it's
except clause more than half the time, and probably *much* more than
half the time.

This little problem strikes me as one of those fun puzzles that are a
joy to solve in python, so rather than keep it to myself I thought I'd
post it here on c.l.p and see what folks could come up with.

I'm not so much looking for simple improvements like

def f(target, ListOfLists):
for N in ListOfLists:
if target not in N:
continue
i = N.index(target)
# etc...

but rather cool, elegant, efficient, "pythonic" different ways to do
it.

I love these kinds of puzzles but I know the combined "might" of
comp.lang.python can come up with stuff I would never dream of, so, um,
here you go.

About the input data:
1) there are many lists, several hundred or thousand, perhaps even tens
of thousand
2) most of them will have less than 5 items, and almost all of them
less than 20, I don't know the maximum length but it will be less than
50 or so
3) The target occurs at most once in each list. (*no* item is
duplicated within a single list, all elements of a list are unique to
that list but may recur in other lists)
4) *most* of the lists will not contain the target
5) The data are not static, but they do change less often than this
function will be called. I don't know yet, but I'd say that this
function will be called maybe 10 times more often than the list of
lists will be modified.
6) Changes to the data will all be either adding or removing lists, the
lists themselves won't be modified.
7) The form of the data can be other than a list of lists, i.e. a set
of tuples, a list of tuples, a tuple of lists (although that would
require creating a new tuple when the data are modified, see 6)
whatever.
8) Building auxiliary data structures is fine (like a dict mapping
targets to lists they appear in, or to the sets that the function
returns ("memoizing")) but while speed is the main concern, memory
usage shouldn't be flagrant. Changing the data should be fast, so
rebuilding the auxiliary data structures shouldn't be expensive
Let's see, I guess that's about it. Thanks in advance.
Peace,
~Simon
--
"The history of mankind for the last four centuries is rather like that
of an imprisoned sleeper, stirring clumsily and uneasily while the
prison that restrains and shelters him catches fire, not waking but
incorporating the crackling and warmth of the fire with ancient and
incongruous dreams, than like that of a man consciously awake to danger
and opportunity." --H. P. Wells, "A Short History of the World"

Jul 4 '06 #1

Subscribe Post Reply

1667

Simon Forman

Simon Forman wrote:

I've got a function that I'd like to improve.

It takes a list of lists and a "target" element, and it returns the set
of the items in the lists that appear either before or after the target
item. (Actually, it's a generator, and I use the set class outside of
it to collect the unique items, but you get the idea. ;-) )

data = [
['this', 'string', 'is', 'nice'],
['this', 'string', 'sucks'],
['string', 'not', 'good'],
['what', 'a', 'string']
]

def f(target, ListOfLists):
for N in ListOfLists:
try:
i = N.index(target)
except ValueError:
continue

# item before target
if i: yield N[i - 1]

# item after target
try:
yield N[i + 1]
except IndexError:
pass

print set(n for n in f('string', data))

# Prints set(['this', 'not', 'is', 'sucks', 'a'])
Now, this works and normally I'd be happy with this and not try to
improve it unless I found that it was a bottleneck. However, in this
case I know that when a try..except statement fails (i.e. executes the
except part of the statement) it's "slow", *and* I know that the real
list of lists will have many lists in it in which the target item does
not appear.

That means that the first try..except statement will be executing it's
except clause more than half the time, and probably *much* more than
half the time.

Nevermind, it turns out that each list in the list of lists will
*always* contain the target item, making this task much less
interesting.

~Simon

Jul 5 '06 #2

by: kevin parks | last post by:

hi. I've been banging my head against this one a while and have asked around, and i am throwing this one out there in the hopes that some one can shed some light on what has turned out to be a...

Python

Counter for items in lists in lists?

by: Charlotte Henkle | last post by:

Hello; I'm pondering how to count the number of times an item appears in total in a nested list. For example: myList=,,] I'd like to know that a appeared three times, and b appeared twice,...

Python

Styling List Items

by: Peter Collinson | last post by:

Hi... Is there any way to style a List Item a different color and size than the <LI> in an Ordered List? I'd like a red super-script number and a dark blue text in a page's footnotes. And...

HTML / CSS

moving items from one list box to another

by: Kay | last post by:

Hello, I have two list boxes on my form, one initially displays with items and the other displays blank, by clicking a button, it is possible to move items from one box to another and vice a...

ASP.NET

IList items "keeping position"

by: Giovanni Bassi | last post by:

Hello All, I have a class implementing IList and a global object of this class type. I have two different form objects. One has a ListBox and the other has ComboBox. Both of their datasources...

Visual Basic .NET

Referencing Items in a List of Tuples

by: rshepard | last post by:

While working with lists of tuples is probably very common, none of my five Python books or a Google search tell me how to refer to specific items in each tuple. I find references to sorting a list...

Python

finding items that occur more than once in a list

by: Simon Forman | last post by:

Is there a more efficient way to do this? def f(L): '''Return a set of the items that occur more than once in L.''' L = list(L) for item in set(L): L.remove(item) return set(L)

Python

Countif Query - Finding Unique Entries

by: jameswilkinsonfjs | last post by:

Hi All, Ok I have a table - it lists items with a unique reference code; lets say there are 4 items : Item RefCode 1 ABC1 2 ABC2 3 ...

Microsoft Access / VBA

Retrieving Unique Items from a List

by: =?Utf-8?B?YW1pcg==?= | last post by:

Hi, I have a Generic Object that has a private List field item. I populate the List in a different function I use a FOREACH LOOP with a FindAll function to get all items that have a certain...

C# / C Sharp

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

finding items that occur before or after an item in lists

Similar topics