468,740 Members | 2,000 Online

# all pairs of items in a list without indexing?

So I need to do something like:

for i in range(len(l)):
for j in range(i+1, len(l)):
# do something with (l[i], l[j])

where I get all pairs of items in a list (where I'm thinking of pairs
as sets, not tuples, so order doesn't matter). There isn't really
anything wrong with the solution here, but since Python's for-each
construction is so nice, I usually try to avoid range(len(..)) type
calls and list indexing when I can... Is there any way to do this
without indexing, e.g.:

for item1 in ...:
for item2 in ...:
# do something with (item1, item2)

I could do something like:

for i, item1 in enumerate(l):
for j in range(i+1, len(l)):
(item1, l[j])

but that only gets me halfway there... I also thought of something like:

for i, item1 in enumerate(l):
for item2 in l[i+1:]:
(item1, item2)

but that seems like a lot of wasteful list slicing...

Steve
--
You can wordify anything if you just verb it.
- Bucky Katt, Get Fuzzy
Jul 18 '05 #1
12 6623
Steven Bethard wrote:
So I need to do something like:

for i in range(len(l)):
for j in range(i+1, len(l)):
# do something with (l[i], l[j])

where I get all pairs of items in a list (where I'm thinking of pairs
as sets, not tuples, so order doesn't matter). There isn't really
anything wrong with the solution here, but since Python's for-each
construction is so nice, I usually try to avoid range(len(..)) type
calls and list indexing when I can... Is there any way to do this
without indexing

Are you trying to pair each item in a list with every other
item exactly once? Maybe this code does what you want:

while len(L)>0:
item1 = L.pop()
for item2 in L:
print (item1, item2)

pop() removes one item from the list. The inner for loop matches that
item with each of the remaining items. At the end of the outer loop the
list L is empty; you may want to run this loop over a copy of the list
if you want to do other things with the list later.

HTH,

Jim
Jul 18 '05 #2
On Tue, Sep 28, 2004 at 09:47:32PM +0000, Jim Sizelove wrote:
Steven Bethard wrote:
So I need to do something like:

for i in range(len(l)):
for j in range(i+1, len(l)):
# do something with (l[i], l[j])

where I get all pairs of items in a list (where I'm thinking of pairs
as sets, not tuples, so order doesn't matter). There isn't really
anything wrong with the solution here, but since Python's for-each
construction is so nice, I usually try to avoid range(len(..)) type
calls and list indexing when I can... Is there any way to do this
without indexing
Are you trying to pair each item in a list with every other
item exactly once? Maybe this code does what you want:

while len(L)>0:
item1 = L.pop()
for item2 in L:
print (item1, item2)

This looks good, as long as the fact that the "item1"s will come
out of the list backwards is OK.

def all_pairs(L):
while L:
i = L.pop()
for j in L: yield i, j
list(all_pairs(range(5)))

[(4, 0), (4, 1), (4, 2), (4, 3), (3, 0), (3, 1), (3, 2), (2, 0), (2, 1), (1, 0)]

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFBWd8LJd01MZaTXX0RAtSuAJ9b1FjtfXQF/uAruhDukWlcA9ftCwCeJfm+
9r/rzibh0ssDsVjfVA7d51s=
=e1EA
-----END PGP SIGNATURE-----

Jul 18 '05 #3
<jepler <at> unpythonic.net> writes:
def all_pairs(L):
while L:
i = L.pop()
for j in L: yield i, j

Interesting. I hadn't thought of this one -- it's not bad other than
requiring the list copy (since I need to maintain the original list).

Thanks,

Steve

Jul 18 '05 #4
Steven Bethard <st************@gmail.com> wrote:
So I need to do something like:

for i in range(len(l)):
for j in range(i+1, len(l)):
# do something with (l[i], l[j])
[...] I could do something like:

for i, item1 in enumerate(l):
for j in range(i+1, len(l)):
(item1, l[j])

but that only gets me halfway there... I also thought of something like:

for i, item1 in enumerate(l):
for item2 in l[i+1:]:
(item1, item2)

but that seems like a lot of wasteful list slicing...

You could try something like:

for i, item1 in enumerate(l):
for item2 in itertools.islice(l, i+1, None):
(item1, item2)

I'm not sure which version I prefer.

Marc
Jul 18 '05 #5
Steven Bethard <st************@gmail.com> wrote :
So I need to do something like:

for i in range(len(l)):
for j in range(i+1, len(l)):
# do something with (l[i], l[j]) ... Is there any way to do this without indexing

As others have pointed out, you can do this without indexing, but it
may involve overhead of other sorts (list copies, popping, etc.).

I'd just like to point out that, for the problem you are describing
(where you want every possible pair, but order doesn't matter), you
_can_ make your loop perhaps a bit more readable (depending on the
context it is used in), and perhaps a tiny bit faster:

for i in range(1,len(mylist)):
for j in range(i):
# do something with (l[i], l[j])

Regards,
Pat
Jul 18 '05 #6
Steven Bethard <st************@gmail.com> wrote:
...
for i, item1 in enumerate(l):
for item2 in l[i+1:]:
(item1, item2)

but that seems like a lot of wasteful list slicing...

Sorry, I don't get it: what's wasteful about it? Do you mean in terms
of performance?

With slicing:

kallisti:~/cb alex\$ python2.4 timeit.py -s'l=range(333)' '[(x,y) for i,x
in enumerate(l) for y in l[i+1:]]'
10 loops, best of 3: 1.43e+05 usec per loop
kallisti:~/cb alex\$ python2.4 timeit.py -s'l=range(333)' '[(x,y) for i,x
in enumerate(l) for y in l[i+1:]]'
10 loops, best of 3: 1.41e+05 usec per loop

With xrange(len(...:

kallisti:~/cb alex\$ python2.4 timeit.py -s'l=range(333)' '[(l[i],l[j])
for i in xrange(len(l)) for j in xrange(i+1,len(l))]'
10 loops, best of 3: 1.61e+05 usec per loop
kallisti:~/cb alex\$ python2.4 timeit.py -s'l=range(333)' '[(l[i],l[j])
for i in xrange(len(l)) for j in xrange(i+1,len(l))]'
10 loops, best of 3: 1.62e+05 usec per loop

You could use itertools.islice(l,i+1,len(l)) instead of l[i+1:], but in
my tests, in this particular case, itertools appears to be slower than
plain old slicing, albeit not by much -- about 1.45 to 1.46 vs plain
slicing's 1.42 or so, and xrange's 1.61 or so.

But maybe you can clarify the 'wasteful' observation better!
Alex
Jul 18 '05 #7
Steven Bethard <st************@gmail.com> wrote:
<jepler <at> unpythonic.net> writes:
def all_pairs(L):
while L:
i = L.pop()
for j in L: yield i, j

Interesting. I hadn't thought of this one -- it's not bad other than
requiring the list copy (since I need to maintain the original list).

If you start with an L=list(L), you can also optionally L.reverse() to
play with the ordering (if significant, issue still not clarified).

Ordering apart, performance is still not quite as good as with the
slicing you consider wasteful, in my measurements. Remember with the
slicing we got about 1.42e+05 microseconds -- with this approach we see:

kallisti:~/cb alex\$ python2.4 timeit.py -s'l=range(333)' -s'
def alp(L):
L = list(L)
while L:
it1 = L.pop()
for it2 in L: yield it1, it2
' 'list(alp(l))'
10 loops, best of 3: 1.51e+05 usec per loop
Alex
Jul 18 '05 #8
Alex Martelli <aleaxit <at> yahoo.com> writes:
With slicing:

kallisti:~/cb alex\$ python2.4 timeit.py -s'l=range(333)' '[(x,y) for i,x
in enumerate(l) for y in l[i+1:]]'
10 loops, best of 3: 1.43e+05 usec per loop
kallisti:~/cb alex\$ python2.4 timeit.py -s'l=range(333)' '[(x,y) for i,x
in enumerate(l) for y in l[i+1:]]'
10 loops, best of 3: 1.41e+05 usec per loop

With xrange(len(...:

kallisti:~/cb alex\$ python2.4 timeit.py -s'l=range(333)' '[(l[i],l[j])
for i in xrange(len(l)) for j in xrange(i+1,len(l))]'
10 loops, best of 3: 1.61e+05 usec per loop
kallisti:~/cb alex\$ python2.4 timeit.py -s'l=range(333)' '[(l[i],l[j])
for i in xrange(len(l)) for j in xrange(i+1,len(l))]'
10 loops, best of 3: 1.62e+05 usec per loop
Wow, interesting. Slicling's still faster in a slightly more fair comparison:
python timeit.py -s "l=range(333)" "[(x,i) for i,x in enumerate(l) for y in l [:i+1]]"
10 loops, best of 3: 24.5 msec per looppython timeit.py -s "l=range(333)" "[(x,i) for i,x in enumerate(l) for y in l [:i+1]]"
10 loops, best of 3: 24.4 msec per loop
python timeit.py -s "l=range(333)" "[(x,l[j]) for i,x in enumerate(l) for j in xrange(i+1,len(l))]"
10 loops, best of 3: 28.9 msec per looppython timeit.py -s "l=range(333)" "[(x,l[j]) for i,x in enumerate(l) for j

in xrange(i+1,len(l))]"
10 loops, best of 3: 28.8 msec per loop

So slicing a list and iterating over the slice is faster than iterating over
the indices and indexing the items... Is this generally true? Is it
something I can depend on across implementations?

STeve

Jul 18 '05 #9
Alex Martelli <aleaxit <at> yahoo.com> writes:
If you start with an L=list(L), you can also optionally L.reverse() to
play with the ordering (if significant, issue still not clarified).

The order of the elements isn't crucial, but for debugging purposes it would
be slightly better if they maintained order.

Steve

Jul 18 '05 #10
Alex Martelli <aleaxit <at> yahoo.com> writes:

Steven Bethard <steven.bethard <at> gmail.com> wrote:

but that seems like a lot of wasteful list slicing...

Sorry, I don't get it: what's wasteful about it?

Sorry, perhaps I should have said it has a bad smell. I wasn't the only one
who thought this smelled a little bad; quoting Jeff:

"Something about taking a slice of seq for the inner loop doesn't seem right
to me."

Maybe we need to retrain our noses. ;)

Steve

Jul 18 '05 #11
Steven Bethard <st************@gmail.com> wrote:
...
So slicing a list and iterating over the slice is faster than iterating over
the indices and indexing the items... Is this generally true? Is it
something I can depend on across implementations?

Unfortunately, nothing can forbid a new implementation, or even a new
release of an existing implementation, acquiring some special new
optimization that makes a technique that used to be slower suddenly
faster. And if it _was_ possible to forbid optimizations, you wouldn't
really want to, anyway -- a potential optimization of some _other_
technique wouldn't damage the solution you've chosen, anyway, except in
terms of "relative standing" vs the now-optimized alternative.

Since using the slice is simpler, and at least for now it's also faster
(so it won't become _too_ slow anyway), that's what I would suggest. As
a precaution you might want to use a 'virtual slice' through a function
such as:

def vslice_from(alist, start):
''' returns some iterable x such that "for i in x:" is exactly
equivalent to "for i in alist[start:]:". '''
return alist[start:]

so that you can experiment with alternatives without having to modify
most of your code, just through alterations to vslice_from. E.g.

import itertools
def vslice_from(alist, start, islice=itertools.islice):
return islice(alist, start, len(alist))

or

def vslice_from(alist, start):
for i in xrange(start, len(alist)): yield alist[i]

or even customized pyrex or C implementations if your quest for the
ultimate performance needs them...
Alex
Jul 18 '05 #12
Steven Bethard <st************@gmail.com> wrote:
but that seems like a lot of wasteful list slicing...

Sorry, I don't get it: what's wasteful about it?

Sorry, perhaps I should have said it has a bad smell. I wasn't the only one
who thought this smelled a little bad; quoting Jeff:

"Something about taking a slice of seq for the inner loop doesn't seem right
to me."

Maybe we need to retrain our noses. ;)

Hmmm, I see (or should I say, I sniff). Perhaps a Numeric.array, where
slices share slots rather than making new slots, would smell better and
might indeed perform faster here. But the comparison is with index
loops on xrange(i,len(l)), which is hardly Chanel N. 5 either...;-).
Alex
Jul 18 '05 #13

### This discussion thread is closed

Replies have been disabled for this discussion.