473,396 Members | 1,933 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

pairs from a list

I want to generate sequential pairs from a list.
Here is a way::

from itertools import izip, islice
for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
print x12

(Of course the print statement is just illustrative.)
What is the fastest way? (Ignore the import time.)

Thanks,
Alan Isaac
Jan 22 '08 #1
18 3768
Alan Isaac <ai****@american.eduwrites:
(Of course the print statement is just illustrative.)
What is the fastest way? (Ignore the import time.)
You have to try a bunch of different ways and time them. One
idea (untested):

def pairs(seq):
while True:
yield (seq.next(), seq.next())
Jan 22 '08 #2
On Jan 21, 10:20 pm, Alan Isaac <ais...@american.eduwrote:
I want to generate sequential pairs from a list.
Here is a way::

from itertools import izip, islice
for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
print x12

(Of course the print statement is just illustrative.)
What is the fastest way? (Ignore the import time.)
Look up the timeit module and test yourself the various alternatives;
that's the most reliable way to tell for sure.

George
Jan 22 '08 #3
On Jan 22, 3:20 am, Alan Isaac <ais...@american.eduwrote:
I want to generate sequential pairs from a list.
<<snip>>
What is the fastest way? (Ignore the import time.)
1) How fast is the method you have?
2) How much faster does it need to be for your application?
3) Are their any other bottlenecks in your application?
4) Is this the routine whose smallest % speed-up would give the
largest overall speed up of your application?

- Paddy.
Jan 22 '08 #4
On Jan 22, 12:15 am, Paddy <paddy3...@googlemail.comwrote:
On Jan 22, 3:20 am, Alan Isaac <ais...@american.eduwrote:I want to generate sequential pairs from a list.
<<snip>>
What is the fastest way? (Ignore the import time.)

1) How fast is the method you have?
2) How much faster does it need to be for your application?
3) Are their any other bottlenecks in your application?
4) Is this the routine whose smallest % speed-up would give the
largest overall speed up of your application?
I believe the "what is the fastest way" question for such small well-
defined tasks is worth asking on its own, regardless of whether it
makes a difference in the application (or even if there is no
application to begin with). Just because cpu cycles are cheap these
days is not a good reason to be sloppy. Moreover, often the fastest
pure Python version happens to be among the most elegant and concise,
unlike other languages where optimization usually implies obfuscation.

George
Jan 22 '08 #5
On Jan 22, 3:20*am, Alan Isaac <ais...@american.eduwrote:
I want to generate sequential pairs from a list.
Here is a way::

* * from itertools import izip, islice
* * for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
* * * * print x12

(Of course the print statement is just illustrative.)
What is the fastest way? (Ignore the import time.)

Thanks,
Alan Isaac
Don't know the fastest, but here's a very concise way:

from itertools import izip

def ipairs(seq):
it = iter(seq)
return izip(it, it)
>>list(pairs(xrange(10)))
[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
>>list(pairs('hello'))
[('h', 'e'), ('l', 'l')]

--
Arnaud

Jan 22 '08 #6
I suppose my question should have been,
is there an obviously faster way?
Anyway, of the four ways below, the
first is substantially fastest. Is
there an obvious reason why?

Thanks,
Alan Isaac

PS My understanding is that the behavior
of the last is implementation dependent
and not guaranteed.

def pairs1(x):
for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
yield x12

def pairs2(x):
xiter = iter(x)
while True:
yield xiter.next(), xiter.next()

def pairs3(x):
for i in range( len(x)//2 ):
yield x[2*i], x[2*i+1],

def pairs4(x):
xiter = iter(x)
for x12 in izip(xiter,xiter):
yield x12
Jan 22 '08 #7
On Jan 22, 1:19*pm, Alan Isaac <ais...@american.eduwrote:
[...]
PS My understanding is that the behavior
of the last is implementation dependent
and not guaranteed.
[...]
def pairs4(x):
* * xiter = iter(x)
* * for x12 in izip(xiter,xiter):
* * * * yield x12
According to the docs [1], izip is defined to be equivalent to:

def izip(*iterables):
iterables = map(iter, iterables)
while iterables:
result = [it.next() for it in iterables]
yield tuple(result)

This guarantees that it.next() will be performed from left to right,
so there is no risk that e.g. pairs4([1, 2, 3, 4]) returns [(2, 1),
(4, 3)].

Is there anything else that I am overlooking?

[1] http://docs.python.org/lib/itertools-functions.html

--
Arnaud
Jan 22 '08 #8
On Jan 22, 1:19*pm, Alan Isaac <ais...@american.eduwrote:
I suppose my question should have been,
is there an obviously faster way?
Anyway, of the four ways below, the
first is substantially fastest. *Is
there an obvious reason why?
Can you post your results?

I get different ones (pairs1 and pairs2 rewritten slightly to avoid
unnecessary indirection).

====== pairs.py ===========
from itertools import *

def pairs1(x):
return izip(islice(x,0,None,2),islice(x,1,None,2))

def pairs2(x):
xiter = iter(x)
while True:
yield xiter.next(), xiter.next()

def pairs3(x):
for i in range( len(x)//2 ):
yield x[2*i], x[2*i+1],

def pairs4(x):
xiter = iter(x)
return izip(xiter,xiter)

def compare():
import timeit
for i in '1234':
t = timeit.Timer('list(pairs.pairs%s(l))' % i,
'import pairs; l=range(1000)')
print 'pairs%s: %s' % (i, t.timeit(10000))

if __name__ == '__main__':
compare()
=====================

marigold:python arno$ python pairs.py
pairs1: 0.789824962616
pairs2: 4.08462786674
pairs3: 2.90438890457
pairs4: 0.536775827408

pairs4 wins.

--
Arnaud

Jan 22 '08 #9
Arnaud Delobelle wrote:
According to the docs [1], izip is defined to be equivalent to:

def izip(*iterables):
iterables = map(iter, iterables)
while iterables:
result = [it.next() for it in iterables]
yield tuple(result)

This guarantees that it.next() will be performed from left to right,
so there is no risk that e.g. pairs4([1, 2, 3, 4]) returns [(2, 1),
(4, 3)].

Is there anything else that I am overlooking?

[1] http://docs.python.org/lib/itertools-functions.html

<URL:http://bugs.python.org/issue1121416>

fwiw,
Alan Isaac
Jan 22 '08 #10
On Jan 22, 4:10*pm, Alan Isaac <ais...@american.eduwrote:
<URL:http://bugs.python.org/issue1121416>

fwiw,
Alan Isaac
Thanks. So I guess I shouldn't take the code snippet I quoted as a
specification of izip but rather as an illustration.

--
Arnaud

Jan 22 '08 #11
On Jan 22, 5:34 am, George Sakkis <george.sak...@gmail.comwrote:
On Jan 22, 12:15 am, Paddy <paddy3...@googlemail.comwrote:
On Jan 22, 3:20 am, Alan Isaac <ais...@american.eduwrote:I want to generate sequential pairs from a list.
<<snip>>
What is the fastest way? (Ignore the import time.)
1) How fast is the method you have?
2) How much faster does it need to be for your application?
3) Are their any other bottlenecks in your application?
4) Is this the routine whose smallest % speed-up would give the
largest overall speed up of your application?

I believe the "what is the fastest way" question for such small well-
defined tasks is worth asking on its own, regardless of whether it
makes a difference in the application (or even if there is no
application to begin with).
Hi George,
You need to 'get it right' first. Micro optimizations for speed
without thought of the wider context is a bad habit to form and a time
waster.
If the routine is all that needs to be delivered and it does not
perform at an acceptable speed then find out what is acceptable and
optimise towards that goal. My questions were set to get posters to
think more about the need for speed optimizations and where they
should be applied, (if at all).

A bit of forethought might justify leaving the routine alone, or
optimising for readability instead.

- Paddy.
Jan 22 '08 #12
On Jan 22, 6:34*pm, Paddy <paddy3...@googlemail.comwrote:
[...]
Hi George,
You need to 'get it right' first. Micro optimizations for speed
without thought of the wider context is a bad habit to form and a time
waster.
If the routine is all that needs to be delivered and it does not
perform at an acceptable speed then find out what is acceptable and
optimise towards that goal. My questions were set to get posters to
think more about the need for speed optimizations and where they
should be applied, (if at all).

A bit of forethought might justify leaving the routine alone, or
optimising for readability instead.
But it's fun!

Some-of-us-can't-help-it'ly yours
--
Arnaud

Jan 22 '08 #13
Steven D'Aprano wrote:
In fact, "fastest" isn't even a meaningful attribute. Does it mean:

* the worst-case is fastest
* the best-case is fastest
* the average-case is fastest
* fastest on typical data
* all of the above

I confess that it did not occur to me that there
might be an interesting distinction among these
cases for the question of how to get sequential
pairs from a list. How would one draw these
distinctions in this case?

Thanks,
Alan Isaac

PS Just for context, the sequential pairs were
needed in a simulation, but my question was
primarily prompted by my surprise that the
approaches I looked at differed as much as they did.
Jan 23 '08 #14
On Jan 23, 4:37 am, Steven D'Aprano
<ste...@REMOVE.THIS.cybersource.com.auwrote:
On Tue, 22 Jan 2008 23:33:00 -0800, George Sakkis wrote:
As I mentioned already, I consider the seeking of the most efficient
solution a legitimate question, regardless of whether a "dumb" solution
is fast enough for an application. Call it a "don't be sloppy" principle
if you wish.

Sure, by why do you limit "efficient" and "don't be sloppy" to mean
"write the fastest executing code you can, regardless of every other
trade-off"?
I explicitly didn't limit sloppiness to inefficiency and mentioned
it's a tradeoff:

"... all else being equal or at least comparable (elegance,
conciseness, readability, etc.). Of course it's a tradeoff;
spending a week to save a few milliseconds on average is usually a
waste for most applications, but being a lazy keyboard banger writing
the first thing that pops into mind is not that good either."
But... do you write list.__len__() instead of len(list) to save a few
nanoseconds?
No, of course not, it's not worth it, but that doesn't mean that being
curious about what's faster and using timeit to find out is totally
worthless.

Another example: avoiding attribute lookups within a loop. I rarely
write
bar = foo.bar
for i in big_list: bar(i)

but it's valuable to know that it can make a difference when I really
need it. Always writing the first thing that "just works" prevents one
from even considering that there might be faster (or more elegant,
more general, etc.) alternatives.

George
Jan 23 '08 #15
On Jan 23, 1:30 pm, Paddy <paddy3...@googlemail.comwrote:
I've heard quality expressed as "meeting requirements", which I think
is apt. Falling short of requirements everyone knows doesn't give a
quality result, but equally 'exceeding' requirements also detracts
from quality (as does not knowing your requirements).
It is good to learn optimization techniques, which may be part of what
you are saying, but part of that is learning when it pays to apply
them and/or search for them; and when it does not.
The OP wanted an answer to a simple question, not a lecture on good
software engineering principles. This whole subthread reminds of a
movie (can't remember which) where someone asks his buddy in the
stadium "what do you want?". His buddy gets it wrong and embarks in a
long diatribe of what he wants in life now, what he wanted as a child,
what's the meaning of one's life and so on. After a couple of minutes
the guy cuts him and asks again:
- "Man, what do you want, burger or hot dog?"
- "Oh, a hot dog".

Sometimes you want to see the tree right in front of you, not the
whole damn forest.

George
Jan 23 '08 #16
On Jan 23, 7:06*pm, George Sakkis <george.sak...@gmail.comwrote:
The OP wanted an answer to a simple question, not a lecture on good
software engineering principles.
I wholeheartedly agree.

--
Arnaud

Jan 23 '08 #17
On Wed, 23 Jan 2008 10:39:25 -0800, George Sakkis wrote:
On Jan 23, 4:37 am, Steven D'Aprano
<ste...@REMOVE.THIS.cybersource.com.auwrote:
>On Tue, 22 Jan 2008 23:33:00 -0800, George Sakkis wrote:
As I mentioned already, I consider the seeking of the most efficient
solution a legitimate question, regardless of whether a "dumb"
solution is fast enough for an application. Call it a "don't be
sloppy" principle if you wish.

Sure, by why do you limit "efficient" and "don't be sloppy" to mean
"write the fastest executing code you can, regardless of every other
trade-off"?

I explicitly didn't limit sloppiness to inefficiency and mentioned it's
a tradeoff:
Of course you did, and I was being sloppy. The "you" was meant more as a
generic you than you yourself. Sorry for the confusion.

As for your other points, I think we're actually very much in agreement,
except for your tolerance of random posters asking what I believe is an
incoherent question: "what's the fastest way to do ...?". It seems to me
you're willing to give them the benefit of the doubt that they've done
their profiling and considered their trade-offs, or at the very worst are
asking from purely intellectual curiosity. Call me cynical if you like,
but I think that in the absence of any direct evidence supporting those
things, the most likely possibility is the opposite.

--
Steven
Jan 23 '08 #18
On Wed, 23 Jan 2008 11:06:44 -0800, George Sakkis wrote:
On Jan 23, 1:30 pm, Paddy <paddy3...@googlemail.comwrote:
>I've heard quality expressed as "meeting requirements", which I think
is apt. Falling short of requirements everyone knows doesn't give a
quality result, but equally 'exceeding' requirements also detracts from
quality (as does not knowing your requirements). It is good to learn
optimization techniques, which may be part of what you are saying, but
part of that is learning when it pays to apply them and/or search for
them; and when it does not.

The OP wanted an answer to a simple question, not a lecture on good
software engineering principles.

(1) It isn't a simple question. There were at least five alternatives,
one of which would have been "fastest" except it failed the correctness
test. The OP went on to ask:

"I suppose my question should have been, is there an obviously faster way?
Anyway, of the four ways below, the first is substantially fastest.
[Note: actually it wasn't.] Is there an obvious reason why?"

Asking *why* something is faster is very different from asking *which* is
faster. I don't believe anyone actually answered that "why" question.

So, for the record, here's my attempt at an answer:

No, there is no "obvious" reason why one solution should be faster than
another. But perhaps the two most general rules of thumb are:

* the more work you can push into built-ins written in C, and the least
in slow code written in Python, the faster;

* lazy iterators are often (but not always) faster than building the
entire list all at once.

Other than that, the details of why one particular technique is faster
than another depends on the implementation of each, and that's not
obvious.

(2) What the OP wanted and what he needed are not necessarily the same.
Answering the OP's direct question is like giving him a fish -- he'll be
hungry tomorrow. Teaching him good engineering principles is like
teaching him to fish.

(3) This is Usenet, and the advice is free. Sometimes you get more than
you expected, sometimes less. Regardless of what the OP wanted, the
thread will go where the thread will go. The discussions of software
engineering were mostly in response to you, not the OP.

--
Steven
Jan 23 '08 #19

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Equis Uno | last post by:
Hi, Assume I'm given a 100k file full of key-value pairs: date,value 26-Feb-04,36.47 25-Feb-04,36.43 24-Feb-04,36.30 23-Feb-04,37.00 20-Feb-04,37.00
12
by: Steven Bethard | last post by:
So I need to do something like: for i in range(len(l)): for j in range(i+1, len(l)): # do something with (l, l) where I get all pairs of items in a list (where I'm thinking of pairs as sets,...
7
by: Woodster | last post by:
Further to a question I posted here recently about coding an xml rpeort parser, I need to use a list of name/value pairs, in this case, a list property names and values for a given tag. Rather...
5
by: Bosconian | last post by:
I need a comma delimited regular expression pattern with the followng restrictions: no leading and trailing white space no trailing comma double quoted numeric/alpha pairs each pair on a...
2
by: Evan | last post by:
Is there a simple way to to identify and remove matching pairs from 2 lists? For example: I have a= b=
4
by: Ronald S. Cook | last post by:
I want to maintain a list of value/pairs in my Windows application. Where is the best way to store this? Do I want to use Application Settings and have a row for each value pair? There will be...
3
by: Gabriel Zachmann | last post by:
Well, could some kind soul please explain to me why the following trivial code is misbehaving? #!/usr/bin/python lst = s =
7
by: tmitt | last post by:
Hi, I missed a couple classes due to being sick this past week and I am having trouble getting steered in the right direction for my programming assignment. I'd ask the professor for help but he has...
15
by: mcjason | last post by:
I saw something interesting about a grid pair puzzle problem that it looks like a machine when you find each that work out the way it does and say with all the others that work out the way they...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.