By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,474 Members | 966 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,474 IT Pros & Developers. It's quick & easy.

Profiling, sum-comprehension vs reduce

P: n/a
cnb
This must be because of implementation right? Shouldn't reduce be
faster since it iterates once over the list?
doesnt sum first construct the list then sum it?

-----------------------
>>================================ RESTART ================================
reduce with named function: 37.9864357062
reduce with nested, named function: 39.4710288598
reduce with lambda: 39.2463927678
sum comprehension: 25.9530121845
>>================================ RESTART ================================
reduce with named function: 36.4529584067
reduce with nested, named function: 37.6278529813
reduce with lambda: 38.2629448715
sum comprehension: 26.0197561422
>>>


from timeit import Timer

def add(x,y):
return x+y

def rednamed(lst):
return reduce(add, lst)

def rednn(lst):
def add2(x,y):
return x+y
return reduce(add2, lst)

def redlambda(lst):
return reduce(lambda x,y:x+y, lst)

def com(lst):
return sum(x for x in lst)

s = xrange(101)

t1 = Timer('rednamed(s)', 'from __main__ import rednamed, s')
t2 = Timer('rednn(s)', 'from __main__ import rednn, s')
t3 = Timer('redlambda(s)', 'from __main__ import redlambda, s')
t4 = Timer('com(s)', 'from __main__ import com, s')

print "reduce with named function: ", t1.timeit()
print "reduce with nested, named function: ", t2.timeit()
print "reduce with lambda: ", t3.timeit()
print "sum comprehension: ", t4.timeit()
---------------------------------------
also, using range instead of xrange doesnt seem to generate a
performance-penalty:
>>>
reduce with named function: 36.7560729087
reduce with nested, named function: 38.5393266463
reduce with lambda: 38.3852953378
sum comprehension: 27.9001007111
>>>
Sep 13 '08 #1
Share this Question
Share on Google+
7 Replies


P: n/a
On Sat, 13 Sep 2008 01:06:22 -0700, cnb wrote:
This must be because of implementation right? Shouldn't reduce be faster
since it iterates once over the list? doesnt sum first construct the
list then sum it?
No it doesn't. Why should it?
also, using range instead of xrange doesnt seem to generate a
performance-penalty:
(De)Allocating a list of length 100 isn't very slow. Try some million
elements. And watch the memory consumption too.

Ciao,
Marc 'BlackJack' Rintsch
Sep 13 '08 #2

P: n/a
doesnt sum first construct the list then sum it?
def com(lst):
return sum(x for x in lst)
You construct a generator over an existing list in your code.
Try sum([x for x in lst]) to see the effect of additional list
construction. And while you're at it, try the simple sum(lst).

Cheers,

- harold -
Sep 13 '08 #3

P: n/a
Bas
On Sep 13, 10:06*am, cnb <circularf...@yahoo.sewrote:
This must be because of implementation right? Shouldn't reduce be
faster since it iterates once over the list?
doesnt sum first construct the list then sum it?
No, sum also iterates the sequence just once and doesn't create a
list. It is probably implemented similar to

def sum(sequence, start=0):
it = iter(sequence)
total = start
for i in it:
total += i
return i

but then implemented in C for speed. Reduce is probably implemented
pretty similar, but then with a random function instead of addition.
Make sure that you understand the difference between generator
expression and list comprehension, and that [f(x) for x in something]
is (almost) equal to list(f(x) for x in something), so you can emulate
a LC by using the list constructor on the equivalent GE.

HTH,
Bas
Sep 13 '08 #4

P: n/a
On Sat, 13 Sep 2008 01:06:22 -0700, cnb wrote:
This must be because of implementation right? Shouldn't reduce be faster
since it iterates once over the list? doesnt sum first construct the
list then sum it?
What makes you think that?

Given the speed of sum(), it sure doesn't look like it's generating a
full list before summing. Why would it?

reduce with named function: 37.9864357062
reduce with nested, named function: 39.4710288598
reduce with lambda: 39.2463927678
sum comprehension: 25.9530121845
If you want to see reduce really shine, time it with a C-based function
rather than one written in pure Python:
>>Timer('reduce(add, xrange(10000))',
.... 'from operator import add').repeat(number=10000)
[19.724750995635986, 19.410486936569214, 19.614511013031006]
>>>
Timer('reduce(add, xrange(10000))',
.... 'def add(x, y): return x+y').repeat(number=10000)
[45.210143089294434, 44.814558982849121, 46.906874895095825]
You probably won't see much (if any) benefit for small lists, so make
sure your test is on a significantly-sized input list.

Of course, sum() is even faster than reduce:
>>Timer('sum(xrange(10000))').repeat(number=1000 0)
[9.814924955368042, 8.7169640064239502, 9.5062401294708252]

--
Steven
Sep 13 '08 #5

P: n/a


Steven D'Aprano wrote:
If you want to see reduce really shine, time it with a C-based function
rather than one written in pure Python:
>>>Timer('reduce(add, xrange(10000))',
... 'from operator import add').repeat(number=10000)
[19.724750995635986, 19.410486936569214, 19.614511013031006]
>>>Timer('reduce(add, xrange(10000))',
... 'def add(x, y): return x+y').repeat(number=10000)
[45.210143089294434, 44.814558982849121, 46.906874895095825]
....
Of course, sum() is even faster than reduce:
>>>Timer('sum(xrange(10000))').repeat(number=10000 )
[9.814924955368042, 8.7169640064239502, 9.5062401294708252]
'Of course', because the irreducible difference between reduce(add.seq)
and sum(seq) is that reduce has to call an add function while sum has
the operation built-in in place of a call.

Sep 13 '08 #6

P: n/a
Terry Reedy <tj*****@udel.eduwrites:
>Of course, sum() is even faster than reduce:
>>>>Timer('sum(xrange(10000))').repeat(number=1000 0)
[9.814924955368042, 8.7169640064239502, 9.5062401294708252]

'Of course', because the irreducible difference between
reduce(add.seq) and sum(seq) is that reduce has to call an add
function while sum has the operation built-in in place of a call.
Note that, despite appearances, it's not as built-in as one might
wish. sum(seq) is still completely generic and works on all
number-like objects (in fact on all objects that define an __add__
operation except strings, which are explicitly forbidden). This means
that it can't just go ahead and execute a C addition, it must properly
call PyNumber_Add (the C API call equivalent to Python's "+"
operator), which will then inspect the objects and invoke the
appropriate implementation of addition. On the other hand,
operator.add is defined to do exactly that -- call PyNumber_Add on its
arguments and return the result.

It's not entirely obvious why summing numbers using the same method,
repeatedly calling PyNumber_Add, would be significantly slower for
reduce than for sum. Admittedly reduce has to execute a Python-level
function call which requires allocating and filling out a tuple, only
to reach PyNumber_Add, which sum calls immediately. But
builtin_reduce() is actually careful to optimize those repeated
function calls, for example by reusing the same tuple across the loop.
Sep 13 '08 #7

P: n/a
On Sep 13, 9:27*pm, Hrvoje Niksic <hnik...@xemacs.orgwrote:
Note that, despite appearances, it's not as built-in as one might
wish. *sum(seq) is still completely generic and works on all
number-like objects (in fact on all objects that define an __add__
operation except strings, which are explicitly forbidden). *This means
that it can't just go ahead and execute a C addition, it must properly
call PyNumber_Add (the C API call equivalent to Python's "+"
operator), which will then inspect the objects and invoke the
appropriate implementation of addition.
The time machine strikes again! Try using Py2.6.
The built-in sum() function is much smarter and faster.
It does in fact use C addition.
Raymond
Sep 16 '08 #8

This discussion thread is closed

Replies have been disabled for this discussion.