Hi all,
I have a list of items that has contiguous repetitions of values, but
the number and location of the repetitions is not important, so I just
need to strip them out. For example, if my original list is
[0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with [0,1,2,3,2,4,5].
Here is the way I'm doing this now:
def straightforward_collapse(myList):
collapsed = [myList[0]]
for n in myList[1:]:
if n != collapsed[-1]:
collapsed.append(n)
return collapsed
Is there an elegant way to do this, or should I just stick with the code
above?
Thanks,
Alan McIntyre http://www.esrgtech.com 42 2557
Alan McIntyre wrote: Hi all,
I have a list of items that has contiguous repetitions of values, but the number and location of the repetitions is not important, so I just need to strip them out. For example, if my original list is [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with [0,1,2,3,2,4,5].
Here is the way I'm doing this now:
def straightforward_collapse(myList): collapsed = [myList[0]] for n in myList[1:]: if n != collapsed[-1]: collapsed.append(n)
return collapsed
Is there an elegant way to do this, or should I just stick with the code above?
Well, this does about the same thing, but using enumerate and a list
comprehension:
py> lst = [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]
py> [item for i, item in enumerate(lst) if i == 0 or item != lst[i-1]]
[0, 1, 2, 3, 2, 4, 5]
Similar code that doesn't check 'if i == 0' each time through:
py> itr = enumerate(lst)
py> itr.next()
(0, 0)
py> [lst[0]] + [item for i, item in itr if item != lst[i-1]]
[0, 1, 2, 3, 2, 4, 5]
I don't know if either of these is really more elegant though...
Steve
On Fri, 04 Feb 2005 12:43:37 -0500, Alan McIntyre wrote: def straightforward_collapse(myList): collapsed = [myList[0]] for n in myList[1:]: if n != collapsed[-1]: collapsed.append(n)
return collapsed
Is there an elegant way to do this, or should I just stick with the code above?
I think that's pretty elegant; I read it and immediately understood what
you were doing. There may be some performance tweaks you could make if you
were doing this to large lists, and my instincts say to re-write it as an
iterator if you use it a lot like:
for item in collapse(yourList):
but other than that which may not even apply, "straightforward" is
generally a *good* thing, don't you think? :-)
On Fri, Feb 04, 2005 at 12:43:37PM -0500, Alan McIntyre wrote: Hi all,
I have a list of items that has contiguous repetitions of values, but the number and location of the repetitions is not important, so I just need to strip them out. For example, if my original list is [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with [0,1,2,3,2,4,5].
Here is the way I'm doing this now:
def straightforward_collapse(myList): collapsed = [myList[0]] for n in myList[1:]: if n != collapsed[-1]: collapsed.append(n)
return collapsed
Is there an elegant way to do this, or should I just stick with the code above?
If you are using python2.4, import itertools as it [x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5])]
[0, 1, 2, 3, 2, 4, 5]
Since this is 2.4 you could also return a generator expression.
def iter_collapse(myList):
.... return (x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]))
.... i = iter_collapse([0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]) i
<generator object at 0xb7df6b2c> list(i)
[0, 1, 2, 3, 2, 4, 5]
-Jack
Your first example is along the lines of what I was thinking when I said
"elegant." :) I was looking for something that I could drop into one
or two lines of code; I may not do that if I'm writing code that will
have to be maintained, but it's still nice to know how to do it.
Thanks :)
Alan
Steven Bethard wrote: Well, this does about the same thing, but using enumerate and a list comprehension:
py> lst = [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5] py> [item for i, item in enumerate(lst) if i == 0 or item != lst[i-1]] [0, 1, 2, 3, 2, 4, 5]
Similar code that doesn't check 'if i == 0' each time through:
py> itr = enumerate(lst) py> itr.next() (0, 0) py> [lst[0]] + [item for i, item in itr if item != lst[i-1]] [0, 1, 2, 3, 2, 4, 5]
I don't know if either of these is really more elegant though...
Steve
Alan McIntyre wrote: Hi all,
I have a list of items that has contiguous repetitions of values, but the number and location of the repetitions is not important, so I just need to strip them out. For example, if my original list is [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with [0,1,2,3,2,4,5].
Here is the way I'm doing this now:
def straightforward_collapse(myList): collapsed = [myList[0]] for n in myList[1:]: if n != collapsed[-1]: collapsed.append(n)
return collapsed
Is there an elegant way to do this, or should I just stick with the code above?
Here's a solution that works for iterables other than lists:
py> def collapse(iterable):
.... enumeration = enumerate(iterable)
.... _, lastitem = enumeration.next()
.... yield lastitem
.... for i, item in enumeration:
.... if item != lastitem:
.... yield item
.... lastitem = item
....
py> lst = [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]
py> list(collapse(lst))
[0, 1, 2, 3, 2, 4, 5]
Again, I'm still not sure I'd call this more elegant...
STeVe
I think you're right; sometimes I'm susceptible to the "I can do that in
one line of code" temptation. :)
Since this current bit of code will probably end up in something that's
going to be maintained, I will probably stick with the straightforward
method just to be nice to the maintainer (especially if it's me!).
Jeremy Bowers wrote: I think that's pretty elegant; I read it and immediately understood what you were doing. There may be some performance tweaks you could make if you were doing this to large lists, and my instincts say to re-write it as an iterator if you use it a lot like:
for item in collapse(yourList):
but other than that which may not even apply, "straightforward" is generally a *good* thing, don't you think? :-)
Jack,
I'm not using 2.4 yet; still back in 2.3x. :) Thanks for the examples,
though - they are clear enough that I will probably use them when I upgrade.
Thanks,
Alan
Jack Diederich wrote: If you are using python2.4,
import itertools as it [x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5])] [0, 1, 2, 3, 2, 4, 5]
Since this is 2.4 you could also return a generator expression.
def iter_collapse(myList): ... return (x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5])) ... i = iter_collapse([0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]) i <generator object at 0xb7df6b2c> list(i)
[0, 1, 2, 3, 2, 4, 5] -Jack
Alan McIntyre wrote:
.... I have a list of items that has contiguous repetitions of values, but the number and location of the repetitions is not important, so I just need to strip them out. For example, if my original list is [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with [0,1,2,3,2,4,5].
....
Is there an elegant way to do this, or should I just stick with the code above? def changes( dataset ):
.... last = None
.... for value in dataset:
.... if value != last:
.... yield value
.... last = value
.... print list(changes(data ))
which is quite readable/elegant IMO.
Have fun,
Mike
________________________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder http://www.vrplumber.com http://blog.vrplumber.com
PyCon is coming...
Mike C. Fletcher wrote: Alan McIntyre wrote: ...
I have a list of items that has contiguous repetitions of values, but the number and location of the repetitions is not important, so I just need to strip them out. For example, if my original list is [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with [0,1,2,3,2,4,5].
...
Is there an elegant way to do this, or should I just stick with the code above?
>>> def changes( dataset ):
... last = None ... for value in dataset: ... if value != last: ... yield value ... last = value ... >>> print list(changes(data ))
which is quite readable/elegant IMO.
But fails if the list starts with None:
py> lst = [None,0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]
py> def changes(dataset):
.... last = None
.... for value in dataset:
.... if value != last:
.... yield value
.... last = value
....
py> list(changes(lst))
[0, 1, 2, 3, 2, 4, 5]
A minor modification that does appear to work:
py> def changes(dataset):
.... last = object()
.... for value in dataset:
.... if value != last:
.... yield value
.... last = value
....
py> list(changes(lst))
[None, 0, 1, 2, 3, 2, 4, 5]
STeVe
Steven Bethard <st************@gmail.com> writes: Mike C. Fletcher wrote:
[...] >>> def changes( dataset ): ... last = None ... for value in dataset: ... if value != last: ... yield value ... last = value ... >>> print list(changes(data )) which is quite readable/elegant IMO.
But fails if the list starts with None:
py> lst = [None,0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5] py> def changes(dataset): ... last = None ... for value in dataset: ... if value != last: ... yield value ... last = value ... py> list(changes(lst)) [0, 1, 2, 3, 2, 4, 5]
A minor modification that does appear to work:
py> def changes(dataset): ... last = object() ... for value in dataset: ... if value != last: ... yield value ... last = value ... py> list(changes(lst)) [None, 0, 1, 2, 3, 2, 4, 5]
Unless the first object in the list has a weird __cmp__ (does
happen...). OK, weird __cmp__s are nasty anyway, but still, why
compound it through cleverness when you can write a really plodding
function that *always* does what it says on the tin?
clever-is-evil-ly y'rs,
John
John J. Lee wrote: Steven Bethard <st************@gmail.com> writes:
Mike C. Fletcher wrote:
[...]
>>> def changes( dataset ): ... last = None ... for value in dataset: ... if value != last: ... yield value ... last = value ... >>> print list(changes(data )) which is quite readable/elegant IMO.
But fails if the list starts with None:
py> lst = [None,0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5] py> def changes(dataset): ... last = None ... for value in dataset: ... if value != last: ... yield value ... last = value ... py> list(changes(lst)) [0, 1, 2, 3, 2, 4, 5]
A minor modification that does appear to work:
py> def changes(dataset): ... last = object() ... for value in dataset: ... if value != last: ... yield value ... last = value ... py> list(changes(lst)) [None, 0, 1, 2, 3, 2, 4, 5]
Unless the first object in the list has a weird __cmp__ (does happen...). OK, weird __cmp__s are nasty anyway, but still, why compound it through cleverness when you can write a really plodding function that *always* does what it says on the tin?
In that case:
Py> def collapsed(iterable):
.... itr = iter(iterable)
.... last = itr.next()
.... yield last
.... for value in itr:
.... if value != last:
.... yield value
.... last = value
....
Py> from Tkinter import _flatten #**
Py> lst = list(_flatten([[i] * 5 for i in range(10)]))
Py> lst
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5
, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9]
Py> list(collapsed(lst))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
** Hmm, this function really is quite handy when generating test data. Maybe in
Python 2.5 the line should be spelt "from itertools import flatten"
Cheers,
Nick.
--
Nick Coghlan | nc******@email.com | Brisbane, Australia
--------------------------------------------------------------- http://boredomandlaziness.skystorm.net
Jack Diederich wrote: Since this is 2.4 you could also return a generator expression.
def iter_collapse(myList):
... return (x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5])) ...
But why write a function that returns a generator expression, when you could
just turn the function itself into a generator?
Py>def iter_collapse(myList):
.... for x in it.groupby(myList):
.... yield x[0]
Cheers,
Nick.
--
Nick Coghlan | nc******@email.com | Brisbane, Australia
--------------------------------------------------------------- http://boredomandlaziness.skystorm.net
Steven Bethard <st************@gmail.com> wrote: Here's a solution that works for iterables other than lists:
py> def collapse(iterable): ... enumeration = enumerate(iterable) ... _, lastitem = enumeration.next() ... yield lastitem ... for i, item in enumeration: ... if item != lastitem: ... yield item ... lastitem = item ... py> lst = [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5] py> list(collapse(lst)) [0, 1, 2, 3, 2, 4, 5]
Again, I'm still not sure I'd call this more elegant...
Hmmmm, what role does the enumeration play here? I don't see how you're
using it, at all. Why not just:
def collapse(iterable):
it = iter(iterable)
lastitem = it.next()
yield lastitem
for item in it:
if item != lastitem:
yield item
lastitem = item
that's basically just the same as your code but without the strangeness
of making an enumerate for the purpose of ignoring what the enumerate
adds to an ordinary iterator.
Alex
Alan McIntyre wrote: Hi all,
I have a list of items that has contiguous repetitions of values, but
the number and location of the repetitions is not important, so I
just need to strip them out. For example, if my original list is [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with
[0,1,2,3,2,4,5]. Here is the way I'm doing this now:
def straightforward_collapse(myList): collapsed = [myList[0]] for n in myList[1:]: if n != collapsed[-1]: collapsed.append(n)
return collapsed
Is there an elegant way to do this, or should I just stick with the
code above?
Thanks, Alan McIntyre http://www.esrgtech.com
Here is a version using dictionary properties, ie no duplication of
keys.
def condense(l):
d={}
for item in l:
d[item]=1
l=d.keys()
return l
Cheers
Tony
Tony,
Actually I only want to remove a certain kind of duplication; if an item
occurs twice - say like this: [1,1,1,2,2,2,1,1,1], then I need to keep
the order and occurrence of the individual values: [1,2,1]. Using a
dict as you proposed loses the order of occurrence, as well as multiple
occurrences of groups of the same item.
If I didn't need those two qualities of the list to be preserved,
though, I think I'd use something like your solution (if I was using a
Python older than 2.3) or Steve Coats' solution posted above using Set.
Thanks!
Alan
Tony wrote: Here is a version using dictionary properties, ie no duplication of keys.
def condense(l): d={} for item in l: d[item]=1 l=d.keys() return l
Cheers Tony
Alex,
Wow, that method turns out to be the fastest so far in a simple
benchmark on Python2.3 (on my machine, of course, YMMV); it takes 14%
less time than the one that I deemed most straightforward. :)
Thanks,
Alan
Alex Martelli wrote: Hmmmm, what role does the enumeration play here? I don't see how you're using it, at all. Why not just:
def collapse(iterable): it = iter(iterable) lastitem = it.next() yield lastitem for item in it: if item != lastitem: yield item lastitem = item
that's basically just the same as your code but without the strangeness of making an enumerate for the purpose of ignoring what the enumerate adds to an ordinary iterator.
Alex
On Sat, Feb 05, 2005 at 02:31:08PM +1000, Nick Coghlan wrote: Jack Diederich wrote:Since this is 2.4 you could also return a generator expression.
>def iter_collapse(myList):
... return (x[0] for (x) in it.groupby([0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5])) ...
But why write a function that returns a generator expression, when you could just turn the function itself into a generator?
Py>def iter_collapse(myList): ... for x in it.groupby(myList): ... yield x[0]
Here is where I say, "because it is faster!"
except it isn't. maybe. The results are unexpected, anyway.
import timeit
import itertools as it
def collapse1(myl):
for x in it.groupby(myl):
yield x[0]
def collapse2(myl):
return (x[0] for (x) in it.groupby(myl))
list_str = '[0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]'
print "collapse1", timeit.Timer('collapse1(%s)' % (list_str), 'from __main__ import collapse1').timeit()
print "collapse2", timeit.Timer('collapse2(%s)' % (list_str), 'from __main__ import collapse2').timeit()
print "list(collapse1)", timeit.Timer('list(collapse1(%s))' % (list_str), 'from __main__ import collapse1').timeit()
print "list(collapse2)", timeit.Timer('list(collapse2(%s))' % (list_str), 'from __main__ import collapse2').timeit()
collapse1 1.06855082512
collapse2 3.40627384186
list(collapse1) 8.31489896774
list(collapse2) 9.49903011322
The overhead of creating the generator expression seems much higher than
creating the equivalent function. If you subtract our the setup difference
actually running through the whole iterator is slightly faster for genexps
than functions that yield. At a guess it has something to do with how
they handle lexical scope and some byte code differences.
I said guess, right?
-Jack
Alan McIntyre wrote: Tony,
Actually I only want to remove a certain kind of duplication; <snip>
How about this one liner?
def condense(m):
print [m[0]]+[m[k] for k in range(1,len(m)) if
m[k]!=m[k-1]]
b=[1,1,1,2,2,2,1,1,1]
condense(b)
Tony Clarke
Alex Martelli wrote: Steven Bethard <st************@gmail.com> wrote:
Here's a solution that works for iterables other than lists:
py> def collapse(iterable): ... enumeration = enumerate(iterable) ... _, lastitem = enumeration.next() ... yield lastitem ... for i, item in enumeration: ... if item != lastitem: ... yield item ... lastitem = item ... py> lst = [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5] py> list(collapse(lst)) [0, 1, 2, 3, 2, 4, 5]
Again, I'm still not sure I'd call this more elegant...
Hmmmm, what role does the enumeration play here?
None =) Sorry, it was an accidental carry-over from the list solution
that looked at lst[i-1]. I thought about correcting it when I realized
this (after posting), but I was too lazy. ;)
Steve
Thanks to everybody that responded; I appreciate all the input, even if
I didn't respond to all of it individually. :)
This is a prefect use case for the good old "reduce" function:
--BEGIN SNAP
a_lst = [None,0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]
def straightforward_collapse(lst):
return reduce(lambda v,e: v[-1]!=e and v+[e] or v, lst[1:], [lst[0]])
def straightforward_collapse_secu(lst):
return lst and reduce(lambda v,e: v[-1]!=e and v+[e] or v, lst[1:],
[lst[0]]) or []
print straightforward_collapse(a_lst)
print straightforward_collapse_secu([])
--END SNAP
Regards
Francis Girard
Le vendredi 4 Février 2005 20:08, Steven Bethard a écrit*: Mike C. Fletcher wrote: Alan McIntyre wrote: ...
I have a list of items that has contiguous repetitions of values, but the number and location of the repetitions is not important, so I just need to strip them out. For example, if my original list is [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with [0,1,2,3,2,4,5].
...
Is there an elegant way to do this, or should I just stick with the code above?
>>> def changes( dataset ):
... last = None ... for value in dataset: ... if value != last: ... yield value ... last = value ... >>> print list(changes(data ))
which is quite readable/elegant IMO.
But fails if the list starts with None:
py> lst = [None,0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5] py> def changes(dataset): ... last = None ... for value in dataset: ... if value != last: ... yield value ... last = value ... py> list(changes(lst)) [0, 1, 2, 3, 2, 4, 5]
A minor modification that does appear to work:
py> def changes(dataset): ... last = object() ... for value in dataset: ... if value != last: ... yield value ... last = value ... py> list(changes(lst)) [None, 0, 1, 2, 3, 2, 4, 5]
STeVe
Francis Girard wrote: This is a prefect use case for the good old "reduce" function:
--BEGIN SNAP
a_lst = [None,0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]
def straightforward_collapse(lst): return*reduce(lambda*v,e:*v[-1]!=e*and*v+[e]*or*v,*lst[1:],*[lst[0]])
reduce() magically increases the number of function calls by len(a_list).
There will be no good use cases for reduce() until Python can inline
functions.
Apodictically Yours
Peter
Alan McIntyre <al***********@esrgtech.com> wrote: Hi all,
I have a list of items that has contiguous repetitions of values, but the number and location of the repetitions is not important, so I just need to strip them out. For example, if my original list is [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with [0,1,2,3,2,4,5].
Here is the way I'm doing this now:
def straightforward_collapse(myList): collapsed = [myList[0]] for n in myList[1:]: if n != collapsed[-1]: collapsed.append(n)
return collapsed
Is there an elegant way to do this, or should I just stick with the code above? p=[1,1,1,1,1,4,4,4,8,8,9] filter(lambda y: y>0, map(lambda x,y: x==y and -1 or x,[0]+p,p+[0]))
[1, 4, 8, 9]
Z powazaniem
Adam Przybyla
Zut !
I'm very sorry that there is no good use case for the "reduce" function in
Python, like Peter Otten pretends. That's an otherwise very useful tool for
many use cases. At least on paper.
Python documentation should say "There is no good use case for the reduce
function in Python and we don't know why we bother you offering it."
Francis Girard
Le lundi 7 Février 2005 08:24, Peter Otten a écrit*: Francis Girard wrote: This is a prefect use case for the good old "reduce" function:
--BEGIN SNAP
a_lst = [None,0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]
def straightforward_collapse(lst): return*reduce(lambda*v,e:*v[-1]!=e*and*v+[e]*or*v,*lst[1:],*[lst[0]])
reduce() magically increases the number of function calls by len(a_list). There will be no good use cases for reduce() until Python can inline functions.
Apodictically Yours Peter
Francis Girard wrote: Python documentation should say "There is no good use case for the reduce function in Python and we don't know why we bother you offering it."
Submit a patch :-)
Peter
Francis Girard wrote: I'm very sorry that there is no good use case for the "reduce" function in Python, like Peter Otten pretends. That's an otherwise very useful tool for many use cases. At least on paper.
Clarity aside[1], can you give an example of where reduce is as
efficient as the eqivalent loop in Python?
Steve
[1] I generally find most reduce solutions much harder to read, but
we'll set aside my personal limitations for the moment. ;)
On Mon, Feb 07, 2005 at 07:07:10PM +0100, Francis Girard wrote: Zut ! I'm very sorry that there is no good use case for the "reduce" function in Python, like Peter Otten pretends. That's an otherwise very useful tool for many use cases. At least on paper. Python documentation should say "There is no good use case for the reduce function in Python and we don't know why we bother you offering it."
I am guessing you are joking, right? I think Peter exaggerates when he
says that "there will be no good use cases for reduce"; it is very
useful, in writing very compact code when it does exactly what you
want (and you have to go through hoops to do it otherwise). It also
can be the fastest way to do something. For example, the fastest way
to get the factorial of a (small enough) number in pure python is
factorial = lambda n: reduce(operator.mul, range(1, n+1))
--
John Lenton (jo**@grulic.org.ar) -- Random fortune:
Laugh and the world thinks you're an idiot.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
iD8DBQFCB7i6gPqu395ykGsRAvzgAKCnnxlkuwfR5sHRDd8zCi mRTFi5rQCgoCwB
oRkSfburxlGnGrbQbhCZpho=
=2pvS
-----END PGP SIGNATURE-----
No. Just wanted to naively use one of the base tool Python had to offer. I
tought it was very usable and very readable. I might be wrong.
Is there someone on this list using this tool and happy with it ? Or is my
mind too much targeted on FP paradigm and most of you really think that all
the functions that apply another function to each and every elements of a
list are bad (like "reduce", "map", "filter") ?
Francis Girard
Le lundi 7 Février 2005 19:25, Steven Bethard a écrit*: Francis Girard wrote: I'm very sorry that there is no good use case for the "reduce" function in Python, like Peter Otten pretends. That's an otherwise very useful tool for many use cases. At least on paper.
Clarity aside[1], can you give an example of where reduce is as efficient as the eqivalent loop in Python?
Steve
[1] I generally find most reduce solutions much harder to read, but we'll set aside my personal limitations for the moment. ;)
Francis Girard wrote: Is there someone on this list using this tool and happy with it ? Or is my mind too much targeted on FP paradigm and most of you really think that all the functions that apply another function to each and every elements of a list are bad (like "reduce", "map", "filter") ?
I know there are definitely proponents for map and filter, especially
for simple cases like:
map(int, lst)
filter(str.strip, lst)
Note that, unlike reduce, map and filter aren't really going to increase
the number of function calls. Consider the equivalent list comprehensions:
[int(x) for x in lst]
[x for x in lst if str.strip(x)] [1]
The list comprehensions also require the same number of function calls
in these cases. Of course, in cases where a function does not already
exist, map and filter will require more function calls. Compare:
map(lambda x: x**2 + 1, lst)
with
[x**2 + 1 for x in lst]
Where the LC allows you to essentially "inline" the function. (You can
dis.dis these to see the difference if you like.)
As far as my personal preferences go, while the simple cases of map and
filter (the ones using existing functions) are certainly easy enough for
me to read, for more complicated cases, I find things like:
[x**2 + 1 for x in lst]
[x for x in lst if (x**2 + 1) % 3 == 1]
much more readable than the corresponding:
map(lambda x: x**2 + 1, lst)
filter(lambda x: (x**2 + 1) % 3 == 1, lst)
especially since I avoid lambda usage, and would have to write these as:
def f1(x):
return x**2 + 1
map(f1, lst)
def f2(x):
return (x**2 + 1) % 3 == 1
map(f2, lst)
(I actually find the non-lambda code clearer, but still more complicated
than the list comprehensions.)
Given that I use list comprehensions for the complicated cases, I find
it to be more consistent if I use list comprehensions in all cases, so I
even tend to avoid the simple map and filter cases.
As far as reduce goes, I've never seen code that I thought was clearer
using reduce than using a for-loop. I have some experience with FP, and
I can certainly figure out what a given reduce call is doing given
enough time, but I can always understand the for-loop version much more
quickly.
Of course, YMMV.
STeVe
[1] While it's not technically equivalent, this would almost certainly
be written as:
[x for x in lst if x.strip()]
which in fact takes better advantage of Python's duck-typing -- it will
work for unicode objects too.
John Lenton wrote: For example, the fastest way to get the factorial of a (small enough) number in pure python is
factorial = lambda n: reduce(operator.mul, range(1, n+1))
Gah! I'll never understand why people use lambda, which is intended to
create _anonymous_ functions, to create named functions! This code is
basically equivalent to:
def factorial(n):
return reduce(operator.mul, range(1, n+1))
except that factorial.__name__ is now the meaningful 'factorial',
instead of '<lambda>'. See "Inappropriate use of Lambda" in http://www.python.org/moin/DubiousPython
My personal beef with the inappropriate use of lambda aside, and
ignoring the fact that the reduce solution as written fails for
factorial(0), some timings are in order. Given fact.py:
----------------------------------------------------------------------
import operator
def f1(n):
result = 1
for x in xrange(1, n+1):
result *= x
return result
def f2(n):
return reduce(operator.mul, range(1, n+1))
def f3(n):
return reduce(operator.mul, xrange(1, n+1))
----------------------------------------------------------------------
I get the following timeit results (with Python 2.4):
$ python -m timeit -s "import fact" "[fact.f1(x) for x in xrange(1, 100)]"
100 loops, best of 3: 2.53 msec per loop
$ python -m timeit -s "import fact" "[fact.f2(x) for x in xrange(1, 100)]"
100 loops, best of 3: 2.46 msec per loop
$ python -m timeit -s "import fact" "[fact.f3(x) for x in xrange(1, 100)]"
100 loops, best of 3: 2.17 msec per loop
So it does look like reduce is actually faster in this case, especially
if you use xrange instead of range.
Thanks for the example!
Steve
Le lundi 7 Février 2005 19:51, John Lenton a écritÂ*: On Mon, Feb 07, 2005 at 07:07:10PM +0100, Francis Girard wrote: Zut !
I'm very sorry that there is no good use case for the "reduce" function in Python, like Peter Otten pretends. That's an otherwise very useful tool for many use cases. At least on paper.
Python documentation should say "There is no good use case for the reduce function in Python and we don't know why we bother you offering it." I am guessing you are joking, right?
Of course I am joking. I meant the exact contrary. Only wanted to show that
Peter did exaggerate. Thank you for citing me with the full context.
I think Peter exaggerates when he says that "there will be no good use cases for reduce"; it is very useful, in writing very compact code when it does exactly what you want (and you have to go through hoops to do it otherwise). It also can be the fastest way to do something. For example, the fastest way to get the factorial of a (small enough) number in pure python is
factorial = lambda n: reduce(operator.mul, range(1, n+1))
Great.
Regards
Francis Girard
Le lundi 7 Février 2005 20:30, Steven Bethard a écrit*: Francis Girard wrote: Is there someone on this list using this tool and happy with it ? Or is my mind too much targeted on FP paradigm and most of you really think that all the functions that apply another function to each and every elements of a list are bad (like "reduce", "map", "filter") ? I know there are definitely proponents for map and filter, especially for simple cases like:
map(int, lst) filter(str.strip, lst)
Note that, unlike reduce, map and filter aren't really going to increase the number of function calls. Consider the equivalent list comprehensions:
[int(x) for x in lst] [x for x in lst if str.strip(x)] [1]
The list comprehensions also require the same number of function calls in these cases. Of course, in cases where a function does not already exist, map and filter will require more function calls. Compare:
map(lambda x: x**2 + 1, lst)
with
[x**2 + 1 for x in lst]
Where the LC allows you to essentially "inline" the function. (You can dis.dis these to see the difference if you like.)
I see. As far as my personal preferences go, while the simple cases of map and filter (the ones using existing functions) are certainly easy enough for me to read, for more complicated cases, I find things like:
[x**2 + 1 for x in lst] [x for x in lst if (x**2 + 1) % 3 == 1]
much more readable than the corresponding:
map(lambda x: x**2 + 1, lst) filter(lambda x: (x**2 + 1) % 3 == 1, lst)
I agree.
especially since I avoid lambda usage, and would have to write these as:
Why avoid "lambda" usage ? You find them too difficult to read (I mean in
general) ? def f1(x): return x**2 + 1 map(f1, lst)
def f2(x): return (x**2 + 1) % 3 == 1 map(f2, lst)
(I actually find the non-lambda code clearer, but still more complicated than the list comprehensions.)
Given that I use list comprehensions for the complicated cases, I find it to be more consistent if I use list comprehensions in all cases, so I even tend to avoid the simple map and filter cases.
As far as reduce goes, I've never seen code that I thought was clearer using reduce than using a for-loop. I have some experience with FP, and I can certainly figure out what a given reduce call is doing given enough time, but I can always understand the for-loop version much more quickly.
I think it's a question of habits. But I agree that we should always go with
code we find easy to read and that we think others find the same.
I think I will stop using that function in Python if Python practionners find
it difficult to read in general. There is no point in being cool just for
being cool.
Thank you
Francis Girard Of course, YMMV.
STeVe
[1] While it's not technically equivalent, this would almost certainly be written as: [x for x in lst if x.strip()] which in fact takes better advantage of Python's duck-typing -- it will work for unicode objects too.
Francis Girard wrote: Le lundi 7 Février 2005 20:30, Steven Bethard a écrit :especially since I avoid lambda usage, and would have to write these as:
Why avoid "lambda" usage ? You find them too difficult to read (I mean in general) ?
Yup, basically a readability thing. I also tend to find that if I
actually declare the function, I can often find a way to refactor things
to make that function useful in more than one place.
Additionally, 'lambda' is on Guido's regrets list, so I'm avoiding it's
use in case it gets yanked for Python 3.0. I think most code can be
written now without it, and in most cases code so written is clearer.
Probably worth looking at is a thread I started that went through some
stdlib uses of lambda and how they could be rewritten: http://mail.python.org/pipermail/pyt...er/257990.html
Many were rewritable with def statements, list comprehensions, the
operator module functions, or unbound or bound methods.
Steve
John Lenton wrote: On Mon, Feb 07, 2005 at 07:07:10PM +0100, Francis Girard wrote: Zut !
I'm very sorry that there is no good use case for the "reduce" function in Python, like Peter Otten pretends. That's an otherwise very useful tool for many use cases. At least on paper.
Python documentation should say "There is no good use case for the reduce function in Python and we don't know why we bother you offering it."
I am guessing you are joking, right? I think Peter exaggerates when he says that "there will be no good use cases for reduce"; it is very useful, in writing very compact code when it does exactly what you want (and you have to go through hoops to do it otherwise). It also can be the fastest way to do something. For example, the fastest way to get the factorial of a (small enough) number in pure python is
factorial = lambda n: reduce(operator.mul, range(1, n+1))
I see that you are seduced by the beauty of the expression. Otherwise, if
you would really care for speed:
$ cat factorial.py
#ugly but efficient
factorial = [1]
for i in range(1, 50):
factorial.append(i*factorial[-1])
import operator
def factorial_reduce(n):
return reduce(operator.mul, xrange(1, n+1), 1)
if __name__ == "__main__":
assert factorial[0] == factorial_reduce(0) == 1
assert factorial[5] == factorial_reduce(5) == 1*2*3*4*5
$ python2.4 -m timeit -s'import factorial as f' 'f.factorial_reduce(30)'
100000 loops, best of 3: 13.3 usec per loop
$ python2.4 -m timeit -s'import factorial as f' 'f.factorial[30]'
1000000 loops, best of 3: 0.328 usec per loop
Having factorial raise a UseStirlingsFormulaError for large values is left
as an exercise to the reader :-)
Peter
On Mon, Feb 07, 2005 at 09:39:11PM +0100, Peter Otten wrote: John Lenton wrote: For example, the fastest way to get the factorial of a (small enough) number in pure python is factorial = lambda n: reduce(operator.mul, range(1, n+1)) I see that you are seduced by the beauty of the expression. Otherwise, if you would really care for speed: [...]
that's cheating: you moved the calculation into the setup. You aren't
timing what you say you are.
--
John Lenton (jo**@grulic.org.ar) -- Random fortune:
I'd rather push my Harley than ride a rice burner.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
iD8DBQFCB9WegPqu395ykGsRAmQFAJ96qzLBiM+/DmYt7+Snpf2jGYoNYACcD70U
DkQ/nRGUFFex22oWpTQ7zpc=
=zvS1
-----END PGP SIGNATURE-----
Le lundi 7 Février 2005 21:21, Steven Bethard a écrit*: Francis Girard wrote: Le lundi 7 Février 2005 20:30, Steven Bethard a écrit :especially since I avoid lambda usage, and would have to write these as:
Why avoid "lambda" usage ? You find them too difficult to read (I mean in general) ?
Yup, basically a readability thing. I also tend to find that if I actually declare the function, I can often find a way to refactor things to make that function useful in more than one place.
Additionally, 'lambda' is on Guido's regrets list, so I'm avoiding it's use in case it gets yanked for Python 3.0. I think most code can be written now without it, and in most cases code so written is clearer. Probably worth looking at is a thread I started that went through some stdlib uses of lambda and how they could be rewritten:
http://mail.python.org/pipermail/pyt...er/257990.html
Many were rewritable with def statements, list comprehensions, the operator module functions, or unbound or bound methods.
Steve
I see. I personnaly use them frequently to bind an argument of a function with
some fixed value. Modifying one of the example in http://mail.python.org/pipermail/pyt...er/257990.html
I frequently have something like :
SimpleXMLRPCServer.py: server.register_1arg-place_function(lambda x: x+2,
'add2')
If "Guido" don't like lambdas, he would have to give me some way to easily do
this. Something like (or similar to fit Python syntax) :
SimpleXMLRPCServer.py: server.register_1arg-place_function(\+2, 'add2')
This would be great.
Regards
Francis Girard
server.register_function(operator.add, 'add')
Francis Girard wrote: I see. I personnaly use them frequently to bind an argument of a function with some fixed value. Modifying one of the example in http://mail.python.org/pipermail/pyt...er/257990.html I frequently have something like :
SimpleXMLRPCServer.py: server.register_1arg-place_function(lambda x: x+2, 'add2')
PEP 309 has already been accepted, so I assume it will appear in Python 2.5: http://www.python.org/peps/pep-0309.html
Your code could be written as:
server.register_1arg-place_function(partial(operator.add, 2))
If your task is actually what you describe above -- "to bind an argument
of a function with some fixed value"[1] -- then in general, you should
be able to write this as[2]:
function_with_fixed_value = partial(function, fixed_value)
Steve
[1] As opposed to binding a name to be used in an _expression_ as you do
in your example.
[2] The partial function can also be used to fix multiple argument
values and keyword argument values, if that's necessary for your purposes.
John Lenton wrote: On Mon, Feb 07, 2005 at 09:39:11PM +0100, Peter Otten wrote: John Lenton wrote: > For example, the fastest way > to get the factorial of a (small enough) number in pure python is > > factorial = lambda n: reduce(operator.mul, range(1, n+1))
I see that you are seduced by the beauty of the expression. Otherwise, if you would really care for speed:
[...]
that's cheating: you moved the calculation into the setup. You aren't timing what you say you are.
*You* are cheating when you take a predefined function implemented in C
(operator.mul) and then claim you are using pure Python. This gives you the
extra 50 percent that overcompensate the inefficiency of reduce() -- but
only by a hair's breadth:
$ python2.4 -m timeit -s'mul = lambda x, y: x * y' -s'a=1' 'mul(a, a)'
1000000 loops, best of 3: 0.522 usec per loop
$ python2.4 -m timeit -s'from operator import mul' -s'a=1' 'mul(a, a)'
1000000 loops, best of 3: 0.345 usec per loop
Peter
Le lundi 7 Février 2005 22:53, Steven Bethard a écrit*: Francis Girard wrote: I see. I personnaly use them frequently to bind an argument of a function with some fixed value. Modifying one of the example in http://mail.python.org/pipermail/pyt...er/257990.html I frequently have something like :
SimpleXMLRPCServer.py: server.register_1arg-place_function(lambda x: x+2, 'add2') PEP 309 has already been accepted, so I assume it will appear in Python 2.5:
http://www.python.org/peps/pep-0309.html
Your code could be written as:
server.register_1arg-place_function(partial(operator.add, 2))
If your task is actually what you describe above -- "to bind an argument of a function with some fixed value"[1] -- then in general, you should be able to write this as[2]:
function_with_fixed_value = partial(function, fixed_value)
Great ! Great ! Great !
Really, Python is going in the right direction all of the time !
Very good news.
Regards
Francis Girard Steve
[1] As opposed to binding a name to be used in an _expression_ as you do in your example.
[2] The partial function can also be used to fix multiple argument values and keyword argument values, if that's necessary for your purposes.
Peter Otten wrote: *You* are cheating when you take a predefined function implemented in C (operator.mul) and then claim you are using pure Python.
operator is a standard module in Python. if you have a desperate need
to be pointless in public, consider using another newsgroup.
</F>
Fredrik Lundh wrote: Peter Otten wrote:
*You* are cheating when you take a predefined function implemented in C (operator.mul) and then claim you are using pure Python.
operator is a standard module in Python. if you have a desperate need to be pointless in public, consider using another newsgroup.
</F>
No.
Peter
[Alan McIntyre] I have a list of items that has contiguous repetitions of values, but the number and location of the repetitions is not important, so I just need to strip them out. For example, if my original list is [0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5], I want to end up with [0,1,2,3,2,4,5]. import itertools [k for k, v in itertools.groupby(lst)]
[0, 1, 2, 3, 2, 4, 5]
Raymond Hettinger This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Speaker |
last post by:
Hi I am a newbie to SQL.
I have a historical list of digatal points listed by time.ie: 3 fields
PointName;Date/Time;State.
I need to return a list of When a specific point chsnges state.
...
|
by: Justin Bartels |
last post by:
Hi,
I have been struggling with an unusual problem with a
dropdownlist web control for quite some time now.
Cutting straight to the problem, I am setting the
selected item in the dropdownlist...
|
by: BBands |
last post by:
There must be a better way to multiply the elements of one list by
another:
a =
b =
c =
for i in range(len(a)):
c.append(a*b)
a = c
print a
|
by: Brian |
last post by:
I have a form that has a textbox used to search records in a table. On
the form I have a listbox which shows each record in the table. When
you click on each row in the listbox the rest of the form...
|
by: Army1987 |
last post by:
Given:
#include <stdlib.h>
typedef struct Node {
unsigned char Data;
struct Node *Next
} node_t, *list_t;
list_t list;
Does
list = list->Next = malloc(sizeof(node));
|
by: Valli |
last post by:
HI,
I have a drop down list in my aspx page. When the list1 value changes, a
procedure GetList2Values ( onchange="GetList2Values ()" ) will get executed.
Since the list2 contains a large number...
|
by: DolphinDB |
last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation.
Take...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: Vimpel783 |
last post by:
Hello!
Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: CloudSolutions |
last post by:
Introduction:
For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
|
by: Defcon1945 |
last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
| |