Hi,
When list comprehension was added to the language, I had a lot of
trouble understanding it but now that I am familiar with it, I am not
sure how I programmed in Python without it.
Now I see that generator expressions have been added to the language
with 2.4 and I question the need for it. I know that it allows for lazy
evaluation which speeds things up for larger lists but why was it
necessary to add it instead of improving list comprehension?
Was there some sort of limitation that prevented list comprehension from
taking over the functionality that generator expressions provide?
I have always liked the fact that Python has limited capabilities for
having more than one way to do it and it seem to me that generator
expressions break that philosophy. It is similar to how you have to use
range or xrange depending on how large the range is.
Can someone please explain the reasoning behind it?
Thanks,
Mahesh 24 3267
Mahesh Padmanabhan <ma****@privacy.net> wrote: Hi,
When list comprehension was added to the language, I had a lot of trouble understanding it but now that I am familiar with it, I am not sure how I programmed in Python without it.
Oh good. Now I see that generator expressions have been added to the language with 2.4 and I question the need for it. I know that it allows for lazy evaluation which speeds things up for larger lists but why was it necessary to add it instead of improving list comprehension?
Was there some sort of limitation that prevented list comprehension from taking over the functionality that generator expressions provide?
Sure, and that limitation is: list comprehensions return lists. This
one "limitation" (together with Python's applicative order evaluation,
and you couldn't change THAT without breaking the whole caboodle of
existing programs!) implies everything else. I have always liked the fact that Python has limited capabilities for having more than one way to do it and it seem to me that generator expressions break that philosophy. It is similar to how you have to use range or xrange depending on how large the range is.
Can someone please explain the reasoning behind it?
Generator comprehensions are wonderful and there is no way Python list
comprehensions can provide the same features, since lists need to be
lists. Sure, list(e(x) for x in foo) IS just the same thing as [e(x)
for x in foo]. We'll remove the redundancy in 3.0 -- not earlier
because it will break backwards compatibility. The only sensible way I
can see right now for 3.0 to remove this redundancy is by removing list
comprehensions and leaving only generator comprehensions, btw.
Alex
In article <1g****************************@yahoo.com>, al*****@yahoo.com (Alex Martelli) wrote: Sure, and that limitation is: list comprehensions return lists. This one "limitation" (together with Python's applicative order evaluation, and you couldn't change THAT without breaking the whole caboodle of existing programs!) implies everything else.
Is returning a list really a limitation considering that lists can be
transformed quite easily?
What is "Python's applicative order evaluation" and how do generator
expressions get around it?
Generator comprehensions are wonderful and there is no way Python list comprehensions can provide the same features, since lists need to be lists. Sure, list(e(x) for x in foo) IS just the same thing as [e(x) for x in foo]. We'll remove the redundancy in 3.0 -- not earlier because it will break backwards compatibility. The only sensible way I can see right now for 3.0 to remove this redundancy is by removing list comprehensions and leaving only generator comprehensions, btw.
I am still not clear of the advantages of using generator expressions
(other than less memory consumption) instead of list comprehension for
any given class of problems. Can you cite concrete use cases where
generator expressions would be preferred over list comprehension?
Thanks,
Mahesh
Mahesh Padmanabhan <ma****@privacy.net> writes: I am still not clear of the advantages of using generator expressions (other than less memory consumption)
Why do you think using bounded rather than unbounded amounts of memory
is an insignificant advantage?
Mahesh Padmanabhan <ma****@privacy.net> wrote: I am still not clear of the advantages of using generator expressions (other than less memory consumption) instead of list comprehension for any given class of problems. Can you cite concrete use cases where generator expressions would be preferred over list comprehension?
Here's a simple example
linecount = sum(1 for line in file)
linecount = sum([1 for line in file])
The former requires an in-memory list equal to the size of the file.
Consigning "other than memory consumption" to a parenthetical remark
misses its importance.
martin
Mahesh Padmanabhan wrote: In article <1g****************************@yahoo.com>, al*****@yahoo.com (Alex Martelli) wrote:
Sure, and that limitation is: list comprehensions return lists. This one "limitation" (together with Python's applicative order evaluation, and you couldn't change THAT without breaking the whole caboodle of existing programs!) implies everything else.
Is returning a list really a limitation considering that lists can be transformed quite easily?
What is "Python's applicative order evaluation" and how do generator expressions get around it?
Generator comprehensions are wonderful and there is no way Python list comprehensions can provide the same features, since lists need to be lists. Sure, list(e(x) for x in foo) IS just the same thing as [e(x) for x in foo]. We'll remove the redundancy in 3.0 -- not earlier because it will break backwards compatibility. The only sensible way I can see right now for 3.0 to remove this redundancy is by removing list comprehensions and leaving only generator comprehensions, btw.
I am still not clear of the advantages of using generator expressions (other than less memory consumption) instead of list comprehension for any given class of problems. Can you cite concrete use cases where generator expressions would be preferred over list comprehension?
Thanks, Mahesh
- lower memory usage
- lower CPU usage ( sometimes, you don't need to expand the whole generator,
see below )
- ability to manipulate infinite generators
Is that enouth ?
On Sat, 28 Aug 2004 14:09:34 -0600, Mahesh Padmanabhan wrote: Is returning a list really a limitation considering that lists can be transformed quite easily? http://aspn.activestate.com/ASPN/Coo.../Recipe/190465
It is trivial to generate huge lists and more people need them than you
might think. Not choking the VM system with multi-hundred megabytes in
allocations is a non-trivial performance advantage, especially as the
alternative may be not finishing at all.
You, by the way, have it backwards. It is trivial to convert a generator
into a list, without advantage loss. The opposite is not possible. That's
why Alex, elsewhere in the newsgroup, is saying list comprehensions might
someday be replaced by list(x for x in whatever), but you're never going
to see him say generator expressions are adequately captured by iter([x
for x in whatever]).
In article <ma**************************@individual.net>,
Mahesh Padmanabhan <ma****@privacy.net> wrote: I am still not clear of the advantages of using generator expressions (other than less memory consumption) instead of list comprehension for any given class of problems. Can you cite concrete use cases where generator expressions would be preferred over list comprehension?
Storage economy and the time saved by omitting lots of
object creation and destruction are the main advantages of
generator expressions.
The advantage of lists is that they'll stand still while
you do things to them (things like iterating over them in
many and novel ways.) The mind does boggle at the idea of a
data-processing program that has replaced all containers
with generator expressions, and creates all its information
on-demand without storing anything anywhere.
Regards. Mel.
>>>>> "Mahesh" == Mahesh Padmanabhan <ma****@privacy.net> writes:
Mahesh> Is returning a list really a limitation considering that
Mahesh> lists can be transformed quite easily?
Yes. (1) You lose your memory, and (2) you can't use whatever you get
outside after the list comprehension hidden loop when evaluating it.
Let's have some examples.
You lose you memory because you have to generate a whole list, which
might be unnecessary because your processing eventually don't need it.
It is the same as the difference between range() and xrange(). E.g.,
you might write
for a in [x*x for x in range(100000)]:
print a
you have to wait until you generate the 100k list, and at that time
you start printing out x*x for each of the values. The middle-ground
for a in [x*x for x in xrange(100000)]:
print a
save half the memory, but still needs to generate the 100k list, and
you have to wait a long time before you print the first result. Once
you get generator expression, you can say
for a in (x*x for x in xrange(100000)):
print a
and it will do the same thing, except that you don't need to wait at
the beginning, and at no instant there is a 100000 element list
sitting in memory.
[N.B.: Of course, in this simple example you'll instead write
for x in xrange(100000):
print x*x
but sometimes things are not as easy, e.g., what you'd do if you have
to pass the abstraction "x*x in xrange(100000)" into a function as a
function argument?]
Now let's turn to the second point: with list comprehension, the
evaluation of an element happens after the evaluation of previous
elements, but before the previous elements are being used. At times
this makes things hard to achieve or even impossible (and at that time
you have to throw up your hands and write a generator function
instead). E.g., suppose you have this:
import time
hash = {}
def process1(x):
for i in xrange(1, 11):
hash[x * i] = 1
def process2(x):
for i in xrange(2, 12):
hash[x * i] = 1
if time.time() % 2:
process = process1
else:
process = process2
Now you have a loop, which you want a neater way to rewrite:
for x in xrange(1000):
if not hash.has_key(x):
process(x * x)
in such a way that you don't need to let others specify the exact loop
to run. Intuitively you'd like to write a list comprehension to do
that. So you'd like to write
for y in [x*x for x in xrange(1000) if not hash.has_key(x)]:
process(y)
and let others pass the list into the function. But this makes
hash.hash_key(x) to be called when none of the process(y) is called,
so it breaks completely. With generator expression, you write:
for y in (x*x for x in xrange(1000) if not hash.has_key(x)):
process(y)
which do the trick. Note that now "x*x for x in xrange(1000) if not
hash.has_key(x)" is an object, and you can move it out of the
function, ask somebody else to pass it to you---which you can't do in
the original for loop. I.e., now you can say
def func(gen):
for x in gen:
process(x)
and let somebody call
func(x*x for x in xrange(1000 if not hash.has_key(x)))
Without generator expression, to achieve this you must code a
generator function. So generator expression helps you to write simple
generators.
Regards,
Isaac.
In article <87************@sinken.local.csis.hku.hk>,
Isaac To <ik****@netscape.net> wrote:
[snip]
Thanks Isaac for providing such a detailed explanation. I understand now
why generator expressions make more sense.
I just wish that generator expressions had been designed into the
language without going the way of list comprehension -> generator
expressions.
Mahesh Padmanabhan <ma****@privacy.net> wrote:
... What is "Python's applicative order evaluation" and how do generator expressions get around it?
I see you've received several answers but it doesn't seem that they
address this point, so let me try.
Python, like most programming languages, specifies that evaluation is in
"applicative order": each argument to a function (or operator, etc) is
entirely evaluated before the function executes (Python, like several
other languages, makes a specific ad hoc exception for 'short-circuiting
operators', namely 'and' & 'or', which may entirely omit evaluating one
argument in some cases). This is also known as 'strict' or 'eager'
evaluation and by other names yet (some draw subtle distinctions between
the exact connotation of such names but for simplicity we'll treat them
as synonyms, and similarly for terms introduced in the next paragraph).
Haskell, where the concept of list comprehensions originated, is
different than most languages: Haskell specifies that evaluation is in
"normal order" -- each argument and PART of an argument is evaluated at
the time, if ever, at which that value is needed to compute. This is
also known as 'non-strict' or 'lazy' evaluation. The reason this
feature of Haskell is pretty unusual is that implementing it is somewhat
complicated and costly -- in a lazy-evaluation language's implementation
there are typically plenty of 'thunks' of code being stashed here and
yon for possible future execution. I don't even know of any lazy
evaluation language that does not rely on the crucial simplification of
the computation environment given by 'functional programming' aka FP
(basically, the idea that all objects are immutable -- a program
computes new objects, it never mutates existing objects)... and Python
is not a FP language (it crucially uses mutable data, particularly
mutable lists), so it would be astonishing if Python had the kind of
generalized support for lazy evaluation that you see in Haskell.
So, Python copied Haskell's list comprehensions, but didn't (couldn't,
probably!) copy the lazy evaluation aspect. So, whenever Python
evaluates a LC, it evaluates it fully -- it necessarily materializes the
whole list in memory, say N items' worth. This is intrinsically O(N) in
space and therefore AT BEST can be O(N) in time (it will be worse than
O(N) unless each step is constant-time, i.e. O(1)...!). This doesn't
apply to Haskell, thanks to lazy evaluation -- in Haskell, each item of
a list comprehension will be evaluated only if and when it's needed,
just like any other operand/arguments/etc... but Python can't do that.
Which is why Python 2.4 introduced generator comprehensions -- in a way
a closer match to Haskell's list comprehensions, than Python's own LCs!
That's because generators play the role of 'thunks' -- pieces of code
hanging out there in suspended animation, ready for a bit more execution
if and when needed -- for the specific but crucially important case of
sequential iteration over (some prefix of) the items of a (potentially
unbounded) sequence [[a case which you will also find in other
programming models under such names as 'streams' -- Unix pipes also
implement some part of that, with various processes in a pipeline
playing the role of the thunks in this case...]]. Generators (and more
generally iterators, and more specifically generator comprehensions)
give Python "just enough lazy evaluation to get by" -- without paying
the full conceptual and practical costs in switching to "full laziness"
(and totally immutable data), Python thereby nevertheless gets the
ability to express some (vast and important) subset of those algorithms
which non-strict evaluation makes so elegant to express.
Alex
Mahesh Padmanabhan <ma****@privacy.net> wrote: In article <87************@sinken.local.csis.hku.hk>, Isaac To <ik****@netscape.net> wrote:
[snip]
Thanks Isaac for providing such a detailed explanation. I understand now why generator expressions make more sense.
I just wish that generator expressions had been designed into the language without going the way of list comprehension -> generator expressions.
It would sure be nice if Python had been born back from day one with all
the neat features it has taken years to develop -- that way we wouldn't
have any issues that are there just because of backwards compatibility.
Unfortunately, this wish is totally unrealistic -- obviously people do
come up with cool ideas such as the iterator protocol, and generators,
after the language has been around for a while. When list
comprehensions got into Python the new iterator protocol wasn't there
yet, so of course generator comprehensions could not have been thought
of. Similarly, dicts now have both (e.g.) a method .items() which
returns a list, and a method .iteritems() which returns an iterator --
no special reason for both to be there, but the former was there first,
before the iterator protocol, so it must stay for backwards
compatibility. One day (Python 3.0) we'll be able to break
compatibility and simplify things, but that sort of thing can be allowed
only very rarely in a language's continuing evolution.
Alex al*****@yahoo.com (Alex Martelli) writes: It would sure be nice if Python had been born back from day one with all the neat features it has taken years to develop -- that way we wouldn't have any issues that are there just because of backwards compatibility.
Unfortunately, this wish is totally unrealistic -- obviously people do come up with cool ideas such as the iterator protocol, and generators, after the language has been around for a while.
And yet a lot of the unrealisticness comes directly from the Python
culture. Maybe it took a while to think up the iterator protocol, but
what about nested scopes? What about the += operator? How about the
sorted() method on lists? One could go on and on with more examples
like that. These things exist in other languages and had been
requested in Python for years before they got accepted. And every
time some little thing gets added changing the language, that creates
a new mini-dialect that users have to remember for a while and then
forget. The result is "version fatigue"; one gets bleary trying to
remember what's in the language this week. Those features are in
other languages for a reason, and there's been enough experience using
them (in those languages) that their desirability for Python should
never have seriously been in question. So it would have been better
to include them from the beginning, instead of through separate
episodes of prolonged agony for each one.
Paul Rubin <http://ph****@NOSPAM.invalid> writes: Those features are in other languages for a reason, and there's been enough experience using them (in those languages) that their desirability for Python should never have seriously been in question.
While I feel a lot of sympathy towards this statement in principle,
the other side of the coin is extremely important too. One of Python's
strengths is exactly that it does _not_ import many tried and tested
features from other languages (braces as delimiters, access
restriction, static typing, meaningful line noise, etc.) However, it
is also one of its weaknesses (nested scopes, for example, as you
mentioned). It all depends on the actual feature in question, and the
programmer in question.
So it would have been better to include them from the beginning, instead of through separate episodes of prolonged agony for each one.
Python occupies some position in the space of all possible programming
languages. Its position changes, as the language evolves. The language
evolves because the Python community pushes it in some direction that
it considers to be a good one. Judgements about which directions are
good, come from experience with using the language. The language
evolves into one which the community finds more useful, the language
arrives at some local maximum in programming language space. You would
like it to have been placed at the local maximum in the first
place. So would I, but that is not realistically possible.
And remember, the landscape looks very different from different
people's perspectives, what might look like a local maximum to you,
might seem to be in a deep trench from my point of view. Macros,
anyone? Lambda?
Paul Rubin <http://ph****@NOSPAM.invalid> wrote: al*****@yahoo.com (Alex Martelli) writes: It would sure be nice if Python had been born back from day one with all the neat features it has taken years to develop -- that way we wouldn't have any issues that are there just because of backwards compatibility.
Unfortunately, this wish is totally unrealistic -- obviously people do come up with cool ideas such as the iterator protocol, and generators, after the language has been around for a while. And yet a lot of the unrealisticness comes directly from the Python culture. Maybe it took a while to think up the iterator protocol, but what about nested scopes? What about the += operator? How about the sorted() method on lists? One could go on and on with more examples like that.
....and some of the examples would be wrong, just like yours about "the
sorted method on lists" -- sorted is a new built-in function in 2.4, not
a method on lists, and it accepts any sequence, not just a list.
These things exist in other languages and had been requested in Python for years before they got accepted. And every
A million other features satisfy this description D: they exist in other
languages and have been requested in Python for years. Indeed, it's
hard to think of ANY feature which exists in other languages and has NOT
been requested for Python.
time some little thing gets added changing the language, that creates a new mini-dialect that users have to remember for a while and then forget. The result is "version fatigue"; one gets bleary trying to remember what's in the language this week. Those features are in other languages for a reason, and there's been enough experience using them (in those languages) that their desirability for Python should never have seriously been in question. So it would have been better to include them from the beginning, instead of through separate episodes of prolonged agony for each one.
If Python included every features satisfying description D, then Python
would be the largest, bulkiest, most unwieldy language in the world --
in fact I think even including half the features in D would be plenty to
make Python completely unusable. That a dozen features satisfying D
have indeed been included is therefore a silly argument, considering
that a gross more features satisfying D have not been included and
hopefully never will. Moreover, the specific feature we're discussing,
generator expressions, doesn't really satisfy D (what Haskell has is not
quite the same thing, you know), making your argument even weirder.
All features that are in any programming language are there for a reason
(not necessarily a good one -- some languages may have been designed on
principles close to those implied by this silly criticism you are
expressing, for example) and, for old enough languages, there has been
enough experience using them. Nevertheless it's perfectly proper and
indeed wise to seriously question their desirability for Python. I bet
that if you took a feature satisfying description D, you'd find a chance
of more than 90% that it would be totally silly to have it in Python and
that it will never be there.
Finding out what dozen, out of dozens of dozens, of features satisfying
D, would work well with Python, and which ones wouldn't, is obviously
not something that can be rushed. Sure, introducing some feature in
every release is not ideal, but there is really no alternative that is
even in the least sensible at all. So, I find your criticisms
unfounded.
Alex
A small correction:
Martin DeMello wrote: Here's a simple example
linecount = sum(1 for line in file) linecount = sum([1 for line in file])
The former requires an in-memory list equal to the size of the file.
^^^^^^
I think you meant "the latter"...
Consigning "other than memory consumption" to a parenthetical remark misses its importance.
-Peter
Peter Hansen <pe***@engcorp.com> wrote: A small correction:
Martin DeMello wrote:
Here's a simple example
linecount = sum(1 for line in file) linecount = sum([1 for line in file])
The former requires an in-memory list equal to the size of the file. ^^^^^^ I think you meant "the latter"...
Oops - I did indeed.
martin
On Mon, 30 Aug 2004 01:23:16 -0700, Paul Rubin wrote: These things exist in other languages and had been requested in Python for years before they got accepted. And every time some little thing gets added changing the language, that creates a new mini-dialect that users have to remember for a while and then forget. The result is "version fatigue"; one gets bleary trying to remember what's in the language this week. Those features are in other languages for a reason, and there's been enough experience using them (in those languages) that their desirability for Python should never have seriously been in question.
Along with the other two poster's points, you have a whiff of an implied
argument that somehow, these amorphous "other languages" somehow *didn't*
go through "version fatigue", and are therefore somehow better. This is
false if you take it out of the realm of amorphous, implied claim and look
at it in the light of day. Show me a language that has never suffered from
these problems and anyone actually uses. If you can show me one, show me a
few more.
C changed. C++ changed. Fortran changed. COBOL changed. LISP has changed.
Python changed, Perl changed, Haskell changed, Javascript changed,
everything changed. None of those languages decided that the correct
solution is to simply throw everything in (even Ada, C++, and Perl, all of
which have that reputation, really didn't in the sense you're getting at
here).
The claim that other languages have features, *therefore* we shouldn't
stop to think about whether they are good and they should all be added to
Python without debate boggles the mind when viewed head on. Pure
non-sequitor.
(BTW, note levels of debate are proportional to how many people bother to
debate, and nothing else. It is one reason a language can not
effectively grow to the community size of Python without strong leadership
and why I am glad that Python runs under a BDFL system. Most of the debate
you may see here in c.l.p. is really effectively irrelevant, and should be
mentally categorized as such; that debate does not constitute true design
involvement. Thus, pointing out "high levels of debate" in the newsgroup
doesn't convince me of much, since it really means little. To the extent
that it does sometimes matter, the amortized effect of any given message
is simply miniscule; the entire debate of a thousand messages may have no
effect whatsoever on Guido, or manifest itself as a single document of
moderate length that he may *still* reject.)
In article <1g***************************@yahoo.com>, al*****@yahoo.com (Alex Martelli) wrote: Mahesh Padmanabhan <ma****@privacy.net> wrote: I just wish that generator expressions had been designed into the language without going the way of list comprehension -> generator expressions.
It would sure be nice if Python had been born back from day one with all the neat features it has taken years to develop -- that way we wouldn't have any issues that are there just because of backwards compatibility.
Unfortunately, this wish is totally unrealistic -- obviously people do come up with cool ideas such as the iterator protocol, and generators, after the language has been around for a while.
All the cool features Python has adopted existed a long time before they
made their way into Python itself. As I see it, the real issue with
taking up new language features is that the vast majority of work-a-day
programmers seem to be highly reticent to embrace new tools and ideas,
for one reason or another. I think this explains -- among other things
-- the lingering popularity of languages like BASIC (in its various
incarnations), and Fortran.
In fact, one of the things I love most about Python, both the language
and the community surrounding it, is that here, for the first time, we
see a large community of mainstream programmers who are ready, willing,
and able to adapt to new ideas without losing their heads and retreating
into the ostensibly safe and comfortable world of linear imperative
programming. Instead of insisting everybody forget what they know and
start over, however, Python gives us the old familiar stuff, too, and
the new ideas come in a little at a time.
Frankly, I think if the Lisp world had managed to build the same
friendly and welcoming community Python seems to have, it would have
taken over the world a quarter-century ago.
-M
--
Michael J. Fromberger | Lecturer, Dept. of Computer Science http://www.dartmouth.edu/~sting/ | Dartmouth College, Hanover, NH, USA
Michael J. Fromberger <Mi******************@Clothing.Dartmouth.EDU>
wrote:
... Unfortunately, this wish is totally unrealistic -- obviously people do come up with cool ideas such as the iterator protocol, and generators, after the language has been around for a while. All the cool features Python has adopted existed a long time before they made their way into Python itself.
Almost all, not all. As I repeatedly explained, for example, generator
comprehensions are not quite like anything in other languages (though
you can surely find _similarities_, of course).
Frankly, I think if the Lisp world had managed to build the same friendly and welcoming community Python seems to have, it would have taken over the world a quarter-century ago.
I think most people's reaction to (==against) Lisp's parentheses and
prefix syntax is more intense than that to Python's significant
whitespace, which appears quite natural to most non-programmers even
though many programmers inured to block delimiters do react strongly
against it. It's hard enough for Python to keep its community welcoming
and friendly when you hear some clueless dweeb rant against indendation
for the thousandth time, I think the Lisp community would have had to be
made up of saints to be still friendly and welcoming after a _million_
newcomers raised their whining against parentheses.
Alex
Alex Martelli wrote: Michael J. Fromberger <Mi******************@Clothing.Dartmouth.EDU> wrote: ...
Unfortunately, this wish is totally unrealistic -- obviously people do come up with cool ideas such as the iterator protocol, and generators, after the language has been around for a while. All the cool features Python has adopted existed a long time before they made their way into Python itself.
Almost all, not all. As I repeatedly explained, for example, generator comprehensions are not quite like anything in other languages (though you can surely find _similarities_, of course).
Indeed, and the reason I like Python is that it has some aspects of Icon
mixed in with perhaps the most easily-accessible object-oriented
features of any of the programming languages I've learned.
Frankly, I think if the Lisp world had managed to build the same friendly and welcoming community Python seems to have, it would have taken over the world a quarter-century ago.
I think most people's reaction to (==against) Lisp's parentheses and prefix syntax is more intense than that to Python's significant whitespace, which appears quite natural to most non-programmers even though many programmers inured to block delimiters do react strongly against it. It's hard enough for Python to keep its community welcoming and friendly when you hear some clueless dweeb rant against indendation for the thousandth time, I think the Lisp community would have had to be made up of saints to be still friendly and welcoming after a _million_ newcomers raised their whining against parentheses.
.... which it clearly isn't :-).
I do think the Python community is just a little out-of-the-ordinary,
which is what keeps me (and, I suspect, you: great to see you on the
wires again) coming back to c.l.py.
regards
Steve
Op 2004-08-28, Alex Martelli schreef <al*****@yahoo.com>: Mahesh Padmanabhan <ma****@privacy.net> wrote:
Hi,
When list comprehension was added to the language, I had a lot of trouble understanding it but now that I am familiar with it, I am not sure how I programmed in Python without it.
Oh good.
Now I see that generator expressions have been added to the language with 2.4 and I question the need for it. I know that it allows for lazy evaluation which speeds things up for larger lists but why was it necessary to add it instead of improving list comprehension?
Was there some sort of limitation that prevented list comprehension from taking over the functionality that generator expressions provide?
Sure, and that limitation is: list comprehensions return lists. This one "limitation" (together with Python's applicative order evaluation, and you couldn't change THAT without breaking the whole caboodle of existing programs!) implies everything else.
I have always liked the fact that Python has limited capabilities for having more than one way to do it and it seem to me that generator expressions break that philosophy. It is similar to how you have to use range or xrange depending on how large the range is.
Can someone please explain the reasoning behind it?
Generator comprehensions are wonderful and there is no way Python list comprehensions can provide the same features, since lists need to be lists. Sure, list(e(x) for x in foo) IS just the same thing as [e(x) for x in foo]. We'll remove the redundancy in 3.0 -- not earlier because it will break backwards compatibility. The only sensible way I can see right now for 3.0 to remove this redundancy is by removing list comprehensions and leaving only generator comprehensions, btw.
The one problem I have with generator comprehensions is that, because
list comprhension came first they look like tuple comprehensions.
In order to avoid this same kind of confusion with new comers I think
it would be best if generator comprehensions are explained first and
that list comprehension is just syntax sugar for list(genexp)
--
Antoon Pardon
[Mahesh Padmanabhan] Now I see that generator expressions have been added to the language with 2.4 and I question the need for it. I know that it allows for lazy evaluation which speeds things up for larger lists but why was it necessary to add it instead of improving list comprehension?
FWIW, we improved list comprehensions too. They run much faster now.
Raymond py****@rcn.com (Raymond Hettinger) wrote in message news:<5d**************************@posting.google. com>... [Mahesh Padmanabhan] Now I see that generator expressions have been added to the language with 2.4 and I question the need for it. I know that it allows for lazy evaluation which speeds things up for larger lists but why was it necessary to add it instead of improving list comprehension?
FWIW, we improved list comprehensions too. They run much faster now.
Raymond
IIRC, list constructors and list comprehensions now use the length of
the underlying sequence (if it defines __len__) to preallocate its
internal array.
Thus, the list comprehension is theoritically more efficient in this
case :
[x for x in range(100000)] vs [x for x in xrange(100000)]
So [x for x in whatever] may be preferable to list(x for x in
whatever) if you don't know what 'whatever' is and it happens in
define __len__.
BTW, I guess this is no longer true with an if statement :
[x for x in range(100000) if x%2==0] may be less efficient than [x for
x in xrange(100000) if x%2==0]
Regards,
Nicolas Lehuen
> > FWIW, we improved list comprehensions too. They run much faster now.
[Nicolas Lehuen] IIRC, list constructors and list comprehensions now use the length of the underlying sequence (if it defines __len__) to preallocate its internal array.
Yes for list().
No for list comps.
Thus, the list comprehension is theoritically more efficient
Aside from the theory being wrong, this is a testable assertion. Use
timeit.
The relative speed performance of genexps vs list comps depends on
length, cache, processor, memory allocator, use of re-usable
containers (tuples, dicts, etc), and other factors. Try this for
example:
dict( (i,i+i) for i in xrange(n) )
My recommendation is to always use list comps when you need a list and
to use genexps whenever values get consumed right away by an enclosing
function.
Raymond This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Raymond Hettinger |
last post by:
Peter Norvig's creative thinking triggered renewed interest in PEP 289.
That led to a number of contributors helping to re-work the pep details
into a form that has been well received on the...
|
by: Joh |
last post by:
hello,
i'm trying to understand how i could build following consecutive sets
from a root one using generator :
l =
would like to produce :
, , , ,
|
by: Mike Meyer |
last post by:
Ok, we've added list comprehensions to the language, and seen that
they were good. We've added generator expressions to the language, and
seen that they were good as well.
I'm left a bit...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 2 August 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM)
The start time is equivalent to 19:00 (7PM) in Central...
|
by: erikbower65 |
last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps:
1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal.
2. Connect to...
|
by: erikbower65 |
last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA:
1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
|
by: kcodez |
last post by:
As a H5 game development enthusiast, I recently wrote a very interesting little game - Toy Claw ((http://claw.kjeek.com/))。Here I will summarize and share the development experience here, and hope it...
|
by: DJRhino1175 |
last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this -
If...
|
by: Rina0 |
last post by:
I am looking for a Python code to find the longest common subsequence of two strings. I found this blog post that describes the length of longest common subsequence problem and provides a solution in...
|
by: DJRhino |
last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer)
If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _
310030356 Or 310030359 Or 310030362 Or...
|
by: lllomh |
last post by:
How does React native implement an English player?
|
by: DJRhino |
last post by:
Was curious if anyone else was having this same issue or not....
I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
| |