I tried to clear a list today (which I do rather rarely, considering
that just doing l = [] works most of the time) and was shocked, SHOCKED
to notice that there is no clear() method. Dicts have it, sets have it,
why do lists have to be second class citizens? 77 16738
Ville Vainio wrote: I tried to clear a list today (which I do rather rarely, considering that just doing l = [] works most of the time) and was shocked, SHOCKED to notice that there is no clear() method. Dicts have it, sets have it, why do lists have to be second class citizens?
because Python already has a perfectly valid way to clear a list,
perhaps ?
del l[:]
(lists are not mappings, so the duck typing argument don't really
apply here.)
</F>
Fredrik Lundh wrote: I tried to clear a list today (which I do rather rarely, considering that just doing l = [] works most of the time) and was shocked, SHOCKED to notice that there is no clear() method. Dicts have it, sets have it, why do lists have to be second class citizens? because Python already has a perfectly valid way to clear a list, perhaps ?
del l[:]
Ok. That's pretty non-obvious but now that I've seen it I'll probably
remember it. I did a stupid "while l: l.pop()" loop myself.
(lists are not mappings, so the duck typing argument don't really apply here.)
I was thinking of list as a "mutable collection", and clear() is
certainly a very natural operation for them.
Ville Vainio wrote: I tried to clear a list today (which I do rather rarely, considering that just doing l = [] works most of the time) and was shocked, SHOCKED to notice that there is no clear() method. Dicts have it, sets have it, why do lists have to be second class citizens?
This gets brought up all the time (search the archives for your
favorite), but your options are basically (renaming your list to lst for
readability) one of::
del lst[:]
lst[:] = []
or if you don't need to modify the list in place,
lst = []
Personally, I tend to go Fredrik's route and use the first.
If you feel really strongly about this though, you might consider
writing up a PEP. It's been contentious enough that there's not much
chance of getting a change without one.
STeVe
Em Ter, 2006-04-11 Ã*s 10:42 -0600, Steven Bethard escreveu: one of::
del lst[:]
lst[:] = []
or if you don't need to modify the list in place,
lst = []
Personally, I tend to go Fredrik's route and use the first.
I love benchmarks, so as I was testing the options, I saw something very
strange:
$ python2.4 -mtimeit 'x = range(100000); '
100 loops, best of 3: 6.7 msec per loop
$ python2.4 -mtimeit 'x = range(100000); del x[:]'
100 loops, best of 3: 6.35 msec per loop
$ python2.4 -mtimeit 'x = range(100000); x[:] = []'
100 loops, best of 3: 6.36 msec per loop
$ python2.4 -mtimeit 'x = range(100000); del x'
100 loops, best of 3: 6.46 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone
test this, too?
Cheers,
--
Felipe.
Steven Bethard wrote: If you feel really strongly about this though, you might consider writing up a PEP. It's been contentious enough that there's not much chance of getting a change without one.
No strong feelings here, and I'm sure greater minds than me have
already hashed this over sufficiently.
It's just that, when I have an object, and am wondering how I can clear
it, I tend to look what methods it has first and go to google looking
for "idioms" second.
Perhaps "clear" method could be added that raises
PedagogicException("Use del lst[:], stupid!")?
*ducks*
Felipe Almeida Lessa wrote: I love benchmarks, so as I was testing the options, I saw something very strange:
$ python2.4 -mtimeit 'x = range(100000); ' 100 loops, best of 3: 6.7 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x[:]' 100 loops, best of 3: 6.35 msec per loop $ python2.4 -mtimeit 'x = range(100000); x[:] = []' 100 loops, best of 3: 6.36 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x' 100 loops, best of 3: 6.46 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone test this, too?
In the first benchmark, you need space for two lists: the old one and
the new one; the other benchmarks you need only a single block of
memory (*). Concluding from here gets difficult - you would have to study
the malloc implementation to find out whether it works better in one
case over the other. Could also be an issue of processor cache: one
may fit into the cache, but the other may not.
Regards,
Martin
(*) plus, you also need the integer objects twice.
Ville Vainio wrote: It's just that, when I have an object, and am wondering how I can clear it, I tend to look what methods it has first and go to google looking for "idioms" second.
I guess del on a list is not that common, so people tend to not know
that it works on lists (and slices!), too. It's too bad that lists have
a pop() method these days, so people can do x.pop() even if they don't
need the value, instead of doing del x[-1]. I don't think I ever needed
to del a slice except for clearing the entire list (and I don't need to
do that often, either - I just throw the list away).
Regards,
Martin
Steven Bethard wrote: lst[:] = [] lst = []
What's the difference here?
Em Ter, 2006-04-11 Ã*s 17:56 +0000, John Salerno escreveu: Steven Bethard wrote:
lst[:] = [] lst = []
What's the difference here?
lst[:] = [] makes the specified slice become []. As we specified ":", it
transforms the entire list into [].
lst = [] assigns the value [] to the variable lst, deleting any previous
one.
This might help: lst = range(10) id(lst), lst
(-1210826356, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) lst[:] = [] id(lst), lst
(-1210826356, [])
lst = range(10) id(lst), lst
(-1210844052, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) lst = [] id(lst), lst
(-1210826420, [])
You see? lst[:] removes all elements from the list that lst refers to,
while lst = [] just creates a new list and discard the only one. The
difference is, for example:
lst = range(3) x = [lst, lst, lst] x
[[0, 1, 2], [0, 1, 2], [0, 1, 2]] lst[:] = [] x
[[], [], []]
lst = range(3) x = [lst, lst, lst] x
[[0, 1, 2], [0, 1, 2], [0, 1, 2]] lst = [] x
[[0, 1, 2], [0, 1, 2], [0, 1, 2]]
HTH,
--
Felipe.
John Salerno wrote: Steven Bethard wrote:
lst[:] = [] lst = []
What's the difference here?
L[:]= modifies the object in place, L=[] binds the variable to a
new object. compare and contrast: L = ["a", "b", "c"] M = L L
['a', 'b', 'c'] M
['a', 'b', 'c'] L is M
True L[:] = [] L
[] M
[] L is M
True
L = ["a", "b", "c"] M = L L
['a', 'b', 'c'] M
['a', 'b', 'c'] L = [] L
[] M
['a', 'b', 'c'] L is M
False
</F>
John Salerno wrote: Steven Bethard wrote:
lst[:] = [] lst = []
What's the difference here? lst = [1,2,3] lst2 = lst lst[:] = [] lst2
[] lst = [1,2,3] lst2 = lst lst = [] lst2
[1, 2, 3]
Duncan
Felipe Almeida Lessa wrote: You see? lst[:] removes all elements from the list that lst refers to, while lst = [] just creates a new list and discard the only one. The difference is, for example:
Thanks, your explanation was great!
Fredrik Lundh wrote: John Salerno wrote:
Steven Bethard wrote:
lst[:] = [] lst = [] What's the difference here?
L[:]= modifies the object in place, L=[] binds the variable to a new object. compare and contrast:
Thanks guys, your explanations are really helpful. I think what had me
confused at first was my understanding of what L[:] does on either side
of the assignment operator. On the left, it just chooses those elements
and edits them in place; on the right, it makes a copy of that list,
right? (Which I guess is still more or less *doing* the same thing, just
for different purposes)
John Salerno wrote: Thanks guys, your explanations are really helpful. I think what had me confused at first was my understanding of what L[:] does on either side of the assignment operator. On the left, it just chooses those elements and edits them in place; on the right, it makes a copy of that list, right? (Which I guess is still more or less *doing* the same thing, just for different purposes)
Interestingly, if it was just a "clear" method nobody would be confused.
On Tue, 11 Apr 2006 14:49:04 -0700, Ville Vainio wrote: John Salerno wrote:
Thanks guys, your explanations are really helpful. I think what had me confused at first was my understanding of what L[:] does on either side of the assignment operator. On the left, it just chooses those elements and edits them in place; on the right, it makes a copy of that list, right? (Which I guess is still more or less *doing* the same thing, just for different purposes)
Interestingly, if it was just a "clear" method nobody would be confused.
Even more importantly, you could say help(list.clear) and learn something
useful, instead of trying help(del) and getting a syntax error. help(del)
File "<stdin>", line 1
help(del)
^
SyntaxError: invalid syntax
I know Python isn't a purely OOP language, but in my opinion using
statements like del should be avoided when there is an easy and natural OO
way of doing it. Something like name.del() which means "delete the
reference to name in name's namespace" feels wrong, and may not even be
possible with Python's object model, so del name is a natural way to do
it.
But name.clear() meaning "mutate the object referenced by name to the
empty state" is a very natural candidate for a method, and I don't
understand why lists shouldn't have it.
For lists, it would be natural to have a hypothetical clear() method
accept an index or slice as an argument, so that these are equivalent:
del L[:] <=> L.clear()
del L[n] <=> L.clear(n)
del L[a:b] <=> L.clear(slice(a, b))
# untested reference implementation:
class ClearableList(list):
def clear(self, obj=None):
if obj is None:
obj = slice(0, len(self))
if isinstance(obj, int):
self.__delitem__(obj)
elif isinstance(obj, slice):
self.__delslice__(obj.start, obj.stop)
else:
raise TypeError
--
Steven.
On Tue, 11 Apr 2006 19:15:18 +0200, Martin v. Löwis wrote: Felipe Almeida Lessa wrote: I love benchmarks, so as I was testing the options, I saw something very strange:
$ python2.4 -mtimeit 'x = range(100000); ' 100 loops, best of 3: 6.7 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x[:]' 100 loops, best of 3: 6.35 msec per loop $ python2.4 -mtimeit 'x = range(100000); x[:] = []' 100 loops, best of 3: 6.36 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x' 100 loops, best of 3: 6.46 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone test this, too?
In the first benchmark, you need space for two lists: the old one and the new one;
Er, what new list? I see only one list, x = range(100000), which is merely
created then nothing done to it. Have I missed something?
I understood Felipe to be asking, why does it take longer to just create a
list, than it takes to create a list AND then do something to it?
--
Steven.
Em Qua, 2006-04-12 Ã*s 11:36 +1000, Steven D'Aprano escreveu: On Tue, 11 Apr 2006 19:15:18 +0200, Martin v. Löwis wrote:
Felipe Almeida Lessa wrote: I love benchmarks, so as I was testing the options, I saw something very strange:
$ python2.4 -mtimeit 'x = range(100000); ' 100 loops, best of 3: 6.7 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x[:]' 100 loops, best of 3: 6.35 msec per loop $ python2.4 -mtimeit 'x = range(100000); x[:] = []' 100 loops, best of 3: 6.36 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x' 100 loops, best of 3: 6.46 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone test this, too? In the first benchmark, you need space for two lists: the old one and the new one;
Er, what new list? I see only one list, x = range(100000), which is merely created then nothing done to it. Have I missed something?
He's talking about the garbage collector.
Talking about the GC, do you want to see something *really* odd?
$ python2.4 -mtimeit -s 'from gc import collect' 'collect(); x =
range(100000); '
100 loops, best of 3: 13 msec per loop
$ python2.4 -mtimeit -s 'from gc import collect' 'collect(); x =
range(100000); del x[:]'
100 loops, best of 3: 8.19 msec per loop
$ python2.4 -mtimeit -s 'from gc import collect' 'collect(); x =
range(100000); x[:] = []'
100 loops, best of 3: 8.16 msec per loop
$ python2.4 -mtimeit -s 'from gc import collect' 'collect(); x =
range(100000); del x'
100 loops, best of 3: 8.3 msec per loop
But in this case I got the answer (I think):
- When you explicitly delete the objects, the GC already know that it
can be collected, so it just throw the objects away.
- When we let the "x" variable continue to survive, the GC has to look
at all the 100001 objects to see if they can be collected -- just to see
that it can't.
Also, IIRC "del x" is slower than "x = []" because removing a name from
the namespace is more expensive than just assigning something else to
it. Right?
I understood Felipe to be asking, why does it take longer to just create a list, than it takes to create a list AND then do something to it?
I see dead people... ;-)
--
Felipe.
Felipe Almeida Lessa <fe**********@gmail.com> writes: I love benchmarks, so as I was testing the options, I saw something very strange:
$ python2.4 -mtimeit 'x = range(100000); ' 100 loops, best of 3: 6.7 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x[:]' 100 loops, best of 3: 6.35 msec per loop $ python2.4 -mtimeit 'x = range(100000); x[:] = []' 100 loops, best of 3: 6.36 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x' 100 loops, best of 3: 6.46 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone test this, too?
I get similar behaviour. No idea why.
$ python2.4 -mtimeit 'x = range(100000); '
100 loops, best of 3: 6.99 msec per loop
$ python2.4 -mtimeit 'x = range(100000); del x[:]'
100 loops, best of 3: 6.49 msec per loop
$ python2.4 -mtimeit 'x = range(100000); x[:] = []'
100 loops, best of 3: 6.47 msec per loop
$ python2.4 -mtimeit 'x = range(100000); del x'
100 loops, best of 3: 6.6 msec per loop
Dan
Martin v. Löwis wrote: Felipe Almeida Lessa wrote: I love benchmarks, so as I was testing the options, I saw something very strange:
$ python2.4 -mtimeit 'x = range(100000); ' 100 loops, best of 3: 6.7 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x[:]' 100 loops, best of 3: 6.35 msec per loop $ python2.4 -mtimeit 'x = range(100000); x[:] = []' 100 loops, best of 3: 6.36 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x' 100 loops, best of 3: 6.46 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone test this, too? In the first benchmark, you need space for two lists: the old one and the new one; the other benchmarks you need only a single block of memory (*).
I don't follow you here :)
Concluding from here gets difficult - you would have to study the malloc implementation to find out whether it works better in one case over the other.
That's not the case here. The following program prints the same
addresses whether you comment or uncomment del
---------------------------------
ids = range(10)
for i in xrange(10):
x = range(100000)
#del x[:]
ids[i] = id(x)
print ids
--------------------------------
Could also be an issue of processor cache: one may fit into the cache, but the other may not.
That's not the reason, since the same time difference happens with
smaller arrays. I think the difference is how items are deallocated in
these two cases.
Calling del invokes list_ass_slice that deallocates from 0 to end
whereas ordinary removal of a list was changed to backward iteration
(the reason is in the comment:
/* Do it backwards, for Christian Tismer.
There's a simple test case where somehow this reduces
thrashing when a *very* large list is created and
immediately deleted. */
Usually iterating from low addresses to higher addresses is better for
CPU. On my CPU (Pentium M, 1.7Ghz) it's 20% faster:
Here is my results:
C:\py>python -mtimeit "x = range(10000); del x[:]"
1000 loops, best of 3: 213 usec per loop
C:\py>python -mtimeit "x = range(10000); del x"
1000 loops, best of 3: 257 usec per loop
C:\py>python -mtimeit "x = range(10000); "
1000 loops, best of 3: 258 usec per loop
C:\py>python -mtimeit "x = range(1000); del x[:]"
10000 loops, best of 3: 21.4 usec per loop
C:\py>python -mtimeit "x = range(1000); del x"
10000 loops, best of 3: 25.2 usec per loop
C:\py>python -mtimeit "x = range(1000); "
10000 loops, best of 3: 25.6 usec per loop
I don't have a development environment on my computer so I can't test
my thoughts. I could be wrong about the reason.
Martin v. Löwis wrote: Felipe Almeida Lessa wrote: I love benchmarks, so as I was testing the options, I saw something very strange:
$ python2.4 -mtimeit 'x = range(100000); ' 100 loops, best of 3: 6.7 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x[:]' 100 loops, best of 3: 6.35 msec per loop $ python2.4 -mtimeit 'x = range(100000); x[:] = []' 100 loops, best of 3: 6.36 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x' 100 loops, best of 3: 6.46 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone test this, too?
In the first benchmark, you need space for two lists: the old one and the new one; the other benchmarks you need only a single block of memory (*). Concluding from here gets difficult - you would have to study the malloc implementation to find out whether it works better in one case over the other. Could also be an issue of processor cache: one may fit into the cache, but the other may not.
Addition to the previous message. Now I follow you :) There are indeed
two arrays and cache seems to be the second reason for slowdown, but
iterating backwards is also contributing to slowdown.
Felipe Almeida Lessa wrote: Em Qua, 2006-04-12 às 11:36 +1000, Steven D'Aprano escreveu: On Tue, 11 Apr 2006 19:15:18 +0200, Martin v. Löwis wrote:
Felipe Almeida Lessa wrote: > I love benchmarks, so as I was testing the options, I saw something very > strange: > > $ python2.4 -mtimeit 'x = range(100000); ' > 100 loops, best of 3: 6.7 msec per loop > $ python2.4 -mtimeit 'x = range(100000); del x[:]' > 100 loops, best of 3: 6.35 msec per loop > $ python2.4 -mtimeit 'x = range(100000); x[:] = []' > 100 loops, best of 3: 6.36 msec per loop > $ python2.4 -mtimeit 'x = range(100000); del x' > 100 loops, best of 3: 6.46 msec per loop > > Why the first benchmark is the slowest? I don't get it... could someone > test this, too?
In the first benchmark, you need space for two lists: the old one and the new one;
Er, what new list? I see only one list, x = range(100000), which is merely created then nothing done to it. Have I missed something?
He's talking about the garbage collector.
To be exact the reason for two array is timeit.py. It doesn't place the
code to time into a separate namespace but injects it into a for loop,
so the actual code timed is:
for _i in _it:
x = range(100000)
and that makes two arrays with 100.000 items exist for a short time
starting from second iteration.
On Wed, 12 Apr 2006 00:33:29 -0700, Serge Orlov wrote: Felipe Almeida Lessa wrote: Em Qua, 2006-04-12 Ã*s 11:36 +1000, Steven D'Aprano escreveu: > On Tue, 11 Apr 2006 19:15:18 +0200, Martin v. Löwis wrote: > > > Felipe Almeida Lessa wrote: > >> I love benchmarks, so as I was testing the options, I saw something very > >> strange: > >> > >> $ python2.4 -mtimeit 'x = range(100000); ' > >> 100 loops, best of 3: 6.7 msec per loop > >> $ python2.4 -mtimeit 'x = range(100000); del x[:]' > >> 100 loops, best of 3: 6.35 msec per loop > >> $ python2.4 -mtimeit 'x = range(100000); x[:] = []' > >> 100 loops, best of 3: 6.36 msec per loop > >> $ python2.4 -mtimeit 'x = range(100000); del x' > >> 100 loops, best of 3: 6.46 msec per loop > >> > >> Why the first benchmark is the slowest? I don't get it... could someone > >> test this, too? > > > > In the first benchmark, you need space for two lists: the old one and > > the new one; > > Er, what new list? I see only one list, x = range(100000), which is merely > created then nothing done to it. Have I missed something?
He's talking about the garbage collector.
To be exact the reason for two array is timeit.py. It doesn't place the code to time into a separate namespace but injects it into a for loop, so the actual code timed is: for _i in _it: x = range(100000) and that makes two arrays with 100.000 items exist for a short time starting from second iteration.
But that is precisely the same for the other timeit tests too.
for _i in _it:
x = range(100000)
del x[:]
etc.
The question remains -- why does it take longer to do X than it takes to
do X and then Y?
--
Steven.
Steven D'Aprano wrote: But that is precisely the same for the other timeit tests too.
for _i in _it: x = range(100000)
Allocate list.
Allocate ob_item array to hold pointers to 10000 objects
Allocate 99900 integer objects
setup list
del x[:]
Calls list_clear which:
decrements references to 99900 integer objects, freeing them
frees the ob_item array
.... next time round ... x = range(100000)
Allocate new list list.
Allocate ob_item array probably picks up same memory block as last time
Allocate 99900 integer objects, probably reusing same memory as last time
setup list etc.
The question remains -- why does it take longer to do X than it takes to do X and then Y? for _i in _it:
.... x = range(100000);
Allocate list.
Allocate ob_item array to hold pointers to 10000 objects
Allocate 99900 integer objects
setup list
.... next time round ...
Allocate another list
allocate a second ob_item array
allocate another 99900 integer objects
setup the list
then deletes the original list, decrements and releases original integers,
frees original ob_item array.
Uses twice as much everything except the actual list object. The actual
work done is the same but I guess there are likely to be more cache misses.
Also there is the question whether the memory allocation does or does not
manage to reuse the recently freed blocks. With one large block I expect it
might well end up reusing it, with two large blocks being freed alternately
it might not manage to reuse either (but that is just a guess and maybe
system dependant).
Steven D'Aprano wrote: But name.clear() meaning "mutate the object referenced by name to the empty state" is a very natural candidate for a method, and I don't understand why lists shouldn't have it.
Funny this even comes up, because I was just trying to 'clear' a list
the other day. But it sounds like it's been an issue for a while. :)
Steven D'Aprano wrote: On Tue, 11 Apr 2006 14:49:04 -0700, Ville Vainio wrote:
John Salerno wrote:
Thanks guys, your explanations are really helpful. I think what had me confused at first was my understanding of what L[:] does on either side of the assignment operator. On the left, it just chooses those elements and edits them in place; on the right, it makes a copy of that list, right? (Which I guess is still more or less *doing* the same thing, just for different purposes) Interestingly, if it was just a "clear" method nobody would be confused.
Even more importantly, you could say help(list.clear) and learn something useful, instead of trying help(del) and getting a syntax error.
[snip more reasons to add list.clear()]
I think these are all good reasons for adding a clear method, but being
that it has been so hotly contended in the past, I don't think it will
get added without a PEP. Anyone out there willing to take out the best
examples from this thread and turn it into a PEP?
STeVe
Steven Bethard wrote: I think these are all good reasons for adding a clear method, but being that it has been so hotly contended in the past, I don't think it will get added without a PEP. Anyone out there willing to take out the best examples from this thread and turn it into a PEP?
What are the usual arguments against adding it?
John Salerno wrote: Steven Bethard wrote:
I think these are all good reasons for adding a clear method, but being that it has been so hotly contended in the past, I don't think it will get added without a PEP. Anyone out there willing to take out the best examples from this thread and turn it into a PEP?
What are the usual arguments against adding it?
That there should be one obvious way to do it.
Yes, I know that it can be debated whether "del x[:]" is obvious, and
fortunately I'm not the one to decide <wink>.
Georg
Georg Brandl wrote: John Salerno wrote:Steven Bethard wrote:I think these are all good reasons for adding a clear method, but being that it has been so hotly contended in the past, I don't think it will get added without a PEP. Anyone out there willing to take out the best examples from this thread and turn it into a PEP?
What are the usual arguments against adding it?
That there should be one obvious way to do it.
Yes, I know that it can be debated whether "del x[:]" is obvious, and fortunately I'm not the one to decide <wink>.
It's so "obvious", I couldn't even find it in the tutorial the other day
when this thread came up. It's a wonder I ever learned that one myself,
and I don't know where I saw it. .clear(), on the other hand, would
clearly be easy to add/find in the tutorial... and we'd all save time
answering the question over and over around here (valuable time we would
then be able to using telling people who hadn't, to go and read the
tutorial! :-) ).
-Peter
John Salerno wrote: Steven Bethard wrote:I think these are all good reasons for adding a clear method, but being that it has been so hotly contended in the past, I don't think it will get added without a PEP. Anyone out there willing to take out the best examples from this thread and turn it into a PEP?
What are the usual arguments against adding it?
I think one is that it's very rare to need it. Most of the time, just
rebinding the name to a new empty list is easier, and possibly faster.
The main problem with that approach is that in multithreaded code that
can be a source of subtle bugs. On the other hand, most things can be a
source of subtle bugs in multithreaded code, so maybe that's not a good
reason to add clear().
Saving us time answering the question repeatedly (and trying to justify
the current state of affairs) might be justification enough...
-Peter
John Salerno wrote: Steven Bethard wrote:
I think these are all good reasons for adding a clear method, but being that it has been so hotly contended in the past, I don't think it will get added without a PEP. Anyone out there willing to take out the best examples from this thread and turn it into a PEP?
What are the usual arguments against adding it?
I think most of the other folks have already given you the answers, but
the main ones I remember:
(1) There are already at least two ways of spelling it. We don't need
another.
(2) Clearing a list is a pretty uncommon operation. Usually just
creating a new list is fine (and easier).
Disclaimer: I'm happy with adding list.clear(), so it's quite possible
I'm misrepresenting the argument here. I guess that's just all the more
reason to have a PEP for it: to lay out all the pros and cons.
STeVe
Steven D'Aprano wrote: $ python2.4 -mtimeit 'x = range(100000); ' 100 loops, best of 3: 6.7 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x[:]' 100 loops, best of 3: 6.35 msec per loop $ python2.4 -mtimeit 'x = range(100000); x[:] = []' 100 loops, best of 3: 6.36 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x' 100 loops, best of 3: 6.46 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone test this, too? In the first benchmark, you need space for two lists: the old one and the new one;
Er, what new list? I see only one list, x = range(100000), which is merely created then nothing done to it. Have I missed something?
See Duncan's explanation. This code is run many times, allocating many
lists. However, only two different lists exist at any point at time.
A Python list consists of two memory blocks: the list proper (a few
bytes), plus the "guts", i.e. a variable-sized block of pointers to
the objects. del x[:] frees the guts.
I understood Felipe to be asking, why does it take longer to just create a list, than it takes to create a list AND then do something to it?
Actually, the same code (deallocation of all integers and the list
blocks) appears in either case. However, in the one case it is triggered
explicitly, and before the new allocation; in the other case, the
allocation of the new objects occurs before the old ones are released.
Regards,
Martin
Ville Vainio wrote: Fredrik Lundh wrote:because Python already has a perfectly valid way to clear a list, perhaps ?
del l[:]
Ok. That's pretty non-obvious but now that I've seen it I'll probably remember it. I did a stupid "while l: l.pop()" loop myself.
Actually, it's in the Library Reference (that we keep under
our pillows) section 2.3.6.4 Mutable Sequence Types.
Cheers, Mel.
[Steven Bethard] I think these are all good reasons for adding a clear method, but being that it has been so hotly contended in the past, I don't think it will get added without a PEP. Anyone out there willing to take out the best examples from this thread and turn it into a PEP?
Something this small doesn't need a PEP. I'll just send a note to
Guido asking for a pronouncement.
Here's a draft list of pros and cons (any changes or suggestions are
welcome):
Pros:
-----
* s.clear() is more obvious in intent
* easier to figure-out, look-up, and remember than either s[:]=[] or
del s[:]
* parallels the api for dicts, sets, and deques (increasing the
expecation that lists will too)
* the existing alternatives are a bit perlish
* the OP is shocked, SHOCKED that python got by for 16 years without
list.clear()
Cons:
-----
* makes the api fatter (there are already two ways to do it)
* expanding the api makes it more difficult to write classes that can
be polymorphically substituted for lists
* learning slices is basic to the language (this lesson shouldn't be
skipped)
* while there are valid use cases for re-using lists, the technique is
already overused for unsuccessful attempts to micro-optimize (creating
new lists is surprisingly fast)
* the request is inane, the underlying problem is trivial, and the
relevant idiom is fundamental (api expansions should be saved for rich
new functionality and not become cluttered with infrequently used
redundant entries)
Em Qua, 2006-04-12 Ã*s 12:40 -0700, Raymond Hettinger escreveu: * the existing alternatives are a bit perlish
I love this argument =D! "perlish"... lol...
Cheers,
--
Felipe.
[Felipe Almeida Lessa] I love benchmarks, so as I was testing the options, I saw something very strange:
$ python2.4 -mtimeit 'x = range(100000); ' 100 loops, best of 3: 6.7 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x[:]' 100 loops, best of 3: 6.35 msec per loop $ python2.4 -mtimeit 'x = range(100000); x[:] = []' 100 loops, best of 3: 6.36 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x' 100 loops, best of 3: 6.46 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone test this, too?
[Dan Christensen] I get similar behaviour. No idea why.
It is an effect of the memory allocator and fragmentation. The first
builds up a list with increasingly larger sizes. It periodically
cannot grow in-place because something is in the way (some other
object) so it needs to move its current entries to another, larger
block and grow from there. In contrast, the other entries are reusing
a the previously cleared out large block.
Just for grins, replace the first with"
'x=None; x=range(100000)'
The assignment to None frees the reference to the previous list and
allows it to be cleared so that its space is immediately available to
the new list being formed by range().
Raymond Hettinger wrote: Cons: ----- * learning slices is basic to the language (this lesson shouldn't be skipped)
And yet it doesn't appear to be in the tutorial. I could have missed
it, but I've looked in a number of the obvious places, without actually
going through it (again) from start to finish. Also, googling for
"slice site:docs.python.org", you have to go to the *sixth* entry before
you can find the first mention of "del x[:]" and what it does. I think
given the current docs it's possible to learn all kinds of things about
slicing and still not make the non-intuitive leap that "del x[slice]" is
actually how you spell "delete contents of list in-place".
* while there are valid use cases for re-using lists, the technique is already overused for unsuccessful attempts to micro-optimize (creating new lists is surprisingly fast)
Not just valid use-cases, but ones for which "x = []" is entirely buggy,
yet not obviously so, especially to newcomers.
* the request is inane, the underlying problem is trivial, and the relevant idiom is fundamental (api expansions should be saved for rich new functionality and not become cluttered with infrequently used redundant entries)
The first phrase is insulting and untrue, the second merely untrue (as
even your own list of pros and cons shows), and the last completely
valid and highly relevant, yet not overriding.
-Peter
Raymond Hettinger wrote: * easier to figure-out, look-up, and remember than either s[:]=[] or del s[:]
Easier is an understatement - it's something you figure out
automatically. When I want to do something w/ an object, looking at its
methods (via code completion) is the very first thing.
* the OP is shocked, SHOCKED that python got by for 16 years without list.clear()
I'm sure you realize I was being sarcastic...
* learning slices is basic to the language (this lesson shouldn't be skipped)
Assigning to slices is much less important, and is something I always
never do (and hence forget).
* the request is inane, the underlying problem is trivial, and the relevant idiom is fundamental (api expansions should be saved for rich new functionality and not become cluttered with infrequently used redundant entries)
I understand that these are the main arguments. However, as it stands
there is no one *obvious* way to clear a list in-place. I agree that
it's rare to even need it, but when you do a it's a little bit of a
surprise.
Ville Vainio wrote: Assigning to slices is much less important, and is something I always never do (and hence forget).
ALMOST never, of course.
In article <11**********************@u72g2000cwu.googlegroups .com>,
Raymond Hettinger <py****@rcn.com> wrote: [Steven Bethard] I think these are all good reasons for adding a clear method, but being that it has been so hotly contended in the past, I don't think it will get added without a PEP. Anyone out there willing to take out the best examples from this thread and turn it into a PEP?
Something this small doesn't need a PEP. I'll just send a note to Guido asking for a pronouncement.
Here's a draft list of pros and cons (any changes or suggestions are welcome):
Pros: -----
* s.clear() is more obvious in intent
Serious question: Should it work more like "s=[]" or more like
"s[:]=[]". I'm assuming the latter, but the fact that there is
a difference is an argument for not hiding this operation behind
some syntactic sugar.
Alan
--
Defendit numerus
On Wed, 12 Apr 2006 12:40:52 -0700, Raymond Hettinger wrote: Something this small doesn't need a PEP. I'll just send a note to Guido asking for a pronouncement.
Raymond, if you're genuinely trying to help get this sorted in the
fairest, simplest way possible, I hope I speak for everyone when
I say thank you, your efforts are appreciated.
But there is this:
* the request is inane, the underlying problem is trivial, and the relevant idiom is fundamental (api expansions should be saved for rich new functionality and not become cluttered with infrequently used redundant entries)
Is this sort of editorialising fair, or just a way of not-so-subtly
encouraging Guido to reject the whole idea, now and forever?
Convenience and obviousness are important for APIs -- that's why lists
have pop, extend and remove methods. The only difference I can see between
a hypothetical clear and these is that clear can be replaced with a
one-liner, while the others need at least two, e.g. for extend:
for item in seq:
L.append(item)
Here is another Pro for your list:
A list.clear method will make deleting items from a list more OO,
consistent with almost everything else you do to lists, and less
procedural. This is especially true if clear() takes an optional index (or
two), allowing sections of the list to be cleared, not just the entire
list.
--
Steven.
On Wed, 12 Apr 2006 15:36:47 -0700, Alan Morgan wrote: Serious question: Should it work more like "s=[]" or more like "s[:]=[]". I'm assuming the latter, but the fact that there is a difference is an argument for not hiding this operation behind some syntactic sugar.
Er, I don't see how it can possibly work like s = []. That merely
reassigns a new empty list to the name s, it doesn't touch the existing
list (which may or may not be garbage collected soon/immediately
afterwards).
As far as I know, it is impossible -- or at least very hard -- for an
object to know which namesspaces it is in, so it can reassign one name but
not the others. Even if such a thing was possible, I think it is an
absolutely bad idea.
--
Steven.
In article <pa****************************@REMOVETHIScyber.co m.au>,
Steven D'Aprano <st***@REMOVETHIScyber.com.au> wrote: On Wed, 12 Apr 2006 15:36:47 -0700, Alan Morgan wrote:
Serious question: Should it work more like "s=[]" or more like "s[:]=[]". I'm assuming the latter, but the fact that there is a difference is an argument for not hiding this operation behind some syntactic sugar.
Er, I don't see how it can possibly work like s = []. That merely reassigns a new empty list to the name s, it doesn't touch the existing list (which may or may not be garbage collected soon/immediately afterwards).
Right. I was wondering what would happen in this case:
s=[1,2,3]
t=s
s.clear()
t # [] or [1,2,3]??
If you know your basic python it is "obvious" what would happen
if you do s=[] or s[:]=[] instead of s.clear() and I guess it is
equally "obvious" which one s.clear() must mimic. I'm still not
used to dealing with mutable lists.
Alan
--
Defendit numerus
Alan Morgan wrote: Right. I was wondering what would happen in this case:
s=[1,2,3] t=s s.clear() t # [] or [1,2,3]??
If you know your basic python it is "obvious" what would happen if you do s=[] or s[:]=[] instead of s.clear() and I guess it is equally "obvious" which one s.clear() must mimic. I'm still not used to dealing with mutable lists.
If you know your basic python :-), you know that s[:] = [] is doing the
only thing that s.clear() could possibly do, which is changing the
contents of the list which has the name "s" bound to it (and which might
have other names bound to it, just like any object in Python). It
*cannot* be doing the same as "s=[]" which does not operate on the list
but creates an entirely new one and rebinds the name "s" to it.
The only possible answer for your question above is "t is s" and "t ==
[]" because you haven't rebound the names.
-Peter
Steven D'Aprano wrote: Convenience and obviousness are important for APIs -- that's why lists have pop, extend and remove methods. The only difference I can see between a hypothetical clear and these is that clear can be replaced with a one-liner, while the others need at least two, e.g. for extend:
for item in seq: L.append(item)
It's not even clear that extend needs two lines: s = range(5) more = list('abc') s[:] = s + more s
[0, 1, 2, 3, 4, 'a', 'b', 'c']
Okay, it's not obvious, but I don't think s[:]=[] is really any more
obvious as a way to clear the list.
Clearly .extend() needs to be removed from the language as it is an
unnecessary extension to the API using slicing, which everyone should
already know about...
-Peter
In article <ma***************************************@python. org>,
Peter Hansen <pe***@engcorp.com> wrote: Alan Morgan wrote: Right. I was wondering what would happen in this case:
s=[1,2,3] t=s s.clear() t # [] or [1,2,3]??
If you know your basic python it is "obvious" what would happen if you do s=[] or s[:]=[] instead of s.clear() and I guess it is equally "obvious" which one s.clear() must mimic. I'm still not used to dealing with mutable lists.
If you know your basic python :-), you know that s[:] = [] is doing the only thing that s.clear() could possibly do,
Ah, but if you know your basic python then you wouldn't be looking for
s.clear() in the first place; you'd just use s[:]=[] (or s=[], whichever
is appropriate). IOW, the people most in need of s.clear() are those
least likely to be able to work out what it is actually doing.
Personally, it seems more *reasonable* to me, a novice python user,
for s.clear() to behave like s=[] (or, perhaps, for s=[] and s[:]=[] to
mean the same thing). The fact that it can't might be an argument for
not having it in the first place.
Alan
--
Defendit numerus
"Peter Hansen" <pe***@engcorp.com> wrote in message
news:e1**********@sea.gmane.org... It's not even clear that extend needs two lines:
s = range(5) more = list('abc') s[:] = s + more s [0, 1, 2, 3, 4, 'a', 'b', 'c']
This is not the same as list.extend because it makes a separate
intermediate list instead of doing the extension completely in place.
However, the following does mimic .extend. s=range(5) s[len(s):] = list('abc') s
[0, 1, 2, 3, 4, 'a', 'b', 'c']
So. at the cost of writing and calling len(s), you are correct that .extend
is not necessary.
Terry Jan Reedy
Raymond Hettinger <py****@rcn.com> writes: Felipe Almeida Lessa writes:
I love benchmarks, so as I was testing the options, I saw something very strange:
$ python2.4 -mtimeit 'x = range(100000); ' 100 loops, best of 3: 6.7 msec per loop $ python2.4 -mtimeit 'x = range(100000); del x[:]' 100 loops, best of 3: 6.35 msec per loop
Why the first benchmark is the slowest? I don't get it... could someone test this, too? It is an effect of the memory allocator and fragmentation. The first builds up a list with increasingly larger sizes.
I don't see what you mean by this. There are many lists all of
the same size. Do you mean some list internal to the memory
allocator?
It periodically cannot grow in-place because something is in the way (some other object) so it needs to move its current entries to another, larger block and grow from there. In contrast, the other entries are reusing a the previously cleared out large block.
Just for grins, replace the first with" 'x=None; x=range(100000)' The assignment to None frees the reference to the previous list and allows it to be cleared so that its space is immediately available to the new list being formed by range().
It's true that this runs at the same speed as the del variants on my
machine. That's not too surprising to me, but I still don't
understand why the del variants are more than 5% faster than the first
version.
Once this is understood, is it something that could be optimized?
It's pretty common to rebind a variable to a new value, and if
this could be improved 5%, that would be cool. But maybe it
wouldn't affect anything other than such micro benchmarks.
Dan
Peter Hansen wrote: * learning slices is basic to the language (this lesson shouldn't be skipped) And yet it doesn't appear to be in the tutorial.
oh, please.
slices are explained in the section on strings, and in the section on lists,
and used to define the behaviour of the list methods in the second section
on lists, ...
I could have missed it, but I've looked in a number of the obvious places http://docs.python.org/tut/node5.htm...00000000000000
section 3.1.2 contains an example that shows to remove stuff from a list,
in place.
if you want a clearer example, please consider donating some of your time
to the pytut wiki: http://pytut.infogami.com/
</F>
Peter Hansen wrote: It's not even clear that extend needs two lines:
>>> s = range(5) >>> more = list('abc') >>> s[:] = s + more >>> s
[0, 1, 2, 3, 4, 'a', 'b', 'c']
Okay, it's not obvious, but I don't think s[:]=[] is really any more obvious as a way to clear the list.
Clearly .extend() needs to be removed from the language as it is an unnecessary extension to the API using slicing
you just flunked the "what Python has to do to carry out a certain operation"
part of the "how Python works, intermediate level" certification.
</F> This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Mike |
last post by:
How do I extract a list of lists from a user defined function and print the
results as strings for each list?
|
by: sam |
last post by:
Hi,
The following code produced a core-dump:
PropertyParser::PropertyParser(list<HashMap> &p_list)
{
l_conf_data.clear();
cout << "beginning copy.. size: " << p_list.size() << endl;...
|
by: Water Cooler v2 |
last post by:
What am I missing here. I bind a drop-down list in ASP.NET placed on a
web form to a DataReader. The binding is done at run-time and not at
design time. Here's the code I write to bind the list:
...
|
by: pamelafluente |
last post by:
Hi
I have a sorted list with several thousands items. In my case, but this
is not important, objects are stored only in Keys, Values are all
Nothing. Several of the stored objects (might be a...
|
by: =?Utf-8?B?THVib21pcg==?= |
last post by:
Hi,
I am wondering how I could clear the invokation list of delegates.
I tried the following:
//make a chain
MyDel += SomeDel;
// Clear the chain
|
by: Kemmylinns12 |
last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
|
by: Matthew3360 |
last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function.
Here is my code.
header("Location:".$urlback);
Is this the right layout the...
|
by: Matthew3360 |
last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it so the python app could use a http request to get...
|
by: Arjunsri |
last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and credentials and received a successful connection...
|
by: WisdomUfot |
last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific technical details, Gmail likely implements measures...
|
by: Matthew3360 |
last post by:
Hi,
I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
|
by: Oralloy |
last post by:
Hello Folks,
I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA.
My problem (spelled failure) is with the synthesis of my design into a bitstream, not the C++...
|
by: Carina712 |
last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
|
by: BLUEPANDA |
last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...
| |