Hi!
This is a rant against the optimization trend of the Python interpreter.
Sorting a list of 100000 integers in random order takes:
* 0.75 seconds in Python 2.1
* 0.51 seconds in Python 2.2
* 0.46 seconds in Python 2.3
Tim Peters did a great job optimizing list.sort(). If I try with a
simple, non-stable pure Python quicksort implementation, in Python 2.3:
* 4.83 seconds
* 0.21 seconds with Psyco
First step towards world domination of high-level languages :-)
The reason that Psyco manages to outperform the C implementation is not
that gcc is a bad compiler (it is about 10 times better than Psyco's).
The reason is that the C implementation must use a generic '<' operator
to compare elements, while the Psyco version quickly figures out that it
can expect to find ints in the list; it still has to check this
assumption, but this is cheap and then the comparison is done with a
single machine instruction.
Similarily, here are some results about the heapq module, which is
rewritten in C in the CVS tree for Python 2.4:
l = [random.random() for x in range(200000)]
heapq.heapify(l)
This code executes on my laptop in:
* 1.96 seconds on Python 2.3 (pure Python)
* 0.18 seconds on Python 2.4cvs (rewritten in C)
* 0.16 seconds on Python 2.3 with Psyco
So this is not so much a plug for Psyco as a rant against the current
trend of rewriting standard modules in C. Premature optimization and
all that.
Worse, and more importantly, the optimization starts to become visible
to the programmer. Iterators, for example, are great in limited cases
but I consider their introduction a significant complication in the
language; before, you could expect that some function from which you
would expect a sequence returned a list. Python was all lists and
dicts, with dicts used as namespaces here and there. Nowadays you have
to be careful. Moreover, it is harder to explain: zip([1,2,3], [4,5,6]) # easy to understand and explain
[(1, 4), (2, 5), (3, 6)]
enumerate([6,7,8,9]) # uh ?
<enumerate object at 0x401a102c>
I know you can always do list(_). My point is that this is a
user-visible optimization. enumerate() should return a normal list, and
it should be someone else's job to ensure that it is correctly optimized
away if possible (and I'm not even talking about Psyco, it could be done
in the current Python implementation with a reasonable amount of
effort).
Protesting-ly yours,
Armin 36 2585
Hi !
For the 1st April, it's finish.
Michel Claveau/Hamster wrote: For the 1st April, it's finish.
Check it for yourself. Find yourself an Intel machine and grab Psyco
from http://psyco.sf.net. Here is the source of my test:
# Python Quicksort Written by Magnus Lie Hetland
# http://www.hetland.org/python/quicksort.html
def _partition(list, start, end):
pivot = list[end]
bottom = start-1
top = end
done = 0
while not done:
while not done:
bottom = bottom+1
if bottom == top:
done = 1
break
if pivot < list[bottom]:
list[top] = list[bottom]
break
while not done:
top = top-1
if top == bottom:
done = 1
break
if list[top] < pivot:
list[bottom] = list[top]
break
list[top] = pivot
return top
def _quicksort(list, start, end):
if start < end:
split = _partition(list, start, end)
_quicksort(list, start, split-1)
_quicksort(list, split+1, end)
def quicksort(list):
if len(list) > 1:
_quicksort(list, 0, len(list)-1)
# __________________________________________________ __________
import random, time, psyco
l = range(100000)
random.shuffle(l)
#TIMEIT = "l.sort()"
#TIMEIT = "quicksort(l)"
TIMEIT = "psyco.proxy(quicksort)(l)"
print TIMEIT, ':',
t = time.time()
exec TIMEIT
print time.time() - t
assert l == range(100000)
Armin Rigo <ar***@tunes.org> writes: enumerate([6,7,8,9]) # uh ?
<enumerate object at 0x401a102c>
I know you can always do list(_). My point is that this is a user-visible optimization. enumerate() should return a normal list, and it should be someone else's job to ensure that it is correctly optimized away if possible (and I'm not even talking about Psyco, it could be done in the current Python implementation with a reasonable amount of effort).
I think enumerate(xrange(1000000000)) returning a normal list would
exhaust memory before some later optimizer had a chance to do anything
with it.
> For the 1st April, it's finish.
That wasn't a joke, psyco does in fact work /really/ well for some things.
- Josiah
In article <40***************@tunes.org>, Armin Rigo wrote: The reason that Psyco manages to outperform the C implementation is not that gcc is a bad compiler (it is about 10 times better than Psyco's). The reason is that the C implementation must use a generic '<' operator to compare elements, while the Psyco version quickly figures out that it can expect to find ints in the list; it still has to check this assumption, but this is cheap and then the comparison is done with a single machine instruction.
Why can't the C implementation do the same thing?
Joe
Hello Paul,
Paul Rubin wrote: I think enumerate(xrange(1000000000)) returning a normal list would exhaust memory before some later optimizer had a chance to do anything with it.
There are two levels: the language specification and its implementation.
My point is that there is no reason why everything that is (at the
language level) a list, should always be implemented as a plain array of
objects. The list returned by range(1000000) doesn't have to use 4MB of
memory when the same information can be encoded in a couple of ints.
The presence of xrange() is the oldest of these user-visible
optimization hack that should have been optimized in the implementation
instead. Similarily, if you do 'a'*999999999 I can think of a better
way to encode the resulting string than with 100MB of RAM.
On your specific example, we could also argue that a slightly involved
but reasonable amount of effort would allow C functions to internally
receive a context hint, which would tell enumerate() that its result
will only ever be used for iteration when this is the case, so that it
can internally return an iterable instead of a list -- but this would
only be a hack, a workaround for the limits of CPython's internal object
representations.
Armin
Armin Rigo <ar***@tunes.org> writes: There are two levels: the language specification and its implementation. My point is that there is no reason why everything that is (at the language level) a list, should always be implemented as a plain array of objects. The list returned by range(1000000) doesn't have to use 4MB of memory when the same information can be encoded in a couple of ints. The presence of xrange() is the oldest of these user-visible optimization hack that should have been optimized in the implementation instead.
I think you're saying that instead of having xrange return a special
object, range should return that special object instead. I'm not too
clear on the distinction. Also, how can the compiler or interpreter
know whether the program will only access the object sequentially?
It has to predict the behavior of the program, instead of being told
explicitly.
Paul Rubin wrote: I think you're saying that instead of having xrange return a special object, range should return that special object instead. I'm not too clear on the distinction.
No, range should return an object that is a list, as far as you can tell
from Python, but which is represented more efficiently than an array of
objects internally. The distinction is between the language level (it
would be a list, with all operations, etc.) and the implementation
(there is no reason why all lists should be arrays of PyObjects
internally).
Another example would be 'a'*999999999: the result is a string, but
there is no reason that it takes 100MB of memory. Instead, store it
into a C structure that contains a pointer to the original string object
'a' and the repetition counter, but still give this C structure the
Python type str, so that the difference doesn't show up and the Python
language remains simple. (This is a bit difficult to implement
currently in CPython, but not impossible.)
Also, how can the compiler or interpreter know whether the program will only access the object sequentially? It has to predict the behavior of the program, instead of being told explicitly.
Ideally: If you do x=range(100); x[50]='hi' then the interpreter first
builds this optimized range representation and assigns it to x; and when
in the next statement you modify this list x it says 'oops! i cannot do
that with this representation', so it reverts to an array-like
representation (i.e. it creates all 100 elements) and then changes the
50th. No gain here. If on the other hand you only ever do 'easy'
things with your list, like iterate over it or read elements, then it
can all be done with the range representation, without falling back to
the array representation.
Again I'm not saying it is trivial to implement it, but that not having
to expose for optimization purposes 'xrange' and the whole 'iterator'
part of the language would be worth it, in my opinion.
Armin
Hi,
Joe Mason wrote: The reason is that the C implementation must use a generic '<' operator to compare elements, while the Psyco version quickly figures out that it can expect to find ints in the list; it still has to check this assumption, but this is cheap and then the comparison is done with a single machine instruction.
Why can't the C implementation do the same thing?
You could, if you wanted to optimize specifically lists of integers. If
you did the result would probably be really fast. The problem is that
you can only really special-case so many types: the C code has to deal
with all cases without knowing which cases are likely. The Psyco
version quickly figures out that for this list it pays off to make a
special case for integers; with another list, the machine code would be
different, special-cased differently.
However, in most real examples, you are not sorting a list of integers
but of something more complex anyway, where the built-in sort wins
easily. My message was intended as a long-term hint that maybe, at some
point, the built-in sort will actually be more often faster than the C
one if rewritten in Python. The advantage would then be that (as Psyco
does in a limited fashion) you can specialize the code for the
particular kind of list you are dealing with.
Armin
> No, range should return an object that is a list, as far as you can tell from Python, but which is represented more efficiently than an array of objects internally. The distinction is between the language level (it would be a list, with all operations, etc.) and the implementation (there is no reason why all lists should be arrays of PyObjects internally).
You can implement such a thing already. In fact, xrange up until
recently, supported basically everything that a list object did, except
for mutations. The reason it doesn't anymore is because for multiple
versions of Python, such behavior was buggy and poorly supported. If
you are bored enough, feel free to make your own xrange-like object that
supports mutation. Heck, it can even subclass 'list', though it need
not have any standard list internals.
Another example would be 'a'*999999999: the result is a string, but there is no reason that it takes 100MB of memory. Instead, store it into a C structure that contains a pointer to the original string object 'a' and the repetition counter, but still give this C structure the Python type str, so that the difference doesn't show up and the Python language remains simple. (This is a bit difficult to implement currently in CPython, but not impossible.)
Also, you are free to implement such a thing. I believe that the
current CPython implementation doesn't follow this route (and other
suggested optimizations) is because it needlessly complicates the
implementation of CPython.
Ideally: If you do x=range(100); x[50]='hi' then the interpreter first builds this optimized range representation and assigns it to x; and when in the next statement you modify this list x it says 'oops! i cannot do that with this representation', so it reverts to an array-like representation (i.e. it creates all 100 elements) and then changes the 50th. No gain here. If on the other hand you only ever do 'easy' things with your list, like iterate over it or read elements, then it can all be done with the range representation, without falling back to the array representation.
Why not instead use a dictionary-based approach for special items? It
would be far more memory efficient, and wouldn't incur the modification
penalty.
Again I'm not saying it is trivial to implement it, but that not having to expose for optimization purposes 'xrange' and the whole 'iterator' part of the language would be worth it, in my opinion.
I think that one of the desires of offering 'iterator' concepts to
users, both new and seasoned, is that it allows people to think in ways
they didn't before. Allowing people to make those optimizations 'by
hand', I believe (as an educator and computer scientist), allows them to
grow as programmers (or computer scientists, as the case may be).
Don't get me wrong, it would be terribly convenient for Python to
abstract away all the nastiness of optimization, but if/when someone
were to do something that had been /very/ fast before, but was now
awfully slow (think the current Queue.Queue object for small vs. large
numbers of items), they are going to jump to the conclusion that Python
is broken in some fundamental way, maybe come here to complain about
Python being broken, and those who are familliar with the internals
would say, "you are ruining Python's early optimization by this thing
that you do".
Which really gets us to the fact that you are asking for the Python
internals to be optimized. In fact, while simultaneously saying "don't
optimize early", you are optimizing early by saying that range should be
optimized, as should string multiplication, etc. Goodness man, pick a
position and stick with it.
- Josiah
On Sat, 3 Apr 2004, Armin Rigo wrote: This is a rant against the optimization trend of the Python interpreter.
Sorting a list of 100000 integers in random order takes:
* 0.75 seconds in Python 2.1 * 0.51 seconds in Python 2.2 * 0.46 seconds in Python 2.3
Tim Peters did a great job optimizing list.sort(). If I try with a simple, non-stable pure Python quicksort implementation, in Python 2.3:
* 4.83 seconds * 0.21 seconds with Psyco
First step towards world domination of high-level languages :-)
{...}
So this is not so much a plug for Psyco as a rant against the current trend of rewriting standard modules in C. Premature optimization and all that.
{...}
Protesting-ly yours,
While I certainly appreciate pysco and what it can do, I believe your
protest to be unreasonable as it denies better performance on platforms
that psyco doesn't yet (& probably never will) support.
Moreover, your protest about iterators is also unreasonable as people are
benefitting from the reduced memory consumption iterators and their ilk
bring (quite often accompanied by performance gains from not having to
thrash large amounts of RAM through pitiful caches). As such, iterators
are a _different_ optimisation, and I hope that you can come to terms with
this and psycoise them too!
--
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: an*****@bullseye.apana.org.au (pref) | Snail: PO Box 370 an*****@pcug.org.au (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia
Armin Rigo <ar***@tunes.org> writes: Ideally: If you do x=range(100); x[50]='hi' then the interpreter first builds this optimized range representation and assigns it to x; and when in the next statement you modify this list x it says 'oops! i cannot do that with this representation', so it reverts to an array-like representation (i.e. it creates all 100 elements) and then changes the 50th. No gain here. If on the other hand you only ever do 'easy' things with your list, like iterate over it or read elements, then it can all be done with the range representation, without falling back to the array representation.
Maybe there is something to this.
"Armin Rigo" wrote ... Hi!
This is a rant against the optimization trend of the Python interpreter.
Sorting a list of 100000 integers in random order takes:
* 0.75 seconds in Python 2.1 * 0.51 seconds in Python 2.2 * 0.46 seconds in Python 2.3
Tim Peters did a great job optimizing list.sort(). If I try with a simple, non-stable pure Python quicksort implementation, in Python 2.3:
* 4.83 seconds * 0.21 seconds with Psyco
First step towards world domination of high-level languages :-)
The reason that Psyco manages to outperform the C implementation is not that gcc is a bad compiler (it is about 10 times better than Psyco's). The reason is that the C implementation must use a generic '<' operator to compare elements, while the Psyco version quickly figures out that it can expect to find ints in the list; it still has to check this assumption, but this is cheap and then the comparison is done with a single machine instruction.
I think Psyco is great! But Python + Psyco does not outperform
C overall. Try writing a chess program in Python and see how it
performs against C.
Patrick
In article <40***************@tunes.org>, Armin Rigo wrote: Another example would be 'a'*999999999: the result is a string, but there is no reason that it takes 100MB of memory. Instead, store it into a C structure that contains a pointer to the original string object 'a' and the repetition counter, but still give this C structure the Python type str, so that the difference doesn't show up and the Python language remains simple. (This is a bit difficult to implement currently in CPython, but not impossible.)
What this does is makes the interpreter more complicated for features
that not all programs will use. Only a very few programs will have long
strings of repeated characters, and it's reasonable to ask them to
implement their own stringlike class if they really want it.
If this is built into the interpreter, then either it's an optional
feature, in which case all those programs that rely on it to be remotely
memory-efficient aren't portable, or it requires every single
implementation to include it, including the PalmOS port and the one
that's supposed to run on cell phones.
Joe
In article <40***************@tunes.org>, Armin Rigo wrote: Joe Mason wrote: > The reason is that the C implementation must use a generic '<' operator > to compare elements, while the Psyco version quickly figures out that it > can expect to find ints in the list; it still has to check this > assumption, but this is cheap and then the comparison is done with a > single machine instruction.
Why can't the C implementation do the same thing?
You could, if you wanted to optimize specifically lists of integers. If you did the result would probably be really fast. The problem is that you can only really special-case so many types: the C code has to deal with all cases without knowing which cases are likely. The Psyco version quickly figures out that for this list it pays off to make a special case for integers; with another list, the machine code would be different, special-cased differently.
Ah, good point. (In fact, not special-casing lots of things in the C
code is exactly what I was arguing against in my other post.)
Joe
[Armin Rigo] enumerate([6,7,8,9]) # uh ? <enumerate object at 0x401a102c>
This got me thinking about how much I would like to see the contents
of an iterator at the interactive prompt.
I wonder if the read-eval-print loop could be trained to make a better
display:
# rough pseudo-code sketch
while 1:
command = raw_input()
result = eval(command)
if result is None:
continue
if is_iterator(result):
result, copy = itertools.tee(result)
print "<%s object at 0x%8x:" %
(type(result).__name__, id(result)),
for elem in itertools.islice(copy, 3):
print repr(elem),
else:
print '...',
print '>'
else:
print repr(result)
_ = result
# intended result enumerate('abcdefgh')
<enumerate object at 0x401a102c: (0, 'a') (1, 'b') (2, 'c') ...> list(_)
[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f'), (6, 'g'),
(7, 'h'), (8, 'i'), (9, 'j'), (10, 'k'), (11, 'l'), (12, 'm'), (13,
'n')]
Raymond Hettinger
>>>>> "Armin" == Armin Rigo <ar***@tunes.org> writes:
Armin> Worse, and more importantly, the optimization starts to
Armin> become visible to the programmer. Iterators, for example,
Armin> are great in limited cases but I consider their
Armin> introduction a significant complication in the language;
Armin> before, you could expect that some function from which you
Armin> would expect a sequence returned a list. Python was all
Armin> lists and
Iterators are an optimization? I always considered them just a more
clean and elegant way to do the same thing :-).
--
Ville Vainio http://tinyurl.com/2prnb
On Sat, Apr 03, 2004 at 09:44:38PM -0800, Raymond Hettinger wrote: [Armin Rigo]>> enumerate([6,7,8,9]) # uh ? <enumerate object at 0x401a102c>
This got me thinking about how much I would like to see the contents of an iterator at the interactive prompt.
I wonder if the read-eval-print loop could be trained to make a better display:
[snip] # intended result enumerate('abcdefgh') <enumerate object at 0x401a102c: (0, 'a') (1, 'b') (2, 'c') ...> list(_)
[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f'), (6, 'g'), (7, 'h'), (8, 'i'), (9, 'j'), (10, 'k'), (11, 'l'), (12, 'm'), (13, 'n')]
Yeah! I love this idea. It may make not only enumerate() but also
reverse() and itertools internal objects more interactive-friendly.
Hye-Shik
Joe Mason <jo*@notcharles.ca> wrote in
news:sl****************@gate.notcharles.ca: In article <40***************@tunes.org>, Armin Rigo wrote: The reason that Psyco manages to outperform the C implementation is not that gcc is a bad compiler (it is about 10 times better than Psyco's). The reason is that the C implementation must use a generic '<' operator to compare elements, while the Psyco version quickly figures out that it can expect to find ints in the list; it still has to check this assumption, but this is cheap and then the comparison is done with a single machine instruction.
Why can't the C implementation do the same thing?
It easily could, however sorting a list of ints sounds to me like a
comparatively unusual thing to do in any real application. Sorting strings,
or sorting more complex data structures are much more usual. In other words
it would be kind of pointless to introduce an optimisation that only helps
speedup benchmarks.
Also the builtin sort handles other cases which are important. Not all data
you want to sort is randomly ordered. The builtin sort method should
outperform quicksort in those cases where the input is already largely
sorted. The builtin sort is also stable, although for a list of random
integers you might be hard pushed to tell :-)
Hi !
Yes, but Psyco is writted in C & Python, and it use an C module.
Said "Psyco is faster than C", it's like said "Psyco is faster than Psyco".
@-salutations (and sorry for my *bad english*)
--
Michel Claveau
Hi !
Sorry, but i had let down C language. I can't compare.
See also my answer to Josiah Carlson.
And Psyco is sufficiently good not to need ambiguous arguments.
@-salutations
--
Michel Claveau
from pprint import *
def install_hook():
def displayhook(v):
import __main__
if v is not None:
if isiterator(v):
print "<%s object at 0x%8x:" % (
type(v).__name__, id(v)),
v, copy = itertools.tee(v)
for elem in itertools.islice(copy, 3):
print saferepr(elem),
try:
copy.next()
except StopIteration:
pass
else:
print '...',
sys.stdout.softspace=0
print ">"
else:
pprint(v)
__main__._ = v
sys.displayhook = displayhook
Jeff
Armin Rigo: Another example would be 'a'*999999999: the result is a string, but there is no reason that it takes 100MB of memory. Instead, store it into a C structure that contains a pointer to the original string object 'a' and the repetition counter, but still give this C structure the Python type str, so that the difference doesn't show up and the Python language remains simple.
I've written lazy code like this for my own projects, where the
concrete object wasn't fully created until needed. One of the
problems I had with it was error behaviour. In this case, suppose
there isn't enough memory available. Python as is will attempt to
allocate enough space when you request it and raise a MemoryError
if there isn't enough space.
Python as you propose it may allocate the string-like object and
only later (eg, when passing the object to some external C
function which expects a char *) realize the full string. There
isn't enough memory so the allocation fails, raising a MemoryError.
But how is the programmer to realize which operations may raise
a MemoryError? Instead, will everything need a try/except guard
around it?
Andrew da***@dalkescientific.com
Armin Rigo <arigo <at> tunes.org> writes: Again I'm not saying it is trivial to implement it, but that not having to expose for optimization purposes 'xrange' and the whole 'iterator' part of the language would be worth it, in my opinion.
First of all, I'm trying to see whether I can write through this interface.
As you might have realized, sarcastically after they fooled me with that
April joke, my site was really lost, andthis is a tad.
Anyway, I'd like to add that Armin's idea can be extended (as he surely knows)
to special casing seldom assignments to and algorithmically given array.
That is, in the case of just a few assignments, a list could internally
aufment the expression. If the number of elements grows, it could be
turned into a preferred dictionary, after reaching some threshold.
And after another threshold, it could be turned into something like
a real list, or just a specialized, typed list, depending on the type.
In general, I share Armin's impression, that iterators are nothing else
but an explicit way to spell optimizations.
While explicit is better than implicit, in the case of optimizations,
I believe it is an over-specification, and almost completely in the false
direction. We have to prove this in a known project, still.
There is one thing where I think explicit iterator do make some sense,
in a way the reader might not expect.
Let me show:
if you do something like
for each in sequence:
try:
do_something_with(each)
except SomeThingWrongWithThis:
# handle exception somehow
In terms of iterators, we implicitly create an interator here and consume it.
The explicit notation of iterators gives us this advantage:
instead you can do it this way:
it = iter(sequence)
can_continue = 1
while can_continue:
try:
for each in it:
do_something_with(each)
exceptSomeThingWrongWithThis
can_continue = some_recovery(each)
continue
In words: By the help of iterators, it is possible to write exception
handlers for special cases *outside* the loop, repair the error, and
continue iterating in the loop.
I have used this pattern a lot of times now, and I'm quite happy ith it.
But I have to admit, that this is too a bad, explicit optimization, and
a compiler could be smart enough to do this for me, automatically.
cheers - chris
Raymond Hettinger wrote:
[Armin Rigo] >>> enumerate([6,7,8,9]) # uh ? <enumerate object at 0x401a102c>
This got me thinking about how much I would like to see the contents of an iterator at the interactive prompt.
I wonder if the read-eval-print loop could be trained to make a better display:
# rough pseudo-code sketch while 1: command = raw_input() result = eval(command) if result is None: continue if is_iterator(result): result, copy = itertools.tee(result) print "<%s object at 0x%8x:" % (type(result).__name__, id(result)), for elem in itertools.islice(copy, 3): print repr(elem), else: print '...', print '>' else: print repr(result) _ = result
# intended result enumerate('abcdefgh') <enumerate object at 0x401a102c: (0, 'a') (1, 'b') (2, 'c') ...> list(_)
[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f'), (6, 'g'), (7, 'h'), (8, 'i'), (9, 'j'), (10, 'k'), (11, 'l'), (12, 'm'), (13, 'n')]
I thought this myself, but what if the iterator is computationally
intensive?
--
CARL BANKS http://www.aerojockey.com/software
"If you believe in yourself, drink your school, stay on drugs, and
don't do milk, you can get work."
-- Parody of Mr. T from a Robert Smigel Cartoon
Armin Rigo wrote: Paul Rubin wrote: I think you're saying that instead of having xrange return a special object, range should return that special object instead. I'm not too clear on the distinction.
No, range should return an object that is a list, as far as you can tell from Python, but which is represented more efficiently than an array of objects internally. The distinction is between the language level (it would be a list, with all operations, etc.) and the implementation (there is no reason why all lists should be arrays of PyObjects internally).
Another example would be 'a'*999999999: the result is a string, but there is no reason that it takes 100MB of memory. Instead, store it into a C structure that contains a pointer to the original string object 'a' and the repetition counter, but still give this C structure the Python type str, so that the difference doesn't show up and the Python language remains simple. (This is a bit difficult to implement currently in CPython, but not impossible.)
Hmm. What would be really cool is an abstract sequence type with all
kinds of knobs to specify the operations you want to support. In
other words, rather than saying "I want a vector" or "I want a linked
list," you say, "I want fast indexing, fast sorting, no insertion, and
no resizing," or "I want no indexing, no sorting, fast insertion, fast
appending on both ends". Then, let the computer choose the actual
implementation.
Better yet, if the compilation is highly optimized, the compiler could
trace the path of the object (best as it can) through the program, and
maybe use profiling data too, to see for itself the how the object is
used, thus freeing the programmer from even specifying the operations.
The programmer could say, "I want a sequence," use it, and the
compiler could choose the best implementation.
The only thing is, I think that would be prohibitively difficult in an
on-the-fly compiled language like Python. Even having a list with one
or two optimized forms would have drawbacks: it would be hard to
optimize the fast parts with the interpretter second-guessing you all
the time, and it would be hard to write extension modules, since
they'd have to be aware of all the different layouts, or have to
always use an API.
And, frankly, I think having a simple implementation is important,
too. Complex implementations are more likely to have lots of bugs
that could burden users more than some distinct optimized types would.
In my opinion, Python is doing the right thing by having one internal
form per type.
Given that, and keeping in mind that some of these optimizations
really do help, the question to ask is: do the extra power and
efficiency the optimized version gives you justify the extra burden of
complexity on the programmer? I believe that the answer is
occasionally yes.
Consider tuples. Tuples are for the most part an optimized version
of list, but with a few powers list doesn't have, and without a few
powers list has. It's the same with iterators.
IMO, the the benefit an iterator gives you is far more than the
benefit a tuple gives you. I would say 'yes' for an iterator, it's
worth a bit of complexity; and probably 'no' for a tuple (although
it's close).
my-two-cents-ly y'rs,
--
CARL BANKS http://www.aerojockey.com/software
"If you believe in yourself, drink your school, stay on drugs, and
don't do milk, you can get work."
-- Parody of Mr. T from a Robert Smigel Cartoon
[Armin Rigo] Armin> Worse, and more importantly, the optimization starts to Armin> become visible to the programmer. Iterators, for example, Armin> are great in limited cases but I consider their Armin> introduction a significant complication in the language; Armin> before, you could expect that some function from which you Armin> would expect a sequence returned a list. Python was all Armin> lists and
[Ville Vainio] Iterators are an optimization? I always considered them just a more clean and elegant way to do the same thing :-).
In the planning discussions, optimization was not mentioned as goal.
In fact, I think everyone was surprised that they happened to run
faster than the __getitem__ hack. It was also a surprise at how much
they unified the language and made the various pieces work together a
little better.
The original goals were much smaller:
* less cumbersome file iteration (eliminating the need for xreadlines
and reducing the need for the fileinput module).
* providing an extensible interface
* making the code for non-sequence collections more concise and
readable
There was a thought that the performance of dictionary iteration would
improve. Also, while it was not discussed explicitly, everyone was
aware that iterators were more memory/cache friendly than their list
based counterparts.
Raymond Hettinger
[Jeff Epler] from pprint import * def install_hook(): def displayhook(v): import __main__ if v is not None: if isiterator(v): print "<%s object at 0x%8x:" % ( type(v).__name__, id(v)), v, copy = itertools.tee(v) for elem in itertools.islice(copy, 3): print saferepr(elem), try: copy.next() except StopIteration: pass else: print '...', sys.stdout.softspace=0 print ">" else: pprint(v) __main__._ = v
sys.displayhook = displayhook
This is very nice.
In my sketch, I used Py2.4's itertools.tee() to handle non-restartable
iterators. For Py2.3, a good alternative is:
copy = list(itertools.islice(v, 3))
v = itertools.chain(copy, v)
Now, copy can be used in the display and "v" produces the same values
as the original iterator.
Cheers,
Raymond Hettinger
P.S. If some experimentation shows your code to be useful at the
interactive prompt, please submit a patch on SF so it won't be lost.
> >>>> enumerate('abcdefgh') <enumerate object at 0x401a102c: (0, 'a') (1, 'b') (2, 'c') ...>> list(_) [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f'), (6, 'g'), (7, 'h'), (8, 'i'), (9, 'j'), (10, 'k'), (11, 'l'), (12, 'm'), (13, 'n')]
[Carl Banks] I thought this myself, but what if the iterator is computationally intensive?
Since no more than three items are displayed, the interactive prompt
will still handle this much better than something like range(10000000)
which takes a while to compute and display.
Besides, everything you do in Python pays a little performance penalty
just to make sure you can break out of computationally intensive
tasks.
For me, these almost never arise at the interactive prompt.
Likewise, I program in a less hostile world than Andrew Dalke who has
to contend with pandora's box iterators which ruin the lives of mortal
men who think they can call .next() with impunity ;-)
Raymond Hettinger
"Michel Claveau/Hamster" <No********@No.Spam.mclaveau.No.Spam.com> writes: Yes, but Psyco is writted in C & Python, and it use an C module. Said "Psyco is faster than C", it's like said "Psyco is faster than Psyco".
You appear to be missing the point completely.
The language used to implement the tools you use is irrelevant. What
is important is the language in which you are programming.
Armin's point is that (in certain circumstances) you can achieve a
significant speedup by one of two ways
- Recode some Python code in C
- import psyco; psyco.full()
Note: in the second case, this is _all_[*] you have to do; it takes
about 2 seconds of work. In the first case it take many
hours/days/weeks of tedious and error-prone work.
The point is: Why code in C when you can get just as good program
speed performance by coding in Python.
Put another way: Why code in a low-level, tedious, rigid,
error-inducing language, when you can get just as good program speed
performance by writing in a high-level, flexible, dynamic, pleasant
language.
[*] Yes, I'm deliberately keeping things simple in order not to draw
attention away from the real point.
Raymond Hettinger wrote: ...
P.S. If some experimentation shows your code to be useful at the interactive prompt, please submit a patch on SF so it won't be lost.
Why put this behaviour in the interactive prompt rather than in repr()?
Paul Prescod
> Yes, but Psyco is writted in C & Python, and it use an C module. Said "Psyco is faster than C", it's like said "Psyco is faster than Psyco".
You probably meant "C is faster than C". That is a reasonable
statement, but users of Psyco don't need to write anything special
beyond a handful of extra statements in Python.
To the user, Psyco is a module that makes things fast. They wrote
everything in Python, so to them, "Python with Psyco is faster than C"
is far more accurate.
- Josiah
Paul Rubin <http://ph****@NOSPAM.invalid> writes: Armin Rigo <ar***@tunes.org> writes: Ideally: If you do x=range(100); x[50]='hi' then the interpreter first builds this optimized range representation and assigns it to x; and when in the next statement you modify this list x it says 'oops! i cannot do that with this representation', so it reverts to an array-like representation (i.e. it creates all 100 elements) and then changes the 50th. No gain here. If on the other hand you only ever do 'easy' things with your list, like iterate over it or read elements, then it can all be done with the range representation, without falling back to the array representation.
Maybe there is something to this.
Armin is usually worth listening too :-)
You can find a step towards many of these ideas in PyPy, which should
surprise no one at all...
Cheers,
mwh
--
Indeed, when I design my killer language, the identifiers "foo" and
"bar" will be reserved words, never used, and not even mentioned in
the reference manual. Any program using one will simply dump core
without comment. Multitudes will rejoice. -- Tim Peters, 29 Apr 1998
Raymond Hettinger: Likewise, I program in a less hostile world than Andrew Dalke who has to contend with pandora's box iterators which ruin the lives of mortal men who think they can call .next() with impunity ;-)
I did say "Should be rare though." :)
Andrew da***@dalkescientific.com
(And now you've got me thinking about the Tomb Raider II movie.
Not a bad thing.)
Michel Claveau/Hamster wrote: Hi !
Yes, but Psyco is writted in C & Python, and it use an C module. Said "Psyco is faster than C", it's like said "Psyco is faster than Psyco".
This is not so much of a contradiction.
First of all, Psyco creates assembly code. Not the fastest,
but it enters an area where Python has no entrance, yet.
So what it does is to produce faster code, although not as fast
as C could, but faster than the existing C implementation
of Python.
Seconde, depending on the algorithm you use, even Python can be
faster than C. For example, a few years ago, I implemented
a fast long integer multiplication in Python. At that time,
CPython still had asymptotic behavior of O(n**2), while
the Karatsuba algorithm I used hat something like O(2**1.53).
The difference showed up with several thousands of digits, but
it was remarkable.
Later on, Karatsuba was built into the core, and it became clear
to me that *I* caused this mess, because I had the chance to
get that algorithm into the core, long time ago. I just missed it.
What I'm trying to say: Finally it is always the algorithm.
ciao - chris
--
Christian Tismer :^) <mailto:ti****@stackless.com>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/
On 03 Apr 2004 18:23:11 -0800, Paul Rubin
<http://ph****@NOSPAM.invalid> wrote: Armin Rigo <ar***@tunes.org> writes: Ideally: If you do x=range(100); x[50]='hi' then the interpreter first builds this optimized range representation and assigns it to x; and when in the next statement you modify this list x it says 'oops! i cannot do that with this representation', so it reverts to an array-like representation (i.e. it creates all 100 elements) and then changes the 50th. No gain here. If on the other hand you only ever do 'easy' things with your list, like iterate over it or read elements, then it can all be done with the range representation, without falling back to the array representation.
Maybe there is something to this.
The problem is, once you start where do you stop.
At the moment, Armin suggests optimising the a list consisting of a
single repeating item. But what about optimising a repeating list with
one or two special cases, so that the "x[50]='hi'" above doesn't incur
a penalty?
Consider integer lists - what about optimising arithetic progressions?
Geometric progressions? Progressions with special cases? Progressions
that are the intersection, or union, or whatever of other
progressions?
If the internal implementation doesn't special-case the implementation
of operations on these, all you have is lazy evaluation. But if the
internal implementation adds special-case implementations of
operations, you either end up with an huge number of special case
implementation methods (other parameters can end up with special-case
implementations, not just 'self') or you have to implement what
amounts to a full algebraic solving engine in the Python interpreter.
Or else you can just choose to special case the really important types
and operations, which I believe Python already does to some degree (an
integer is a special case of a long integer, for instance, and an
iterator is a special case of a list with only a subset of the
operations available to a standard list) and provide the programmer
with the tools to easily implement any further special cases that he
may need.
--
Steve Horne
steve at ninereeds dot fsnet dot co dot uk This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Ville Vainio |
last post by:
I don't know if you have seen this before, but here goes:
http://text.userlinux.com/white_paper.html
There is a jab at Python, though, mentioning that Ruby is more
"refined".
--
Ville...
|
by: kbass |
last post by:
In different articles that I have read, persons have constantly eluded to
the productivity gains of Python. One person stated that Python's
productivity gain was 5 to 10 times over Java in some in...
|
by: stormslayer |
last post by:
Folks:
I've been considering a shift to python. I currently use c++builder
(borland) or perl. I do computational models / statistics
programming, and was interested in python b/c it
a. has...
|
by: Maurice LING |
last post by:
This may be a dumb thing to ask, but besides the penalty for dynamic
typing, is there any other real reasons that Python is slower than Java?
maurice
|
by: Lad |
last post by:
Is anyone capable of providing Python advantages over PHP if there are
any?
Cheers,
L.
|
by: Dieter Vanderelst |
last post by:
Dear all,
I'm currently comparing Python versus Perl to use in a project that
involved a lot of text processing. I'm trying to determine what the most
efficient language would be for our...
|
by: 63q2o4i02 |
last post by:
Hi, I've been thinking about Python vs. Lisp. I've been learning
Python the past few months and like it very much. A few years ago I
had an AI class where we had to use Lisp, and I absolutely...
|
by: Kurt B. Kaiser |
last post by:
Patch / Bug Summary
___________________
Patches : 419 open ( +3) / 3410 closed ( +2) / 3829 total ( +5)
Bugs : 910 open (+12) / 6185 closed ( +5) / 7095 total (+17)
RFE : 235 open...
|
by: Kurt B. Kaiser |
last post by:
Patch / Bug Summary
___________________
Patches : 420 open ( +4) / 3410 closed ( +2) / 3830 total ( +6)
Bugs : 915 open (+17) / 6186 closed ( +6) / 7101 total (+23)
RFE : 235 open...
|
by: Python Maniac |
last post by:
I am new to Python however I would like some feedback from those who
know more about Python than I do at this time.
def scrambleLine(line):
s = ''
for c in line:
s += chr(ord(c) | 0x80)...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: Vimpel783 |
last post by:
Hello!
Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: af34tf |
last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome former...
| |