Towards faster Python implementations - theory

John Nagle

Some faster Python implementations are under development.
JPython has been around for a while, and PyPy and ShedSkin
continue to move forward. It's worth thinking about what slows
down Python implementations.

It isn't the dynamism, really. As others have pointed out
in the Python literature, most of the time, the more elaborate
dynamic features aren't being used for any given variable or
object. If code has something like "x = 1.0", the odds are that
"x" is going to stay a floating point number, and not suddenly turn
into a list or an object reference. The type inference of Shed Skin
builds on that assumption, adding some restrictions about changing of
variable types.

The Shed Skin effort indicates that explicit typing, via 'decorators'
or otherwise, isn't really necessary. What's necessary is the avoidance
of "surprises" in the language. In this context, a "surprise" is
the use of a dynamic feature in a way that can't be seen at compile time.

A typical "surprise" would be the use of "setattr" on an object from
outside the compilation unit that defines the object. Within a module,
"setattr" on an object in that module is no problem; the compiler can see
it and generate the extra machinery needed to make an object dynamically
alterable at run time. But if an object doesn't need that extra machinery
and associated dictionary, it's a big win to discard the excess baggage
and use a simpler fixed-size representation, comparable to a C struct,
for the object.

On the typing front, the neatest way to express typing is via
initialization. With the Shed Skin restrictions, if all variables are
initialized before use (preferably in __init__), there's no need to
maintain an "undefined" flag for a variable. And, of course, if
the type of a varaible is simple and can't change, it doesn't have to
be "boxed", (enclosed in an object) which is a big win.

The point here is that we don't need language changes or declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

John Nagle

May 8 '07 #1

Subscribe Post Reply

1630

Paul Boddie

On 8th May, 17:53, John Nagle <n...@animats.comwrote:

Some faster Python implementations are under development.
JPython

Ahem: Jython!

has been around for a while, and PyPy and ShedSkin
continue to move forward. It's worth thinking about what slows
down Python implementations.

It's the dynamicity! ;-) But things like clever memory allocation can
pay dividends, too, and I wouldn't be surprised if this explained
Python's better-than-expected showing when compared to other languages
- such as that comparison you provoked with those claims of superior
JavaScript performance. ;-)

It isn't the dynamism, really.

In theory, no, but in practice CPython doesn't optimise away the
dynamism. Recent experiments with method caches seem to have shown
modest performance improvements, and I'd guess that such things are
fairly established in other dynamic language implementations.

As others have pointed out
in the Python literature, most of the time, the more elaborate
dynamic features aren't being used for any given variable or
object. If code has something like "x = 1.0", the odds are that
"x" is going to stay a floating point number, and not suddenly turn
into a list or an object reference. The type inference of Shed Skin
builds on that assumption, adding some restrictions about changing of
variable types.

The problem here, amongst many others, is knowing for sure whether the
more elaborate features have been avoided. Approaches which attempt to
avoid knowing such things include just-in-time compilation (you avoid
knowing in advance) and type declarations (you give up thinking about
whether it's possible and have the programmer do all the work).

The Shed Skin effort indicates that explicit typing, via 'decorators'
or otherwise, isn't really necessary. What's necessary is the avoidance
of "surprises" in the language. In this context, a "surprise" is
the use of a dynamic feature in a way that can't be seen at compile time.

I concur with your assessment of the supposed necessity of explicit
typing. However, you might be surprised as to what constitutes a
"surprise" in Python...

A typical "surprise" would be the use of "setattr" on an object from
outside the compilation unit that defines the object. Within a module,
"setattr" on an object in that module is no problem; the compiler can see
it and generate the extra machinery needed to make an object dynamically
alterable at run time. But if an object doesn't need that extra machinery
and associated dictionary, it's a big win to discard the excess baggage
and use a simpler fixed-size representation, comparable to a C struct,
for the object.

You don't even need to bring out setattr to make the inference
activity a difficult one. Even straightforward attribute access needs
to be repeatedly checked to see if the target of a normal attribute
assignment or query has changed. Traditionally, people have shown some
trivial function in isolation...

def f(x):
return x.a

....and said, "We don't know anything! It's all impossible!" But
context is everything, as you know, and whole program analysis is the
only way to go with the aforementioned goals in mind. What if the
parameter to the above itself comes from attribute access?

def g(y):
return f(y.b)

You can descend into some fairly demanding situations with respect to
keeping track of all the possibilities.

On the typing front, the neatest way to express typing is via
initialization. With the Shed Skin restrictions, if all variables are
initialized before use (preferably in __init__), there's no need to
maintain an "undefined" flag for a variable. And, of course, if
the type of a varaible is simple and can't change, it doesn't have to
be "boxed", (enclosed in an object) which is a big win.

I'm fairly sure, although my intuition unfortunately doesn't provide a
ready example right now, that typing via initialisation, whilst an
important tool, may either impose unacceptable restrictions if
enforced too rigidly or limit the precision that one might desire in
determining type information in the resulting system. But it is a
worthwhile objective to identify fixed-size structures and unboxed
values, in my opinion.

The point here is that we don't need language changes or declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

Agreed. And I don't buy into the negative "lesser Python" labelling of
such work, either. People seem to have forgotten how to use older,
conservative Python features, preferring to show off with metaclasses
and closures even for problems that could be solved using simple
classes and objects. If that's "greater Python" then call me a
minimalist!

Paul

May 8 '07 #2

Marc 'BlackJack' Rintsch

In <11*********************@p77g2000hsh.googlegroups. com>, Paul Boddie
wrote:

> On the typing front, the neatest way to express typing is via
initialization. With the Shed Skin restrictions, if all variables are
initialized before use (preferably in __init__), there's no need to
maintain an "undefined" flag for a variable. And, of course, if
the type of a varaible is simple and can't change, it doesn't have to
be "boxed", (enclosed in an object) which is a big win.

A variable? :-)

Maybe that last sentence should start with: And, of course, if the type of
objects bound to a name won't changeâ€¦

I'm fairly sure, although my intuition unfortunately doesn't provide a
ready example right now, that typing via initialisation, whilst an
important tool, may either impose unacceptable restrictions if
enforced too rigidly or limit the precision that one might desire in
determining type information in the resulting system.

I often bind a name to `None` first if it is going to be bound to it's
"real" value later on. So this initialization doesn't say anything about
the type(s) that will be bound to it later.

I don't see how this type inference for static types will work unless some
of the dynamism of the language will get restricted. But is this really
necessary? Isn't a JIT compiler and maybe type hinting good enough? By
type hinting I really mean just recommendations from the programmer. So
you can say this name should be an `int` and the JIT compiler produces
code in advance, but it's okay to pass another object instead.

Ciao,
Marc 'BlackJack' Rintsch

May 8 '07 #3

Bruno Desthuilliers

John Nagle a écrit :

Some faster Python implementations are under development.
JPython has been around for a while,

s/JP/J/

And FWIW, I'm not sure Jython is really faster than CPython...

May 8 '07 #4

John Nagle

Marc 'BlackJack' Rintsch wrote:

I don't see how this type inference for static types will work unless some
of the dynamism of the language will get restricted. But is this really
necessary? Isn't a JIT compiler and maybe type hinting good enough?

Not necessarily. One of the more powerful optimizations is to optimize
reference count updates. Often, you can hoist reference count updates
out of loops, which is a big win. But to do that, you need to be sure
that the code executed by the loop won't change unexpectedly.

My point is that while all the dynamism in Python is occasionally
useful, most of the time for most of the variables most of it isn't
being used. If we can discard the excess baggage, unnecessary
dictionary lookups, and unnecessary reference count updates a
large fraction of the time, it's a big win. This requires
"surprise-free" language semantics, so that when something unusual
is going on, it's visibly reflected in the code.

Surprise-free semantics also make code easier to debug.
John Nagle

May 9 '07 #5

Alex Martelli

Bruno Desthuilliers <bd*****************@free.quelquepart.frwrote:

John Nagle a écrit :
Some faster Python implementations are under development.
JPython has been around for a while,

s/JP/J/

These days, yes, but it WAS originally called JPython (it was renamed to
Jython later). So, "has been around a while" is an appropriate context
for using the good old name.

And FWIW, I'm not sure Jython is really faster than CPython...

Definitely not, last I measured. Not IronPython either, overall (though
it's much better than Jython and does get to beat CPython on some
specific benchmarks).
Alex

May 9 '07 #6

Kay Schluehr

On May 9, 1:25 pm, John Nagle <n...@animats.comwrote:

Marc 'BlackJack' Rintsch wrote:
I don't see how this type inference for static types will work unless some
of the dynamism of the language will get restricted. But is this really
necessary? Isn't a JIT compiler and maybe type hinting good enough?

Not necessarily. One of the more powerful optimizations is to optimize
reference count updates. Often, you can hoist reference count updates
out of loops, which is a big win. But to do that, you need to be sure
that the code executed by the loop won't change unexpectedly.

The advantage of having a JIT is just that it can record data at
runtime and respond flexible to them. It doesn't run into global
static analysis problems mentioned by Paul. A "semi-dynamical"
compromise would mean to use a profile of samples of runtime data and
assert that they reflect typical" behaviour. Then the system needs an
ability to fall back to usual bytecode interpretation. Psyco does that
by analyzing the next opcode and a very clever dispatch mechanism.

A more code oriented, profile driven approach would factor source into
natively compilable parts and those that have to be bytecode
interpreted. I wonder whether bridges to Pyrex, Boost.Python or the
celerid bridge to D could be used and what the performance penalties
are for all the argument/return value wrapping and unwrapping.
Unfortunately ShedSkin lacks CPython integration. We talked about this
here recently.

Kay

May 9 '07 #7

Hendrik van Rooyen

"John Nagle" <na***@animats.comwrote:

8<---------------- summary of existing work and thinking ------------------

The point here is that we don't need language changes or declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

I am relatively new on this turf, and from what I have seen so far, it
would not bother me at all to tie a name's type to its first use, so that
the name can only be bound to objects of the same type as the type
of the object that it was originally bound to.

But maybe I am missing the point of dynamism.

Would an implementation of the above break lots of stuff in practice?

It seems to me that it could have the effect of a declaration without
the wart of actually doing it.

- Hendrik

May 9 '07 #8

Paul Boddie

On 9 May, 08:09, "Hendrik van Rooyen" <m...@microcorp.co.zawrote:

>
I am relatively new on this turf, and from what I have seen so far, it
would not bother me at all to tie a name's type to its first use, so that
the name can only be bound to objects of the same type as the type
of the object that it was originally bound to.

But it's interesting to consider the kinds of names you could restrict
in this manner and what the effects would be. In Python, the only kind
of name that can be considered difficult to arbitrarily modify "at a
distance" - in other words, from outside the same scope - are locals,
and even then there are things like closures and perverse
implementation-dependent stack hacks which can expose local namespaces
to modification, although any reasonable "conservative Python"
implementation would disallow the latter.

In a local namespace you could restrict names in this way, although
I'd argue that with the limitations on locals, you don't gain as much
as you would by restricting other names similarly. However, by
restricting other kinds of names (eg. instance attributes) you have to
start thinking about polymorphism: what if attribute x on instances of
class C can have different types? If you aren't careful, you've
introduced interfaces as the primary mechanism permitting some kind of
polymorphism.

Paul

May 9 '07 #9

Terry Reedy

"Hendrik van Rooyen" <ma**@microcorp.co.zawrote in message
news:013d01c79210$5e441280$03000080@hendrik...
| I am relatively new on this turf, and from what I have seen so far, it
| would not bother me at all to tie a name's type to its first use, so that
| the name can only be bound to objects of the same type as the type
| of the object that it was originally bound to.
|
| But maybe I am missing the point of dynamism.
|
| Would an implementation of the above break lots of stuff in practice?

For function local variables, if you mean 'originally bound to' in the
current call invocation, that would sometimes be ok (and that is sort of
what Psycho does). But if you mean in the original binding in the first
call invocation, then that would cripple many functions.
tjr

May 9 '07 #10

John Nagle

Paul Boddie wrote:

On 9 May, 08:09, "Hendrik van Rooyen" <m...@microcorp.co.zawrote:

>>I am relatively new on this turf, and from what I have seen so far, it
would not bother me at all to tie a name's type to its first use, so that
the name can only be bound to objects of the same type as the type
of the object that it was originally bound to.

But it's interesting to consider the kinds of names you could restrict
in this manner and what the effects would be. In Python, the only kind
of name that can be considered difficult to arbitrarily modify "at a
distance" - in other words, from outside the same scope - are locals,
and even then there are things like closures and perverse
implementation-dependent stack hacks which can expose local namespaces
to modification, although any reasonable "conservative Python"
implementation would disallow the latter.

Modifying "at a distance" is exactly what I'm getting at. That's the
killer from an optimizing compiler standpoint. The compiler, or a
maintenance programmer, looks at a block of code, and there doesn't seem
to be anything unusual going on. But, if in some other section of
code, something does a "setattr" to mess with the first block of code,
something unusual can be happening. This is tough on both optimizing
compilers and maintenance programmers.

Python has that capability mostly because it's free in an
"everything is a dictionary" implementation. ("When all you have
is a hash, everything looks like a dictionary".) But that limits
implementation performance. Most of the time, nobody is using
"setattr" to mess with the internals of a function, class, or
module from far, far away. But the cost for that flexibility is
being paid, unnecessarily.

I'm suggesting that the potential for "action at a distance" somehow
has to be made more visible.

One option might be a class "simpleobject", from which other classes
can inherit. ("object" would become a subclass of "simpleobject").
"simpleobject" classes would have the following restrictions:

- New fields and functions cannot be introduced from outside
the class. Every field and function name must explicitly appear
at least once in the class definition. Subclassing is still
allowed.
- Unless the class itself uses "getattr" or "setattr" on itself,
no external code can do so. This lets the compiler eliminate the
object's dictionary unless the class itself needs it.

This lets the compiler see all the field names and assign them fixed slots
in a fixed sized object representation. Basically, this means simple objects
have a C/C++ like internal representation, with the performance that comes
with that representation.

With this, plus the "Shed Skin" restrictions, plus the array features of
"numarray", it should be possible to get computationally intensive code
written in Python up to C/C++ levels of performance. Yet all the dynamic
machinery of Python remains available if needed.

All that's necessary is not to surprise the compiler.

John Nagle

May 9 '07 #11

Paul Boddie

John Nagle wrote:

>
Modifying "at a distance" is exactly what I'm getting at. That's the
killer from an optimizing compiler standpoint. The compiler, or a
maintenance programmer, looks at a block of code, and there doesn't seem
to be anything unusual going on. But, if in some other section of
code, something does a "setattr" to mess with the first block of code,
something unusual can be happening. This is tough on both optimizing
compilers and maintenance programmers.

Agreed. It's tempting to look at some code and say which types one
thinks are being manipulated, but the actual program behaviour can be
quite different. Adding explicit type declarations "anchors" the names
to a very restrictive set of types - typically those permitted through
interfaces or inheritance in many object-oriented languages - and the
challenge, as we've already discussed, is to attempt to "anchor" the
names when they are in a sense "floating free" without any explicit
annotations from the programmer.

Python also presents other challenges. Unlike systems programming
languages, almost nothing is a local operation: each Python function
is mostly a product of function calls and the occasional conditional
or looping construct, themselves wired up using yet more function
calls. Whilst the expection is that most of these will return sensible
results (eg. a call to the iter method on a sequence returns an
iterator), there isn't any guarantee that this will be the case, and
even then the precise iterator involved can vary. Some might claim
that the class of operations which could be done locally are precisely
the ones which would benefit from identification and optimisation,
namely those involving primitive types, yet some pretty good solutions
for such cases exist already: Pyrex, various numeric packages, and so
on.

Python has that capability mostly because it's free in an
"everything is a dictionary" implementation. ("When all you have
is a hash, everything looks like a dictionary".) But that limits
implementation performance. Most of the time, nobody is using
"setattr" to mess with the internals of a function, class, or
module from far, far away. But the cost for that flexibility is
being paid, unnecessarily.

I've heard claims that virtual machines are evolving to make
dictionary-based access more performant, and I guess Microsoft's DLR
and any JVM improvements might point in that direction, but Shed Skin
shows the way by avoiding such things entirely.

I'm suggesting that the potential for "action at a distance" somehow
has to be made more visible.

One option might be a class "simpleobject", from which other classes
can inherit. ("object" would become a subclass of "simpleobject").
"simpleobject" classes would have the following restrictions:

- New fields and functions cannot be introduced from outside
the class. Every field and function name must explicitly appear
at least once in the class definition. Subclassing is still
allowed.

I suppose one could mandate that only methods defined inside class
blocks belong to the class and subclasses, and that various kinds of
optimisations can then be applied to the code in those methods such
that the self parameter's type is constrained in a way probably
already imposed by CPython. This would forbid the adding of functions
to classes and instances after the fact, but in my fairly conservative
world of programming, I hardly ever do such things. I do add
attributes to instances outside those instances (by the above
definition) quite often, though.

- Unless the class itself uses "getattr" or "setattr" on itself,
no external code can do so. This lets the compiler eliminate the
object's dictionary unless the class itself needs it.

I think this is a natural progression from determining the access
needs of particular classes and their objects.

This lets the compiler see all the field names and assign them fixed slots
in a fixed sized object representation. Basically, this means simple objects
have a C/C++ like internal representation, with the performance that comes
with that representation.

Yes, I agree that this would be desirable.

With this, plus the "Shed Skin" restrictions, plus the array features of
"numarray", it should be possible to get computationally intensive code
written in Python up to C/C++ levels of performance. Yet all the dynamic
machinery of Python remains available if needed.

All that's necessary is not to surprise the compiler.

One thing I had cause to think about again today was the cost of
function invocations. Python not only has pervasive dynamic dispatch,
but there are also such things as *args and **kwargs and the cost of
dealing with them. However, such things are very useful in integrating
components and extending class hierarchies, so I wouldn't want to see
them vanish in their entirety.

Paul

May 9 '07 #12

Klaas

On May 9, 10:02 am, John Nagle <n...@animats.comwrote:

One option might be a class "simpleobject", from which other classes
can inherit. ("object" would become a subclass of "simpleobject").
"simpleobject" classes would have the following restrictions:

- New fields and functions cannot be introduced from outside
the class. Every field and function name must explicitly appear
at least once in the class definition. Subclassing is still
allowed.
- Unless the class itself uses "getattr" or "setattr" on itself,
no external code can do so. This lets the compiler eliminate the
object's dictionary unless the class itself needs it.

This lets the compiler see all the field names and assign them fixed slots
in a fixed sized object representation. Basically, this means simple objects
have a C/C++ like internal representation, with the performance that comes
with that representation.

Hey look, it already exists:

>>class A(object):

.... __slots__ = 'a b c d'.split()

>>a = A()
a.e = 2

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'A' object has no attribute 'e'

>>hasattr(a, '__dict__')

False

-Mike

May 9 '07 #13

Hendrik van Rooyen

"Terry Reedy" <t..y@u,,.eduwrote:

"Hendrik van Rooyen" <ma**@microcorp.co.zawrote in message
news:013d01c79210$5e441280$03000080@hendrik...
| I am relatively new on this turf, and from what I have seen so far, it
| would not bother me at all to tie a name's type to its first use, so that
| the name can only be bound to objects of the same type as the type
| of the object that it was originally bound to.
|
| But maybe I am missing the point of dynamism.
|
| Would an implementation of the above break lots of stuff in practice?

For function local variables, if you mean 'originally bound to' in the
current call invocation, that would sometimes be ok (and that is sort of
what Psycho does). But if you mean in the original binding in the first
call invocation, then that would cripple many functions.

Errr - I was thinking a bit simplistic - I know I can write:

def f(x):
for y in x:
print y # using print here as short for doing something complicated

And that would currently "work" with any iterable, as x could
currently be anything.

It seems that such functions are the problem, as something like this:

x = [1,2,3,4,5]
for y in x:
print y

does not have the same hassle for x, but can shift the problem to y:

x = [1,2,3,4,(1,2)]
for y in x:
print y

I can't see an easy way to put the patient back into his straight jacket.

Makes you want to use pre defined globals...

- Hendrik

May 10 '07 #14

Hendrik van Rooyen

"John Nagle" <na***@animats.comwrote:

Paul Boddie wrote:
On 9 May, 08:09, "Hendrik van Rooyen" <m...@microcorp.co.zawrote:

>I am relatively new on this turf, and from what I have seen so far, it
would not bother me at all to tie a name's type to its first use, so that
the name can only be bound to objects of the same type as the type
of the object that it was originally bound to.

But it's interesting to consider the kinds of names you could restrict
in this manner and what the effects would be. In Python, the only kind
of name that can be considered difficult to arbitrarily modify "at a
distance" - in other words, from outside the same scope - are locals,
and even then there are things like closures and perverse
implementation-dependent stack hacks which can expose local namespaces
to modification, although any reasonable "conservative Python"
implementation would disallow the latter.

Modifying "at a distance" is exactly what I'm getting at. That's the
killer from an optimizing compiler standpoint. The compiler, or a
maintenance programmer, looks at a block of code, and there doesn't seem
to be anything unusual going on. But, if in some other section of
code, something does a "setattr" to mess with the first block of code,
something unusual can be happening. This is tough on both optimizing
compilers and maintenance programmers.

Python has that capability mostly because it's free in an
"everything is a dictionary" implementation. ("When all you have
is a hash, everything looks like a dictionary".) But that limits
implementation performance. Most of the time, nobody is using
"setattr" to mess with the internals of a function, class, or
module from far, far away. But the cost for that flexibility is
being paid, unnecessarily.

I'm suggesting that the potential for "action at a distance" somehow
has to be made more visible.

One option might be a class "simpleobject", from which other classes
can inherit. ("object" would become a subclass of "simpleobject").
"simpleobject" classes would have the following restrictions:

- New fields and functions cannot be introduced from outside
the class. Every field and function name must explicitly appear
at least once in the class definition. Subclassing is still
allowed.
- Unless the class itself uses "getattr" or "setattr" on itself,
no external code can do so. This lets the compiler eliminate the
object's dictionary unless the class itself needs it.

This lets the compiler see all the field names and assign them fixed slots
in a fixed sized object representation. Basically, this means simple objects
have a C/C++ like internal representation, with the performance that comes
with that representation.

With this, plus the "Shed Skin" restrictions, plus the array features of
"numarray", it should be possible to get computationally intensive code
written in Python up to C/C++ levels of performance. Yet all the dynamic
machinery of Python remains available if needed.

All that's necessary is not to surprise the compiler.

If this is all it takes, I would even be happy to have to declare which things
could be surprising - some statement like:

x can be anything

Would that help?

It kind of inverts the logic - and states that if you want what is now
the default behaviour, you have to ask for it.

- Hendrik

May 10 '07 #15

sturlamolden

On May 8, 5:53 pm, John Nagle <n...@animats.comwrote:

The point here is that we don't need language changes or declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

Franz, CMUCL, SBCL and GCL teams made Lisp almost as fast as C. A
dynamic language can be fast if the implementation is good.

If you look at SBCL and GCL, no code is interpreted. It's all compiled
on the fly to native machine code. The compiler begins with some code
and some input data, and compiles as much as it can. Then the RT
executes the machine code, compiles again, etc. Often long stretches
of code can be compiled without break, and tight performance critical
loops are usually compiled only once. In addition to this, one needs
an efficient system to cache compiled code, in order to do the
compilation work only once. making a dynamic language fast is not
rocket science.

We should have somthing like "GPython", a Python RT on top of a GCC
backend, similar to what the GCL team did for Lisp. There is no
obvious reason as to why Lisp should have better performance than
Python.

May 10 '07 #16

Tim Golden

sturlamolden wrote:

On May 8, 5:53 pm, John Nagle <n...@animats.comwrote:

> The point here is that we don't need language changes or declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

Franz, CMUCL, SBCL and GCL teams made Lisp almost as fast as C. A
dynamic language can be fast if the implementation is good.

If you look at SBCL and GCL, no code is interpreted. It's all compiled
on the fly to native machine code. The compiler begins with some code
and some input data, and compiles as much as it can. Then the RT
executes the machine code, compiles again, etc. Often long stretches
of code can be compiled without break, and tight performance critical
loops are usually compiled only once. In addition to this, one needs
an efficient system to cache compiled code, in order to do the
compilation work only once. making a dynamic language fast is not
rocket science.

We should have somthing like "GPython", a Python RT on top of a GCC
backend, similar to what the GCL team did for Lisp. There is no
obvious reason as to why Lisp should have better performance than
Python.

I doubt if anyone disputes the gist of what you're
saying[*], viz that Python could be made faster by using
technique (a), (b) or (c) which have been successful
elsewhere. At least that it's worth investgating.

But the relevant bit of your last paragraph is at the start:
"We should...". Unless someone or someones has the time,
inclination, money, backing, wherewithal etc. to implement
this or any other measure of speeding-up, it's all
pie-in-the-sky. Useful, maybe, as discussion of what
options are viable, but a project of this magnitude
doesn't just happen in some developer's lunchbreak.

Many people (and I include myself) are quite
happy with Python's speed. In fact, I'm quite happy
with most things about Python. Others would like to
see it faster. That's great. But unless people
puts their money where their mouths are, I don't
see it happening.

TJG
[*] Actually, knowing this community, I'm sure *someone's*
going to!

May 10 '07 #17

Terry Reedy

"sturlamolden" <st**********@yahoo.nowrote in message
news:11**********************@y80g2000hsf.googlegr oups.com...
| Franz, CMUCL, SBCL and GCL teams made Lisp almost as fast as C. A
| dynamic language can be fast if the implementation is good.
|
| If you look at SBCL and GCL, no code is interpreted. It's all compiled
| on the fly to native machine code.

Unfortunately, native machine code depends on the machine, or at least the
machine being emulated by the hardware. Fortunately or not, the dominance
of the x386 model makes this less of a problem.

| The compiler begins with some code
| and some input data, and compiles as much as it can. Then the RT
| executes the machine code, compiles again, etc. Often long stretches
| of code can be compiled without break, and tight performance critical
| loops are usually compiled only once. In addition to this, one needs
| an efficient system to cache compiled code, in order to do the
| compilation work only once. making a dynamic language fast is not
| rocket science.

This sound somewhat similar to Psyco. If a code section is entered with
different machine types, recompilation is needed. One of the problems with
Psyco is that its efficient caching of multiple versions of compiled code
is space-hungry.

| We should have somthing like "GPython", a Python RT on top of a GCC
| backend, similar to what the GCL team did for Lisp. There is no
| obvious reason as to why Lisp should have better performance than
| Python.

In the 1980s, during the Lisp/AI boom, perhaps a billion dollars was put
into making Lisp run faster. Plus decades of work in Computer Science
departments. Other than that, I agree. Now, Python is slowly making
inroads into academia as a subject of research and theses, and the PyPy
group got at least one large EU grant.

Terry Jan Reedy

May 10 '07 #18

olsongt

On May 9, 5:02 pm, John Nagle <n...@animats.comwrote:

Paul Boddie wrote:

Python has that capability mostly because it's free in an
"everything is a dictionary" implementation. ("When all you have
is a hash, everything looks like a dictionary".) But that limits
implementation performance. Most of the time, nobody is using
"setattr" to mess with the internals of a function, class, or
module from far, far away. But the cost for that flexibility is
being paid, unnecessarily.

I've had this idea in the back of my head for a year or so now, but
have been too lazy or busy or both to give the implementation a try.

The idea is to create a special dictionary for python internals that
contains a pointer-to-a-pointer for the value side of the dictionary,
instead of just the standard PyObject*. Then when you start to eval a
code block, you could do a one-time lookup of the pointer-to-the-
pointer for each method, attribute, etc; and store them in pre-
allocated slots. The calls in the block to various GET_ and SET_
functions can then just deal with pointer derefencing instead of a
dictionary lookup, and the dictionary can still get changed while this
is going on without things blowing up.

I think you'd get a big speed up by changing all those dictionary
function calls to inlined pointer deref code.

Anyone else buy this approach? Or see any fatal flaws?

May 10 '07 #19

Terry Reedy

<ol*****@verizon.netwrote in message
news:11**********************@w5g2000hsg.googlegro ups.com...
| The idea is to create a special dictionary for python internals that
| contains a pointer-to-a-pointer for the value side of the dictionary,
| instead of just the standard PyObject*. Then when you start to eval a
| code block, you could do a one-time lookup of the pointer-to-the-
| pointer for each method, attribute, etc; and store them in pre-
| allocated slots. The calls in the block to various GET_ and SET_
| functions can then just deal with pointer derefencing instead of a
| dictionary lookup, and the dictionary can still get changed while this
| is going on without things blowing up.
|
| I think you'd get a big speed up by changing all those dictionary
| function calls to inlined pointer deref code.
|
| Anyone else buy this approach? Or see any fatal flaws?

Within CPython functions, local variables are usually implemented as
numbered slots in a C array (of PyObject pointers). There is why a) the
compiler makes a first pass thru a function body (to determine what is
local and what is not) and b) LOAD_FAST is faster than LOAD_GLOBAL.

Terry Jan Reedy

May 10 '07 #20

sturlamolden

On May 10, 4:02 pm, Tim Golden <m...@timgolden.me.ukwrote:

But the relevant bit of your last paragraph is at the start:
"We should...".

Sorry, bad choice of words.

see it faster. That's great. But unless people
puts their money where their mouths are, I don't

I know, I know. But that doesn't stop me from envying what the Lisp
community has achieved.

Python still sucks if we are using it for scientific simulations,
testing CPU-bound algorithms, etc. Sure it is only 150-200 times
slower than C for these tasks, but that can amount to the difference
between one day and half a year of CPU time. But as strange as it may
seem, it can still be advantageous to use Python. E.g. it may be less
expensive to run the simulation in parallel on 200 CPUs than writing
the code in C instead of Python.

May 11 '07 #21

sturlamolden

On May 10, 7:18 pm, "Terry Reedy" <tjre...@udel.eduwrote:

Unfortunately, native machine code depends on the machine, or at least the
machine being emulated by the hardware. Fortunately or not, the dominance
of the x386 model makes this less of a problem.

CMUCL and SBCL depends on the dominance of the x86 architecture.

GCL uses the GCC backend, which supports a wide range of
architectures.

Building a compiler backend is not needed for a Python JIT, one can
accept the GPL license and use GCC as a backend.

Or one could translate between Python and Lisp on the fly, and use a
compiled Lisp (CMUCL, SBCL, Franz, GCL) as runtime backend.

May 11 '07 #22

John Nagle

Tim Golden wrote:

sturlamolden wrote:

>On May 8, 5:53 pm, John Nagle <n...@animats.comwrote:

>> The point here is that we don't need language changes or
declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

I doubt if anyone disputes the gist of what you're
saying[*], viz that Python could be made faster by using
technique (a), (b) or (c) which have been successful elsewhere. At least
that it's worth investgating.

But the relevant bit of your last paragraph is at the start:
"We should...". Unless someone or someones has the time,
inclination, money, backing, wherewithal etc. to implement
this or any other measure of speeding-up, it's all
pie-in-the-sky. Useful, maybe, as discussion of what
options are viable, but a project of this magnitude
doesn't just happen in some developer's lunchbreak.

Focusing may help. Between Jython, PyPy, and Shed Skin,
enough effort has been put in to produce something better than
CPython, but none of those efforts resulted in something more
useable than CPython.

There's a "commercial grade Python" from ActiveState, but
that's CPython in a cardboard box, I think.

Another problem is that if the language is defined as
"whatever gets put in CPython", that discourages other
implementations. The language needs to be standards-based.

John Nagle

May 11 '07 #23

Paul Boddie

On 11 May, 18:04, John Nagle <n...@animats.comwrote:

>
Another problem is that if the language is defined as
"whatever gets put in CPython", that discourages other
implementations. The language needs to be standards-based.

Indeed. This was suggested by one of the speakers at last year's
EuroPython with reference to the various proposals to remove map,
reduce, lambda and so on from the language. The opinion was that if
Python implementations change and leave the users either on
unsupported releases or with the work of migrating their code
continuously and/or to features that they don't find as intuitive or
appropriate, some people would rather migrate their code to a language
which is standardised and which can remain agreeable for the
foreseeable future.

Paul

May 11 '07 #24

Terry Reedy

"sturlamolden" <st**********@yahoo.nowrote in message
news:11**********************@e65g2000hsc.googlegr oups.com...
| On May 10, 4:02 pm, Tim Golden <m...@timgolden.me.ukwrote:
| I know, I know. But that doesn't stop me from envying what the Lisp
| community has achieved.

But do not let your envy stop you from noticing and appreciating what the
Python commnunity has achieved.

| Python still sucks if we are using it for scientific simulations,

Not if you use extensions compiled from C or Fortran. Doing so is not
cheating, any more than using the C-coded methods of the builtin types.
Leveraging existing code and compilers was part of Python's design.

With the Numeric extensions, produced by people at the US nuke labs.
scientific simulations were, I think, Python's first killer ap.

| Sure it is only 150-200 times slower than C for these tasks,

As a general statement, nonsense. A LinPack inversion of a 10k x 10k
matrix takes the same time whether called from Python or a C program. The
miniscule extra overhead of Python is more than made up for by the ability
to call LinPack and other functions interactively.

The extended buffer protocol, championed by Travis Oliphant and slated for
3.0, will make cooperation between extensions much easier.

Terry Jan Reedy

May 11 '07 #25

Robert Brown

sturlamolden <st**********@yahoo.nowrites:

On May 10, 7:18 pm, "Terry Reedy" <tjre...@udel.eduwrote:

CMUCL and SBCL depends on the dominance of the x86 architecture.

CMUCL and SBCL run on a variety of architectures, including x86, 64-bit x86,
PowerPC, Sparc, Alpha, and Mips. See

http://www.sbcl.org/platform-table.html

for platform support information.

Or one could translate between Python and Lisp on the fly, and use a
compiled Lisp (CMUCL, SBCL, Franz, GCL) as runtime backend.

This has been done by Willem Broekema. PLPython is a Python implementation
that translates Python source into Common Lisp at read time. Under the
covers, the Lisp is compiled into machine code and then run. See

http://trac.common-lisp.net/clpython/

Currently, CLPython uses some non-standard Allegro Common Lisp features, so
it does not run on all the free implementations of ANSI Common Lisp. The
implementation is interesting, in part because it shows how expensive and
complex some Python primitives are.

May 13 '07 #26

Towards faster Python implementations - theory

Similar topics