Please enlighten me about PyPy

Ray

Hello!

I've been reading about PyPy, but there are some things that I don't
understand about it. I hope I can get some enlightenment in this
newsgroup :)

First, the intro:

<excerpt>
"The PyPy project aims at producing a flexible and fast Python
implementation. The guiding idea is to translate a Python-level
description of the Python language itself to lower level languages."
</excerpt>

So the basic idea is that PyPy is an implementation of Python in Python
(i.e.: writing Python's interpreter in Python), and then translate that
into another language such as C or Java? How is it different from
CPython or Jython then?

Also, what does "translation" here mean? Translation as in, say, "Use
Jython to translate PyPy to Java classes"? Or "Use Psyco to translate
PyPy to native exec"?

<excerpt>
"Rumors have it that the secret goal is being faster-than-C which is
nonsense, isn't it?"
</excerpt>

Why is this supposed to be nonsense if it's been translated to C? I
mean, C version of PyPy vs. CPython, both are in C, then why is this
supposed to be nonsense?

It seems that I'm missing a lot of nuances in the word "translation"
here.

Also, this one:

<excerpt>
We have written a Python interpreter in Python, without many references
to low-level details. (Because of the nature of Python, this is already
a complicated task, although not as much as writing it in - say - C.)
Then we use this as a "language specification" and manipulate it to
produce the more traditional interpreters that we want. In the above
sense, we are generating the concrete "mappings" of Python into
lower-level target platforms.
</excerpt>

So the "language specification" in this paragraph _is_ the Python
implementation in Python, a.k.a.: PyPy? Then what does "manipulate it
to produce the more traditional interpreters" mean?

I mean, it seems from what I read that PyPy is more about a translator
that translates Python code into something else rather than
implementing Python in Python. In that case, it could have been any
other project, right? As in implementing X in Python, and then
translate to another language?

Thanks for any pointers!

Dec 22 '05 #1

Subscribe Reply

2280

Luis M. González

Hmmm... I know it's complicated, and all these questions can make your
head explode.
I'll tell you what I understand about Pypy and, at the same time, I'll
leave the door open for further explanations or corrections.

As you know, python is a dynamic language.
It means, amongst other things, that the programmer doesn't provide
type information when declaring variables, like in statically typed
languages.
Its code doesn't get translated to machine code through a compiler,
like in C.
Instead, it is "interpreted" by the interprter, which finds out each
variable type at run-time.
This interpretation makes scripting languages like python much slower
than traditional static languages.

Recently, Python got a speed boost via Psyco, which is something like a
proof of concept for a just-in-time compiler. It is a cpython extension
and it can improve python's speed by analyzing run-time information and
generating machine code on the fly.
However, psyco can only analize python code and as you know, python
relies on many extensions coded in c, for performance.
So its author decided that having a python implementation written in
python would laid a much better basis for implementing psyco-like
techniques.

This implementation requires a minimal core, writen in a restricted
subset of python called "rpython". This subset avoids many of the most
dynamic aspects of python, making it easier to authomatically translate
it to C through a tool that uses top-notch type inference techniques.
This translated version of the rpython interpreter (which got already
auto-translated to c), is the basis of Pypy.

On top of it, New Psyco-like and just-in-time techniques will be
implemented for achieving maximum performance.

However, I still doubt that I really understood it...
I'm still not sure if the type inference techniques will be used to
improve the performance of programs running on pypy, or if these
techniques were only intended for getting the rpython interpreter
translated to c.

As far as I know, pypy is currently about 10/20 times slower than
cpython, although many optimizations remain to be done.
And I 'm not sure, but I think that its developers rely on the
psyco-like techniques to achieve the big speed boost their looking for.

Luis

Dec 22 '05 #2

Ray

Hi Luis!

Thanks for your reply :) Some further questions below...

So its author decided that having a python implementation written in
python would laid a much better basis for implementing psyco-like
techniques.
OK, so far I get it... I think. So it's implementing the Python
interpreter in a Python subset called RPython, makes it more amenable
to translation with psyco-like techniques. But how is this superior
compared to CPython? Is it because Psyco is a specializer, which
generate potentially different code for different sets of data? So the
assumption is that the interpreter deals with a very specific set of
data that Psyco will be able to make use to generate very efficient
machine code?

I still don't get how this can be superior to the hand-coded C version
though?

Also, this sounds like it involves implementing the Python's libraries
that are currently implemented in C, in Python, so that they can be
translated. Did I get that correctly?
This implementation requires a minimal core, writen in a restricted
subset of python called "rpython". This subset avoids many of the most
dynamic aspects of python, making it easier to authomatically translate
it to C through a tool that uses top-notch type inference techniques.
OK, now I understand this bit about RPython, thanks.
This translated version of the rpython interpreter (which got already
auto-translated to c), is the basis of Pypy.
Now I'm confused again--psyco translates Python into machine code--so
how does this tie in with the fact that the interpreter written in
Python is translated into another language (in this case C?)
However, I still doubt that I really understood it...
I'm still not sure if the type inference techniques will be used to
improve the performance of programs running on pypy, or if these
techniques were only intended for getting the rpython interpreter
translated to c.

As far as I know, pypy is currently about 10/20 times slower than
cpython, although many optimizations remain to be done.
And I 'm not sure, but I think that its developers rely on the
psyco-like techniques to achieve the big speed boost their looking for.
This is another one I don't get--this approach seems to imply that when
PyPy is reasonably complete, it is expected that it'll be faster than
CPython. I mean, I don't get how something that's translated into C can
be faster than the handcoded C version?

Thanks,
Ray

Luis

Dec 22 '05 #3

Steve Holden

Kevin Yuan wrote:

21 Dec 2005 19:33:20 -0800, Luis M. González <lu*****@gmail.com
<mailto:lu*****@gmail.com>>:

... ...
This implementation requires a minimal core, writen in a restricted
subset of python called "rpython". This subset avoids many of the most
dynamic aspects of python, making it easier to authomatically translate
it to C through a tool that uses top-notch type inference techniques.
Why not directly write the minimal core in C?

Because then you'd have to maintain it in C. This way, once you have the
first working translator you can translate it into C to improve its
performance, and use it to translate the *next* working translator, and
so on. Consequently your maintenance work is done on the Python code
rather than hand-translated C.

Fairly standard bootstrapping technique, though it's sometimes difficult
to appreciate the problems involved in writing a compiler for language X
in X itself. Typical is the fact that the compiler for version m has to
be written in version (m-1), for example :-)

used-to-do-that-stuff-for-a-living-ly y'rs - steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

Dec 22 '05 #4

Luis M. González

Well, first and foremost, when I said that I leave the door open for
further explanations, I meant explanations by other people more
knowlegeable than me :-)

Now I'm confused again--psyco translates Python into machine code--so
how does this tie in with the fact that the interpreter written in
Python is translated into another language (in this case C?)

No, the psyco-like techniques come later, after the rpython interpreter
is auto-translated to c. They are not used to translate the interpreter
to c (this is done through a tool that uses type inference, flow-graph
anailisis, etc, etc).
Getting the rpython auto-translated to C is the first goal of the
project (already achieved).
That means having a minimal core, writen in a low level language (c for
speed) that hasn't been writen by hand, but auto-translated to c from
the python source -> much easier to improve and maintain from now on.

Think about this: improving and maintaining a hand coded c
implementation like cpython is a nightmare. The more complex the code,
the more dificult is its improvement and experimentation.
Now, they have it all written in python (rpython) instead, which is
easier, nicer and more flexible. And this python code can get
authomatically translated to C (no hand coding, this is done by the
tool I mentioned above).

Now this is both, a conclusion and a question (because I also have many
doubts about it :-):
At this moment, the traslated python-in-python version is, or intends
to be, something more or less equivalenet to Cpython in terms of
performance. Because it is in essence almost the same thing: another C
python implementation. The only difference is that while Cpython was
written by hand, pypy was writen in python and auto-translated to C.

What remains to be done now is implementing the psyco-like techniques
for improving speed (amongst many other things, like stackless, etc).

Luis

Dec 22 '05 #5

Claudio Grondi

Steve Holden wrote:

Kevin Yuan wrote:

21 Dec 2005 19:33:20 -0800, Luis M. González <lu*****@gmail.com
<mailto:lu*****@gmail.com>>:

... ...
This implementation requires a minimal core, writen in a restricted
subset of python called "rpython". This subset avoids many of the
most
dynamic aspects of python, making it easier to authomatically
translate
it to C through a tool that uses top-notch type inference techniques.
Why not directly write the minimal core in C?

Because then you'd have to maintain it in C. This way, once you have the
first working translator you can translate it into C to improve its
performance, and use it to translate the *next* working translator, and
so on. Consequently your maintenance work is done on the Python code
rather than hand-translated C.

Fairly standard bootstrapping technique, though it's sometimes difficult
to appreciate the problems involved in writing a compiler for language X
in X itself. Typical is the fact that the compiler for version m has to
be written in version (m-1), for example :-)

used-to-do-that-stuff-for-a-living-ly y'rs - steve

I am glad someone asked the question about PyPy, because I need same
enlightenment. Reading what has been written up to now I would like to
present here my current understanding to get eventually corrected when
got something the wrong way.

Do I understand it right, that :

Translating Python code to C for compilation is the way to avoid the
necessity to write a Python compiler as a hand-coded Assembler code for
each platform (i.e. Operating System / Processor combination)?

This hand-coding is already done by the people providing a C-compiler
for a platform and it can be assumed, that a C-compiler is always
available, so why dig that deep, when in practice it could be sufficient
to begin at the abstraction level of a C-compiler and not hand-coded
Assembler?

Sticking to ANSI C/C++ will make it possible to become multi platform
without the necessity of writing own pieces of hand-coded Assembler for
each platform what is usually already done by others providing that
platform and the C compiler for it.

So to use PyPy for creating a Python based Operating System where e.g.
IDLE replaces the usual command line interface and Tkinter becomes the
core of the GUI, it will be sufficient to replace the step of
translation to C code and compilation using a C compiler by an in
Assembler hand-coded Python compiler for each specific platform?

The expectation to become faster than CPython with the PyPy approach I
understand as a hope, that creating another Python engine architecture
(i.e. hierarchy of software pieces/modules the entire Python scripting
engine consist of) can lead to improvements not possible when sticking
to the given architecture in the current CPython implementation. After
it has been demonstrated the PyPy can be faster than the current CPython
implementation it will be sure possible to totally rewrite the CPython
implementation to achieve same speed by changing the architecture of the
elementary modules.
Is this maybe what causes confusion in understanding the expectation
that PyPy can come along with a speed improvement over CPython? The
fact, that another architecture of elementary modules can lead to speed
improvement and the fact, that it can sure be also then implemented in
the CPython approach to achieve the same speed, but would need a total
rewrite of CPython i.e. duplication of the PyPy effort?

Claudio

Dec 22 '05 #6

Carl Friedrich Bolz

Hi!

Luis M. González wrote:

Well, first and foremost, when I said that I leave the door open for
further explanations, I meant explanations by other people more
knowlegeable than me :-)
You did a very good job to describe what PyPy is in this and the
previous mail! I will try to give a justification about why PyPy is done
how it is done.

Now I'm confused again--psyco translates Python into machine code--so
how does this tie in with the fact that the interpreter written in
Python is translated into another language (in this case C?)

No, the psyco-like techniques come later, after the rpython interpreter
is auto-translated to c. They are not used to translate the interpreter
to c (this is done through a tool that uses type inference, flow-graph
anailisis, etc, etc).
Getting the rpython auto-translated to C is the first goal of the
project (already achieved).
That means having a minimal core, writen in a low level language (c for
speed) that hasn't been writen by hand, but auto-translated to c from
the python source -> much easier to improve and maintain from now on.

Indeed. The fact that the core is written in RPython has a number of
advantages:

The first point is indeed maintainability: Python is a lot more flexible
and more concise than C, so changes and enhancements become much easier.
Another point is that our interpreter can not only be translated, but
also run on top of CPython! This makes testing very fast, because you
don't need to translate the interpreter first before testing it -- just
run in on CPython.

The most important advantage of writing the interpreter in Python is
that of flexibility. In CPython a lot of implementation choices are done
rather early: The choice to use C as the platform the interpreter works
on, the choice to use reference counting (which is reflected
everywhere), the choice to have a GIL, the choice to not be stackless.
All these choices are deeply embedded into the implementation and are
rather hard to change. Not so in PyPy. Since the interpreter is written
in Python and then translated, the translation process can change
different aspects of the interpreter while translating it. The
interpreter implementation does not need to concern itself with all
these aspects.

One example of this is that we are not restricted to translate out
interpreter to C. There are currently backends to translate RPython to
C, LLVM (llvm.org), JavaScript (incomplete) and plans to write a
Smalltalk and a Java backend. That means that we could potentially
generate something that is similar to Jython -- which is not entirely
true, because the interfacing with Java libraries would not work, but
pypy-java would run on the JVM.

Another example is that we can choose at translation time which garbage
collection strategy to use. At the moment we even have two different
garbage collectors implemented: one simple reference counting one and
one that uses the Boehm garbage collector. We have also started (as part
of my Summer of Code project) an experimental garbage collection
framework which allow us to implement garbage collectors in Python. This
framework is not finished yet and needs to be integrated with the rest
of PyPy.

In a similar manner we hope to make different threading models choosable
at translation time.

[snip]
Now this is both, a conclusion and a question (because I also have many
doubts about it :-):
At this moment, the traslated python-in-python version is, or intends
to be, something more or less equivalenet to Cpython in terms of
performance. Because it is in essence almost the same thing: another C
python implementation. The only difference is that while Cpython was
written by hand, pypy was writen in python and auto-translated to C.
Yes, at the moment pypy-c is rather similar to CPython, although slower
(a bit better than ten times slower than CPython at the moment), except
that we can already choose between different aspects (see above).
What remains to be done now is implementing the psyco-like techniques
for improving speed (amongst many other things, like stackless, etc).

Stackless is already implemented. In fact, it took around three days to
do this at the Paris sprint :-). It is another aspect that we can choose
at translation time (that means you can also choose to not be stackless
if you want to). With stackless we can support arbitrarily deep
recursion (until the heap is full, that is). We don't export any
task-switching capabilities to the user, yet.

About the psyco-like JIT techniques: we hope to be able to not write the
JIT by hand but to generate it as part of the translation process. But
this is at the moment still quite unclear, in heavy flux and nowhere
near finished yet.

Cheers,

Carl Friedrich Bolz

Dec 22 '05 #7

Luis M. González

Thanks Carl for your explanation!
I just have one doubt regarding the way Pypy is supposed to work when
its finished:

We know that for translating the rpython interpreter to C, the pypy
team developed a tool that relies heavily on static type inference.

My question is:
Will this type inference also work when running programs on pypy?
Is type inference a way to speed up running programs on pypy, or it was
just a means to translate the Rpython interpreter to C?

In other words:
Will type inference work on running programs to speed them up, or this
task is only carried out by psyco-like techniques?

Dec 22 '05 #8

Carl Friedrich Bolz

Hi!

some more pointers in addition to the good stuff that Luis wrote...

Ray wrote:

So the basic idea is that PyPy is an implementation of Python in Python
(i.e.: writing Python's interpreter in Python), and then translate that
into another language such as C or Java? How is it different from
CPython or Jython then?
CPython and Jython both need to implement the Python interpreter. So the
work to capture the Python semantics (which is quite big) is done twice,
once in C and once in Java. In PyPy we hope to do that only once and
then write translator backends for C and Java (which is a minor task,
compared to writing a whole Python interpreter).

Also, what does "translation" here mean? Translation as in, say, "Use
Jython to translate PyPy to Java classes"? Or "Use Psyco to translate
PyPy to native exec"?
It's more like "use the translator (which is another Python program) to
translate PyPy to whatever platform you are targetting".
<excerpt>
"Rumors have it that the secret goal is being faster-than-C which is
nonsense, isn't it?"
</excerpt>

Why is this supposed to be nonsense if it's been translated to C? I
mean, C version of PyPy vs. CPython, both are in C, then why is this
supposed to be nonsense?
The idea is that we hope that /Python code/ becomes faster than C code,
which is of course nonsense, right? :-)

It seems that I'm missing a lot of nuances in the word "translation"
here.

Also, this one:

<excerpt>
We have written a Python interpreter in Python, without many references
to low-level details. (Because of the nature of Python, this is already
a complicated task, although not as much as writing it in - say - C.)
Then we use this as a "language specification" and manipulate it to
produce the more traditional interpreters that we want. In the above
sense, we are generating the concrete "mappings" of Python into
lower-level target platforms.
</excerpt>

So the "language specification" in this paragraph _is_ the Python
implementation in Python, a.k.a.: PyPy? Then what does "manipulate it
to produce the more traditional interpreters" mean?
The "manipulation" means the translation process that is used to
translate it into a target lanugage.
I mean, it seems from what I read that PyPy is more about a translator
that translates Python code into something else rather than
implementing Python in Python. In that case, it could have been any
other project, right? As in implementing X in Python, and then
translate to another language?

In theory, yes. But the translator is really not the main product of the
PyPy project. It is more like a means to an end, that we use to get a
more flexible, better, faster, ... implementation of Python. This means
that the translator at the moment is very much customized for our needs
when writing the interpreter, which makes the translator quite a bit
harder to use for other projects. But in theory it is possible to use
the translator for other projects, as long as these projects adheres to
the necessary staticness conditions.

This fact could be used nicely: If someone writes, say, a Ruby or Perl
interpreter in RPython he will get all the benefits of PyPy for free:
different target platforms, different garbage collectors, stacklessness,
maybe a JIT (which is still unclear at the moment).

Cheers,

Carl Friedrich Bolz

Dec 22 '05 #9

Carl Friedrich Bolz

Hi!

Luis M. González wrote:

Thanks Carl for your explanation!
I just have one doubt regarding the way Pypy is supposed to work when
its finished:

We know that for translating the rpython interpreter to C, the pypy
team developed a tool that relies heavily on static type inference.

My question is:
Will this type inference also work when running programs on pypy?
Is type inference a way to speed up running programs on pypy, or it was
just a means to translate the Rpython interpreter to C?

In other words:
Will type inference work on running programs to speed them up, or this
task is only carried out by psyco-like techniques?

The static type inference is just a means. It will not be used for the
speeding up of running programs. The problem with the current type
inference is that it is really very static and most python programs are
not static enough for it.

Therefore we will rather use techniques that are similar to Psyco (note
that our JIT work is still in the early beginnings and that my comments
reflect only what we currently think might work :-) ). The idea is that
the JIT looks at the running code and assumes some things it finds there
to be constant (like the type of a variable), inserts a check that this
still holds, and then optimizes the code under this assumption.

Cheers,

Carl Friedrich

Dec 22 '05 #10

Luis M. González

Carl Friedrich Bolz wrote:

The static type inference is just a means. It will not be used for the
speeding up of running programs. The problem with the current type
inference is that it is really very static and most python programs are
not static enough for it.

Therefore we will rather use techniques that are similar to Psyco (note
that our JIT work is still in the early beginnings and that my comments
reflect only what we currently think might work :-) ). The idea is that
the JIT looks at the running code and assumes some things it finds there
to be constant (like the type of a variable), inserts a check that this
still holds, and then optimizes the code under this assumption.

Thanks!
I think I completely understand the whole thing now :-)

Anyway, I guess it's just a matter of time untill we can use this
translation tool to translate other programs, provided they are written
in restricted python, right?
So we will have two choices:
1) running normal python programs on Pypy.
2) translating rpython programs to C and compiling them to stand-alone
executables.

Is that correct?

Dec 22 '05 #11

Carl Friedrich Bolz

Luis M. González wrote:

Thanks!
I think I completely understand the whole thing now :-)
If only we could say the same :-)
Anyway, I guess it's just a matter of time untill we can use this
translation tool to translate other programs, provided they are written
in restricted python, right?
Yes. This is even possible right now, with one caveat: Basically it is
not so hard to write a new program in RPython. RPython is still kind of
nice, it is testable on CPython so this is not such a bad task. There
are problems with that, though: You don't have most of the stdlib in
RPython (mostly only a few functions from os, sys, math work). The other
problem is that it is quite hard to convert /existing/ programs to
RPython because they will most probably not adhere to the staticness
conditions. And it is surprisingly hard to convert an existing program
to RPython.
So we will have two choices:
1) running normal python programs on Pypy.
2) translating rpython programs to C and compiling them to stand-alone
executables.

Is that correct?

Indeed. Another possibility is to write a PyPy extension module in
RPython, have that translated to C and then use this in your pure python
code. Actually, one of our current rather wild ideas (which might not be
followed) is to be able to even use RPython to write extension modules
for CPython.

Cheers,

Carl Friedrich Bolz

Dec 22 '05 #12

Luis M. González

Anyway, I guess it's just a matter of time untill we can use this
translation tool to translate other programs, provided they are written
in restricted python, right?
So we will have two choices:
1) running normal python programs on Pypy.
2) translating rpython programs to C and compiling them to stand-alone
executables.

Is that correct?

Oh, forget this question...
You already made this clear in another post in this thread...
Thanks!
Luis

Dec 22 '05 #13

Luis M. González

Carl Friedrich Bolz wrote:

Actually, one of our current rather wild ideas (which might not be
followed) is to be able to even use RPython to write extension modules
for CPython.

I don't think this is a wild idea. In fact, it is absolutely
reasonable.
I'm sure that creating this translation tool was a titanic task, and
now that you have it why not using it? This is a treasure that opens up
many possibilities...
Even if Pypy ends up being not as fast as intended (I hope not!), the
fact that you guys created this translation tool was well worth the
effort.

Thanks again for your explanations and keep up the good work!
Cheers,
Luis

Dec 22 '05 #14

Scott David Daniels

Luis M. González wrote:

At this moment, the traslated python-in-python version is, or intends
to be, something more or less equivalent to Cpython in terms of
performance. Actually, I think here it is more or less equivalent in behavior.
Because it is in essence almost the same thing: another C python interpreter implementation. The only difference is that while Cpython was
written by hand, pypy was written in python and auto-translated to C. That is not the only difference. It becomes a lot easier to experiment
with alternative implementations of features and run timing tests.
What remains to be done now is implementing the psyco-like techniques
for improving speed (amongst many other things, like stackless, etc).

While the psyco-like tricks for specialization should definitely improve
the interpreter, there is a second trick (watch for exploding heads
here). The big trick is that you can specialize the interpreter for
running _its_ input (a Python program), thus giving you a new
interpreter that only runs your Python program -- a very specialized
interpreter indeed.

--Scott David Daniels
sc***********@acm.org

Dec 22 '05 #15

Carl Friedrich Bolz

Hi!

Scott David Daniels wrote:

Luis M. González wrote:

At this moment, the traslated python-in-python version is, or intends
to be, something more or less equivalent to Cpython in terms of
performance.
Actually, I think here it is more or less equivalent in behavior.

Yes, apart from some minor differences (obscure one: in CPython you
cannot subclass str or tuple while adding slots, for no good reason,
while you can do that in PyPy).

[snip]
While the psyco-like tricks for specialization should definitely improve
the interpreter, there is a second trick (watch for exploding heads
here). The big trick is that you can specialize the interpreter for
running _its_ input (a Python program), thus giving you a new
interpreter that only runs your Python program -- a very specialized
interpreter indeed.

Indeed! And this specialized interpreter can with some right be called a
compiled version of the user-program! That means that an interpreter
together with a specializer is a compiler.

Now it is possible to take that fun game even one step further: You
specialize the _specializer_ for running its input (which is the
interpreter), thus giving you a new specializer which can specialize
only the interpreter for a later given user program -- a very
specialized specializer indeed. This can then be called a just-in-time
compiler. (Note that this is not quite how JIT of PyPy will look like :-)

recursively-yours,

Carl Friedrich Bolz

Dec 22 '05 #16

Ray

Luis M. González wrote:

Well, first and foremost, when I said that I leave the door open for
further explanations, I meant explanations by other people more
knowlegeable than me :-)

<snip>

Thanks for clearing up some of my confusion with PyPy, Luis!

Cheers,
Ray

Dec 23 '05 #17

Ray

Carl Friedrich Bolz wrote:

Hi!

some more pointers in addition to the good stuff that Luis wrote...

<snip>

Thanks Carl! That solidified my mental picture of PyPy a lot more :)

Warm regards,
Ray

Dec 23 '05 #18

Bugs

Scott David Daniels wrote:

[snip] The big trick is that you can specialize the interpreter for
running _its_ input (a Python program), thus giving you a new
interpreter that only runs your Python program -- a very specialized
interpreter indeed.

Now THAT will be slick!

What is the current roadmap/timeline for PyPy?

Anyone know if Guido is interested in ever becoming deeply involved in
the PyPy project?

Dec 23 '05 #19

Luis M. González

> Thanks for clearing up some of my confusion with PyPy, Luis!

Hey, I'm glad you brought up this topic!
This thread really helped me to understand some dark corners of this
exciting project.

I also want to thank Carl and all the other Pypy developers for their
outstanding work!
I've been quietly following the evolution of Pypy through its mailing
list, and I eagerly wait for every new announcement they make, but I
never dared to ask any question fearing that I would look like a fool
amongst these rocket scientist...

Cheers,
Luis

Dec 23 '05 #20

Christian Tismer

Carl Friedrich Bolz wrote:

Luis M. González wrote:

....

So we will have two choices:
1) running normal python programs on Pypy.
2) translating rpython programs to C and compiling them to stand-alone
executables.

Is that correct?

Indeed. Another possibility is to write a PyPy extension module in
RPython, have that translated to C and then use this in your pure python
code. Actually, one of our current rather wild ideas (which might not be
followed) is to be able to even use RPython to write extension modules
for CPython.

Actually, this wild idea is mine, but PyPy may share. :-)

And there is no need for this to be followed, since
I'm going to do this anyway, because I have good personal
reasons:

There is a lot of demand to get Stackless ported more
easily, and the current way of manually fighting the
ever growing number of C modules by hand just sucks.

Waiting for PyPy to get mature enough to replace Stackless
is one way, which takes too long. Waiting for "readyness"
of PyPy to produce something ahead of explorational and
toy interpreters is also no option, although these are very
nice and great for education.

My alternative is to use translation of RPython to produce
extension modules for CPython. Although this is considered
an "implementation detail" by most of the PyPy core people,
companies which are considering to write extensions in C
are just finger-licking for such a detail to use, instead.

I will use it for Stackless Python as a show-case. As an
example, I want to revert itertools and dequeues to almost
their Python equivalent and then translate them into C using
PyPy's translator. While this is of no visible worth for
the normal Python user, it gives me the advantage that
these modules will gain support for the Stackless features
automatically, because the base support is built into PyPy.

This is not trying to split apart from PyPy, or to short-cut its
goals. I'm completely with PyPy's goals, and it will do much
more than RPython translation ever will, this is out of question.

One problem is that we cannot produce a competitive Python
implementation by now. There is a lot more work involved
to gain the necessary speed to be considered. On the
other hand, the produced low-level code for builtin objects
is already almost as efficient as hand-written code in many
cases. As a proof of concept, I have used this to turn
an application program into compiled RPython, which became
over 10 times faster and outperformed its highly optimized
Java counterpart.

I just believe that RPython is a piece of gold, a gem created
aside while trying to build the huge thing, and we should
not leave its potential unused. Sure, it needs some processing
and finishing to make it easier to use and have better support
for interfacing to existing CPython objects.

After three years, the PyPy project can really take the chance
to produce a small, useful tool for the ambitioned developer.
Not making his task trivial, as PyPy will, but considerably
simpler than writing C.

merry christmas -- chris
--
Christian Tismer :^) <mailto:ti****@stackless.com>
tismerysoft GmbH : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9A : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 802 86 56 mobile +49 173 24 18 776 fax +49 30 80 90 57 05
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/

Dec 25 '05 #21

Christian Tismer

Christian Tismer wrote:

This is not trying to split apart from PyPy, or to short-cut its
goals. I'm completely with PyPy's goals, and it will do much
more than RPython translation ever will, this is out of question.

Of course I meant "this is beyond question" :-)

--
Christian Tismer :^) <mailto:ti****@stackless.com>
tismerysoft GmbH : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9A : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 802 86 56 mobile +49 173 24 18 776 fax +49 30 80 90 57 05
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/

Dec 25 '05 #22

Luis M. González

Christian Tismer wrote:

Christian Tismer wrote:
This is not trying to split apart from PyPy, or to short-cut its
goals. I'm completely with PyPy's goals, and it will do much
more than RPython translation ever will, this is out of question.

Hi Christian,

I'd like to know, in your opinion, how far is the goal of making pypy
complete and fast?
Regarding the current state of the project, are you confident that the
goals will be met, or you still have doubts?
(The same questions go for the RPython translator project as
stand-alone tool)...

Luis

Dec 25 '05 #23

Christian Tismer

Luis M. González wrote:

I'd like to know, in your opinion, how far is the goal of making pypy
complete and fast?
Me too :-)

PyPy is doing a great job, that's for sure.

I'm hesitant with making estimates, after I learned what a bad
job I'm doing at extrapolation.

First I thought that we would reach our first self-contained
PyPy much earlier, gaining CPython speed. When I had lost my
faith a little, we suddenly made it. Then we made very much
progress in speeding it up, but still we are 7 to 10 times
slower than CPython, and it gets harder and harder.

Now we are aiming at JIT technology, which is able to accelerate
Python quite much in many cases, even if we should fail to
improve the basic translation reasonably. Of course it would
be nice to reach both aims, and I expect that the things we
will learn from writing the JIT will also improve the static
translation.

Completenes? In some aspects, like CPython compatibility,
we are very complete, maybe more than the original, even. :-)
Concerning the promises we made to the EU, we will have a hard
time to make it all happen on schedule, but we have a chance,
given that the support by external helpers keeps growing.
Concerning all what we ever said about PyPy? This is a never-
ending story and unlimited, as I don't expect PyPy to stop
growing and extending in any near future, like Python doesn't...
Regarding the current state of the project, are you confident that the
goals will be met, or you still have doubts?
I no longer have doubts about success. I never really had, but
my time estimates are less pessimistic as they sometimes were.
I don't really believe that we will outperform CPython with
a translated RPython interpreter by the end of next year.
We will probably, in conjunction with a JIT compiler.
For gaining a maximum of performance, my guess is another
two years would make very much sense.
(The same questions go for the RPython translator project as
stand-alone tool)...

This is a matter of viewpoint. As a developer, I'm able to
create extenson modules on demand without any explicit tools.
Enabling/supporting the most needed features might be doable
in a couple of weeks and months, depending on the expectations.
A simple-to-use, stand-alone tool for making extensions will
maybe not happen at all, unless we get a lot of extra-resources.

I'm expecting something to happen in the first quarter of the year.
It depends on how much we can extend activities without missing
the promised goals which we have to fulfill, and how much
sponsoring we can create. I believe that by providing just enough
support to make some companies productive in using PyPy, we will
create enough funding for the time after 2006 to make PyPy survive
for a long time, and creating tools like this will become a
self-running motor for PyPy. A matter of good balancing :-)

merry christmas -- chris
--
Christian Tismer :^) <mailto:ti****@stackless.com>
tismerysoft GmbH : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9A : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 802 86 56 mobile +49 173 24 18 776 fax +49 30 80 90 57 05
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/

Dec 25 '05 #24

Please enlighten me about PyPy

Similar topics