473,761 Members | 7,351 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

2.6, 3.0, and truly independent intepreters

Dear Python dev community,

I'm CTO at a small software company that makes music visualization
software (you can check us out at www.soundspectrum.com). About two
years ago we went with decision to use embedded python in a couple of
our new products, given all the great things about python. We were
close to using lua but for various reasons we decided to go with
python. However, over the last two years, there's been one area of
grief that sometimes makes me think twice about our decision to go
with python...

Some background first... Our software is used for entertainment and
centers around real time, high-performance graphics, so python's
performance, embedded flexibility, and stability are the most
important issues for us. Our software targets a large cross section
of hardware and we currently ship products for Win32, OS X, and the
iPhone and since our customers are end users, our products have to be
robust, have a tidy install footprint, and be foolproof. Basically,
we use embedded python and use it to wrap our high performance C++
class set which wraps OpenGL, DirectX and our own software renderer.
In addition to wrapping our C++ frameworks, we use python to perform
various "worker" tasks on worker thread (e.g. image loading and
processing). However, we require *true* thread/interpreter
independence so python 2 has been frustrating at time, to say the
least. Please don't start with "but really, python supports multiple
interpreters" because I've been there many many times with people.
And, yes, I'm aware of the multiprocessing module added in 2.6, but
that stuff isn't lightweight and isn't suitable at all for many
environments (including ours). The bottom line is that if you want to
perform independent processing (in python) on different threads, using
the machine's multiple cores to the fullest, then you're out of luck
under python 2.

Sadly, the only way we could get truly independent interpreters was to
put python in a dynamic library, have our installer make a *duplicate*
copy of it during the installation process (e.g. python.dll/.bundle ->
python2.dll/.bundle) and load each one explicitly in our app, so we
can get truly independent interpreters. In other words, we load a
fresh dynamic lib for each thread-independent interpreter (you can't
reuse the same dynamic library because the OS will just reference the
already-loaded one).

From what I gather from the python community, the basis for not
offering "real" muti-threaded support is that it'd add to much
internal overhead--and I couldn't agree more. As a high performance C
and C++ guy, I fully agree that thread safety should be at the high
level, not at the low level. BUT, the lack of truly independent
interpreters is what ultimately prevents using python in cool,
powerful ways. This shortcoming alone has caused game developers--
both large and small--to choose other embedded interpreters over
python (e.g. Blizzard chose lua over python). For example, Apple's
QuickTime API is powerful in that high-level instance objects can
leverage performance gains associated with multi-threaded processing.
Meanwhile, the QuickTime API simply lists the responsibilitie s of the
caller regarding thread safety and that's all its needs to do. In
other words, CPython doesn't need to step in an provide a threadsafe
environment; it just needs to establish the rules and make sure that
its own implementation supports those rules.

More than once, I had actually considered expending company resources
to develop a high performance, truly independent interpreter
implementation of the python core language and modules but in the end
estimated that the size of that project would just be too much, given
our company's current resources. Should such an implementation ever
be developed, it would be very attractive for companies to support,
fund, and/or license. The truth is, we just love python as a
language, but it's lack of true interpreter independence (in a
interpreter as well as in a thread sense) remains a *huge* liability.

So, my question becomes: is python 3 ready for true multithreaded
support?? Can we finally abandon our Frankenstein approach of loading
multiple identical dynamic libs to achieve truly independent
interpreters?? I've reviewed all the new python 3 C API module stuff,
and all I have to say is: whew--better late then never!! So, although
that solves modules offering truly independent interpreter support,
the following questions remain:

- In python 3, the C module API now supports true interpreter
independence, but have all the modules in the python codebase been
converted over? Are they all now truly compliant? It will only take
a single static/global state variable in a module to potentially cause
no end of pain in a multiple interpreter environment! Yikes!

- How close is python 3 really to true multithreaded use? The
assumption here is that caller ensures safety (e.g. ensuring that
neither interpreter is in use when serializing data from one to
another).

I believe that true python independent thread/interpreter support is
paramount and should become the top priority because this is the key
consideration used by developers when they're deciding which
interpreter to embed in their app. Until there's a hello world that
demonstrates running independent python interpreters on multiple app
threads, lua will remain the clear choice over python. Python 3 needs
true interpreter independence and multi-threaded support!
Thanks,
Andy O'Meara
Oct 22 '08 #1
114 3903
Andy schrieb:
Dear Python dev community,

[...] Basically,
we use embedded python and use it to wrap our high performance C++
class set which wraps OpenGL, DirectX and our own software renderer.
In addition to wrapping our C++ frameworks, we use python to perform
various "worker" tasks on worker thread (e.g. image loading and
processing). However, we require *true* thread/interpreter
independence so python 2 has been frustrating at time, to say the
least.
[...]
>
Sadly, the only way we could get truly independent interpreters was to
put python in a dynamic library, have our installer make a *duplicate*
copy of it during the installation process (e.g. python.dll/.bundle ->
python2.dll/.bundle) and load each one explicitly in our app, so we
can get truly independent interpreters. In other words, we load a
fresh dynamic lib for each thread-independent interpreter (you can't
reuse the same dynamic library because the OS will just reference the
already-loaded one).
Interesting questions you ask.

A random note: py2exe also does something similar for executables build
with the 'bundle = 1' option. The python.dll and .pyd extension modules
in this case are not loaded into the process in the 'normal' way (with
some kind of windows LoadLibrary() call, instead they are loaded by code
in py2exe that /emulates/ LoadLibrary - the code segments are loaded into
memory, fixups are made for imported functions, and marked executable.

The result is that separate COM objects implemented as Python modules and
converted into separate dlls by py2exe do not share their interpreters even
if they are running in the same process. Of course this only works on windows.
In effect this is similar to using /statically/ linked python interpreters
in separate dlls. Can't you do something like that?
So, my question becomes: is python 3 ready for true multithreaded
support?? Can we finally abandon our Frankenstein approach of loading
multiple identical dynamic libs to achieve truly independent
interpreters?? I've reviewed all the new python 3 C API module stuff,
and all I have to say is: whew--better late then never!! So, although
that solves modules offering truly independent interpreter support,
the following questions remain:

- In python 3, the C module API now supports true interpreter
independence, but have all the modules in the python codebase been
converted over? Are they all now truly compliant? It will only take
a single static/global state variable in a module to potentially cause
no end of pain in a multiple interpreter environment! Yikes!
I don't think this is the case (currently). But you could submit patches
to Python so that at least the 'official' modules (builtin and extensions)
would behave corectly in the case of multiple interpreters. At least
this is a much lighter task than writing your own GIL-less interpreter.

My 2 cents,

Thomas
Oct 22 '08 #2
- In python 3, the C module API now supports true interpreter
independence, but have all the modules in the python codebase been
converted over?
No, none of them.
Are they all now truly compliant? It will only take
a single static/global state variable in a module to potentially cause
no end of pain in a multiple interpreter environment! Yikes!
So you will have to suffer pain.
- How close is python 3 really to true multithreaded use?
Python is as thread-safe as ever (i.e. completely thread-safe).
I believe that true python independent thread/interpreter support is
paramount and should become the top priority because this is the key
consideration used by developers when they're deciding which
interpreter to embed in their app. Until there's a hello world that
demonstrates running independent python interpreters on multiple app
threads, lua will remain the clear choice over python. Python 3 needs
true interpreter independence and multi-threaded support!
So what patches to achieve that goal have you contributed so far?

In open source, pleas have nearly zero effect; code contributions is
what has effect.

I don't think any of the current committers has a significant interest
in supporting multiple interpreters (and I say that as the one who wrote
and implemented PEP 3121). To make a significant change, you need to
start with a PEP, offer to implement it once accepted, and offer to
maintain the feature for five years.

Regards,
Martin
Oct 22 '08 #3

Hi Thomas -

I appreciate your thoughts and time on this subject.
>
The result is that separate COM objects implemented as Python modules and
converted into separate dlls by py2exe do not share their interpreters even
if they are running in the same process. *Of course this only works on windows.
In effect this is similar to using /statically/ linked python interpreters
in separate dlls. *Can't you do something like that?
You're definitely correct that homebrew loading and linking would do
the trick. However, because our python stuff makes callbacks into our
C/C++, that complicates the linking process (if I understand you
correctly). Also, then there's the problem of OS X.

- In python 3, the C module API now supports true interpreter
independence, but have all the modules in the python codebase been
converted over? *Are they all now truly compliant? *It will only take
a single static/global state variable in a module to potentially cause
no end of pain in a multiple interpreter environment! *Yikes!

I don't think this is the case (currently). *But you could submit patches
to Python so that at least the 'official' modules (builtin and extensions)
would behave corectly in the case of multiple interpreters. *At least
this is a much lighter task than writing your own GIL-less interpreter.
I agree -- and I've been considering that (or rather, having our
company hire/pay part of the python dev community to do the work). To
consider that, the question becomes, how many modules are we talking
about do you think? 10? 100? I confess that I'm no familiar enough
with the full C python suite to have a good idea of how much work
we're talking about here.

Regards,
Andy


Oct 22 '08 #4

- In python 3, the C module API now supports true interpreter
independence, but have all the modules in the python codebase been
converted over?

No, none of them.
:^)
>
- How close is python 3 really to true multithreaded use?

Python is as thread-safe as ever (i.e. completely thread-safe).
If you're referring to the fact that the GIL does that, then you're
certainly correct. But if you've got multiple CPUs/cores and actually
want to use them, that GIL means you might as well forget about them.
So please take my use of "true multithreaded" to mean "turning off"
the GIL and push the responsibility of object safety to the client/API
level (such as in my QuickTime API example).

I believe that true python independent thread/interpreter support is
paramount and should become the top priority because this is the key
consideration used by developers when they're deciding which
interpreter to embed in their app. Until there's a hello world that
demonstrates running independent python interpreters on multiple app
threads, lua will remain the clear choice over python. Python 3 needs
true interpreter independence and multi-threaded support!

So what patches to achieve that goal have you contributed so far?

In open source, pleas have nearly zero effect; code contributions is
what has effect.
This is just my second email, please be a little patient. :^) But
more seriously, I do represent a company ready, able, and willing to
fund the development of features that we're looking for, so please
understand that I'm definitely not coming to the table empty-handed
here.

I don't think any of the current committers has a significant interest
in supporting multiple interpreters (and I say that as the one who wrote
and implemented PEP 3121). To make a significant change, you need to
start with a PEP, offer to implement it once accepted, and offer to
maintain the feature for five years.
Nice to meet you! :^) Seriously though, thank you for all your work on
3121 and taking the initiative with it! It's definitely the first
step in what companies like ours attract us to embedded an interpreted
language. Specifically: unrestricted interpreter and thread-
independent use.

I would *love* for our company to be 10 times larger and be able to
add another zero to what we'd be able to hire/offer the python dev
community for work that we're looking for, but we unfortunately have
limits at the moment. And I would love to see python become the
leading choice when companies look to use an embedded interpreter, and
I offer my comments here to paint a picture of what can make python
more appealing to commercial software developers. Hopefully, the
python dev community doesn't underestimate the dev funding that could
potentially come in from companies if python grew in certain ways!

So, that said, I represent a company willing to fund the development
of features that move python towards thread-independent operation. No
software engineer can deny that we're entering a new era of
multithreaded processing where support frameworks (such as python)
need to be open minded with how they're used in a multi-threaded
environment--that's all I'm saying here.

Anyway, I can definitely tell you and anyone else interested that
we're willing to put our money where our wish-list is. As I mentioned
in my previous post to Thomas, the next step is to get an
understanding of the options available that will satisfy our needs.
We have a budget for this, but it's not astronomical (it's driven by
the cost associated with dropping python and going with lua--or,
making our own pared-down interpreter implementation) . Please let me
be clear--I love python (as a language) and I don't want to switch.
BUT, we have to be able to run interpreters in different threads (and
get unhindered/full CPU core performance--ie. no GIL).

Thoughts? Also, please feel free to email me off-list if you prefer.

Oh, while I'm at it, if anyone in the python dev community (or anyone
that has put real work into python) is interested in our software,
email me and I'll hook you up with a complimentary copy of the
products that use python (music visuals for iTunes and WMP).

Regards,
Andy


Oct 22 '08 #5
I would *love* for our company to be 10 times larger and be able to
add another zero to what we'd be able to hire/offer the python dev
community for work that we're looking for, but we unfortunately have
limits at the moment.
There is another thing about open source that you need to consider:
you don't have to do it all on your own.

It needs somebody to take the lead, start a project, define a plan,
and small steps to approach it. If it's really something that the
community desperately needs, and if you make it clear that you will
just lead, but get nowhere without contributions, then the
contributions will come in.

If there won't be any contributions, then the itch in the the
community isn't that strong that it needs scratching.

Regards,
Martin
Oct 22 '08 #6
Andy wrote:
I agree -- and I've been considering that (or rather, having our
company hire/pay part of the python dev community to do the work). To
consider that, the question becomes, how many modules are we talking
about do you think? 10? 100?
In your Python directory, everything in Lib is Python, I believe.
Everything in DLLs is compiled C extensions. I see about 15 for Windows
3.0. These reflect two separate directories in the source tree. Builtin
classes are part of pythonxx.dll in the main directory. I have no idea
if things such as lists (from listobject.c), for instance, are a
potential problem for you.

You could start with the module of most interest to you, or perhaps a
small one, and see if it needs patching (from your viewpoint) and how
much effort it would take to meet your needs.

Terry Jan Reedy

Oct 22 '08 #7
On Wed, Oct 22, 2008 at 12:32 PM, Andy <an****@gmail.c omwrote:
And, yes, I'm aware of the multiprocessing module added in 2.6, but
that stuff isn't lightweight and isn't suitable at all for many
environments (including ours). The bottom line is that if you want to
perform independent processing (in python) on different threads, using
the machine's multiple cores to the fullest, then you're out of luck
under python 2.
So, as the guy-on-the-hook for multiprocessing , I'd like to know what
you might suggest for it to make it more apt for your - and other
environments.

Additionally, have you looked at:
https://launchpad.net/python-safethread
http://code.google.com/p/python-safethread/w/list
(By Adam olsen)

-jesse
Oct 22 '08 #8
Andy wrote:
This is just my second email, please be a little patient. :^)
As a 10-year veteran, I welcome new contributors with new viewpoints and
information.
more appealing to commercial software developers. Hopefully, the
python dev community doesn't underestimate the dev funding that could
potentially come in from companies if python grew in certain ways!
This seems to be something of a chicken-and-egg problem.
So, that said, I represent a company willing to fund the development
of features that move python towards thread-independent operation.
Perhaps you know of and can persuade other companies to contribute to
such focused effort.
No
software engineer can deny that we're entering a new era of
multithreaded processing where support frameworks (such as python)
need to be open minded with how they're used in a multi-threaded
environment--that's all I'm saying here.
The *current* developers seem to be more interested in exploiting
multiple processors with multiprocessing . Note that Google choose that
route for Chrome (as I understood their comic introduction). 2.6 and 3.0
come with a new multiprocessing module that mimics the threading module
api fairly closely. It is now being backported to run with 2.5 and 2.4.

Advances in multithreading will probably require new ideas and
development energy.

Terry Jan Reedy

Oct 22 '08 #9
On Wed, Oct 22, 2008 at 5:34 PM, Terry Reedy <tj*****@udel.e duwrote:
The *current* developers seem to be more interested in exploiting multiple
processors with multiprocessing . Note that Google choose that route for
Chrome (as I understood their comic introduction). 2.6 and 3.0 come with a
new multiprocessing module that mimics the threading module api fairly
closely. It is now being backported to run with 2.5 and 2.4.
That's not exactly correct. Multiprocessing was added to 2.6 and 3.0
as a *additional* method for parallel/concurrent programming that
allows you to use multiple cores - however, as I noted in the PEP:

" In the future, the package might not be as relevant should the
CPython interpreter enable "true" threading, however for some
applications, forking an OS process may sometimes be more
desirable than using lightweight threads, especially on those
platforms where process creation is fast and optimized."

Multiprocessing is not a replacement for a "free threading" future
(ergo my mentioning Adam Olsen's work) - it is a tool in the
"batteries included" box. I don't want my cheerleading and driving of
this to somehow implicate that the rest of Python-Dev thinks this is
the "silver bullet" or final answer in concurrency.

However, a free-threaded python has a lot of implications, and if we
were to do it, it requires we not only "drop" the GIL - it also
requires we consider the ramifications of enabling true threading ala
Java et al - just having "true threads" lying around is great if
you've spent a ton of time learning locking, avoiding shared data/etc,
stepping through and cursing poor debugger support for multiple
threads, etc.

This is why I've been a fan of Adam's approach - enabling free
threading via GIL removal is actually secondary to the project's
stated goal: Enable Safe Threading.

In any case, I've jumped the rails - let's just say there's room in
python for multiprocessing , threading and possible a concurrent
package ala java.util.concu rrent - but it really does have to be
thought out and done right.

Speaking of which: If you wanted "real" threads, you could use a
combination of JCC (http://pypi.python.org/pypi/JCC/) and Jython. :)

-jesse
Oct 22 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.