2.6, 3.0, and truly independent intepreters - Page 3

Andy

Dear Python dev community,

I'm CTO at a small software company that makes music visualization
software (you can check us out at www.soundspectrum.com). About two
years ago we went with decision to use embedded python in a couple of
our new products, given all the great things about python. We were
close to using lua but for various reasons we decided to go with
python. However, over the last two years, there's been one area of
grief that sometimes makes me think twice about our decision to go
with python...

Some background first... Our software is used for entertainment and
centers around real time, high-performance graphics, so python's
performance, embedded flexibility, and stability are the most
important issues for us. Our software targets a large cross section
of hardware and we currently ship products for Win32, OS X, and the
iPhone and since our customers are end users, our products have to be
robust, have a tidy install footprint, and be foolproof. Basically,
we use embedded python and use it to wrap our high performance C++
class set which wraps OpenGL, DirectX and our own software renderer.
In addition to wrapping our C++ frameworks, we use python to perform
various "worker" tasks on worker thread (e.g. image loading and
processing). However, we require *true* thread/interpreter
independence so python 2 has been frustrating at time, to say the
least. Please don't start with "but really, python supports multiple
interpreters" because I've been there many many times with people.
And, yes, I'm aware of the multiprocessing module added in 2.6, but
that stuff isn't lightweight and isn't suitable at all for many
environments (including ours). The bottom line is that if you want to
perform independent processing (in python) on different threads, using
the machine's multiple cores to the fullest, then you're out of luck
under python 2.

Sadly, the only way we could get truly independent interpreters was to
put python in a dynamic library, have our installer make a *duplicate*
copy of it during the installation process (e.g. python.dll/.bundle ->
python2.dll/.bundle) and load each one explicitly in our app, so we
can get truly independent interpreters. In other words, we load a
fresh dynamic lib for each thread-independent interpreter (you can't
reuse the same dynamic library because the OS will just reference the
already-loaded one).

From what I gather from the python community, the basis for not
offering "real" muti-threaded support is that it'd add to much
internal overhead--and I couldn't agree more. As a high performance C
and C++ guy, I fully agree that thread safety should be at the high
level, not at the low level. BUT, the lack of truly independent
interpreters is what ultimately prevents using python in cool,
powerful ways. This shortcoming alone has caused game developers--
both large and small--to choose other embedded interpreters over
python (e.g. Blizzard chose lua over python). For example, Apple's
QuickTime API is powerful in that high-level instance objects can
leverage performance gains associated with multi-threaded processing.
Meanwhile, the QuickTime API simply lists the responsibilitie s of the
caller regarding thread safety and that's all its needs to do. In
other words, CPython doesn't need to step in an provide a threadsafe
environment; it just needs to establish the rules and make sure that
its own implementation supports those rules.

More than once, I had actually considered expending company resources
to develop a high performance, truly independent interpreter
implementation of the python core language and modules but in the end
estimated that the size of that project would just be too much, given
our company's current resources. Should such an implementation ever
be developed, it would be very attractive for companies to support,
fund, and/or license. The truth is, we just love python as a
language, but it's lack of true interpreter independence (in a
interpreter as well as in a thread sense) remains a *huge* liability.

So, my question becomes: is python 3 ready for true multithreaded
support?? Can we finally abandon our Frankenstein approach of loading
multiple identical dynamic libs to achieve truly independent
interpreters?? I've reviewed all the new python 3 C API module stuff,
and all I have to say is: whew--better late then never!! So, although
that solves modules offering truly independent interpreter support,
the following questions remain:

- In python 3, the C module API now supports true interpreter
independence, but have all the modules in the python codebase been
converted over? Are they all now truly compliant? It will only take
a single static/global state variable in a module to potentially cause
no end of pain in a multiple interpreter environment! Yikes!

- How close is python 3 really to true multithreaded use? The
assumption here is that caller ensures safety (e.g. ensuring that
neither interpreter is in use when serializing data from one to
another).

I believe that true python independent thread/interpreter support is
paramount and should become the top priority because this is the key
consideration used by developers when they're deciding which
interpreter to embed in their app. Until there's a hello world that
demonstrates running independent python interpreters on multiple app
threads, lua will remain the clear choice over python. Python 3 needs
true interpreter independence and multi-threaded support!
Thanks,
Andy O'Meara

Oct 22 '08

Subscribe Reply

114

3909

<
1
2
3
4
5
>
Last »

Andy O'Meara

On Oct 24, 9:35*am, sturlamolden <sturlamol...@y ahoo.nowrote:

Instead of "appdomains " (one interpreter per thread), or free
threading, you could use multiple processes. Take a look at the new
multiprocessing module in Python 2.6.

That's mentioned earlier in the thread.

>
There is a fundamental problem with using homebrew loading of multiple
(but renamed) copies of PythonXX.dll that is easily overlooked. That
is, extension modules (.pyd) are DLLs as well.

Tell me about it--there's all kinds of problems and maintenance
liabilities with our approach. That's why I'm here talking about this
stuff.

There are other options as well:

- Use IronPython. It does not have a GIL.

- Use Jython. It does not have a GIL.

- Use pywin32 to create isolated outproc COM servers in Python. (I'm
not sure what the effect of inproc servers would be.)

- Use os.fork() if your platform supports it (Linux, Unix, Apple,
Cygwin, Windows Vista SUA). This is the standard posix way of doing
multiprocessing . It is almost unbeatable if you have a fast copy-on-
write implementation of fork (that is, all platforms except Cygwin).

This is discussed earlier in the thread--they're unfortunately all
out.

Oct 24 '08 #21

Stefan Behnel

Terry Reedy wrote:

Everything in DLLs is compiled C extensions. I see about 15 for Windows
3.0.

Ah, weren't that wonderful times back in the days of Win3.0, when DLL-hell was
inhabited by only 15 libraries? *sigh*

.... although ... wait, didn't Win3.0 have more than that already? Maybe you
meant Windows 1.0?

SCNR-ly,

Stefan

Oct 24 '08 #22

sturlamolden

On Oct 24, 3:58*pm, "Andy O'Meara" <and...@gmail.c omwrote:

This is discussed earlier in the thread--they're unfortunately all
out.

It occurs to me that tcl is doing what you want. Have you ever thought
of not using Python?

That aside, the fundamental problem is what I perceive a fundamental
design flaw in Python's C API. In Java JNI, each function takes a
JNIEnv* pointer as their first argument. There is nothing the
prevents you from embedding several JVMs in a process. Python can
create embedded subinterpreters , but it works differently. It swaps
subinterpreters like a finite state machine: only one is concurrently
active, and the GIL is shared. The approach is fine, except it kills
free threading of subinterpreters . The argument seems to be that
Apache's mod_python somehow depends on it (for reasons I don't
understand).

Oct 24 '08 #23

Andy O'Meara

On Oct 24, 2:12*am, greg <g...@cosc.cant erbury.ac.nzwro te:

Andy wrote:
1) Independent interpreters (this is the easier one--and solved, in
principle anyway, by PEP 3121, by Martin v. Löwis

Something like that is necessary for independent interpreters,
but not sufficient. There are also all the built-in constants
and type objects to consider. Most of these are statically
allocated at the moment.

Agreed--I was just trying to speak generally. Or, put another way,
there's no hope for independent interpreters without the likes of PEP
3121. Also, as Martin pointed out, there's the issue of module
cleanup some guys here may underestimate (and I'm glad Martin pointed
out the importance of it). Without the module cleanup, every time a
dynamic library using python loads and unloads you've got leaks. This
issue is a real problem for us since our software is loaded and
unloaded many many times in a host app (iTunes, WMP, etc). I hadn't
raised it here yet (and I don't want to turn the discussion to this),
but lack of multiple load and unload support has been another painful
issue that we didn't expect to encounter when we went with python.

2) Barriers to "free threading". *As Jesse describes, this is simply
just the GIL being in place, but of course it's there for a reason.
It's there because (1) doesn't hold and there was never any specs/
guidance put forward about what should and shouldn't be done in multi-
threaded apps

No, it's there because it's necessary for acceptable performance
when multiple threads are running in one interpreter. Independent
interpreters wouldn't mean the absence of a GIL; it would only
mean each interpreter having its own GIL.

I see what you're saying, but let's note that what you're talking
about at this point is an interpreter containing protection from the
client level violating (supposed) direction put forth in python
multithreaded guidelines. Glenn Linderman's post really gets at
what's at hand here. It's really important to consider that it's not
a given that python (or any framework) has to be designed against
hazardous use. Again, I refer you to the diagrams and guidelines in
the QuickTime API:

http://developer.apple.com/technotes/tn/tn2125.html

They tell you point-blank what you can and can't do, and it's that's
simple. Their engineers can then simply create the implementation
around those specs and not weigh any of the implementation down with
sync mechanisms. I'm in the camp that simplicity and convention wins
the day when it comes to an API. It's safe to say that software
engineers expect and assume that a thread that doesn't have contact
with other threads (except for explicit, controlled message/object
passing) will run unhindered and safely, so I raise an eyebrow at the
GIL (or any internal "helper" sync stuff) holding up an thread's
performance when the app is designed to not need lower-level global
locks.

Anyway, let's talk about solutions. My company looking to support
python dev community endeavor that allows the following:

- an app makes N worker threads (using the OS)

- each worker thread makes its own interpreter, pops scripts off a
work queue, and manages exporting (and then importing) result data to
other parts of the app. Generally, we're talking about CPU-bound work
here.

- each interpreter has the essentials (e.g. math support, string
support, re support, and so on -- I realize this is open-ended, but
work with me here).

Let's guesstimate about what kind of work we're talking about here and
if this is even in the realm of possibility. If we find that it *is*
possible, let's figure out what level of work we're talking about.
From there, I can get serious about writing up a PEP/spec, paid
support, and so on.

Regards,
Andy

Oct 24 '08 #24

Patrick Stinson

I'm not finished reading the whole thread yet, but I've got some
things below to respond to this post with.

On Thu, Oct 23, 2008 at 9:30 AM, Glenn Linderman <v+******@g.nev cal.comwrote:

On approximately 10/23/2008 12:24 AM, came the following characters from the
keyboard of Christian Heimes:
>>
Andy wrote:
>>>
2) Barriers to "free threading". As Jesse describes, this is simply
just the GIL being in place, but of course it's there for a reason.
It's there because (1) doesn't hold and there was never any specs/
guidance put forward about what should and shouldn't be done in multi-
threaded apps (see my QuickTime API example). Perhaps if we could go
back in time, we would not put the GIL in place, strict guidelines
regarding multithreaded use would have been established, and PEP 3121
would have been mandatory for C modules. Then again--screw that, if I
could go back in time, I'd just go for the lottery tickets!! :^)

I've been following this discussion with interest, as it certainly seems
that multi-core/multi-CPU machines are the coming thing, and many
applications will need to figure out how to use them effectively.

>I'm very - not absolute, but very - sure that Guido and the initial
designers of Python would have added the GIL anyway. The GIL makes Python
faster on single core machines and more stable on multi core machines. Other
language designers think the same way. Ruby recently got a GIL. The article
http://www.infoq.com/news/2007/05/ru...eading-futures explains the
rationales for a GIL in Ruby. The article also holds a quote from Guido
about threading in general.

Several people inside and outside the Python community think that threads
are dangerous and don't scale. The paper
http://www.eecs.berkeley.edu/Pubs/Te...ECS-2006-1.pdf sums it up
nicely, It explains why modern processors are going to cause more and more
trouble with the Java approach to threads, too.

Reading this PDF paper is extremely interesting (albeit somewhat dependent
on understanding abstract theories of computation; I have enough math
background to follow it, sort of, and most of the text can be read even
without fully understanding the theoretical abstractions).

I have already heard people talking about "Java applications are buggy". I
don't believe that general sequential programs written in Java are any
buggier than programs written in other languages... so I had interpreted
that to mean (based on some inquiry) that complex, multi-threaded Java
applications are buggy. And while I also don't believe that complex,
multi-threaded programs written in Java are any buggier than complex,
multi-threaded programs written in other languages, it does seem to be true
that Java is one of the currently popular languages in which to write
complex, multi-threaded programs, because of its language support for
threads and concurrency primitives. These reports were from people that are
not programmers, but are field IT people, that have bought and/or support
software and/or hardware with drivers, that are written in Java, and seem to
have non-ideal behavior, (apparently only) curable by stopping/restarting
the application or driver, or sometimes requiring a reboot.

The paper explains many traps that lead to complex, multi-threaded programs
being buggy, and being hard to test. I have worked with parallel machines,
applications, and databases for 25 years, and can appreciate the succinct
expression of the problems explained within the paper, and can, from
experience, agree with its premises and conclusions. Parallel applications
only have been commercial successes when the parallelism is tightly
constrained to well-controlled patterns that could be easily understood.
Threads, especially in "cooperatio n" with languages that use memory
pointers, have the potential to get out of control, in inexplicable ways.

>Python *must* gain means of concurrent execution of CPU bound code
eventually to survive on the market. But it must get the right means or we
are going to suffer the consequences.

This statement, after reading the paper, seems somewhat in line with the
author's premise that language acceptability requires that a language be
self-contained/monolithic, and potentially sufficient to implement itself.
That seems to also be one of the reasons that Java is used today for
threaded applications. It does seem to be true, given current hardware
trends, that _some mechanism_ must be provided to obtain the benefit of
multiple cores/CPUs to a single application, and that Python must either
implement or interface to that mechanism to continue to be a viable language
for large scale application development.

Andy seems to want an implementation of independent Python processes
implemented as threads within a single address space, that can be
coordinated by an outer application. This actually corresponds to the model
promulgated in the paper as being most likely to succeed. Of course, it
maps nicely into a model using separate processes, coordinated by an outer
process, also. The differences seem to be:

1) Most applications are historically perceived as corresponding to single
processes. Language features for multi-processing are rare, and such
languages are not in common use.

2) A single address space can be convenient for the coordinating outer
application. It does seem simpler and more efficient to simply "copy" data
from one memory location to another, rather than send it in a message,
especially if the data are large. On the other hand, coordination of memory
access between multiple cores/CPUs effectively causes memory copies from one
cache to the other, and if memory is accessed from multiple cores/CPUs
regularly, the underlying hardware implements additional synchronization and
copying of data, potentially each time the memory is accessed. Being forced
to do message passing of data between processes can actually be more
efficient than access to shared memory at times. I should note that in my
25 years of parallel development, all the systems created used a message
passing paradigm, partly because the multiple CPUs often didn't share the
same memory chips, much less the same address space, and that a key feature
of all the successful systems of that nature was an efficient inter-CPU
message passing mechanism. I should also note that Herb Sutter has a recent
series of columns in Dr Dobbs regarding multi-core/multi-CPU parallelism and
a variety of implementation pitfalls, that I found to be very interesting
reading.

I have noted the multiprocessing module that is new to Python 2.6/3.0 being
feverishly backported to Python 2.5, 2.4, etc... indicating that people
truly find the model/module useful... seems that this is one way, in Python
rather than outside of it, to implement the model Andy is looking for,
although I haven't delved into the details of that module yet, myself. I
suspect that a non-Python application could load one embedded Python
interpreter, and then indirectly use the multiprocessing module to control
other Python interpreters in other processors. I don't know that
multithreading primitives such as described in the paper are available in
the multiprocessing module, but perhaps they can be implemented in some
manner using the tools that are provided; in any case, some interprocess
communication primitives are provided via this new Python module.

There could be opportunity to enhance Python with process creation and
process coordination operations, rather than have it depend on
easy-to-implement-incorrectly coordination patterns or
easy-to-use-improperly libraries/modules of multiprocessing primitives (this
is not a slam of the new multiprocessing module, which appears to be filling
a present need in rather conventional ways, but just to point out that ideas
promulgated by the paper, which I suspect 2 years later are still research
topics, may be a better abstraction than the conventional mechanisms).

One thing Andy hasn't yet explained (or I missed) is why any of his
application is coded in a language other than Python. I can think of a
number of possibilities:

A) (Historical) It existed, then the desire for extensions was seen, and
Python was seen as a good extension language.

B) Python is inappropriate (performance?) for some of the algorithms (but
should they be coded instead as Python extensions, with the core application
being in Python?)

C) Unavailability of Python wrappers for particularly useful 3rd-party
libraries

D) Other?

We develop virtual instrument plugins for music production using
AudioUnit, VST, and RTAS on Windows and OS X. While our dsp engine's
code has to be written in C/C++ for performance reasons, the gui could
have been written in python. But, we didn't because:

1) Our project lead didn't know python, and the project began with
little time for him to learn it.
2) All of our third-party libs (for dsp, plugin-wrappers, etc) are
written in C++, so it would far easier to write and debug our app if
written in the same language. Could I do it now? yes. Could we do it
then? No.

** Additionally **, we would have run into this problem, which is very
appropriate to this thread:

3) Adding python as an audio scripting language in the audio thread
would have caused concurrency issues if our GUI had been written in
python, since audio threads are not allowed to make blockign calls
(f.ex. acquiring the GIL).

OK, I'll continue reading the thread now :)

>
--
Glenn -- http://nevcal.com/
=============== ============
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

--
http://mail.python.org/mailman/listinfo/python-list

Oct 24 '08 #25

Andy O'Meara

Glenn, great post and points!

>
Andy seems to want an implementation of independent Python processes
implemented as threads within a single address space, that can be
coordinated by an outer application. *This actually corresponds to the
model promulgated in the paper as being most likely to succeed.

Yeah, that's the idea--let the highest levels run and coordinate the
show.

>
It does seem simpler and more efficient to simply "copy"
data from one memory location to another, rather than send it in a
message, especially if the data are large.

That's the rub... In our case, we're doing image and video
manipulation--stuff not good to be messaging from address space to
address space. The same argument holds for numerical processing with
large data sets. The workers handing back huge data sets via
messaging isn't very attractive.

One thing Andy hasn't yet explained (or I missed) is why any of his
application is coded in a language other than Python. *

Our software runs in real time (so performance is paramount),
interacts with other static libraries, depends on worker threads to
perform real-time image manipulation, and leverages Windows and Mac OS
API concepts and features. Python's performance hits have generally
been a huge challenge with our animators because they often have to go
back and massage their python code to improve execution performance.
So, in short, there are many reasons why we use python as a part
rather than a whole.

The other area of pain that I mentioned in one of my other posts is
that what we ship, above all, can't be flaky. The lack of module
cleanup (intended to be addressed by PEP 3121), using a duplicate copy
of the python dynamic lib, and namespace black magic to achieve
independent interpreters are all examples that have made using python
for us much more challenging and time-consuming then we ever
anticipated.

Again, if it turns out nothing can be done about our needs (which
appears to be more and more like the case), I think it's important for
everyone here to consider the points raised here in the last week.
Moreover, realize that the python dev community really stands to gain
from making python usable as a tool (rather than a monolith). This
fact alone has caused lua to *rapidly* rise in popularity with
software companies looking to embed a powerful, lightweight
interpreter in their software.

As a python language fan an enthusiast, don't let lua win! (I say
this endearingly of course--I have the utmost respect for both
communities and I only want to see CPython be an attractive pick when
a company is looking to embed a language that won't intrude upon their
app's design).
Andy

Oct 24 '08 #26

Patrick Stinson

We are in the same position as Andy here.

I think that something that would help people like us produce
something in code form is a collection of information outlining the
problem and suggested solutions, appropriate parts of the CPython's
current threading API, and pros and cons of the many various proposed
solutions to the different levels of the problem. The most valuable
information I've found is contained in the many (lengthy!) discussions
like this one, a few related PEP's, and the CPython docs, but has
anyone condensed the state of the problem into a wiki or something
similar? Maybe we should start one?

For example, Guido's post here
http://www.artima.com/weblogs/viewpo...14235describes some
possible solutions to the problem, like interpreter-specific locks, or
fine-grained object locks, and he also mentions the primary
requirement of not harming from the performance of single-threaded
apps. As I understand it, that requirement does not rule out new build
configurations that provide some level of concurrency, as long as you
can still compile python so as to perform as well on single-threaded
apps.

To add to the heap of use cases, the most important thing to us is to
simple have the python language and the sip/PyQt modules available to
us. All we wanted to do was embed the interpreter and language core as
a local scripting engine, so had we patched python to provide
concurrent execution, we wouldn't have cared about all of the other
unsuppported extension modules since our scripts are quite
application-specific.

It seems to me that the very simplest move would be to remove global
static data so the app could provide all thread-related data, which
Andy suggests through references to the QuickTime API. This would
suggest compiling python without thread support so as to leave it up
to the application.

Anyway, I'm having fun reading all of these papers and news postings,
but it's true that code talks, and it could be a little easier if the
state of the problems was condensed. This could be an intense and fun
project, but frankly it's a little tough to keep it all in my head. Is
there a wiki or something out there or should we start one, or do I
just need to read more code?

On Fri, Oct 24, 2008 at 6:40 AM, Andy O'Meara <an****@gmail.c omwrote:

On Oct 24, 2:12 am, greg <g...@cosc.cant erbury.ac.nzwro te:
>Andy wrote:
1) Independent interpreters (this is the easier one--and solved, in
principle anyway, by PEP 3121, by Martin v. Löwis

Something like that is necessary for independent interpreters,
but not sufficient. There are also all the built-in constants
and type objects to consider. Most of these are statically
allocated at the moment.

Agreed--I was just trying to speak generally. Or, put another way,
there's no hope for independent interpreters without the likes of PEP
3121. Also, as Martin pointed out, there's the issue of module
cleanup some guys here may underestimate (and I'm glad Martin pointed
out the importance of it). Without the module cleanup, every time a
dynamic library using python loads and unloads you've got leaks. This
issue is a real problem for us since our software is loaded and
unloaded many many times in a host app (iTunes, WMP, etc). I hadn't
raised it here yet (and I don't want to turn the discussion to this),
but lack of multiple load and unload support has been another painful
issue that we didn't expect to encounter when we went with python.

2) Barriers to "free threading". As Jesse describes, this is simply
just the GIL being in place, but of course it's there for a reason.
It's there because (1) doesn't hold and there was never any specs/
guidance put forward about what should and shouldn't be done in multi-
threaded apps

No, it's there because it's necessary for acceptable performance
when multiple threads are running in one interpreter. Independent
interpreters wouldn't mean the absence of a GIL; it would only
mean each interpreter having its own GIL.

I see what you're saying, but let's note that what you're talking
about at this point is an interpreter containing protection from the
client level violating (supposed) direction put forth in python
multithreaded guidelines. Glenn Linderman's post really gets at
what's at hand here. It's really important to consider that it's not
a given that python (or any framework) has to be designed against
hazardous use. Again, I refer you to the diagrams and guidelines in
the QuickTime API:

http://developer.apple.com/technotes/tn/tn2125.html

They tell you point-blank what you can and can't do, and it's that's
simple. Their engineers can then simply create the implementation
around those specs and not weigh any of the implementation down with
sync mechanisms. I'm in the camp that simplicity and convention wins
the day when it comes to an API. It's safe to say that software
engineers expect and assume that a thread that doesn't have contact
with other threads (except for explicit, controlled message/object
passing) will run unhindered and safely, so I raise an eyebrow at the
GIL (or any internal "helper" sync stuff) holding up an thread's
performance when the app is designed to not need lower-level global
locks.

Anyway, let's talk about solutions. My company looking to support
python dev community endeavor that allows the following:

- an app makes N worker threads (using the OS)

- each worker thread makes its own interpreter, pops scripts off a
work queue, and manages exporting (and then importing) result data to
other parts of the app. Generally, we're talking about CPU-bound work
here.

- each interpreter has the essentials (e.g. math support, string
support, re support, and so on -- I realize this is open-ended, but
work with me here).

Let's guesstimate about what kind of work we're talking about here and
if this is even in the realm of possibility. If we find that it *is*
possible, let's figure out what level of work we're talking about.
From there, I can get serious about writing up a PEP/spec, paid
support, and so on.

Regards,
Andy

--
http://mail.python.org/mailman/listinfo/python-list

Oct 24 '08 #27

Patrick Stinson

As a side note to the performance question, we are executing python
code in an audio thread that is used in all of the top-end music
production environments. We have found the language to perform
extremely well when executed at control-rate frequency, meaning we
aren't doing DSP computations, just responding to less-frequent events
like user input and MIDI messages.

So we are sitting this music platform with unimaginable possibilities
in the music world (of which python does not play a role), but those
little CPU spikes caused by the GIL at low latencies won't let us have
it. AFAIK, there is no music scripting language out there that would
come close, and yet we are sooooo close! This is a big deal.

On Fri, Oct 24, 2008 at 7:42 AM, Andy O'Meara <an****@gmail.c omwrote:

>
Glenn, great post and points!

>>
Andy seems to want an implementation of independent Python processes
implemented as threads within a single address space, that can be
coordinated by an outer application. This actually corresponds to the
model promulgated in the paper as being most likely to succeed.

Yeah, that's the idea--let the highest levels run and coordinate the
show.

>>
It does seem simpler and more efficient to simply "copy"
data from one memory location to another, rather than send it in a
message, especially if the data are large.

That's the rub... In our case, we're doing image and video
manipulation--stuff not good to be messaging from address space to
address space. The same argument holds for numerical processing with
large data sets. The workers handing back huge data sets via
messaging isn't very attractive.

>One thing Andy hasn't yet explained (or I missed) is why any of his
application is coded in a language other than Python.

Our software runs in real time (so performance is paramount),
interacts with other static libraries, depends on worker threads to
perform real-time image manipulation, and leverages Windows and Mac OS
API concepts and features. Python's performance hits have generally
been a huge challenge with our animators because they often have to go
back and massage their python code to improve execution performance.
So, in short, there are many reasons why we use python as a part
rather than a whole.

The other area of pain that I mentioned in one of my other posts is
that what we ship, above all, can't be flaky. The lack of module
cleanup (intended to be addressed by PEP 3121), using a duplicate copy
of the python dynamic lib, and namespace black magic to achieve
independent interpreters are all examples that have made using python
for us much more challenging and time-consuming then we ever
anticipated.

Again, if it turns out nothing can be done about our needs (which
appears to be more and more like the case), I think it's important for
everyone here to consider the points raised here in the last week.
Moreover, realize that the python dev community really stands to gain
from making python usable as a tool (rather than a monolith). This
fact alone has caused lua to *rapidly* rise in popularity with
software companies looking to embed a powerful, lightweight
interpreter in their software.

As a python language fan an enthusiast, don't let lua win! (I say
this endearingly of course--I have the utmost respect for both
communities and I only want to see CPython be an attractive pick when
a company is looking to embed a language that won't intrude upon their
app's design).
Andy
--
http://mail.python.org/mailman/listinfo/python-list

Oct 24 '08 #28

Terry Reedy

Stefan Behnel wrote:

Terry Reedy wrote:
>Everything in DLLs is compiled C extensions. I see about 15 for Windows
3.0.

Ah, weren't that wonderful times back in the days of Win3.0, when DLL-hell was
inhabited by only 15 libraries? *sigh*

... although ... wait, didn't Win3.0 have more than that already? Maybe you
meant Windows 1.0?

SCNR-ly,

Is that the equivalent of a smilely? or did you really not understand
what I wrote?

Oct 24 '08 #29

Andy O'Meara

>
The Global Interpreter Lock is fundamentally designed to make the
interpreter easier to maintain and safer: Developers do not need to
worry about other code stepping on their namespace. This makes things
thread-safe, inasmuch as having multiple PThreads within the same
interpreter space modifying global state and variable at once is,
well, bad. A c-level module, on the other hand, can sidestep/release
the GIL at will, and go on it's merry way and process away.

....Unless part of the C module execution involves the need do CPU-
bound work on another thread through a different python interpreter,
right? (even if the interpreter is 100% independent, yikes). For
example, have a python C module designed to programmaticall y generate
images (and video frames) in RAM for immediate and subsequent use in
animation. Meanwhile, we'd like to have a pthread with its own
interpreter with an instance of this module and have it dequeue jobs
as they come in (in fact, there'd be one of these threads for each
excess core present on the machine). As far as I can tell, it seems
CPython's current state can't CPU bound parallelization in the same
address space (basically, it seems that we're talking about the
"embarrassi ngly parallel" scenario raised in that paper). Why does it
have to be in same address space? Convenience and simplicity--the
same reasons that most APIs let you hang yourself if the app does dumb
things with threads. Also, when the data sets that you need to send
to and from each process is large, using the same address space makes
more and more sense.

So, just to clarify - Andy, do you want one interpreter, $N threads
(e.g. PThreads) or the ability to fork multiple "heavyweigh t"
processes?

Sorry if I haven't been clear, but we're talking the app starting a
pthread, making a fresh/clean/independent interpreter, and then being
responsible for its safety at the highest level (with the payoff of
each of these threads executing without hinderance). No different
than if you used most APIs out there where step 1 is always to make
and init a context object and the final step is always to destroy/take-
down that context object.

I'm a lousy writer sometimes, but I feel bad if you took the time to
describe threads vs processes. The only reason I raised IPC with my
"messaging isn't very attractive" comment was to respond to Glenn
Linderman's points regarding tradeoffs of shared memory vs no.
Andy

Oct 24 '08 #30