Python reliability

Ville Voipio

I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is.

The piece of software is rather simple, probably a
few hundred lines of code in Python. There is a need
to interact with network using the socket module,
and then probably a need to do something hardware-
related which will get its own driver written in
C.

Threading and other more error-prone techniques can
be left aside, everything can run in one thread with
a poll loop.

The software should be running continously for
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

- Ville

--
Ville Voipio, Dr.Tech., M.Sc. (EE)

Oct 9 '05 #1

Subscribe Post Reply

6338

Paul Rubin

Ville Voipio <vv*****@kosh.hut.fi> writes:

The software should be running continously for
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

I would say give the app the heaviest stress testing that you can
before deploying it, checking carefully for leaks and crashes. I'd
say that regardless of the implementation language.

Oct 9 '05 #2

Steven D'Aprano

On Sun, 09 Oct 2005 23:00:04 +0300, Ville Voipio wrote:

I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is.
[snip]
The software should be running continously for
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

If performance is really not such an issue, would it really matter if you
periodically restarted Python? Starting Python takes a tiny amount of time:

$ time python -c pass
real 0m0.164s
user 0m0.021s
sys 0m0.015s

If performance isn't an issue, your users may not even care about ten
times that delay even once an hour. In other words, built your software to
deal gracefully with restarts, and your users won't even notice or care if
it restarts.

I'm not saying that you will need to restart Python once an hour, or even
once a month. But if you did, would it matter? What's more important is
the state of the operating system. (I'm assuming that, with a year uptime
the requirements, you aren't even thinking of WinCE.)
--
Steven.

Oct 9 '05 #3

Paul Rubin

Steven D'Aprano <st***@REMOVETHIScyber.com.au> writes:

If performance is really not such an issue, would it really matter if you
periodically restarted Python? Starting Python takes a tiny amount of time:

If you have to restart an application, every network peer connected to
it loses its connection. Think of a phone switch. Do you really want
your calls dropped every few hours of conversation time, just because
some lame application decided to restart itself? Phone switches go to
great lengths to keep running through both hardware failures and
software upgrades, without dropping any calls. That's the kind of
application it sounds like the OP is trying to run.

To the OP: besides Python you might also consider Erlang.

Oct 9 '05 #4

George Sakkis

Steven D'Aprano wrote:

On Sun, 09 Oct 2005 23:00:04 +0300, Ville Voipio wrote:
I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is.

[snip]
The software should be running continously for
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

If performance is really not such an issue, would it really matter if you
periodically restarted Python? Starting Python takes a tiny amount of time:

You must have missed or misinterpreted the "The software should be
running continously for practically forever" part. The problem of
restarting python is not the 200 msec lost but putting at stake
reliability (e.g. for health monitoring devices, avionics, nuclear
reactor controllers, etc.) and robustness (e.g. a computation that
takes weeks of cpu time to complete is interrupted without the
possibility to restart from the point it stopped).

George

Oct 9 '05 #5

Neal Norwitz

Ville Voipio wrote:

The software should be running continously for
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

Jp gave you the answer that he has done this.

I've spent quite a bit of time since 2.1 days trying to improve the
reliability. I think it has gotten much better. Valgrind is run on
(nearly) every release. We look for various kinds of problems. I try
to review C code for these sorts of problems etc.

There are very few known issues that can crash the interpreter. I
don't know of any memory leaks. socket code is pretty well tested and
heavily used, so you should be in fairly safe territory, particularly
on Unix.

n

Oct 10 '05 #6

Steven D'Aprano

George Sakkis wrote:

Steven D'Aprano wrote:

On Sun, 09 Oct 2005 23:00:04 +0300, Ville Voipio wrote:

I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is.

[snip]

The software should be running continously for
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

If performance is really not such an issue, would it really matter if you
periodically restarted Python? Starting Python takes a tiny amount of time:

You must have missed or misinterpreted the "The software should be
running continously for practically forever" part. The problem of
restarting python is not the 200 msec lost but putting at stake
reliability (e.g. for health monitoring devices, avionics, nuclear
reactor controllers, etc.) and robustness (e.g. a computation that
takes weeks of cpu time to complete is interrupted without the
possibility to restart from the point it stopped).

Er, no, I didn't miss that at all. I did miss that it
needed continual network connections. I don't know if
there is a way around that issue, although mobile
phones move in and out of network areas, swapping
connections when and as needed.

But as for reliability, well, tell that to Buzz Aldrin
and Neil Armstrong. The Apollo 11 moon lander rebooted
multiple times on the way down to the surface. It was
designed to recover gracefully when rebooting unexpectedly:

http://www.hq.nasa.gov/office/pao/Hi...1.1201-pa.html

I don't have an authoritive source of how many times
the computer rebooted during the landing, but it was
measured in the dozens. Calculations were performed in
an iterative fashion, with an initial estimate that was
improved over time. If a calculation was interupted the
computer lost no more than one iteration.

I'm not saying that this strategy is practical or
useful for the original poster, but it *might* be. In a
noisy environment, it pays to design a system that can
recover transparently from a lost connection.

If your heart monitor can reboot in 200 ms, you might
miss one or two beats, but so long as you pick up the
next one, that's just noise. If your calculation takes
more than a day of CPU time to complete, you should
design it in such a way that you can save state and
pick it up again when you are ready. You never know
when the cleaner will accidently unplug the computer...
--
Steven.

Oct 10 '05 #7

Ville Voipio

In article <7x************@ruckus.brouhaha.com>, Paul Rubin wrote:

I would say give the app the heaviest stress testing that you can
before deploying it, checking carefully for leaks and crashes. I'd
say that regardless of the implementation language.

Goes without saying. But I would like to be confident (or as
confident as possible) that all bugs are mine. If I use plain
C, I think this is the case. Of course, bad memory management
in the underlying platform will wreak havoc. I am planning to
use Linux 2.4.somethingnew as the OS kernel, and there I have
not experienced too many problems before.

Adding the Python interpreter adds one layer on uncertainty.
On the other hand, I am after the simplicity of programming
offered by Python.

- Ville

--
Ville Voipio, Dr.Tech., M.Sc. (EE)

Oct 10 '05 #8

Ville Voipio

In article <pa****************************@REMOVETHIScyber.co m.au>,
Steven D'Aprano wrote:

If performance is really not such an issue, would it really matter if you
periodically restarted Python? Starting Python takes a tiny amount of time:
Uhhh. Sounds like playing with Microsoft :) I know of a mission-
critical system which was restarted every week due to some memory
leaks. If it wasn't, it crashed after two weeks. Guess which
platform...
$ time python -c pass
real 0m0.164s
user 0m0.021s
sys 0m0.015s
This is on the limit of being acceptable. I'd say that a one-second
time lag is the maximum. The system is a safety system after all,
and there will be a hardware watchdog to take care of odd crashes.
The software itself is stateless in the sense that its previous
state does not affect the next round. Basically, it is just checking
a few numbers over the network. Even the network connection is
stateless (single UDP packet pairs) to avoid TCP problems with
partial closings, etc.

There are a gazillion things which may go wrong. A stray cosmic
ray may change the state of one bit in the wrong place of memory,
and that's it, etc. So, the system has to be able to recover from
pretty much everything. I will in any case build an independent
process which probes the state of the main process. However,
I hope it is never really needed.
I'm not saying that you will need to restart Python once an hour, or even
once a month. But if you did, would it matter? What's more important is
the state of the operating system. (I'm assuming that, with a year uptime
the requirements, you aren't even thinking of WinCE.)

Not even in my worst nightmares! The platform will be an embedded
Linux computer running 2.4.somethingnew.

- Ville

--
Ville Voipio, Dr.Tech., M.Sc. (EE)

Oct 10 '05 #9

Paul Rubin

Ville Voipio <vv*****@kosh.hut.fi> writes:

Goes without saying. But I would like to be confident (or as
confident as possible) that all bugs are mine. If I use plain
C, I think this is the case. Of course, bad memory management
in the underlying platform will wreak havoc. I am planning to
use Linux 2.4.somethingnew as the OS kernel, and there I have
not experienced too many problems before.

You might be better off with a 2.6 series kernel. If you use Python
conservatively (be careful with the most advanced features, and don't
stress anything too hard) you should be ok. Python works pretty well
if you use it the way the implementers expected you to. Its
shortcomings are when you try to press it to its limits.

You do want reliable hardware with ECC and all that, maybe with multiple
servers and automatic failover. This site might be of interest:

http://www.linux-ha.org/

Oct 10 '05 #10

Steven D'Aprano

Ville Voipio wrote:

There are a gazillion things which may go wrong. A stray cosmic
ray may change the state of one bit in the wrong place of memory,
and that's it, etc. So, the system has to be able to recover from
pretty much everything. I will in any case build an independent
process which probes the state of the main process. However,
I hope it is never really needed.

If you have enough hardware grunt, you could think
about having three independent processes working in
parallel. They vote on their output, and best out of
three gets reported back to the user. In other words,
only if all three results are different does the device
throw its hands up in the air and say "I don't know!"

Of course, unless you are running each of them on an
independent set of hardware and OS, you really aren't
getting that much benefit. And then there is the
question, can you trust the voting mechanism... But if
this is so critical you are worried about cosmic rays,
maybe it is the way to go.

If it is not a secret, what are you monitoring with
this device?
--
Steven.

Oct 10 '05 #11

Ville Voipio

In article <43**************@REMOVEMEcyber.com.au>, Steven D'Aprano wrote:

If you have enough hardware grunt, you could think
about having three independent processes working in
parallel. They vote on their output, and best out of
three gets reported back to the user. In other words,
only if all three results are different does the device
throw its hands up in the air and say "I don't know!"

Ok, I will give you a bit more information, so that the
situation is a bit clearer. (Sorry, I cannot tell you
the exact application.)

The system is a safety system which supervises several
independent measurements (two or more). The measurements
are carried out by independent measurement instruments
which have their independent power supplies, etc.

The application communicates with the independent
measurement instruments thrgough the network. Each
instrument is queried its measurement results and
status information regularly. If the results given
by different instruments differ more than a given
amount, then an alarm is set (relay contacts opened).

Naturally, in case of equipment malfunction, the
alarm is set. This covers a wide range of problems from
errors reported by the instrument to physical failures
or program bugs.

The system has several weak spots. However, the basic
principle is simple: if anything goes wrong, start
yelling. A false alarm is costly, but not giving the
alarm when required is downright impossible.

I am not building a redundant system with independent
instruments voting. At this point I am trying to minimize
the false alarms. This is why I want to know if Python
is reliable enough to be used in this application.

By the postings I have seen in this thread it seems that
the answer is positive. At least if I do not try
apply any adventorous programming techniques.

- Ville

--
Ville Voipio, Dr.Tech., M.Sc. (EE)

Oct 10 '05 #12

Ville Voipio

In article <7x************@ruckus.brouhaha.com>, Paul Rubin wrote:

You might be better off with a 2.6 series kernel. If you use Python
conservatively (be careful with the most advanced features, and don't
stress anything too hard) you should be ok. Python works pretty well
if you use it the way the implementers expected you to. Its
shortcomings are when you try to press it to its limits.
Just one thing: how reliable is the garbage collecting system?
Should I try to either not produce any garbage or try to clean
up manually?
You do want reliable hardware with ECC and all that, maybe with multiple
servers and automatic failover. This site might be of interest:

Well... Here the uptime benefit from using several servers is
not eceonomically justifiable. I am right now at the phase of
trying to minimize the downtime with given hardware resources.
This is not flying; downtime does not kill anyone. I just want
to avoid choosing tools which belong more to the problem than
to the solution set.

- Ville

--
Ville Voipio, Dr.Tech., M.Sc. (EE)

Oct 10 '05 #13

Paul Rubin

Ville Voipio <vv*****@kosh.hut.fi> writes:

Just one thing: how reliable is the garbage collecting system?
Should I try to either not produce any garbage or try to clean
up manually?
The GC is a simple, manually-updated reference counting system
augmented with some extra contraption to resolve cyclic dependencies.
It's extremely easy to make errors with the reference counts in C
extensions, and either leak references (causing memory leaks) or
forget to add them (causing double-free crashes). The standard
libraries are pretty careful about managing references but if you're
using 3rd party C modules, or writing your own, then watch out.

There is no way you can avoid making garbage. Python conses
everything, even integers (small positive ones are cached). But I'd
say, avoid making cyclic dependencies, be very careful if you use the
less popular C modules or any 3rd party ones, and stress test the hell
out of your app while monitoring memory usage very carefully. If you
can pound it with as much traffic in a few hours as it's likely to see
in a year of deployment, without memory leaks or thread races or other
errors, that's a positive sign.
Well... Here the uptime benefit from using several servers is not
eceonomically justifiable. I am right now at the phase of trying to
minimize the downtime with given hardware resources. This is not
flying; downtime does not kill anyone. I just want to avoid choosing
tools which belong more to the problem than to the solution set.

You're probably ok with Python in this case.

Oct 10 '05 #14

Max M

Ville Voipio wrote:

In article <7x************@ruckus.brouhaha.com>, Paul Rubin wrote:
I would say give the app the heaviest stress testing that you can
before deploying it, checking carefully for leaks and crashes. I'd
say that regardless of the implementation language.

Goes without saying. But I would like to be confident (or as
confident as possible) that all bugs are mine. If I use plain
C, I think this is the case. Of course, bad memory management
in the underlying platform will wreak havoc.

Python isn't perfect, but I do believe that is as good as the best of
the major "standard" systems out there.

You will have *far* greater chances of introducing errors yourself by
coding in c, than you will encounter in Python.

You can see the bugs fixed in recent versions, and see for yourself
whether they would have crashed your system. That should be an indicator:

http://www.python.org/2.4.2/NEWS.html
--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Oct 10 '05 #15

Tom Anderson

On Mon, 10 Oct 2005, it was written:

Ville Voipio <vv*****@kosh.hut.fi> writes:
Just one thing: how reliable is the garbage collecting system? Should I
try to either not produce any garbage or try to clean up manually?
The GC is a simple, manually-updated reference counting system augmented
with some extra contraption to resolve cyclic dependencies. It's
extremely easy to make errors with the reference counts in C extensions,
and either leak references (causing memory leaks) or forget to add them
(causing double-free crashes).

Has anyone looked into using a real GC for python? I realise it would be a
lot more complexity in the interpreter itself, but it would be faster,
more reliable, and would reduce the complexity of extensions.

Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd
still need some way of figuring out which variables in C-land held
pointers to objects; if anything, that might be harder, unless you want to
impose a horrendous JAI-like bondage-and-discipline interface.
There is no way you can avoid making garbage. Python conses everything,
even integers (small positive ones are cached).

So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?
Fair enough - the performance gain is nice, but the extra complexity would
be a huge pain, i imagine.

tom

--
Fitter, Happier, More Productive.

Oct 10 '05 #16

Aahz

In article <Pi*******************************@urchin.earth.li >,
Tom Anderson <tw**@urchin.earth.li> wrote:

Has anyone looked into using a real GC for python? I realise it would be a
lot more complexity in the interpreter itself, but it would be faster,
more reliable, and would reduce the complexity of extensions.

Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd
still need some way of figuring out which variables in C-land held
pointers to objects; if anything, that might be harder, unless you want to
impose a horrendous JAI-like bondage-and-discipline interface.

Bingo! There's a reason why one Python motto is "Plays well with
others".
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur." --Red Adair

Oct 10 '05 #17

Mike Meyer

Tom Anderson <tw**@urchin.earth.li> writes:

Has anyone looked into using a real GC for python? I realise it would
be a lot more complexity in the interpreter itself, but it would be
faster, more reliable, and would reduce the complexity of extensions.

Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd
still need some way of figuring out which variables in C-land held
pointers to objects; if anything, that might be harder, unless you
want to impose a horrendous JAI-like bondage-and-discipline interface.

Wouldn't necessarily be faster, either. I rewrote an program that
built a static data structure of a couple of hundred thousand objects
and then went traipsing through that while generating a few hundred
objects in a compiled language with a real garbage collector. The
resulting program ran about an order of magnitude slower than the
Python version.

Profiling revealed that it was spending 95% of it's time in the
garbage collector, marking and sweeping that large data structure.

There's lots of research on dealing with this problem, as my usage
pattern isn't unusual - just a little extreme. Unfortunately, none of
them were applicable to comiled code without a serious performance
impact on pretty much everything. Those could probably be used in
Python without a problem.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Oct 10 '05 #18

Thomas Bartkus

"Ville Voipio" <vv*****@kosh.hut.fi> wrote in message
news:slrndkka7r.62en.vv*****@kosh.hut.fi...

In article <7x************@ruckus.brouhaha.com>, Paul Rubin wrote: <snip>
I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is. The software should be running continously for
practically forever (at least a year without a reboot).
is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?
Adding the Python interpreter adds one layer on uncertainty.
On the other hand, I am after the simplicity of programming
offered by Python. <snip>
I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is. <snip> The software should be running continously for
practically forever (at least a year without a reboot).
is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

<snip>

All in all, it would seem that the reliability of the Python run time is the
least of your worries. The best multi-tasking operating systems do a good
job of segragating different processes BUT what multitasking operating
system meets the standard you request in that last paragraph? Assuming that
the Python interpreter itself is robust enough to meet that standard, what
about that other 99% of everything else that is competing with your Python
script for cpu, memory, and other critical resources? Under ordinary Linux,
your Python script will be interrupted frequently and regularly by processes
entirely outside of Python's control.

You may not want a multitasking OS at all but rather a single tasking OS
where nothing happens that isn't 100% under your program control. Or if you
do need a multitasking system, you probably want something designed for the
type of rugged use you are demanding. I would google "embedded systems".
If you want to use Python/Linux, I might suggest you search "Embedded
Linux".

And I wouldn't be surprised if some dedicated microcontrollers aren't
showing up with Python capability. In any case, it would seem you need more
control than a Python interpreter would receive when running under Linux.

Good Luck.
Thomas Bartkus

Oct 10 '05 #19

Paul Rubin

Tom Anderson <tw**@urchin.earth.li> writes:

Has anyone looked into using a real GC for python? I realise it would
be a lot more complexity in the interpreter itself, but it would be
faster, more reliable, and would reduce the complexity of extensions.
The next PyPy sprint (this week I think) is going to focus partly on GC.
Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd
still need some way of figuring out which variables in C-land held
pointers to objects; if anything, that might be harder, unless you
want to impose a horrendous JAI-like bondage-and-discipline interface.

I'm not sure what JAI is (do you mean JNI?) but you might look at how
Emacs Lisp does it. You have to call a macro to protect intermediate
heap results in C functions from GC'd, so it's possible to make
errors, but it cleans up after itself and is generally less fraught
with hazards than Python's method is.

Oct 10 '05 #20

Peter Hansen

Ville Voipio wrote:

I am not building a redundant system with independent
instruments voting. At this point I am trying to minimize
the false alarms. This is why I want to know if Python
is reliable enough to be used in this application.

By the postings I have seen in this thread it seems that
the answer is positive. At least if I do not try
apply any adventorous programming techniques.

We built a system with similar requirements using an older version of
Python (either 2.0 or 2.1 I believe). A number of systems were shipped
and operate without problems. We did have a memory leak issue in an
early version and spent ages debugging it (and actually implemented the
suggested "reboot when necessary feature" as a stop-gap measure at one
point), before finally discovering it. (Then, knowing what to search
for, we quickly found that the problem had been fixed in CVS for the
Python version we were using, and actually released in the subsequent
major revision. (The leak involved extending empty lists, or extending
lists with empty lists, as I recall.)

Other than that, we had no real issues and definitely felt the choice of
Python was completely justified. I have no hesitation recommending it,
other than to caution (as I believe Paul R did) that use of new features
is "dangerous" in that they won't have as wide usage and shouldn't
always be considered "proven" in long-term field use, by definition.

Another suggestion would be to carefully avoid cyclic references (if the
app is simple enough for this to be feasible), allowing you to rely on
reference-counting for garbage collection and the resultant "more
deterministic" behaviour.

Also test heavily. We were using test-driven development and had
effectively thousands of hours of run-time by the time the first system
shipped, so we had great confidence in it.

-Peter

Oct 11 '05 #21

Ville Voipio

In article <A_********************@telcove.net>, Thomas Bartkus wrote:

All in all, it would seem that the reliability of the Python run time is the
least of your worries. The best multi-tasking operating systems do a good
job of segragating different processes BUT what multitasking operating
system meets the standard you request in that last paragraph?
Well, let's put it this way. I have seen many computers running
Linux with a high load of this and that (web services, etc.) with
uptimes of years. I have not seen any recent Linux crash without
faulty hardware or drivers.

If using Python does not add significantly to the level of
irreliability, then I can use it. If it adds, then I cannot
use it.
type of rugged use you are demanding. I would google "embedded systems".
If you want to use Python/Linux, I might suggest you search "Embedded
Linux".

I am an embedded system designer by my profession :) Both hardware
and software for industrial instruments. Computers are just a
side effect of nicer things.

But here I am looking into the possibility of making something
with embedded PC hardware (industrial PC/104 cards). The name of
the game is "as good as possible with the given amount of money".
In that respect this is not flying or shooting. If something goes
wrong, someone loses a bunch of dollars, not their life.

I think that in this game Python might be handy when it comes to
maintainability and legibility (vs. C). But choosing a tool which
is known to be bad for the task is not a good idea.

- Ville

--
Ville Voipio, Dr.Tech., M.Sc. (EE)

Oct 11 '05 #22

Ville Voipio

In article <te********************@powergate.ca>, Peter Hansen wrote:

Other than that, we had no real issues and definitely felt the choice of
Python was completely justified. I have no hesitation recommending it,
other than to caution (as I believe Paul R did) that use of new features
is "dangerous" in that they won't have as wide usage and shouldn't
always be considered "proven" in long-term field use, by definition.
Thank you for this information. Of course, we try to be as conservative
as possible. The application fortunately allows for this, cyclic
references and new features can most probably be avoided.
Also test heavily. We were using test-driven development and had
effectively thousands of hours of run-time by the time the first system
shipped, so we had great confidence in it.

Yes, it is usually much nicer to debug the software in the quiet,
air-conditioned lab than somewhere in a jungle on the other side
of the globe with an extremely angry customer next to you...

- Ville

--
Ville Voipio, Dr.Tech., M.Sc. (EE)

Oct 11 '05 #23

John Waycott

Ville Voipio wrote:

In article <A_********************@telcove.net>, Thomas Bartkus wrote:
All in all, it would seem that the reliability of the Python run time is the
least of your worries.

I agree - design of the application, keeping it simple and testing it
thoroughly is more important for reliability than implementation
language. Indeed, I'd argue that in many cases you'd have better
reliability using Python over C because of easier maintainability and
higher-level data constructs.

Well, let's put it this way. I have seen many computers running
Linux with a high load of this and that (web services, etc.) with
uptimes of years. I have not seen any recent Linux crash without
faulty hardware or drivers.

If using Python does not add significantly to the level of
irreliability, then I can use it. If it adds, then I cannot
use it.

I wrote a simple Python program that acts as a buffer between a
transaction network and a database server, writing the transaction logs
to a file that the database reads the next day for billing. The simple
design decoupled the database from network so it wasn't stresed during
high-volume times. The two systems (one for redundancy) that run the
Python program have been running for six years.

-- John Waycott

Oct 11 '05 #24

Alex Martelli

Tom Anderson <tw**@urchin.earth.li> wrote:
...

Has anyone looked into using a real GC for python? I realise it would be a
If you mean mark-and-sweep, with generational twists, that's what gc
uses for cyclic garbage.
lot more complexity in the interpreter itself, but it would be faster,
more reliable, and would reduce the complexity of extensions.
??? It adds no complexity (it's already there), it's slower, it is, if
anything, LESS reliable than reference counting (which is way simpler!),
and (if generalized to deal with ALL garbage) it might make it almost
impossible to write some kinds of extensions (ones which need to
interface existing C libraries that don't cooperate with whatever GC
collection you choose). Are we talking about the same thing?!

So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?
Fair enough - the performance gain is nice, but the extra complexity would
be a huge pain, i imagine.

CPython currently is implemented on a strict "minimize all tricks"
strategy. There are several other implementations of the Python
language, which may target different virtual machines -- Jython for JVM,
IronPython for MS-CLR, and (less mature) stuff for the Parrot VM, and
others yet from the pypy project. Each implementation may use whatever
strategy is most appropriate for the VM it targets, of course -- this is
the reason behind Python's refusal to strictly specify GC semantics
(exactly WHEN some given garbage gets collected)... allow such multiple
implementations leeway in optimizing behavior for the target VM(s).
Alex

Oct 11 '05 #25

Paul Rubin

al***@mail.comcast.net (Alex Martelli) writes:

Has anyone looked into using a real GC for python? ...
lot more complexity in the interpreter itself, but it would be faster,
more reliable, and would reduce the complexity of extensions.

??? It adds no complexity (it's already there), it's slower, it is, if
anything, LESS reliable than reference counting (which is way simpler!),
and (if generalized to deal with ALL garbage) it might make it almost
impossible to write some kinds of extensions (ones which need to
interface existing C libraries that don't cooperate with whatever GC
collection you choose). Are we talking about the same thing?!

I've done it both ways and it seems to me that a simple mark/sweep gc
does require a lump of complexity in one place, but Python has that
anyway to deal with cyclic garbage. Once the gc module is there, then
extensions really do seem to be simpler to right. Having extensions
know about the gc is no harder than having them maintain reference
counts, in fact it's easier, they have to register new objects with
the gc (by pushing onto a stack) but can remove them all in one go.
Take a look at how Emacs Lisp does it. Extensions are easy to write.

Oct 11 '05 #26

Jorgen Grahn

On Mon, 10 Oct 2005 20:37:03 +0100, Tom Anderson <tw**@urchin.earth.li> wrote:

On Mon, 10 Oct 2005, it was written:

....

There is no way you can avoid making garbage. Python conses everything,
even integers (small positive ones are cached).

So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?

If the SmallInteger hack is something like this, it does:

a = 42
b = 42
a is b True a = 42000
b = 42000
a is b False

.... which I guess is what if referred to above as "small positive
ones are cached".

/Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
\X/ algonet.se> R'lyeh wgah'nagl fhtagn!

Oct 12 '05 #27

Peter Hansen

John Waycott wrote:

I wrote a simple Python program that acts as a buffer between a
transaction network and a database server, writing the transaction logs
to a file that the database reads the next day for billing. The simple
design decoupled the database from network so it wasn't stresed during
high-volume times. The two systems (one for redundancy) that run the
Python program have been running for six years.

Six years? With no downtime at all for the server? That's a lot of
"9s" of reliability...

Must still be using Python 1.5.2 as well...

-Peter

Oct 12 '05 #28

Tom Anderson

On Wed, 12 Oct 2005, Jorgen Grahn wrote:

On Mon, 10 Oct 2005 20:37:03 +0100, Tom Anderson <tw**@urchin.earth.li> wrote:
On Mon, 10 Oct 2005, it was written:

...
There is no way you can avoid making garbage. Python conses everything,
even integers (small positive ones are cached).

So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?

If the SmallInteger hack is something like this, it does:
a = 42
b = 42
a is b True a = 42000
b = 42000
a is b False

... which I guess is what if referred to above as "small positive
ones are cached".

That's not what i meant.

In both smalltalk and python, every single variable contains a reference
to an object - there isn't the object/primitive distinction you find in
less advanced languages like java.

Except that in smalltalk, this isn't true: in ST, every variable *appears*
to contain a reference to an object, but implementations may not actually
work like that. In particular, SmallTalk 80 (and some earlier smalltalks,
and all subsequent smalltalks, i think) handles small integers (those that
fit in wordsize-1 bits) differently: all variables contain a word, whose
bottom bit is a tag bit; if it's one, the word is a genuine reference, and
if it's zero, the top bits of the word contain a signed integer. The
innards of the VM know about this (where it matters), and do the right
thing. All this means that small (well, smallish - up to a billion!)
integers can be handled with zero heap space and much reduced instruction
counts. Of course, it means that references are more expensive, since they
have to be checked for integerness before dereferencing, but since this is
a few instructions at most, and since small integers account for a huge
fraction of the variables in most programs (as loop counters, array
indices, truth values, etc), this is a net win.

See the section 'Representation of Small Integers' in:

http://users.ipa.net/~dwighth/smallt...ObjectMemory26

The precise implementation is sneaky - the tag bit for an integer is zero,
so in many cases you can do arithmetic directly on the word, with a few
judicious shifts here and there; the tag bit for a pointer is one, and the
pointer is stored in two's-complement form *with the bottom bit in the
same place as the tag bit*, so you can recover a full-length pointer from
the word by complementing the whole thing, rather than having to shift.
Since pointers are word-aligned, the bottom bit is always a zero, so in
the complement it's always a one, so it can also be the status bit!

I think this came from LISP initially (most things do) and was probably
invented by Guy Steele (most things were).

tom

--
That's no moon!

Oct 12 '05 #29

Tom Anderson

On Mon, 10 Oct 2005, it was written:

Tom Anderson <tw**@urchin.earth.li> writes:
Has anyone looked into using a real GC for python? I realise it would
be a lot more complexity in the interpreter itself, but it would be
faster, more reliable, and would reduce the complexity of extensions.
The next PyPy sprint (this week I think) is going to focus partly on GC.

Good stuff!

Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd
still need some way of figuring out which variables in C-land held
pointers to objects; if anything, that might be harder, unless you want
to impose a horrendous JAI-like bondage-and-discipline interface.

I'm not sure what JAI is (do you mean JNI?)

Yes. Excuse the braino - JAI is Java Advanced Imaging, a component whose
horribleness exceed even that of JNI, hence the confusion.
but you might look at how Emacs Lisp does it. You have to call a macro
to protect intermediate heap results in C functions from GC'd, so it's
possible to make errors, but it cleans up after itself and is generally
less fraught with hazards than Python's method is.

That makes a lot of sense.

tom

--
That's no moon!

Oct 12 '05 #30

jepler

On Mon, Oct 10, 2005 at 08:37:03PM +0100, Tom Anderson wrote:

So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?
Fair enough - the performance gain is nice, but the extra complexity would
be a huge pain, i imagine.

I tried to implement this once. There was not a performance gain for general
code, and I also made the interpreter buggy in the process.

I wrote in 2002:
| Many Lisp interpreters use 'tagged types' to, among other things, let
| small ints reside directly in the machine registers.
|
| Python might wish to take advantage of this by designating pointers to odd
| addresses stand for integers according to the following relationship:
| p = (i<<1) | 1
| i = (p>>1)
| (due to alignment requirements on all common machines, all valid
| pointers-to-struct have 0 in their low bit) This means that all integers
| which fit in 31 bits can be stored without actually allocating or deallocating
| anything.
|
| I modified a Python interpreter to the point where it could run simple
| programs. The changes are unfortunately very invasive, because they
| make any C code which simply executes
| o->ob_type
| or otherwise dereferences a PyObject* invalid when presented with a
| small int. This would obviously affect a huge amount of existing code in
| extensions, and is probably enough to stop this from being implemented
| before Python 3000.
|
| This also introduces another conditional branch in many pieces of code, such
| as any call to PyObject_TypeCheck().
|
| Performance results are mixed. A small program designed to test the
| speed of all-integer arithmetic comes out faster by 14% (3.38 vs 2.90
| "user" time on my machine) but pystone comes out 5% slower (14124 vs 13358
| "pystones/second").
|
| I don't know if anybody's barked up this tree before, but I think
| these results show that it's almost certainly not worth the effort to
| incorporate this "performance" hack in Python. I'll keep my tree around
| for awhile, in case anybody else wants to see it, but beware that it
| still has serious issues even in the core:
| >>> 0+0j
| Traceback (most recent call last):
| File "<stdin>", line 1, in ?
| TypeError: unsupported operand types for +: 'int' and 'complex'
| >>> (0).__class__
| Segmentation fault
|
|
http://mail.python.org/pipermail/pyt...st/027685.html

Note that the tree where I worked on this is long since lost.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDTaFYJd01MZaTXX0RAsOkAJ41KG4qEmy2GeH8VDR9r7 nyZaYAFgCgiBPO
XljYbktL1wdmt0O3892AwJA=
=Hmpn
-----END PGP SIGNATURE-----

Oct 12 '05 #31

Tom Anderson

On Tue, 11 Oct 2005, Alex Martelli wrote:

Tom Anderson <tw**@urchin.earth.li> wrote:
...
Has anyone looked into using a real GC for python? I realise it would be a
If you mean mark-and-sweep, with generational twists,

Yes, more or less.
that's what gc uses for cyclic garbage.
Do you mean what python uses for cyclic garbage? If so, i hadn't realised
that. There are algorithms for extending refcounting to cyclic structures
(i forget the details, but you sort of go round and experimentally
decrement an object's count and see it ends up with a negative count or
something), so i assumed python used one of those. Mind you, those are
probably more complex than mark-and-sweep!

lot more complexity in the interpreter itself, but it would be faster,
more reliable, and would reduce the complexity of extensions.

??? It adds no complexity (it's already there), it's slower,

Ah. That would be why all those java, .net, LISP, smalltalk and assorted
other VMs out there, with decades of development, hojillions of dollars
and the serried ranks of some of the greatest figures in computer science
behind them all use reference counting rather than garbage collection,
then.

No, wait ...
it is, if anything, LESS reliable than reference counting (which is way
simpler!),
Reliability is a red herring - in the absence of ill-behaved native
extensions, and with correct implementations, both refcounting and GC are
perfectly reliable. And you can rely on the implementation being correct,
since any incorrectness will be detected very quickly!
and (if generalized to deal with ALL garbage) it might make it almost
impossible to write some kinds of extensions (ones which need to
interface existing C libraries that don't cooperate with whatever GC
collection you choose).
Lucky those existing C libraries were written to use python's refcounting!

Oh, you have to write a wrapper round the library to interface with the
automatic memory management? Well, as it happens, the stuff you need to do
is more or less identical for refcounting and GC - the extension has to
tell the VM which of the VM's objects it holds references to, so that the
VM knows that they aren't garbage.
Are we talking about the same thing?!

Doesn't look like it, does it?

So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?
Fair enough - the performance gain is nice, but the extra complexity would
be a huge pain, i imagine.

CPython currently is implemented on a strict "minimize all tricks"
strategy.

A very, very sound principle. If you have the aforementioned decades,
hojillions and serried ranks, an all-tricks-turned-up-to-eleven strategy
can be made to work. If you're a relatively small non-profit outfit like
the python dev team, minimising tricks buys you reliability and agility,
which is, really, what we all want.

tom

--
That's no moon!

Oct 13 '05 #32

Fredrik Lundh

Tom Anderson wrote:

In both smalltalk and python, every single variable contains a reference
to an object - there isn't the object/primitive distinction you find in
less advanced languages like java.

Except that in smalltalk, this isn't true: in ST, every variable *appears*
to contain a reference to an object, but implementations may not actually
work like that.

Python implementations don't have to work that way either. Please don't
confuse "the Python language" with "the CPython implementation" and with
other implementations (existing as well as hypothetical).

(fwiw, switching to tagging in CPython would break most about everything.
might as well start over, and nobody's likely to do that to speed up integer-
dominated programs a little...)

</F>

Oct 13 '05 #33

Paul Rubin

"Fredrik Lundh" <fr*****@pythonware.com> writes:

(fwiw, switching to tagging in CPython would break most about
everything. might as well start over, and nobody's likely to do
that to speed up integer- dominated programs a little...)

Yeah, a change of that magnitude in CPython would be madness, but
the question is well worth visiting for PyPy.

Oct 13 '05 #34

Alex Martelli

Tom Anderson <tw**@urchin.earth.li> wrote:

On Tue, 11 Oct 2005, Alex Martelli wrote:
Tom Anderson <tw**@urchin.earth.li> wrote:
...
Has anyone looked into using a real GC for python? I realise it would be a
If you mean mark-and-sweep, with generational twists,

Yes, more or less.
that's what gc uses for cyclic garbage.

Do you mean what python uses for cyclic garbage? If so, i hadn't realised

Yes, gc (a standard library module) gives you access to the mechanism
(to some reasonable extent).
that. There are algorithms for extending refcounting to cyclic structures
(i forget the details, but you sort of go round and experimentally
decrement an object's count and see it ends up with a negative count or
something), so i assumed python used one of those. Mind you, those are
probably more complex than mark-and-sweep!
Not sure about that, when you consider the "generational twists", but
maybe.

lot more complexity in the interpreter itself, but it would be faster,
more reliable, and would reduce the complexity of extensions.

??? It adds no complexity (it's already there), it's slower,

Ah. That would be why all those java, .net, LISP, smalltalk and assorted
other VMs out there, with decades of development, hojillions of dollars
and the serried ranks of some of the greatest figures in computer science
behind them all use reference counting rather than garbage collection,
then.

No, wait ...

Not everybody agrees that "practicality beats purity", which is one of
Python's principles. A strategy based on PURE reference counting just
cannot deal with cyclic garbage -- you'd also need the kind of kludges
you refer to above, or a twin-barreled system like Python's. A strategy
based on PURE mark-and-sweep *CAN* be complete and correct... at the
cost of horrid delays, of course, but what's such a practical
consideration to a real purist?-)

In practice, more has probably been written about garbage collection
implementations than about almost every issue in CS (apart from sorting
and searching;-). Good techniques need to be "incremental" -- the need
to "stop the world" for unbounded amounts of time (particularly in a
paged virtual memory world...), typical of pure m&s (even with
generational twists), is simply unacceptable in all but the most "batch"
type of computations, which occupy a steadily narrowing niche.
Reference counting is intrinsically "reasonably incremental"; the
worst-case of very long singly-linked lists (such that a dec-to-0 at the
head causes a cascade of N dec-to-0's all along) is as rare in Python as
it is frequent in LISP (and other languages that go crazy with such
lists -- Haskell, which defines *strings* as single linked lists of
characters, being a particularly egregious example) [[admittedly, the
techniques for amortizing the cost of such worst-cases are well known in
any case, though CPython has not implemented them]].

In any case, if you like Python (which is a LANGUAGE, after all) and
don't like one implementation of it, why not use a different
implementation, which uses a different virtual machine? Jython, for the
JVM, and IronPython, for MSCLR (presumably what you call ".net"), are
quite usable; project pypy is producing others (an implementation based
on Common LISP was one of the first practical results, over a year ago);
not to count Parrot, and other projects yet...

it is, if anything, LESS reliable than reference counting (which is way
simpler!),

Reliability is a red herring - in the absence of ill-behaved native
extensions, and with correct implementations, both refcounting and GC are
perfectly reliable. And you can rely on the implementation being correct,
since any incorrectness will be detected very quickly!

Not necessarily: tiny memory leaks in supposedly "stable" versions of
the JVM, for example, which get magnified in servers operating for
extremely long times and on very large scales, keep turning up. So, you
can't count on subtle and complicated implementations of garbage
collection algorithms being correct, any more than you can count on that
for (for example) subtle and complicated optimizations -- corner cases
can be hidden everywhere.

There are two ways to try to make a software system reliable: make it so
simple that it obviously has no bugs, or make it so complicated that it
has no obvious bugs. RC is definitely tilted towards the first of the
two options (and so would be mark-and-sweep in the pure form, the one
where you may need to stop everything for a LONG time once in a while),
while more sophisticated GC schemes get more and more complicated.

BTW, RC _IS_ a form of GC, just like, say, MS is.

and (if generalized to deal with ALL garbage) it might make it almost
impossible to write some kinds of extensions (ones which need to
interface existing C libraries that don't cooperate with whatever GC
collection you choose).

Lucky those existing C libraries were written to use python's refcounting!

Oh, you have to write a wrapper round the library to interface with the
automatic memory management? Well, as it happens, the stuff you need to do
is more or less identical for refcounting and GC - the extension has to
tell the VM which of the VM's objects it holds references to, so that the
VM knows that they aren't garbage.

Ah, but there is an obvious difference, when we're comparing reference
counting with mark and sweep in similarly simple incarnations: reference
counting has no "sweep" phase! M&S relies on any memory area not
otherwise accounted for being collectable during the "sweep" part, while
RC will intrinsically and happily leave alone any memory area it does
not know about. Adding sophistication to M&S often makes things even
more ticklish, if there are "random" pieces of memory which must be
hands-off -- the existing C library you're interfacing may give you no
idea, on an API-accessible level, where such internal "random" pieces
might be at any time. E.g., said existing libraries might be returning
to you "opaque handles" -- say you know they're pointers (already you're
having to breach the encapsulation and abstraction of the library you're
interfacing...), but pointers to WHAT? To structures which may
internally hold other pointers yet -- and what do THOSE point to...?

By ``handwaving'' about "the VM's objects" you imply a distinction
between such "objects" and other generic "areas of memory" that may not
be easy to maintain. In RC, no problem: the reference-counting
operations intrinsically discriminate (you don't addref or decref to
anything but such an "object"). In MS, the problem is definitely there;
your VM's allocator needs to be able to control the "sweep", which may
require quite a lot of extra overhead if it can't just assume it "owns"
all of the memory.

Are we talking about the same thing?!

Doesn't look like it, does it?

Apparently not. Most of my "production-level" implementations of
garbage collection schemes hark back to the late '70s (as part of my
thesis) and early '80s (working at Texas Instruments to architect a
general purpose CPU with some kind of GC support in hardware); after
leaving the field, when I got back to it, years later, I essentially
found out that the universality of paged virtual memory had changed
every single parameter in the game. I did some work on
pointer-swizzling "incremental sort-of-compacting" collectors, and
conservative M&S a la Boehm, but by that time I was more interested in
real-world applications and none of those efforts ever yielded anything
practical and production-quality -- so, I either used existing libraries
and VMs (and more often than not cursed at them -- and, generally, the
FFIs whose gyrations I had to go through to work with them), or, when I
was implementing GC in my applications, relied on simple and solid
techniques such as variants on reference-counting, arenas, etc etc.

The crusher was a prototype application based on Microsoft's CLR (or
".net", as you call it) which needed to use MSCLR's ``advanced, modern,
sophisticated'' GC for many new parts, and "unmanaged" mode for a lot of
existing "legacy" libraries and subsystems. I won't say that it wasted
a year of my life, because I was leading that effort at about
half-time... so, it only wasted HALF a year of my life!-) Of course,
that was in the bad dark ages of about 5 years ago -- I'm sure that by
now everything is perfect and flawless and the experience of the
previous 25 years is thereby nullified, right?-)

From your tone I assume that your experience in implementing and
cooperating with modern, "advanced" GC techniques is much fresher and
more successful than mine. Personally, I'm just happy that other Python
developers must clearly have scars similar to mine in these respects, so
that Python's implementation is solid and conservative, one whose
correctness you CAN essentially count on.

So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?
Fair enough - the performance gain is nice, but the extra complexity would
be a huge pain, i imagine.

CPython currently is implemented on a strict "minimize all tricks"
strategy.

A very, very sound principle. If you have the aforementioned decades,
hojillions and serried ranks, an all-tricks-turned-up-to-eleven strategy
can be made to work.

78% of software projects fail -- and I believe the rate of failures in
large IT departments and software houses is higher than average.
If you're a relatively small non-profit outfit like
the python dev team, minimising tricks buys you reliability and agility,
which is, really, what we all want.

And if you're a relatively large (and tumultuously growing),
pretty-good-profit outfit, minimizing tricks builds you scalability and
solidity. Funny enough, I've found that my attitude that "clarity,
solidity and prudence are THE prime criteria of good implementations",
born of thirty years' worth of scars and "arrows in my back", made me an
"instant cultural fit" for Google (which I joined as Uber Technical Lead
just over six months ago, and where I'm currently happily prospering)...
Alex

Oct 13 '05 #35

Python reliability

Similar topics