473,899 Members | 4,542 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

os.wait() losing child?

This may be a silly question but is possible for os.wait() to lose track
of child processes? I'm running Python 2.4.4 on Linux kernel 2.6.20
(i686), gcc4.1.1, and glibc-2.5.

Here's what happened in my situation. I first created a few child
processes with Popen, then in a while(True) loop wait on any of the
child process to exit, then restart a child process:

import os
from subprocess import Popen

pids = {}

for i in xrange(3):
p = Popen('sleep 1', shell=True, cwd='/home/user',
stdout=file(os. devnull,'w'))
pids[p.pid] = i

while (True):
pid = os.wait()
i = pids[pid]
del pids[pid]
print "Child Process %d terminated, restarting" % i
if (someCondition) :
break
p = Popen('sleep 1', shell=True, cwd='/home/user',
stdout=file(os. devnull,'w'))
pids[p.pid] = i

As I started to run this program, soon I discovered that some of the
processes stopped showing up, and eventually os.wait() will give an
error saying that there's no more child process to wait on. Can anyone
tell me what I did wrong?
Jul 11 '07
22 11776
Nick Craig-Wood wrote:
The problem you are having is you are letting Popen do half the job
and doing the other half yourself.
Except that I never wanted Popen to do any thread management for me to
begin with. Popen class has advertised itself as a replacement for
os.popen, popen2, popen4, and etc., and IMHO it should leave the
clean-up to the users, or at least leave it as an option.
Here is a way which works, done completely with Popen. Polling the
subprocesses is slightly less efficient than using os.wait() but does
work. In practice you want to do this anyway to see if your children
exceed their time limits etc.
I think your polling way works; it seems there no other way around this
problem other than polling or extending Popen class.

thanks,

Jason
Jul 12 '07 #11
Jason Zheng <Xi*******@jpl. nasa.govwrote:
Nick Craig-Wood wrote:
The problem you are having is you are letting Popen do half the job
and doing the other half yourself.

Except that I never wanted Popen to do any thread management for me to
begin with. Popen class has advertised itself as a replacement for
os.popen, popen2, popen4, and etc., and IMHO it should leave the
clean-up to the users, or at least leave it as an option.
Here is a way which works, done completely with Popen. Polling the
subprocesses is slightly less efficient than using os.wait() but does
work. In practice you want to do this anyway to see if your children
exceed their time limits etc.

I think your polling way works; it seems there no other way around this
problem other than polling or extending Popen class.
I think polling is probably the right way of doing it...

Internally subprocess uses os.waitpid(pid) just waiting for its own
specific pids. IMHO this is the right way of doing it other than
os.wait() which waits for any pids. os.wait() can reap children that
you weren't expecting (say some library uses os.system())...

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Jul 12 '07 #12
Jason Zheng <Xi*******@jpl. nasa.govwrites:
greg wrote:
>Jason Zheng wrote:
>>Hate to reply to my own thread, but this is the working program
that can demonstrate what I posted earlier:
I've figured out what's going on. The Popen class has a
__del__ method which does a non-blocking wait of its own.
So you need to keep the Popen instance for each subprocess
alive until your wait call has cleaned it up.
The following version seems to work okay.
It still doesn't work on my machine. I took a closer look at the Popen
class, and I think the problem is that the __init__ method always
calls a method _cleanup, which polls every existing Popen
instance.
Actually, it's not that bad. _cleanup only polls the instances that
are no longer referenced by user code, but still running. If you hang
on to Popen instances, they won't be added to _active, and __init__
won't reap them (_active is only populated from Popen.__del__).

This version is a trivial modification of your code to that effect.
Does it work for you?

#!/usr/bin/python

import os
from subprocess import Popen

pids = {}
counts = [0,0,0]

for i in xrange(3):
p = Popen('sleep 1', shell=True, cwd='/home', stdout=file(os. devnull,'w'))
pids[p.pid] = p, i
print "Starting child process %d (%d)" % (i,p.pid)

while (True):
pid, ignored = os.wait()
try:
p, i = pids[pid]
except KeyError:
# not one of ours
continue
del pids[pid]
counts[i] += 1

#terminate if count>10
if (counts[i]==10):
print "Child Process %d terminated." % i
if reduce(lambda x,y: x and (y>=10), counts):
break
continue

print "Child Process %d terminated, restarting" % i
p = Popen('sleep 1', shell=True, cwd='/home', stdout=file(os. devnull,'w'))
pids[p.pid] = p, i
Jul 12 '07 #13
Nick Craig-Wood <ni**@craig-wood.comwrites:
> I think your polling way works; it seems there no other way around this
problem other than polling or extending Popen class.

I think polling is probably the right way of doing it...
It requires the program to wake up every 0.1s to poll for freshly
exited subprocesses. That doesn't consume excess CPU cycles, but it
does prevent the kernel from swapping it out when there is nothing to
do. Sleeping in os.wait allows the operating system to know exactly
what the process is waiting for, and to move it out of the way until
those conditions are met. (Pedants would also notice that polling
introduces on average 0.1/2 seconds delay between the subprocess dying
and the parent reaping it.)

In general, a program that waits for something should do so in a
single call to the OS. OP's usage of os.wait was exactly correct.

Fortunately the problem can be worked around by hanging on to Popen
instances until they are reaped. If all of them are kept referenced
when os.wait is called, they will never end up in the _active list
because the list is only populated in Popen.__del__.
Internally subprocess uses os.waitpid(pid) just waiting for its own
specific pids. IMHO this is the right way of doing it other than
os.wait() which waits for any pids. os.wait() can reap children
that you weren't expecting (say some library uses os.system())...
system calls waitpid immediately after the fork. This can still be a
problem for applications that call wait in a dedicated thread, but the
program can always ignore the processes it doesn't know anything
about.
Jul 12 '07 #14
Hrvoje Niksic wrote:
>greg wrote:

Actually, it's not that bad. _cleanup only polls the instances that
are no longer referenced by user code, but still running. If you hang
on to Popen instances, they won't be added to _active, and __init__
won't reap them (_active is only populated from Popen.__del__).
Perhaps that's the difference between Python 2.4 and 2.5. In 2.4,
Popen's __init__ always appends self to _active:

def __init__(...):
_cleanup()
...
self._execute_c hild(...)
...
_active.append( self)

This version is a trivial modification of your code to that effect.
Does it work for you?
Nope it still doesn't work. I'm running python 2.4.4, tho.

$ python test.py
Starting child process 0 (26497)
Starting child process 1 (26498)
Starting child process 2 (26499)
Child Process 2 terminated, restarting
Child Process 2 terminated, restarting
Child Process 2 terminated, restarting
Child Process 2 terminated, restarting
Child Process 2 terminated, restarting
Child Process 2 terminated, restarting
Child Process 2 terminated, restarting
Child Process 2 terminated, restarting
Child Process 2 terminated, restarting
Child Process 2 terminated.
Traceback (most recent call last):
File "test.py", line 15, in ?
pid, ignored = os.wait()
OSError: [Errno 10] No child processes
Jul 12 '07 #15
Hrvoje Niksic <hn*****@xemacs .orgwrote:
Nick Craig-Wood <ni**@craig-wood.comwrites:
I think your polling way works; it seems there no other way around this
problem other than polling or extending Popen class.
I think polling is probably the right way of doing it...

It requires the program to wake up every 0.1s to poll for freshly
exited subprocesses. That doesn't consume excess CPU cycles, but it
does prevent the kernel from swapping it out when there is nothing to
do. Sleeping in os.wait allows the operating system to know exactly
what the process is waiting for, and to move it out of the way until
those conditions are met. (Pedants would also notice that polling
introduces on average 0.1/2 seconds delay between the subprocess dying
and the parent reaping it.)
Sure!

You could get rid of this by sleeping until a SIGCHLD arrived maybe.
In general, a program that waits for something should do so in a
single call to the OS. OP's usage of os.wait was exactly correct.
Disagree for the reason below.
Internally subprocess uses os.waitpid(pid) just waiting for its own
specific pids. IMHO this is the right way of doing it other than
os.wait() which waits for any pids. os.wait() can reap children
that you weren't expecting (say some library uses os.system())...

system calls waitpid immediately after the fork.
os.system probably wasn't the best example, but you take my point I
think!
This can still be a problem for applications that call wait in a
dedicated thread, but the program can always ignore the processes
it doesn't know anything about.
Ignoring them isn't good enough because it means that the bit of code
which was waiting for that process to die with os.getpid() will never
get called, causing a deadlock in that bit of code.

What is really required is a select() like interface to wait which
takes more than one pid. I don't think there is such a thing though,
so polling is your next best option.

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Jul 12 '07 #16
Jason Zheng <Xi*******@jpl. nasa.govwrote:
>Hrvoje Niksic wrote:
>Actually, it's not that bad. _cleanup only polls the instances that
are no longer referenced by user code, but still running. If you hang
on to Popen instances, they won't be added to _active, and __init__
won't reap them (_active is only populated from Popen.__del__).
Perhaps that's the difference between Python 2.4 and 2.5. In 2.4,
Popen's __init__ always appends self to _active:
Yes, that changed between 2.4 and 2.5.

Note that if you take a copy of 2.5's subprocess.py, it ought to work
fine with 2.4.

-M-

Jul 12 '07 #17
Nick Craig-Wood wrote:
Sure!

You could get rid of this by sleeping until a SIGCHLD arrived maybe.
Yah, I could also just dump Popen class and use fork(). But then what's
the point of having an abstraction layer any more?
> This can still be a problem for applications that call wait in a
dedicated thread, but the program can always ignore the processes
it doesn't know anything about.

Ignoring them isn't good enough because it means that the bit of code
which was waiting for that process to die with os.getpid() will never
get called, causing a deadlock in that bit of code.
Are you talking about something like os.waitpid(os.g etpid())? If the
process has completed and de-zombified by another os.wait() call, I
thought it would just throw an exception; it won't cause a deadlock by
hanging the process.

~Jason
Jul 12 '07 #18
Nick Craig-Wood <ni**@craig-wood.comwrites:
> This can still be a problem for applications that call wait in a
dedicated thread, but the program can always ignore the processes
it doesn't know anything about.

Ignoring them isn't good enough because it means that the bit of
code which was waiting for that process to die with os.getpid() will
never get called, causing a deadlock in that bit of code.
It won't deadlock, it will get an ECHILD or equivalent error because
it's waiting for a PID that doesn't correspond to a running child
process. I agree that this can be a problem if and when you use
libraries that can call system. (In that case sleeping for SIGCHLD is
probably a good solution.)
What is really required is a select() like interface to wait which
takes more than one pid. I don't think there is such a thing
though, so polling is your next best option.
Except for the problems outlined in my previous message. And the fact
that polling becomes very expensive (O(n) per check) once the number
of processes becomes large. Unless one knows that a library can and
does call system, wait is the preferred solution.
Jul 12 '07 #19
Jason Zheng <Xi*******@jpl. nasa.govwrites:
Hrvoje Niksic wrote:
>>greg wrote:
Actually, it's not that bad. _cleanup only polls the instances that
are no longer referenced by user code, but still running. If you hang
on to Popen instances, they won't be added to _active, and __init__
won't reap them (_active is only populated from Popen.__del__).

Perhaps that's the difference between Python 2.4 and 2.5.
[...]
Nope it still doesn't work. I'm running python 2.4.4, tho.
That explains it, then, and also why greg's code didn't work. You
still have the option to try to run 2.5's subprocess.py under 2.4.
Jul 13 '07 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
3050
by: Markus Franz | last post by:
Hi. I created a little script: for currenturl in sys.argv: pid = os.fork() if pid == 0: signal.alarm(10) do_something() # placeholder for the download and print routine
1
2453
by: Huey | last post by:
Hi All, I encountered a funny thing, and my code schetch as below: #define READ 0 #define WRITE 1 int byteRead, status, pd; char buff;
22
2583
by: codefixer | last post by:
Hi, I have a situation where I have to handle the following scenario. The main() must wait for child to complete or the main() must kill the child after 3 seconds and exit. /* Assume everythign is declared */ time(&start); while ( childpid != wait(&status) )
1
4582
by: VMI | last post by:
How can I add a small "Please wait..." form to a child form so that when I minimize the child form, the "Please Wait..." form will also disappear? This form will be displayed when the child form is running a lengthy process or when the child form is displayed. Once the child form finishes the process, or it's minimized, the "wait" form will also disappear. Thanks.
0
1597
by: fwirtanen | last post by:
I am building a custom composite control consisting of two drop downs, with parent/child dependancy. The child dropdownlist is updated through client callback when the parent index changes. An 'All' list item is added to the child dropdown during the CreateControlHeirarchy(bool useDataSource) method. Somewhere between the CreateControlHeirarchy method executing and the
0
1775
by: spacehopper_man | last post by:
hi - I am writing a "tab" control. - but it's losing the viewstate of the content pane when I switch between tabs. can anyone shed any light on why I'm losing ViewState based on my simple example below? - if I put the tabs at the top of the screen viewstate is lost -
1
1872
by: Carlos Aguayo | last post by:
If I have a parent and a child window, and if I create an array in the child window, and pass it as a parameter to the parent window, it's still an array, and its methods still work (like the 'forEach'). However, if I try to figure out if it's an array by looking at the constructor property, I get that it's not an array. I got this behavior in IE6 and Firefox, is this a bug or am I doing something wrong? I pasted the code below. Thanks!...
3
2999
by: RedWiz | last post by:
Hi i have to develop a multihreaded php application on linux, then through pcntl_fork and wait. I tried it, but there something going wrong, i think. The difference whit other languages like c and python is that the wait cannot handle correctly all signals from dead children in two cases: 1 - two or more children dies (about) at the same time time 2 - the chidren dies before pcntl_wait start
6
2339
by: Mathieu Prevot | last post by:
Hi there, it seems that child.wait() is ignored when print "Server running "%(child.pid) fpid.write(child.pid) are between the process creation child = Popen(cmd.split(), stderr=flog) and child.wait(). It seems to be a bug, doesn't it ?
0
9843
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10863
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10494
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
8039
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7201
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5887
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4720
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4300
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3317
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.