473,788 Members | 2,733 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

run function in separate process

Hi everyone,

I have written a function that runs functions in separate processes. I
hope you can help me improving it, and I would like to submit it to
the Python cookbook if its quality is good enough.

I was writing a numerical program (using numpy) which uses huge
amounts of memory, the memory increasing with time. The program
structure was essentially:

for radius in radii:
result = do_work(params)

where do_work actually uses a large number of temporary arrays. The
variable params is large as well and is the result of computations
before the loop.

After playing with gc for some time, trying to convince it to to
release the memory, I gave up. I will be happy, by the way, if
somebody points me to a web page/reference that says how to call a
function then reclaim the whole memory back in python.

Meanwhile, the best that I could do is fork a process, compute the
results, and return them back to the parent process. This I
implemented in the following function, which is kinda working for me
now, but I am sure it can be much improved. There should be a better
way to return the result that a temporary file, for example. I
actually thought of posting this after noticing that the pypy project
had what I thought was a similar thing in their testing, but they
probably dealt with it differently in the autotest driver [1]; I am
not sure.

Here is the function:

def run_in_separate _process(f, *args, **kwds):
from os import tmpnam, fork, waitpid, remove
from sys import exit
from pickle import load, dump
from contextlib import closing
fname = tmpnam()
pid = fork()
if pid 0: #parent
waitpid(pid, 0) # should have checked for correct finishing
with closing(file(fn ame)) as f:
result = load(f)
remove(fname)
return result
else: #child
result = f(*args, **kwds)
with closing(file(fn ame,'w')) as f:
dump(result, f)
exit(0)
To be used as:

for radius in radii:
result = run_in_separate _process (do_work, params)

[1] http://codespeak.net/pipermail/pypy-...q3/003273.html

Regards,

Muhammad Alkarouri

Apr 11 '07 #1
7 2320
On Apr 11, 9:23 am, malkaro...@gmai l.com wrote:
Hi everyone,

I have written a function that runs functions in separate processes. I
hope you can help me improving it, and I would like to submit it to
the Python cookbook if its quality is good enough.

I was writing a numerical program (using numpy) which uses huge
amounts of memory, the memory increasing with time. The program
structure was essentially:

for radius in radii:
result = do_work(params)

where do_work actually uses a large number of temporary arrays. The
variable params is large as well and is the result of computations
before the loop.

After playing with gc for some time, trying to convince it to to
release the memory, I gave up. I will be happy, by the way, if
somebody points me to a web page/reference that says how to call a
function then reclaim the whole memory back in python.

Meanwhile, the best that I could do is fork a process, compute the
results, and return them back to the parent process. This I
implemented in the following function, which is kinda working for me
now, but I am sure it can be much improved. There should be a better
way to return the result that a temporary file, for example. I
actually thought of posting this after noticing that the pypy project
had what I thought was a similar thing in their testing, but they
probably dealt with it differently in the autotest driver [1]; I am
not sure.

Here is the function:

def run_in_separate _process(f, *args, **kwds):
from os import tmpnam, fork, waitpid, remove
from sys import exit
from pickle import load, dump
from contextlib import closing
fname = tmpnam()
pid = fork()
if pid 0: #parent
waitpid(pid, 0) # should have checked for correct finishing
with closing(file(fn ame)) as f:
result = load(f)
remove(fname)
return result
else: #child
result = f(*args, **kwds)
with closing(file(fn ame,'w')) as f:
dump(result, f)
exit(0)

To be used as:

for radius in radii:
result = run_in_separate _process (do_work, params)

[1]http://codespeak.net/pipermail/pypy-dev/2006q3/003273.html

Regards,

Muhammad Alkarouri
I found a post on a similar topic that looks like it may give you some
ideas:

http://mail.python.org/pipermail/pyt...er/285400.html
http://www.artima.com/forums/flat.js...&thread=174099
http://www.nabble.com/memory-manage-...-t3386442.html
http://www.thescripts.com/forum/thread620226.html

Mike

Apr 11 '07 #2
<ma********@gma il.comwrote:
...
somebody points me to a web page/reference that says how to call a
function then reclaim the whole memory back in python.

Meanwhile, the best that I could do is fork a process, compute the
results, and return them back to the parent process. This I
That's my favorite way to ensure that all resources get reclaimed: let
the operating system do the job.
implemented in the following function, which is kinda working for me
now, but I am sure it can be much improved. There should be a better
way to return the result that a temporary file, for example. I
You can use a pipe. I.e. (untested code):

def run_in_separate _process(f, *a, **k):
import os, sys, cPickle
pread, pwrite = os.pipe()
pid = os.fork()
if pid>0:
os.close(pwrite )
with os.fdopen(pread , 'rb') as f:
return cPickle.load(f)
else:
os.close(pread)
result = f(*a, **k)
with os.fdopen(pwrit e, 'wb') as f:
cPickle.dump(f, -1)
sys.exit()

Using cPickle instead of pickle, and a negative protocol (on the files
pedantically specified as binary:-), meaning the latest and greatest
available pickling protocol, rather than the default 0, should improve
performance.
Alex
Apr 11 '07 #3
Thanks Mike for you answer. I will use the occasion to add some
comments on the links and on my approach.

I am programming in Python 2.5, mainly to avoid the bug that memory
arenas were never freed before.
The program is working on both Mac OS X (intel) and Linux, so I prefer
portable approaches.

On Apr 11, 3:34 pm, kyoso...@gmail. com wrote:
[...]
I found a post on a similar topic that looks like it may give you some
ideas:

http://mail.python.org/pipermail/pyt...er/285400.html
I see the comment about using mmap as valuable. I tried to use that
using numpy.memmap but I wasn't successful. I don't remember why at
the moment.
The other tricks are problem-dependent, and my case is not like them
(I believe).
http://www.artima.com/forums/flat.js...&thread=174099
Good ideas. I hope that python will grow a replacable gc one day. I
think that pypy already has a choice at the moment.
http://www.nabble.com/memory-manage-...-t3386442.html
http://www.thescripts.com/forum/thread620226.html
Bingo! This thread actually reaches more or less the same conclusion.
In fact, Alex Martelli describes the exact pattern in
http://mail.python.org/pipermail/pyt...ch/431910.html

I probably got the idea from a previous thread by him or somebody
else. It should be much earlier than March, though, as my program was
working since last year.

So, let's say the function I have written is an implementation of
Alex's architectural pattern. Probably makes it easier to get in the
cookbook:)

Regards,

Muhammad

Apr 11 '07 #4
On Apr 11, 3:58 pm, a...@mac.com (Alex Martelli) wrote:
[...]
That's my favorite way to ensure that all resources get reclaimed: let
the operating system do the job.
Thanks a lot, Alex, for confirming the basic idea. I will be playing
with your function later today, and will give more feedback.
I think I avoided the pipe on the mistaken belief that pipes cannot be
binary. I know, I should've tested. And I avoided pickle at the time
because I had a structure that was unpicklable (grown by me using a
mixture of python, C, ctypes and pyrex at the time). The structure is
improved now, and I will go for the more standard approach..

Regards,

Muhammad

Apr 11 '07 #5
On Apr 11, 4:36 pm, malkaro...@gmai l.com wrote:
[...]
.. And I avoided pickle at the time
because I had a structure that was unpicklable (grown by me using a
mixture of python, C, ctypes and pyrex at the time). The structure is
improved now, and I will go for the more standard approach..
Sorry, I was speaking about an older version of my code. The code is
already using pickle, and yes, cPickle is better.

Still trying the code. So far, after modifying the line:

cPickle.dump(f, -1)

to:

cPickle.dump(re sult, f, -1)

it is working.

Regards,

Muhammad

Apr 11 '07 #6
After playing with Alex's implementation, and adding some support for
exceptions, this is what I came up with. I hope I am not getting too
clever for my needs:

import os, cPickle
def run_in_separate _process_2(f, *args, **kwds):
pread, pwrite = os.pipe()
pid = os.fork()
if pid 0:
os.close(pwrite )
with os.fdopen(pread , 'rb') as f:
status, result = cPickle.load(f)
os.waitpid(pid, 0)
if status == 0:
return result
else:
raise result
else:
os.close(pread)
try:
result = f(*args, **kwds)
status = 0
except Exception, exc:
result = exc
status = 1
with os.fdopen(pwrit e, 'wb') as f:
try:
cPickle.dump((s tatus,result), f,
cPickle.HIGHEST _PROTOCOL)
except cPickle.Picklin gError, exc:
cPickle.dump((2 ,exc), f, cPickle.HIGHEST _PROTOCOL)
f.close()
os._exit(0)

Basically, the function is called in the child process, and a status
code is returned in addition to the result. The status is 0 if the
function returns normally, 1 if it raises an exception, and 2 if the
result is unpicklable. Some cases are deliberately not handled, like a
SystemExit or a KeyboardInterru pt show up as EOF errors in the
unpickling in the parent. Some cases are inadvertently not handled,
these are called bugs. And the original exception trace is lost. Any
comments?

Regards,

Muhammad Alkarouri

Apr 11 '07 #7
After playing a little with Alex's function, I got to:

import os, cPickle
def run_in_separate _process_2(f, *args, **kwds):
pread, pwrite = os.pipe()
pid = os.fork()
if pid 0:
os.close(pwrite )
with os.fdopen(pread , 'rb') as f:
status, result = cPickle.load(f)
os.waitpid(pid, 0)
if status == 0:
return result
else:
raise result
else:
os.close(pread)
try:
result = f(*args, **kwds)
status = 0
except Exception, exc:
result = exc
status = 1
with os.fdopen(pwrit e, 'wb') as f:
try:
cPickle.dump((s tatus,result), f,
cPickle.HIGHEST _PROTOCOL)
except cPickle.Picklin gError, exc:
cPickle.dump((2 ,exc), f, cPickle.HIGHEST _PROTOCOL)
f.close()
os._exit(0)
It handles exceptions as well, partially. Basically the child process
returns a status code as well as a result. If the status is 0, then
the function returned successfully and its result is returned. If the
status is 1, then the function raised an exception, which will be
raised in the parent. If the status is 2, then the function has
returned successfully but the result is not picklable, an exception is
raised.
Exceptions such as SystemExit and KeyboardInterru pt in the child are
not checked and will result in an EOFError in the parent.

Any comments?

Regards,

Muhammad

Apr 11 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
4965
by: Penn Markham | last post by:
Hello all, I am writing a script where I need to use the system() function to call htpasswd. I can do this just fine on the command line...works great (see attached file, test.php). When my webserver runs that part of the script (see attached file, snippet.php), though, it doesn't go through. I don't get an error message or anything...it just returns a "1" (whereas it should return a "0") as far as I can tell. I have read the PHP...
1
1853
by: Michael Williams | last post by:
Hi, I am trying to understand the performance implications of running a number of separate ActiveXexe processes as opposed to a single ActiveXexe with multiple threads on a Windows 2000 server. I have been told that Windows does not allocate a fixed memory space to its processes, so they are always paged off to disk when inactive. However, all the threads in a single process operate within the same memory space, so are relatively...
1
545
by: Raghavendran Muraleetharan | last post by:
I have a requirement where I need to embed a VB 6 forms application into .Net forms application. Basically the new .Net application would act as a wrapper application and would control the embedded VB 6 application. I tried to start the VB 6 in a separate process from dot net app and acquired the handle of the handle to the main window of the process. Next I used to use Win32 API function SetParent to set the parent of the VB 6...
8
2038
by: Lucas Lemmens | last post by:
Dear pythonians, I've been reading/thinking about the famous function call speedup trick where you use a function in the local context to represent a "remoter" function to speed up the 'function lookup'. "This is especially usefull in a loop where you call the function a zillion time" they say. I think this is very odd behavior.
6
1880
by: gk245 | last post by:
Basically, i want to make a function that will receive a sentence or phrase, and count its words. It would start like this (i think): #include <stdio.h> int count ( char sentence ) { char string ;
28
4336
by: Larax | last post by:
Best explanation of my question will be an example, look below at this simple function: function SetEventHandler(element) { // some operations on element element.onclick = function(event) {
3
975
by: John | last post by:
Hi Is there a way to run a sub in a separate synchronous or asynchronous process? One of the reason I would like to do that is I do not want this process to effect the main app and also to cleanup the process resources completely after it end. Thanks Regards
3
3446
by: rob | last post by:
Hello, If I have an array made up of a bunch of key =value pairs, how can I pass the values of each key as an argument to a function, given that the number of items in the array are not static (i.e: sometimes there's one item, sometimes there's two)? For example, if I have the following array: $list = array('sky' ='blue', 'grass' ='green');
4
2485
by: barcaroller | last post by:
I am trying to adopt a model for calling functions and checking their return values. I'm following Scott Meyer's recommendation of not over-using exceptions because of their potential overhead. Here's the approach I'm currently looking at. I throw exceptions only from constructors. Destructors, of course, do not throw exceptions. All other functions return a signed integer. The values are all stored in one large header file (as...
0
9656
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9498
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10177
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9969
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8995
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6750
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5538
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4074
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2896
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.