473,326 Members | 2,813 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

Parallelization with Python: which, where, how?

Dear NG,

I have a (pretty much) "emberassingly parallel" problem and look for the
right toolbox to parallelize it over a cluster of homogenous linux
workstations. I don't need automatic loop-parallelization or the like
since I prefer to prepare the work packets "by hand".
I simply need
- to specify a list of clients
- a means of sending a work packet to a free client and receiving the
result (hopefully automatically without need to login to each one)
- optionally a timeout mechanism if a client doesn't respond
- optionally help for debugging of remote clients

So far I've seen scipy's COW (cluster of workstation) package, but
couldn't find documentation or even examples for it (and the small
example in the code crashes...).
I've noticed PYRO as well, but didn't look too far yet.

Can someone recommend a parallelization approach? Are there examples or
documentation? Has someone got experience with stability and efficiency?

Thanks a lot,
Mathias

Jul 18 '05 #1
5 1773
"Mathias" <no_sp@m_please.cc> wrote:
I have a (pretty much) "emberassingly parallel" problem and look for the right toolbox to
parallelize it over a cluster of homogenous linux workstations. I don't need automatic
loop-parallelization or the like since I prefer to prepare the work packets "by hand".
I simply need
- to specify a list of clients
- a means of sending a work packet to a free client and receiving the
result (hopefully automatically without need to login to each one)
- optionally a timeout mechanism if a client doesn't respond
- optionally help for debugging of remote clients

So far I've seen scipy's COW (cluster of workstation) package, but couldn't find documentation or
even examples for it (and the small example in the code crashes...).
I've noticed PYRO as well, but didn't look too far yet.

Can someone recommend a parallelization approach? Are there examples or documentation? Has someone
got experience with stability and efficiency?


googling for "parallel python" brings up lots of references; tools like

http://pympi.sourceforge.net/
http://datamining.anu.edu.au/~ole/pypar/

(see https://geodoc.uchicago.edu/climatew...scussPythonMPI for
a comparision)

seem to be commonly used.

</F>

Jul 18 '05 #2
Mathias <no_sp@m_please.cc> writes:
Can someone recommend a parallelization approach? Are there examples
or documentation? Has someone got experience with stability and
efficiency?


In the "persistent objects" thread someone mentioned a very cool package
called POSH:

http://poshmodule.sourceforge.net/posh/html/posh.html
Jul 18 '05 #3
>>>>> "Mathias" == Mathias <no_sp@m_please.cc> writes:
Dear NG,
I have a (pretty much) "emberassingly parallel" problem and look for
the right toolbox to parallelize it over a cluster of homogenous linux
workstations. I don't need automatic loop-parallelization or the like
since I prefer to prepare the work packets "by hand".
I simply need
- to specify a list of clients
- a means of sending a work packet to a free client and receiving the
result (hopefully automatically without need to login to each one)
- optionally a timeout mechanism if a client doesn't respond
- optionally help for debugging of remote clients


pypvm or pympi? See http://pypvm.sourceforge.net/ and
http://pympi.sourceforge.net/.

Ganesan
Jul 18 '05 #4
Mathias wrote:
I have a (pretty much) "emberassingly parallel" problem and look for the
right toolbox to parallelize it over a cluster of homogenous linux
workstations.


We have a >1000-node cluster here and use the commercial Platform LSF to
manage it. My Poly package
<http://www.ebi.ac.uk/~hoffman/software/poly/> makes that trivial to use
from Python and also avoids many of the pitfalls of programming farms
that large, such as accidental distributed denial of service attacks on
your own fileserver ;)

Due to the cost and difficulty of setup, LSF is probably not what you
want, or you would already have it. But MPI is probably not what you
want if you are doing embarassingly parallelizable problems. I would
look into OpenPBS <http://www.openpbs.org/>. If you want to write a Poly
plugin for OpenPBS, I would be happy to accept it. ;)
--
Michael Hoffman
Jul 18 '05 #5
On Mon, 20 Dec 2004 14:03:09 +0100, Mathias <no_sp@m_please.cc> wrote:
Can someone recommend a parallelization approach? Are there examples or
documentation? Has someone got experience with stability and efficiency?


If you think a light-weight approach of distributing work and collecting
the output afterwards (using ssh/rsh) fits your problem, send me an
email.

Albert
--
Unlike popular belief, the .doc format is not an open publically available format.
Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

54
by: Brandon J. Van Every | last post by:
I'm realizing I didn't frame my question well. What's ***TOTALLY COMPELLING*** about Ruby over Python? What makes you jump up in your chair and scream "Wow! Ruby has *that*? That is SO...
0
by: Mike M?ller | last post by:
> Can someone recommend a parallelization approach? Are there examples or > documentation? Has someone got experience with stability and efficiency? I am successfully using pyro...
1
by: Leonid | last post by:
Hello, Please help me with calculation parallelization. I have 2 processors DELL PRECISION 530 computer and I'd like parallelize cycles like for(int i = 0; i < last; ++i) { a = b + c; }
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.