473,406 Members | 2,849 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

python in parallel for pattern discovery in genome data

Hi,

I am new to python.I am using it on redhat linux 9.
I am interested to run python on a sun machine(SunE420R,os=solaris)
with 4 cpu's for a pattern discovery/search program on biological
sequence(genomic sequence).I want to write the python code so that it
utilizes all the 4 cpu's.Moreover do i need some other libraries.
Kindly advice.

Thanks

Sincerely,

Manoj

--

************************************************** **************
Manoj Balyan
Scientist- Bioinformatics
Centre for Cellular and Molecular Biology(CCMB)
Uppal Road,
Hyderabad-500007
Andhra Pradesh,INDIA
TEl:+91-040-27192772,27160222,27192777
FAX:+91-040-27160591,27160311
EMAIL:ma*********@ccmb.res.in,
manoj_balyan@hotmail
WWW:http://www.ccmb.res.in
************************************************** *************
If you weep for the setting sun,you will miss the stars:Tagore
************************************************** *************


Jul 18 '05 #1
2 2546
BalyanM wrote:
Hi,

I am new to python.I am using it on redhat linux 9.
I am interested to run python on a sun machine(SunE420R,os=solaris)
with 4 cpu's for a pattern discovery/search program on biological
sequence(genomic sequence).I want to write the python code so that it
utilizes all the 4 cpu's.Moreover do i need some other libraries.
Kindly advice.

Thanks

Sincerely,

Manoj


Just a normal python interpreter won't help any, because of the GIL (Global
Interpreter Lock).
Just from your description, the following module might be something for you:
http://poshmodule.sourceforge.net/
It allows object sharing between differnet python processes.
As I have never worked with it, I can't say, if it's any good.

Stephan
Jul 18 '05 #2
BalyanM:
I am interested to run python on a sun machine(SunE420R,os=solaris)
with 4 cpu's for a pattern discovery/search program on biological
sequence(genomic sequence).I want to write the python code so that it
utilizes all the 4 cpu's.


*oomphh*

There's a lot of details buried in your lines.

It looks like you will be writing your own pattern matching code.
Why? There are plenty of tools for that already. A quick web
search finds http://genome.imb-jena.de/seqanal.html and many
of those tools are freely available.

Okay, suppose you do have the tool or library for it. Do you
want to do high throughput searches? Then you can just break
your N jobs into N/4 parts, one per machine. Easiest way in
Python is to run 4 Python programs, each with a little server going
(see the xmlrpc module for an example) and have your code
call them (see Aahz's excellent example of master/slave
programming using threads). Other options for the communications
are Twisted and Pyro.

You will not be able to do this with one Python process because
Python has what's called the "global interpreter lock" that
prevents core Python from effectively using multiple processors.
You can write a C extension which does the search and gives
up the lock, but I you seem to want to do this in raw Python.

(The suggestion to look at POSH won't work - it has some
Intel-specific assembly instructions in the C extension.)

Depending on the type of pattern search, you instead can assign
1/4 of the genome to each process, with overlap if needed. This
will speed up a single search, which is good for interactivity.

These work for a single "user" of the code. Might you have
many people trying to do pattern searches? If so, you may
need some way to throttle how many searches are done per
machine. For in-house use this likely isn't a problem - besides,
you should get your code working first.

There are other approaches. You could use shared memory or
CORBA for the communications, or PVM or MPI. Still, given
your experience, you should:
1) get your algorithm working on one machine
2) get it working as a client/server using XML-RPC (see the
SimpleXMLRPCServer and xmlrpclib modules),
3) get your client to work with multiple servers,
using multiple threads in the client

(It's a bit of my experience too - I really should try Pyro
for this sort of work. Well, I need a break so maybe I'll
try it out tonight ;)

There are a lot of skills to learn before it all works, so don't
get too discouraged too quickly.

Andrew
da***@dalkescientific.com
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

21
by: Glen Wheeler | last post by:
Hello all, My program uses many millions of integers, and Python is allocating way too much memory for these. I can't have the performance hit by using disk, so I figured I'd write a C...
63
by: Davor | last post by:
Is it possible to write purely procedural code in Python, or the OO constructs in both language and supporting libraries have got so embedded that it's impossible to avoid them? Also, is anyone...
10
by: Andrew Dalke | last post by:
Is there an author index for the new version of the Python cookbook? As a contributor I got my comp version delivered today and my ego wanted some gratification. I couldn't find my entries. ...
10
by: bpontius | last post by:
The GES Algorithm A Surprisingly Simple Algorithm for Parallel Pattern Matching "Partially because the best algorithms presented in the literature are difficult to understand and to implement,...
9
by: corey.coughlin | last post by:
Alright, so I've been following some of the arguments about enhancing parallelism in python, and I've kind of been struck by how hard things still are. It seems like what we really need is a more...
5
by: Michael Sperlle | last post by:
Is it possible? Bestcrypt can supposedly be set up on linux, but it seems to need changes to the kernel before it can be installed, and I have no intention of going through whatever hell that would...
43
by: parallelpython | last post by:
Has anybody tried to run parallel python applications? It appears that if your application is computation-bound using 'thread' or 'threading' modules will not get you any speedup. That is because...
6
by: Kay Schluehr | last post by:
Every once in a while Erlang style message passing concurrency is discussed for Python which does not only imply Stackless tasklets but also some process isolation semantics that lets the...
2
by: hari | last post by:
Hi all, I need to automate printer command testing, prinetr supports parallel/ serial/USB.How can i send the commands from python to printer. I have got pyparallel, as am new to python, no...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.