473,394 Members | 1,696 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Threaded for loop


I want to do something like this:

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = calculate(i,j)

I would like to now do this using a fixed number of threads, say 10
threads.
What is the easiest way to do the "parfor" in python?

Thanks in advance for your help,
--j

Jan 13 '07 #1
12 4777

JohnI want to do something like this:

Johnfor i = 1 in range(0,N):
John for j = 1 in range(0,N):
John D[i][j] = calculate(i,j)

JohnI would like to now do this using a fixed number of threads, say
John10 threads. What is the easiest way to do the "parfor" in python?

I'd create a queue containing 10 tokens. Pull a token off the queue, invoke
the thread with the parameters for its chunk, have it compute its bit, lock
D, update it, unlock it, then return the token to the token queue. Sketching
(and completely untested):

# Calculate one row of D
def calcrow(i, N, token, Tqueue, Dqueue):
d = [0.0] * N
for j in range(N):
d[j] = calculate(i, j)
D = Dqueue.get()
D[i][:] = d
Dqueue.put(D)
Tqueue.put(token)

# This queue limits the number of simultaneous threads
Tqueue = Queue.Queue()
for i in range(10):
Tqueue.put(i)

# This queue guards the shared matrix, D
Dqueue = Queue.Queue()
D = []
for i in range(N):
D.append([0.0] * N)
Dqueue.put(D)

for i in range(N):
token = Tqueue.get()
t = threading.Thread(target=calcrow, args=(i, N, token, Tqueue,
Dqueue))
t.start()

Skip
Jan 13 '07 #2
"John" <we**********@yahoo.comwrites:
I want to do something like this:

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = calculate(i,j)

I would like to now do this using a fixed number of threads, say 10
threads. What is the easiest way to do the "parfor" in python?
It won't help in terms of actual parallelism. Python only lets one
thread run at a time, even on a multi-cpu computer.
Jan 13 '07 #3

Dennis Lee Bieber wrote:
On 13 Jan 2007 12:15:44 -0800, "John" <we**********@yahoo.comdeclaimed
the following in comp.lang.python:

I want to do something like this:

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = calculate(i,j)

I would like to now do this using a fixed number of threads, say 10
threads.
What is the easiest way to do the "parfor" in python?

Thanks in advance for your help,
--j

Don't know if it's the easiest -- and if "calculate" is a CPU-bound
number cruncher with no I/O or other OS-blocking calls, it won't be
faster either as the GIL will only let one run at a time, even on
multi-core processors.
It could still be helpful if you'd like to get as much done as possible
in as short a time as possible, and you suspect that one or two cases
are likely to hold everything up.
Carl Banks

Jan 14 '07 #4
John wrote:
I want to do something like this:

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = calculate(i,j)

I would like to now do this using a fixed number of threads, say 10
threads.
What is the easiest way to do the "parfor" in python?

Thanks in advance for your help,
As it was already mentioned before threads will not help in terms of
parallelism (only one thread will be actually working). If you want to
calculate this in parallel here is an easy solution:

import ppsmp

#start with 10 processes
srv = ppsmp.Server(10)

f = []

for i = 1 in range(0,N):
for j = 1 in range(0,N):
#it might be a little bit more complex if 'calculate' depends on
other modules or calls functions
f.append(srv.submit(calculate, (i,j)))

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = f.pop(0)

You can get the latest version of ppsmp module here:
http://www.parallelpython.com/

Jan 14 '07 #5

Damn! That is bad news. So even if caclulate is independent for (i,j)
and
is computable on separate CPUs (parts of it are CPU bound, parts are IO
bound)
python cant take advantage of this?

Surprised,
--Tom

Paul Rubin wrote:
"John" <we**********@yahoo.comwrites:
I want to do something like this:

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = calculate(i,j)

I would like to now do this using a fixed number of threads, say 10
threads. What is the easiest way to do the "parfor" in python?

It won't help in terms of actual parallelism. Python only lets one
thread run at a time, even on a multi-cpu computer.
Jan 14 '07 #6

Damn! That is bad news. So even if caclulate is independent for (i,j)
and
is computable on separate CPUs (parts of it are CPU bound, parts are IO
bound)
python cant take advantage of this?

Surprised,
--j

Paul Rubin wrote:
"John" <we**********@yahoo.comwrites:
I want to do something like this:

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = calculate(i,j)

I would like to now do this using a fixed number of threads, say 10
threads. What is the easiest way to do the "parfor" in python?

It won't help in terms of actual parallelism. Python only lets one
thread run at a time, even on a multi-cpu computer.
Jan 14 '07 #7


Thanks. Does it matter if I call shell commands os.system...etc in
calculate?

Thanks,
--j

pa************@gmail.com wrote:
John wrote:
I want to do something like this:

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = calculate(i,j)

I would like to now do this using a fixed number of threads, say 10
threads.
What is the easiest way to do the "parfor" in python?

Thanks in advance for your help,

As it was already mentioned before threads will not help in terms of
parallelism (only one thread will be actually working). If you want to
calculate this in parallel here is an easy solution:

import ppsmp

#start with 10 processes
srv = ppsmp.Server(10)

f = []

for i = 1 in range(0,N):
for j = 1 in range(0,N):
#it might be a little bit more complex if 'calculate' depends on
other modules or calls functions
f.append(srv.submit(calculate, (i,j)))

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = f.pop(0)

You can get the latest version of ppsmp module here:
http://www.parallelpython.com/
Jan 14 '07 #8
"John" <we**********@yahoo.comwrites:
Damn! That is bad news. So even if caclulate is independent for
(i,j) and is computable on separate CPUs (parts of it are CPU bound,
parts are IO bound) python cant take advantage of this?
Not at the moment, unless you write C extensions that release the
global interpreter lock (GIL). One of these days. Meanwhile there
are various extension modules that let you use multiple processes,
look up POSH and Pyro.
Jan 14 '07 #9
John wrote:
Thanks. Does it matter if I call shell commands os.system...etc in
calculate?

Thanks,
--j
The os.system command neglects important changes in the environment
(redirected streams) and would not work with current version of ppsmp.
Although there is a very simple workaround:
print os.popen("yourcommand").read()
instead of os.system("yourcommand")
Here is a complete working example of that code:
http://www.parallelpython.com/compon...,29/topic,13.0

Jan 14 '07 #10

JohnDamn! That is bad news. So even if caclulate is independent for
John(i,j) and is computable on separate CPUs (parts of it are CPU
Johnbound, parts are IO bound) python cant take advantage of this?

It will help if parts are I/O bound, presuming the threads which block
release the global interpreter lock (GIL).

There is a module in development (processing.py) that provides an API like
the threading module but that uses processes under the covers:

http://mail.python.org/pipermail/pyt...er/069297.html

You might find that an interesting alternative.

Skip

Jan 14 '07 #11

John wrote:
I want to do something like this:

for i = 1 in range(0,N):
for j = 1 in range(0,N):
D[i][j] = calculate(i,j)

I would like to now do this using a fixed number of threads, say 10
threads.
Why do you want to run this in 10 threads? Do you have 10 CPUs?

If you are concerned about CPU time, you should not be using threads
(regardless of language) as they are often implemented with the
assumption that they stay idle most of the time (e.g. win32 threads and
pthreads). In addition, CPython has a global interpreter lock (GIL)
that prevents the interpreter from running on several processors in
parallel. It means that python threads are a tool for things like
writing non-blocking i/o and maintaining responsiveness in a GUI'. But
that is what threads are implemented to do anyway, so it doesn't
matter. IronPython and Jython do not have a GIL.

In order to speed up computation you should run multiple processes and
do some sort of IPC. Take a look at MPI (e.g. mpi4py.scipy.org) or
'parallel python'. MPI is the de facto industry standard for dealing
with CPU bound problems on systems with multiple processors, whether
the memory is shared or distributed does not matter. Contrary to common
belief, this approach is more efficient than running multiple threads,
sharing memory and synchronizong with mutexes and event objects - even
if you are using a system unimpeded by a GIL.

The number of parallel tasks should be equal to the number of available
CPU units, not more, as you will get excessive context shifts if the
number of busy threads or processes exceed the number of computational
units. If you only have two logical CPUs (e.g. one dual-core processor)
you should only run two parallel tasks - not ten. If you try to
parallelize using additional tasks (e.g. 8 more), you will just waste
time doing more context shifts, more cache misses, etc. But if you are
a lucky bastard with access to a 10-way server, sure run 10 tasks in
parallel.

Jan 14 '07 #12
sk**@pobox.com wrote:
>
There is a module in development (processing.py) that provides an API like
the threading module but that uses processes under the covers:

http://mail.python.org/pipermail/pyt...er/069297.html

You might find that an interesting alternative.
See the promised parallel processing overview on the python.org Wiki
for a selection of different solutions:

http://wiki.python.org/moin/ParallelProcessing

Paul

Jan 14 '07 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Anand Pillai | last post by:
This is for folks who are familiar with asynchronous event handling in Python using the asyncore module. If you have ever used the asyncore module, you will realize that it's event loop does not...
2
by: Chris | last post by:
I think I already know that the answer is that this can't be done, but I'll ask anyways. Suppose you want to use an RDBMS to store messages for a threaded message forum like usenet and then...
2
by: ian douglas | last post by:
I have one process that will be multi-threaded. The parent (A) will sit and deal with TCP/IP issues, and feed data to its child process (B) via shared memory. I need assistance in finding a good...
1
by: Jim P. | last post by:
I'm having trouble returning an object from an AsyncCallback called inside a threaded infinite loop. I'm working on a Peer2Peer app that uses an AsyncCallback to rerieve the data from the remote...
9
by: Chris | last post by:
I have a form that take a bit to load up because of talking to a database and the amount of data process that has to take place before showing the screen. I have it working but I feel that I...
3
by: Rsrany | last post by:
I've been working on a few gtk applications and need to tie a hot key catcher into a thread. I am currently finding threaded user32.GetMessageA do not work. I have included two programs: 1) a...
14
by: Snor | last post by:
I'm attempting to create a lobby & game server for a multiplayer game, and have hit a problem early on with the server design. I am stuck between using a threaded server, and using an event driven...
6
by: Gina_Marano | last post by:
Hey All, I am using multiple child threads per main thread to download files. It sometimes appears as if the same file is being downloaded twice. I am using "lock". Am I using it correctly? Any...
0
by: chsalvia | last post by:
I'm attempting to write a simple multi-threaded webserver on UNIX that uses a thread pool to handle connections. I use an accept() loop which simply sends requests to a request function, and then...
1
by: J | last post by:
Hi, I've written a multi threaded application which scans about 2000 servers event logs to check for disk errors. The problem with it is due to the fact that it just keeps eating memory. In...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.