469,342 Members | 5,392 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,342 developers. It's quick & easy.

speeding up Python script

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I have a 1000 line python script that takes many hours to finish. It is
running with six inside 'for' loops.

I've searched the net for ways to speed up the proccess.

Psyco improves performance around 3% in this case which is not good enough.

How can I dramatically improve speed?

I tried to find some tool that converts Python to C automatically but
couldn't. As I don't know C, I think that weave and PyInline for
example are out of the solution.

I'm using Linux.

Luis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCio0YHn4UHCY8rB8RAoe0AKC4rj/dxE2HGw4xtCMEmVGDgrGEEwCghO1r
+Rkgx8j1BzD0Lxq/iPey23s=
=xv4O
-----END PGP SIGNATURE-----
Jul 19 '05 #1
10 1835
Without seeing any code, it's hard to tell, but it's not a wild guess
that 'six inside for loops' may be replaced by more efficient ways ;-)

Jul 19 '05 #2
You may want to read through this case study by the BDFL.

http://www.python.org/doc/essays/list2str.html

On Tuesday 17 May 2005 05:32 pm, Luis P. Mendes wrote:
Hi,

I have a 1000 line python script that takes many hours to finish. It is
running with six inside 'for' loops.

I've searched the net for ways to speed up the proccess.

Psyco improves performance around 3% in this case which is not good enough.

How can I dramatically improve speed?

I tried to find some tool that converts Python to C automatically but
couldn't. As I don't know C, I think that weave and PyInline for
example are out of the solution.

I'm using Linux.

Luis


--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
Jul 19 '05 #3
"Luis P. Mendes" <lu***********@netvisaoXXX.pt> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I have a 1000 line python script that takes many hours to finish.
It is running with six inside 'for' loops.

I've searched the net for ways to speed up the proccess.

Psyco improves performance around 3% in this case which is not good
enough.

How can I dramatically improve speed?


You are looking for better algorithm, not for Python optimisations.

--
http://www.peter.dembinski.prv.pl
Jul 19 '05 #4
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The reason why I'm using six nested for loops is because I need to find
the best output using those six variables as input.

Here's the simplified code:

for per in range():
~ for s in range():
~ for t in range():
for v in range():
~ for n in range():
~ for l in range():
var a, \
var b, \
...
var 15 = function1(arg1, \
...
arg20)
~ var a, \
var d, \
var a, \
...
var 14 = function2(arg1, \
...
arg25)
var c, \
...
var 18 = function3(arg1, \
...
arg20)
ia = var1*var2-numPerdidos*sl
result.insert(index, [per,s,t,v,n,l,ia])
index = index + 1

thanks for the replies
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCiy2CHn4UHCY8rB8RAlafAJ4wQ8mSCGFwTTt9EH1bn5 4tTuI40wCfSrn8
KDSrHnsIV2cY/PKmL3ZwLdI=
=39Cm
-----END PGP SIGNATURE-----
Jul 19 '05 #5
Luis P. Mendes wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The reason why I'm using six nested for loops is because I need to find
the best output using those six variables as input.

Here's the simplified code:

for per in range():
~ for s in range():
~ for t in range():
for v in range():
~ for n in range():
~ for l in range():


What's the size of each dimension in the parameter space? If you
absolutely need an exhaustive search, then you are going to have to live
with the long runtimes. The fact that psyco didn't help significantly
suggests that Python optimizations aren't going to help much.

You should examine your problem and determine if you really need an
exhaustive search or if you can live with some other search strategy.
Read up on "combinatorial optimzation" and "discrete optimization."

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Jul 19 '05 #6
On Wed, 18 May 2005 12:56:50 +0100,
"Luis P. Mendes" <lu***********@netvisaoXXX.pt> wrote:
The reason why I'm using six nested for loops is because I need to find
the best output using those six variables as input. Here's the simplified code: for per in range():
~ for s in range():
~ for t in range():
for v in range():
~ for n in range():
~ for l in range():
var a, \
var b, \
...
var 15 = function1(arg1, \
...
arg20)
~ var a, \
var d, \
var a, \
...
var 14 = function2(arg1, \
...
arg25)
var c, \
...
var 18 = function3(arg1, \
...
arg20)
ia = var1*var2-numPerdidos*sl
result.insert(index, [per,s,t,v,n,l,ia])
index = index + 1 thanks for the replies


I don't know how large your search-space is, but if the ranges for s, t,
v, n, and/or l don't change, at the very least you can pre-compute those
ranges instead of creating a new list with range every time:

srange = range( <whatever> )
trange = range( <whatever> )
vrange = range( <whatever> )
nrange = range( <whatever> )
lrange = range( <whatever> )

for per in range( <whatever> ):
for s in srange:
for t in trange:
for v = vrange:
for n in nrange:
for l in nrange:
somevar = somefunction( per, s, t, v, n, l )
result.append( [ somevar, per, s, t, v, n, l ] )

(Appending the results is probably quicker than inserting them, but I
don't know for sure.)

But a better-than-an-exhaustive-search algorithm sounds like a good
idea, too.

Regards,
Dan

--
Dan Sommers
<http://www.tombstonezero.net/dan/>
Jul 19 '05 #7
Quick tip-
Try xrange instead of range. This will use dramatically less memory
if your search space is large, which will speed things up /if/ your
machine is being forced to swap.

Besides that, without seeing the code for your functions, it's hard to
offer more advice. Your algorithm is necessarily going to take a long
time because you're enumerating all of the possible values of your
functions in a six dimensional space!! Recoding the whole program in
C would get you maybe an order of magnitude increase for any given
problem size, but it will still get really slow as the size of your
search space increases.

What are function1, function2, and function3? You might really
consider looking into some numerical methods for optimization if you're
just trying to find the smallest/largest value of ia.

Jul 19 '05 #8
On 2005-05-18, Luis P. Mendes <lu***********@netvisaoXXX.pt> wrote:
I have a 1000 line python script that takes many hours to
finish. It is running with six inside 'for' loops.
[...]
How can I dramatically improve speed?


In probably order of efficacy:

1) Use a better algorithm

2) Replace 'for' loops with vector operations or list
comprehensions

3) Install more RAM

4) Install a faster disk drive

5) Use a faster CPU

6) Rent DVDs to watch while program is running.

Before you do anything, you should profile your program to see
where the time is being spent.

--
Grant Edwards grante Yow! Hand me a pair of
at leather pants and a CASIO
visi.com keyboard -- I'm living
for today!
Jul 19 '05 #9
It looks like your algorithm really does iterate over all values for
six variables and do lots of math.. then you can't do any better than
implementing the inner loop in C. It does look like you have some
functions that are being called that are also in python, and it would
be interesting to see if just one of those was the culprit. If so,
you might be able to replace just one of those calls with something
faster.

I'm no expert, but I have used swig once or twice, I might be able to
help. What do you think?

If you are going through all the data just to find a minimum or
something, then you might be able to get away with not visiting all of
the values of all six variables. Some sort of gradient descent
minimization is probably already out there written in C ready to be
used.

-Jim
On 5/18/05, Grant Edwards <gr****@visi.com> wrote:
On 2005-05-18, Luis P. Mendes <lu***********@netvisaoXXX.pt> wrote:
I have a 1000 line python script that takes many hours to
finish. It is running with six inside 'for' loops.

Jul 19 '05 #10
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I appreciate everyone's help!

I got some ideas that I'll try to put into practice.

Regards,

Luis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCi7QQHn4UHCY8rB8RAkpWAJsGezFRCjDpVvbDiuEN8a 2lPY41AgCgm6w4
3hgtkw5NkxZ6GuzC1k2dIYo=
=Epti
-----END PGP SIGNATURE-----
Jul 19 '05 #11

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Andrew Chalk | last post: by
7 posts views Thread by Pankaj | last post: by
6 posts views Thread by tatamata | last post: by
7 posts views Thread by hlubenow | last post: by
1 post views Thread by michael.buonomo | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.