473,394 Members | 1,715 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Question about unreasonable slowness

[ Warning: I'm new to Python. Don't know it at all really yet, but had
to examine some 3rd party code because of performance problems with it.
]

Here's a code snippet:

i = 0
while (i < 20):
i = i + 1
(shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
spawned shell does nothing
print 'next'
# for line in shellOut:
# print line

On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
if I uncomment the two commented lines, which loop over the empty
shellOut array, the progam now takes 11 secs. That slowdown seems
very hard to believe. Why should it slow down so much?

John.

Nov 16 '06 #1
9 1152
al******@mail.northgrum.com wrote:
i = 0
while (i < 20):
i = i + 1
for i in xrange(20):
(shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
spawned shell does nothing
print 'next'
# for line in shellOut:
# print line

On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
if I uncomment the two commented lines, which loop over the empty
shellOut array, the progam now takes 11 secs. That slowdown seems
very hard to believe. Why should it slow down so much?
The key fact here is that shellOut isn't an array; it's a living,
breathing file object. If you don't iterate over it, you can run all 20
shell processes in parallel if necessary; but if you do iterate over it,
you're waiting for sh's stdout pipe to reach EOF, which effectively
means you can only run one process at a time.

On my system (OS X 10.4 with Python 2.5 installed), your code runs in
..187 secs with the loop commented out, and in .268 secs otherwise. But I
guess AIX's sh is slower than OS X's.
Nov 16 '06 #2
al******@mail.northgrum.com wrote:
i = 0
while (i < 20):
i = i + 1
for i in xrange(20):
(shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
spawned shell does nothing
print 'next'
# for line in shellOut:
# print line

On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
if I uncomment the two commented lines, which loop over the empty
shellOut array, the progam now takes 11 secs. That slowdown seems
very hard to believe. Why should it slow down so much?
The key fact here is that shellOut isn't an array; it's a living,
breathing file object. If you don't iterate over it, you can run all 20
shell processes in parallel if necessary; but if you do iterate over it,
you're waiting for sh's stdout pipe to reach EOF, which effectively
means you can only run one process at a time.

On my system (OS X 10.4 with Python 2.5 installed), your code runs in
..187 secs with the loop commented out, and in .268 secs otherwise. But I
guess AIX's sh is slower than OS X's.
Nov 16 '06 #3
On Thu, 16 Nov 2006 12:45:18 -0800, allenjo5 wrote:
[ Warning: I'm new to Python. Don't know it at all really yet, but had
to examine some 3rd party code because of performance problems with it.
]

Here's a code snippet:

i = 0
while (i < 20):
i = i + 1

You probably want to change that to:

for i in range(20):

If 20 is just a place-holder, and the real value is much bigger, change
the range() to xrange().

(shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
spawned shell does nothing
print 'next'
# for line in shellOut:
# print line

On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
if I uncomment the two commented lines, which loop over the empty
shellOut array, the progam now takes 11 secs. That slowdown seems
very hard to believe. Why should it slow down so much?
What are you using to time the code?

Replacing print statements with "pass", I get these results:
>>import timeit

def test():
.... i = 0
.... while (i < 20):
.... i = i + 1
.... (shellIn, shellOut) = os.popen4("/bin/sh -c ':'")
.... pass # print 'next'
.... for line in shellOut:
.... pass # print line
....
>>timeit.Timer("test()", "from __main__ import test\nimport os").timeit(1)
0.49781703948974609
>>timeit.Timer("test()", "from __main__ import test\nimport os").timeit(100)
54.894074201583862

About 0.5 second to open and dispose of 20 subshells, even with the "for
line in shellOut" loop.
I think you need some more fine-grained testing to determine whether the
slowdown is actually happening inside the "for line in shellOut" loop or
inside the while loop or when the while loop completes.

--
Steven.

Nov 16 '06 #4
Leif K-Brooks wrote:
al******@mail.northgrum.com wrote:
i = 0
while (i < 20):
i = i + 1

for i in xrange(20):
(shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
spawned shell does nothing
print 'next'
# for line in shellOut:
# print line

On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
if I uncomment the two commented lines, which loop over the empty
shellOut array, the progam now takes 11 secs. That slowdown seems
very hard to believe. Why should it slow down so much?

The key fact here is that shellOut isn't an array; it's a living,
breathing file object. If you don't iterate over it, you can run all 20
shell processes in parallel if necessary; but if you do iterate over it,
you're waiting for sh's stdout pipe to reach EOF, which effectively
means you can only run one process at a time.
Aha! I now notice that with the second loop commented out, I see many
python processes running for a little while after the main program
ends. So that confirms what you stated.
On my system (OS X 10.4 with Python 2.5 installed), your code runs in
.187 secs with the loop commented out, and in .268 secs otherwise. But I
guess AIX's sh is slower than OS X's.
Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
python 2.4.3. So, that's better, but still unreasonably slow. And to
answer another's question, I'm using the ksh builtin 'time' command to
time the overall script.

BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
This naively translated pure shell version of my python test script
completes in .1 secs:

i=1
while ((i<20))
do ((i+=1))
print next
print "$shellIn" | /bin/sh -c ':' |
while read line
do print $line
done
done

Has anyone tried this on a true unix box (AIX, HPUX, Solaris, Linux)?
It seems to be functioning differently (and faster) on Windows and OS X
(though I guess at its heard, OS X is essentially unix).

John.

Nov 17 '06 #5
al******@mail.northgrum.com:
Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
python 2.4.3. So, that's better, but still unreasonably slow. And to
answer another's question, I'm using the ksh builtin 'time' command to
time the overall script.

BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
This naively translated pure shell version of my python test script
completes in .1 secs:

i=1
while ((i<20))
do ((i+=1))
print next
print "$shellIn" | /bin/sh -c ':' |
while read line
do print $line
done
done

Has anyone tried this on a true unix box (AIX, HPUX, Solaris, Linux)?
It seems to be functioning differently (and faster) on Windows and OS X
(though I guess at its heard, OS X is essentially unix).

John.
Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686
i386 GNU/Linux

# <code>

import os
import timeit

def test():
for i in xrange(20):
(shellIn, shellOut) = os.popen4("/bin/sh -c ':'")
print 'next'
for line in shellOut:
print line

print timeit.Timer("test()", "from __main__ import test\nimport
os").timeit(1)

# </code>
This returns in 0.4 seconds. If I time it to do 50 tests, it returns
after 20.2 - 20.5 seconds. Even if I substitute the for i in xrange()
construct to your sh-like while statement. And all that through a
network, with print statements intact. Guess your true Unix box has some
features unavailable on Fedora Core or MacOS X ;-)

Regards,
Łukasz Langa
Nov 17 '06 #6

Łukasz Langa wrote:
al******@mail.northgrum.com:
Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
python 2.4.3. So, that's better, but still unreasonably slow. And to
answer another's question, I'm using the ksh builtin 'time' command to
time the overall script.

BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
This naively translated pure shell version of my python test script
completes in .1 secs:

i=1
while ((i<20))
do ((i+=1))
print next
print "$shellIn" | /bin/sh -c ':' |
while read line
do print $line
done
done

Has anyone tried this on a true unix box (AIX, HPUX, Solaris, Linux)?
It seems to be functioning differently (and faster) on Windows and OS X
(though I guess at its heard, OS X is essentially unix).

John.

Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686
i386 GNU/Linux

# <code>

import os
import timeit

def test():
for i in xrange(20):
(shellIn, shellOut) = os.popen4("/bin/sh -c ':'")
print 'next'
for line in shellOut:
print line

print timeit.Timer("test()", "from __main__ import test\nimport
os").timeit(1)

# </code>
This returns in 0.4 seconds. If I time it to do 50 tests, it returns
after 20.2 - 20.5 seconds. Even if I substitute the for i in xrange()
construct to your sh-like while statement. And all that through a
network, with print statements intact. Guess your true Unix box has some
features unavailable on Fedora Core or MacOS X ;-)
Yeah, apparently this is an AIX specific issue - perhaps the python
implementation of popen4() needs to do something special for AIX?

I've since tested my script on SunOS 5.9 with Python 2.4.2, and it took
only about 1.5 sec with or without the second for loop, but without it,
there were no extra python processes running in the background when the
main one ends, unlike what I saw on AIX. This might be a clue to
someone who knows more than I do... any Python gurus out there runnin
AIX?

John.

Nov 17 '06 #7
On Fri, 17 Nov 2006 12:39:16 -0800, allenjo5 wrote:
>al******@mail.northgrum.com:
Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
python 2.4.3. So, that's better, but still unreasonably slow. And to
answer another's question, I'm using the ksh builtin 'time' command to
time the overall script.

BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
This naively translated pure shell version of my python test script
completes in .1 secs:

i=1
while ((i<20))
do ((i+=1))
print next
print "$shellIn" | /bin/sh -c ':' |
while read line
do print $line
done
done
Yeah, apparently this is an AIX specific issue - perhaps the python
implementation of popen4() needs to do something special for AIX?
This seems likely a more general issue, rather than just a python issue
(although the huge speed up from moving to 2.5.x). A
couple of things I'd try:

1. Split the spawn/IO apart, twenty procs. should be fine.
2. Try making the pipe buffer size bigger (optional third argument to
os.popen4).
3. Note that you might well be spawning three processes, and
are definitely doing two shells. Any shell init. slowness is going to be
none fun. Use an array to run /bin/true, and time that.

--
James Antill -- ja***@and.org
http://www.and.org/and-httpd/ -- $2,000 security guarantee
http://www.and.org/vstr/
Nov 20 '06 #8

James Antill wrote:
On Fri, 17 Nov 2006 12:39:16 -0800, allenjo5 wrote:
al******@mail.northgrum.com:
Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
python 2.4.3. So, that's better, but still unreasonably slow. And to
answer another's question, I'm using the ksh builtin 'time' command to
time the overall script.

BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
This naively translated pure shell version of my python test script
completes in .1 secs:

i=1
while ((i<20))
do ((i+=1))
print next
print "$shellIn" | /bin/sh -c ':' |
while read line
do print $line
done
done
Yeah, apparently this is an AIX specific issue - perhaps the python
implementation of popen4() needs to do something special for AIX?

This seems likely a more general issue, rather than just a python issue
(although the huge speed up from moving to 2.5.x). A
couple of things I'd try:
With help from c.u.aix, I've discovered the problem. Python (in
popen2.py) is attempting to close filedescriptors 3 through 32767
before running the /bin/sh. This is because os.sysconf('SC_OPEN_MAX')
is returning 32767. So far, it looks like SC_OPEN_MAX is being set
correctly to 4 in posixmodule.c, and indeed, os.sysconf_names seems to
also have SC_OPEN_MAX set to 4:

python -c 'import os; print os.sysconf_names'

....
'SC_XOPEN_XCU_VERSION': 109, 'SC_OPEN_MAX': 4, 'SC_PRIORITIZED_IO': 91,
....

In fact, none of the values that sysconf_names has set for the various
constants are being returned by os.sysconf(). For example, the 2
others I just listed:

$ ./python -c 'import os; print os.sysconf("SC_XOPEN_XCU_VERSION")'
4

$ ./python -c 'import os; print os.sysconf("SC_PRIORITIZED_IO")'
-1

This makes no sense to me... unless there is some memory alignment or
endian issue going on here?

Nov 21 '06 #9

al******@mail.northgrum.com wrote:
James Antill wrote:
On Fri, 17 Nov 2006 12:39:16 -0800, allenjo5 wrote:
>al******@mail.northgrum.com:
Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
python 2.4.3. So, that's better, but still unreasonably slow. And to
answer another's question, I'm using the ksh builtin 'time' command to
time the overall script.
>
BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
This naively translated pure shell version of my python test script
completes in .1 secs:
>
i=1
while ((i<20))
do ((i+=1))
print next
print "$shellIn" | /bin/sh -c ':' |
while read line
do print $line
done
done
>>
Yeah, apparently this is an AIX specific issue - perhaps the python
implementation of popen4() needs to do something special for AIX?
This seems likely a more general issue, rather than just a python issue
(although the huge speed up from moving to 2.5.x). A
couple of things I'd try:

With help from c.u.aix, I've discovered the problem. Python (in
popen2.py) is attempting to close filedescriptors 3 through 32767
before running the /bin/sh. This is because os.sysconf('SC_OPEN_MAX')
is returning 32767. So far, it looks like SC_OPEN_MAX is being set
correctly to 4 in posixmodule.c, and indeed, os.sysconf_names seems to
also have SC_OPEN_MAX set to 4:

python -c 'import os; print os.sysconf_names'

...
'SC_XOPEN_XCU_VERSION': 109, 'SC_OPEN_MAX': 4, 'SC_PRIORITIZED_IO': 91,
...

In fact, none of the values that sysconf_names has set for the various
constants are being returned by os.sysconf(). For example, the 2
others I just listed:

$ ./python -c 'import os; print os.sysconf("SC_XOPEN_XCU_VERSION")'
4

$ ./python -c 'import os; print os.sysconf("SC_PRIORITIZED_IO")'
-1

This makes no sense to me... unless there is some memory alignment or
endian issue going on here?
More info: clearly I had no idea what I was talking about :-)

The numbers associated with the names returned by os.sysconf_names are
the indices to an array that the C sysconf() function uses to return
the value of the name. So, the fact that os.sysconf("SC_OPEN_MAX")
was returning 32767 on AIX is correct. However, the slowness this
causes is still an issue. This is because python is closing all these
file descriptors in python code, not C code - specifically, in
popen2.py:

try:
MAXFD = os.sysconf('SC_OPEN_MAX')
except (AttributeError, ValueError):
MAXFD = 256

....

def _run_child(self, cmd):
if isinstance(cmd, basestring):
cmd = ['/bin/sh', '-c', cmd]
for i in range(3, MAXFD):
try:
os.close(i)
except OSError:
pass
try:
os.execvp(cmd[0], cmd)
finally:
os._exit(1)
Any chance the "for i in range(3, MAXFD):" loop could be done in C
instead? Even having, say, an os.rclose(x,y) low level function to
close all file descriptors in range [x,y] would be great.

John.

Nov 21 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Lam | last post by:
hi i have a problem with javascript i have an image in a HTML page i would like to have a rectangle upon mouse cursor when mouse are over the image but i don't know how can i do this any idea ?...
9
by: Dr John Stockton | last post by:
Assuming default set-ups and considering all reasonable browsers, whatever that may mean, what should an author expect that his readers in general will see (with visual browsers) for a page with...
0
by: Nicole | last post by:
Has anyone else experienced a dramatic slow down of performance with their ADP's after upgrading to Service Pack 2 or above? The slowness occurs exclusively when pulling data from SQL Server in...
7
by: Mike Nygard | last post by:
I'm experiencing extremely slow response times in design mode of my forms since moving to Access 2003. Simply dragging a button to a different position on the form takes 30 seconds or more. The...
25
by: No Such Luck | last post by:
Hi all: Below are two pieces of code that basically perform the same task. Version A produces a segmentation fault, while version B works correctly. I understand why version B works correctly,...
0
by: Art | last post by:
Hi, I'm working on an application that writes records to an Access database. What I've got so far works, but very slowly. I have one class that creates the data -- some of it is as follows...
2
by: BerkshireGuy | last post by:
I am using Windows 2000 and Access 2003 and noticed that when I click a control on a report, it takes a few seconds for the "click" to take place. Its laggy. Same with moving controls, adding new...
1
by: shai kedem | last post by:
hello, this post is related to iisreset slowness issue I am experiancing. I mean by slowness that IIS reset takes some time to terminate the asp.net working processes and after that starts up...
0
by: =?Utf-8?B?SmFjY2k=?= | last post by:
Hello, I'm having trouble creating a new folder. Right clicking or going to File, New, the arrow turns into an hourglass and for at least a minute, then I can create a new folder. This has been...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.