By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,035 Members | 1,984 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,035 IT Pros & Developers. It's quick & easy.

Python 2.2.1 and select()

P: n/a
Hi kids!

I've got some code that uses select.select() to capture all the output
of a subprocess (both stdout and stderr, see below). This code works
as expected on a variety of Fedora systems running Python 2.4.0, but
on a Debian Sarge system running Python 2.2.1 it's a no-go. I'm
thinking this is a bug in that particular version of Python, but I'd
like to have confirmation if anyone can provide it.

The behavior I see is this: the call to select() returns:
[<file corresponding to sub-proc's STDOUT>] [] []

If and only if the total amount of output is greater than the
specified buffer size, then reading on this file hangs indefinitely.
For what it's worth, the program whose output I need to capture with
this generates about 17k of output to STDERR, and about 1k of output
to STDOUT, at essentially random intervals. But I also ran it with a
test shell script that generates roughly the same amount of output to
each file object, alternating between STDOUT and STDERR, with the same
results.

Yes, I'm aware that this version of Python is quite old, but I don't
have a great deal of control over that (though if this is indeed a
python bug, as opposed to a problem with my implementation, it might
provide some leverage to get it upgraded)... Thanks in advance for
any help you can provide. The code in question (quite short) follows:

def capture(cmd):
buffsize = 8192
inlist = []
inbuf = ""
errbuf = ""

io = popen2.Popen3(cmd, True, buffsize)
inlist.append(io.fromchild)
inlist.append(io.childerr)
while True:
ins, outs, excepts = select.select(inlist, [], [])
for i in ins:
x = i.read()
if not x:
inlist.remove(i)
else:
if i == io.fromchild:
inbuf += x
if i == io.childerr:
errbuf += x
if not inlist:
break
if io.wait():
raise FailedExitStatus, errbuf
return (inbuf, errbuf)

If anyone would like, I could also provide a shell script and a main
program one could use to test this function...

--
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFH6CQSHEnASN++rQIRAh7dAJsFSzzE2OBAdCwC7N0lXW 4/1AvMxACfcibu
YV8/VS3XI0Bwanc6swvEdM4=
=D1br
-----END PGP SIGNATURE-----

Mar 24 '08 #1
Share this Question
Share on Google+
4 Replies


P: n/a
On Mar 24, 2:58 pm, Derek Martin <c...@pizzashack.orgwrote:
If and only if the total amount of output is greater than the
specified buffer size, then reading on this file hangs indefinitely.
For what it's worth, the program whose output I need to capture with
this generates about 17k of output to STDERR, and about 1k of output
to STDOUT, at essentially random intervals. But I also ran it with a
test shell script that generates roughly the same amount of output to
each file object, alternating between STDOUT and STDERR, with the same
results.
I think this is more of a limitation with the underlying clib.
Subprocess buffering defaults to block buffering instead of
line buffering. You can't change this unless you can recompile
the application you are trying to run in a subprocess or
unless you run your subprocess in a pseudotty (pty).

Pexpect takes care of this problem. See http://www.noah.org/wiki/Pexpect
for more info.

--
Noah
Mar 25 '08 #2

P: n/a
En Mon, 24 Mar 2008 23:03:56 -0300, Derek Martin <co**@pizzashack.org>
escribió:
On Mon, Mar 24, 2008 at 05:52:54PM -0700, Noah wrote:
>On Mar 24, 2:58 pm, Derek Martin <c...@pizzashack.orgwrote:
If and only if the total amount of output is greater than the
specified buffer size, then reading on this file hangs indefinitely.
You may try using two worker threads to read both streams; this way you
don't care about the blocking issues.

--
Gabriel Genellina

Mar 25 '08 #3

P: n/a
Il Mon, 24 Mar 2008 17:58:42 -0400, Derek Martin ha scritto:
Hi kids!

I've got some code that uses select.select() to capture all the output
of a subprocess (both stdout and stderr, see below). This code works as
expected on a variety of Fedora systems running Python 2.4.0, but on a
Debian Sarge system running Python 2.2.1 it's a no-go. I'm thinking
this is a bug in that particular version of Python, but I'd like to have
confirmation if anyone can provide it.

The behavior I see is this: the call to select() returns: [<file
corresponding to sub-proc's STDOUT>] [] []

If and only if the total amount of output is greater than the specified
buffer size, then reading on this file hangs indefinitely. For what it's
worth, the program whose output I need to capture with this generates
about 17k of output to STDERR, and about 1k of output to STDOUT, at
essentially random intervals. But I also ran it with a test shell
script that generates roughly the same amount of output to each file
object, alternating between STDOUT and STDERR, with the same results.

Yes, I'm aware that this version of Python is quite old, but I don't
have a great deal of control over that (though if this is indeed a
python bug, as opposed to a problem with my implementation, it might
provide some leverage to get it upgraded)... Thanks in advance for any
help you can provide. The code in question (quite short) follows:

def capture(cmd):
buffsize = 8192
inlist = []
inbuf = ""
errbuf = ""

io = popen2.Popen3(cmd, True, buffsize) inlist.append(io.fromchild)
inlist.append(io.childerr)
while True:
ins, outs, excepts = select.select(inlist, [], []) for i in ins:
x = i.read()
if not x:
inlist.remove(i)
else:
if i == io.fromchild:
inbuf += x
if i == io.childerr:
errbuf += x
if not inlist:
break
if io.wait():
raise FailedExitStatus, errbuf
return (inbuf, errbuf)

If anyone would like, I could also provide a shell script and a main
program one could use to test this function...
From yor description, it would seem that two events occurs:

- there are actual data to read, but in amount less than bufsize.
- the subsequent read waits (for wathever reason) until a full buffer can
be read, and therefore lock your program.

Try specifying bufsize=1 or doing read(1). If my guess is correct, you
should not see the problem. I'm not sure that either is a good solution
for you, since both have performance issues.

Anyway, I doubt that the python library does more than wrapping the
system call, so if there is a bug it is probably in the software layers
under python.

Ciao
----
FB
Mar 25 '08 #4

P: n/a
On Wed, Mar 26, 2008 at 09:49:51AM -0700, Noah Spurrier wrote:
On 2008-03-24 22:03-0400, Derek Martin wrote:
That's an interesting thought, but I guess I'd need you to elaborate
on how the buffering mode would affect the operation of select(). I
really don't see how your explanation can cover this, given the
following:

I might be completely off the mark here. I have not tested your code or even
closely examined it. I don't mean to waste your time. I'm only giving a
reflex response because your problem seems to exactly match a very common
situation where someone tries to use select with a pipe to a subprocess
created with popen and that subprocess uses C stdio.
Yeah, you're right, more or less. I talked to someone much smarter
than I here in the office, who pointed out that the behavior of
Python's read() without a specified size is to attempt to read until
EOF. This will definitely cause the read to block (if there's I/O
waiting from STDERR), if you're allowing I/O to block... :(

The solution is easy though...

def set_nonblock(fd):
flags = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, flags | os.O_NONBLOCK)

Then in the function, after calling popen:
set_nonblock(io.fromchild.fileno())
set_nonblock(io.childerr.fileno())

Yay for smart people.

--
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFH6uNyHEnASN++rQIRAsFpAKCMr60u03yDHsIH5xfbs+ 1klWIETwCfeNDe
ldWnh3VrcTZV7M5RigFFfv4=
=kY9y
-----END PGP SIGNATURE-----

Mar 27 '08 #5

This discussion thread is closed

Replies have been disabled for this discussion.