Am Freitag, 21. Mai 2004 18:18 schrieb Noam Raphael:
I assume that when you write "./program.py < mboxfile.txt", Python knows
that sys.stdin is a regular file, so it can do seek on it (for example,
go back to its beginning).
It's not Python that knows that you can do a seek on the file, but what the
shell does when you pipe a file to a program is to call filefd =
open(file,"r"); fdup2(filefd,0) (0 = stdin) just before the shell forks to
start the program. sys.stdin is always just connected to the filedescriptor 0
which was passed in, which in turn is connected to a file file-descriptor by
the shell, which in turn is seekable.
When you write "cat mboxfile.txt |
program.py", the program cat outputs the file mboxfile.txt into
sys.stdin byte by byte, so you can't do seek on it.
Now, when you pipe something into another program, exactly this gets
generated: a pipe is generated using readfd, writefd = pipe() whose write end
is connected to the stdout (fdup2(writefd,1); fd 1 = stdout) of the first
program when the shell forks to start it, and whose read end is connected to
stdin (fdup2(readfd,0); fd 0 = stdin, as before) of the second program, again
when the shell forks to start it. This means that sys.stdin of the Python
program, which again is connected to filedescriptor 0 is now connected to a
pipe file-descriptor. Pipes are not seekable, and that's exactly what the
exception is telling you.
So, what do we learn from this? The mbox format needs a filedescriptor which
is seekable to be able to parse it (err, I guess it wouldn't need this, but
who knows, look at the source luke!), so you need to pass a reference to a
file-like object which implements seek (or at least a file-descriptor which
is seekable, which pipes are not).
HTH!
Heiko.