Hi!
Classic situation - I have to process an input stream of unknown length
until a I reach its end (EOF, End Of File). How do I check for EOF? The
input stream can be anything from opened file through sys.stdin to a
network socket. And it's binary and potentially huge (gigabytes), thus
"for line in stream.readlines()" isn't really a way to go.
For now I have roughly:
stream = sys.stdin
while True:
data = stream.read(1024)
process_data(data)
if len(data) < 1024: ## (*)
break
I smell a fragile point at (*) because as far as I know e.g. network
sockets streams may return less data than requested even when the socket
is still open.
I'd better like something like:
while not stream.eof():
...
but there is not eof() method :-(
This is probably a trivial problem but I haven't found a decent solution.
Any hints?
Thanks!
GiBo 8 17774
On 2007-02-19, GiBo <gi**@gentlemail.comwrote:
Hi!
Classic situation - I have to process an input stream of unknown length
until a I reach its end (EOF, End Of File). How do I check for EOF? The
input stream can be anything from opened file through sys.stdin to a
network socket. And it's binary and potentially huge (gigabytes), thus
"for line in stream.readlines()" isn't really a way to go.
For now I have roughly:
stream = sys.stdin
while True:
data = stream.read(1024)
if len(data) == 0:
break #EOF
process_data(data)
--
Grant Edwards grante Yow! CALIFORNIA is where
at people from IOWA or NEW
visi.com YORK go to subscribe to
CABLE TELEVISION!!
Grant Edwards wrote:
On 2007-02-19, GiBo <gi**@gentlemail.comwrote:
>Hi!
Classic situation - I have to process an input stream of unknown length until a I reach its end (EOF, End Of File). How do I check for EOF? The input stream can be anything from opened file through sys.stdin to a network socket. And it's binary and potentially huge (gigabytes), thus "for line in stream.readlines()" isn't really a way to go.
For now I have roughly:
stream = sys.stdin while True: data = stream.read(1024)
if len(data) == 0:
break #EOF
> process_data(data)
Right, not a big difference though. Isn't there a cleaner / more
intuitive way? Like using some wrapper objects around the streams or
something?
GiBo
En Mon, 19 Feb 2007 21:50:11 -0300, GiBo <gi**@gentlemail.comescribió:
Grant Edwards wrote:
>On 2007-02-19, GiBo <gi**@gentlemail.comwrote:
>>> Classic situation - I have to process an input stream of unknown length until a I reach its end (EOF, End Of File). How do I check for EOF? The input stream can be anything from opened file through sys.stdin to a network socket. And it's binary and potentially huge (gigabytes), thus "for line in stream.readlines()" isn't really a way to go.
For now I have roughly:
stream = sys.stdin while True: data = stream.read(1024)
if len(data) == 0: break #EOF
>> process_data(data)
Right, not a big difference though. Isn't there a cleaner / more
intuitive way? Like using some wrapper objects around the streams or
something?
Read the documentation... For a true file object:
read([size]) ... An empty string is returned when EOF is encountered
immediately.
All the other "file-like" objects (like StringIO, socket.makefile, etc)
maintain this behavior.
So this is the way to check for EOF. If you don't like how it was spelled,
try this:
if data=="": break
If your data is made of lines of text, you can use the file as its own
iterator, yielding lines:
for line in stream:
process_line(line)
--
Gabriel Genellina
On 2007-02-20, GiBo <gi**@gentlemail.comwrote:
>>stream = sys.stdin while True: data = stream.read(1024)
if len(data) == 0: break #EOF
>> process_data(data)
Right, not a big difference though. Isn't there a cleaner /
more intuitive way?
A file is at EOF when read() returns ''. The above is the
cleanest, simplest, most direct way to do what you specified.
Everybody does it that way, and everybody recognizes what's
being done.
It's also the "standard, Pythonic" way to do it.
Like using some wrapper objects around the streams or
something?
You can do that, but then you're mostly just obfuscating
things.
--
Grant Edwards grante Yow! Vote for ME
at -- I'm well-tapered,
visi.com half-cocked, ill-conceived
and TAX-DEFERRED!
In article <ma***************************************@python. org>, Gabriel Genellina wrote:
So this is the way to check for EOF. If you don't like how it was spelled,
try this:
if data=="": break
How about :
if not data: break
? ;-)
On 2/20/07, Nathan <ne******@gmail.comwrote:
On 2/19/07, Gabriel Genellina <ga******@yahoo.com.arwrote:
En Mon, 19 Feb 2007 21:50:11 -0300, GiBo <gi**@gentlemail.comescribió:
Grant Edwards wrote:
>On 2007-02-19, GiBo <gi**@gentlemail.comwrote:
>>>
>>Classic situation - I have to process an input stream of unknown length
>>until a I reach its end (EOF, End Of File). How do I check for EOF?The
>>input stream can be anything from opened file through sys.stdin to a
>>network socket. And it's binary and potentially huge (gigabytes), thus
>>"for line in stream.readlines()" isn't really a way to go.
>>>
>>For now I have roughly:
>>>
>>stream = sys.stdin
>>while True:
>> data = stream.read(1024)
> if len(data) == 0:
> break #EOF
>> process_data(data)
>
Right, not a big difference though. Isn't there a cleaner / more
intuitive way? Like using some wrapper objects around the streams or
something?
Read the documentation... For a true file object:
read([size]) ... An empty string is returned when EOF is encountered
immediately.
All the other "file-like" objects (like StringIO, socket.makefile, etc)
maintain this behavior.
So this is the way to check for EOF. If you don't like how it was spelled,
try this:
if data=="": break
If your data is made of lines of text, you can use the file as its own
iterator, yielding lines:
for line in stream:
process_line(line)
--
Gabriel Genellina
-- http://mail.python.org/mailman/listinfo/python-list
Not to beat a dead horse, but I often do this:
data = f.read(bufsize):
while data:
# ... process data.
data = f.read(bufsize)
-The only annoying bit it the duplicated line. I find I often follow
this pattern, and I realize python doesn't plan to have any sort of
do-while construct, but even still I prefer this idiom. What's the
concensus here?
What about creating a standard binary-file iterator:
def blocks_of(infile, bufsize = 1024):
data = infile.read(bufsize)
if data:
yield data
-the use would look like this:
for block in blocks_of(myfile, bufsize = 2**16):
process_data(block) # len(block) <= bufsize...
(ahem), make that iterator something that works, like:
def blocks_of(infile, bufsize = 1024):
data = infile.read(bufsize)
while data:
yield data
data = infile.read(bufsize)
On 2/19/07, Gabriel Genellina <ga******@yahoo.com.arwrote:
En Mon, 19 Feb 2007 21:50:11 -0300, GiBo <gi**@gentlemail.comescribió:
Grant Edwards wrote:
On 2007-02-19, GiBo <gi**@gentlemail.comwrote:
Classic situation - I have to process an input stream of unknown length until a I reach its end (EOF, End Of File). How do I check for EOF? The input stream can be anything from opened file through sys.stdin to a network socket. And it's binary and potentially huge (gigabytes), thus "for line in stream.readlines()" isn't really a way to go.
For now I have roughly:
stream = sys.stdin while True: data = stream.read(1024)
if len(data) == 0:
break #EOF process_data(data)
Right, not a big difference though. Isn't there a cleaner / more
intuitive way? Like using some wrapper objects around the streams or
something?
Read the documentation... For a true file object:
read([size]) ... An empty string is returned when EOF is encountered
immediately.
All the other "file-like" objects (like StringIO, socket.makefile, etc)
maintain this behavior.
So this is the way to check for EOF. If you don't like how it was spelled,
try this:
if data=="": break
If your data is made of lines of text, you can use the file as its own
iterator, yielding lines:
for line in stream:
process_line(line)
--
Gabriel Genellina
-- http://mail.python.org/mailman/listinfo/python-list
Not to beat a dead horse, but I often do this:
data = f.read(bufsize):
while data:
# ... process data.
data = f.read(bufsize)
-The only annoying bit it the duplicated line. I find I often follow
this pattern, and I realize python doesn't plan to have any sort of
do-while construct, but even still I prefer this idiom. What's the
concensus here?
What about creating a standard binary-file iterator:
def blocks_of(infile, bufsize = 1024):
data = infile.read(bufsize)
if data:
yield data
-the use would look like this:
for block in blocks_of(myfile, bufsize = 2**16):
process_data(block) # len(block) <= bufsize...
On Feb 19, 6:58 pm, GiBo <g...@gentlemail.comwrote:
Hi!
Classic situation - I have to process an input stream of unknown length
until a I reach its end (EOF, End Of File). How do I check for EOF? The
input stream can be anything from opened file through sys.stdin to a
network socket. And it's binary and potentially huge (gigabytes), thus
"for line in stream.readlines()" isn't really a way to go.
Could you use xreadlines()? It's a lazily-evaluated stream reader.
For now I have roughly:
stream = sys.stdin
while True:
data = stream.read(1024)
process_data(data)
if len(data) < 1024: ## (*)
break
I smell a fragile point at (*) because as far as I know e.g. network
sockets streams may return less data than requested even when the socket
is still open.
Well it depends on a lot of things. Is the stream blocking or non-
blocking (on sockets and some other sorts of streams, you can pick
this yourself)? What are the underlying semantics (reliable-and-
blocking TCP or dropping-and-unordered-UDP)? Unfortunately, you really
need to just know what you're working with (and there's really no
better solution; trying to hide the underlying semantics under a
proscribed overlaid set of semantics can only lead to badness in the
long run).
I'd better like something like:
while not stream.eof():
...
but there is not eof() method :-(
This is probably a trivial problem but I haven't found a decent solution.
For your case, it's not so hard: http://pyref.infogami.com/EOFError says "read() and readline() methods
of file objects return an empty string when they hit EOF." so you
should assume that if something is claiming to be a file-like object
that it will work this way.
Any hints?
So:
stream = sys.stdin
while True:
data = stream.read(1024)
if data=="":
break
process_data(data) This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Fraser Ross |
last post by:
Can I do several reading operations and then check fail() or do I need to
check fail() after every reading operation. Is it possible than a read
operation can set fail() and the next not? I am...
|
by: Leslaw Bieniasz |
last post by:
Cracow, 3.01.2005
Hello,
When opening a file stream in the append mode,
either a new file is created (if a specified file does not exist),
or an existing file is opened for adding stuff.
Is...
|
by: Jason Heyes |
last post by:
I would like to be able to extract an integer from a stream without having
to write a test when I want the integer within some range. Unfortunately
there is no range-checked integer type in the...
|
by: Lionel B |
last post by:
I know this has probably come up frequently, but couldn't find a satisfactory reference... I have some code which needs
to read from stdin but must not block waiting for input if there is no input...
|
by: pauldepstein |
last post by:
How should I check that a stream (for example a .txt file or the screen
-- std::cout ) is open and ready to receive input.
To declare the stream as an instantiation of the ofstream class,
I...
|
by: Edd |
last post by:
Hello,
I have an array of strings containing filenames. I must open each in turn
and parse the data within. However, if a filename appears multiple times in
the list it must still only be read...
|
by: Chris |
last post by:
Hi,
What is the most easy way to check on EOF while reading a
binary file with all integers ?
In lot of examples the read data are first stored in a
string, and afterwards the string is...
|
by: Lyle |
last post by:
Hi. What is the best way to check for EOF when using the streamreader? I am using a do loop. Following is what I have tried but all end in an error when eof is reached.
1. Adding 'until line...
|
by: jacob navia |
last post by:
We hear very often in this discussion group that
bounds checking, or safety tests are too expensive
to be used in C.
Several researchers of UCSD have published an interesting
paper about this...
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |