473,386 Members | 1,791 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Iteration on file reading

for line in sys.stdin:

Does this statement cause all of stdin to be read before the loop begins?

I may need to read several GB and I do not want to swamp the machine's
memory.
Jul 18 '05 #1
7 2104
Paul Watson
for line in sys.stdin:

Does this statement cause all of stdin to be read before the loop begins?


No. It will read a block of text at a time and break that block
into lines. This gives great performance and is scalable to
large files (so long as you can can afford to keep that extra
block around). However, it's lousy for interactive work.

Andrew
da***@dalkescientific.com
Jul 18 '05 #2
Try a generator. This will just read a line at a time.
-- Paul

<code>
from sys import stdin

def lineReader( strm ):
while 1:
yield strm.readline().rstrip("\n")

for f in lineReader( stdin ):
print ">>> " + f
</code>

"Paul Watson" <pw*****@redlinec.com> wrote in message
news:3f**********@themost.net...
for line in sys.stdin:

Does this statement cause all of stdin to be read before the loop begins?

I may need to read several GB and I do not want to swamp the machine's
memory.

Jul 18 '05 #3
Paul McGuire:
def lineReader( strm ):
while 1:
yield strm.readline().rstrip("\n")

for f in lineReader( stdin ):
print ">>> " + f


You can simplify that with the iter builtin.

for f in iter(stdin.readline, ""):
print ">>> " + f

(Hmm... maybe I should test it? Naaaaahhh.)

Andrew
da***@dalkescientific.com
Jul 18 '05 #4
In article <3f**********@themost.net>,
"Paul Watson" <pw*****@redlinec.com> wrote:
for line in sys.stdin:

Does this statement cause all of stdin to be read before the loop begins?


Nope.

Just
Jul 18 '05 #5
Andrew Dalke wrote:
Paul McGuire:
def lineReader( strm ):
while 1:
yield strm.readline().rstrip("\n")

for f in lineReader( stdin ):
print ">>> " + f


You can simplify that with the iter builtin.

for f in iter(stdin.readline, ""):
print ">>> " + f

(Hmm... maybe I should test it? Naaaaahhh.)


There is a difference in behavior: the readline method
returns a line WITH a trailing \n, which then gets
printed, giving a "double-spaced" effect. Sure, you
can strip the \n in the loop body, but if you always
want a sequence of newline-stipped lines, that is
somewhat repetitious. If the use of readline is
mandated (i.e., no direct looping on the file for one
reason or another), my favourite way of expression is:

def linesof(somefile):
for line in iter(somefile.readline, ''):
yield line.rstrip('\n')

not as concise as either of the above, but, I think,
a wee little bit clearer.
Alex

Jul 18 '05 #6
"Paul Watson" <pw*****@redlinec.com> wrote in message news:<3f**********@themost.net>...
for line in sys.stdin:

Does this statement cause all of stdin to be read before the loop begins?

I may need to read several GB and I do not want to swamp the machine's
memory.


Have you considered simply inputting this into an interactive
interpreter and seeing if it swamps the machine's memory?

Jeremy
Jul 18 '05 #7
Alex:
There is a difference in behavior: the readline method
returns a line WITH a trailing \n, which then gets
printed, giving a "double-spaced" effect. Sure, you
can strip the \n in the loop body, ....


Quite true.

As it turns out, the OP wanted to know about

for line in sys.stdin:

The post to which I replied changed the spec to
remove the newline, but the main point was to
use a generator ... which could if desired to extra
work to get rid of the "\n". It could just have
easily converted everything to uppercase or done
rot13 conversion on the text.

My reply meant to point out that the iter builtin
can be used to turn a "function returns the next
object each time it's called and a sentinel when
it's done" into an iterable. I just left out the extra
work his code did since it wasn't needed by the OP.

Andrew
da***@dalkescientific.com
Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

35
by: Raymond Hettinger | last post by:
Here is a discussion draft of a potential PEP. The ideas grew out of the discussion on pep-284. Comments are invited. Dart throwing is optional. Raymond Hettinger ...
59
by: Raymond Hettinger | last post by:
Please comment on the new PEP for reverse iteration methods. Basically, the idea looks like this: for i in xrange(10).iter_backwards(): # 9,8,7,6,5,4,3,2,1,0 <do something with i> The...
2
by: Abdullah Khaidar | last post by:
Is there any iteration style we must use to get faster processing time? I've tried with some style to concat number in list. But I still don't know which one is the recommended style. >>> def...
0
by: Danny Anderson | last post by:
Hola, C++ folk! I want to have a do...while loop that works with a different file each iteration, prompting the user for the file name to open and write. The problem I am having is that the...
28
by: robert | last post by:
In very rare cases a program crashes (hard to reproduce) : * several threads work on an object tree with dict's etc. in it. Items are added, deleted, iteration over .keys() ... ). The threads are...
75
by: Sathyaish | last post by:
Can every problem that has an iterative solution also be expressed in terms of a recursive solution? I tried one example, and am in the process of trying out more examples, increasing their...
10
by: rplobue | last post by:
im trying to get urllib2 to work on my server which runs python 2.2.1. When i run the following code: import urllib2 for line in urllib2.urlopen('www.google.com'): print line i will...
0
by: Rob Meade | last post by:
Hi all, I have the following xml file... <?xml version="1.0" encoding="utf-8" ?> <Assembly> <!-- ' The name of the product/application/resource.
1
by: greyseal96 | last post by:
Hi, I am a pretty new programmer, so I apologize in andvance if this is a dumb question... In a book that I'm reading to learn C#, it says that when using a foreach() loop, a read-only copy of...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.