By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
426,034 Members | 1,714 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 426,034 IT Pros & Developers. It's quick & easy.

Making a file-like object for manipulating a large file

P: n/a
This should be a relatively simple problem, but I haven't quite got
the idea of how to go about it. I have a VERY large file that I would
like to load a line at a time, do some manipulations on it, and then
make it available to as a file-like object for use as input to a
database module (psycopg2) that wants a file-like object (with read
and readlines methods). I could write the manipulated file out to
disk and then read it back in, but that seems wasteful. So, it seems
like I need a buffer, a way to fill the buffer and a way to have read
and readlines use the buffer. What I can't do is to load the ENTIRE
file into a stringio object, as the file is much too large. Any
suggestions?

Thanks,
Sean

Aug 24 '07 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Sean Davis wrote:
This should be a relatively simple problem, but I haven't quite got
the idea of how to go about it. I have a VERY large file that I would
like to load a line at a time, do some manipulations on it, and then
make it available to as a file-like object for use as input to a
database module (psycopg2) that wants a file-like object (with read
and readlines methods). I could write the manipulated file out to
disk and then read it back in, but that seems wasteful. So, it seems
like I need a buffer, a way to fill the buffer and a way to have read
and readlines use the buffer. What I can't do is to load the ENTIRE
file into a stringio object, as the file is much too large. Any
suggestions?
The general approach would be (something like the following untested code):

def filter_lines(f):
for line in f:
if to_be_included(line):
yield line

fil = open("somefile.big.txt", "r")\

filegen = filter_lines(fil)

You can then iterate over the filegen generator, or write your own class
that makes it file-like. At least the generator manages to throw away
the unwanted content without buffering the whole file in memory.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

Aug 24 '07 #2

P: n/a
In message <11********************@q4g2000prc.googlegroups.co m>, Sean Davis
wrote:
I have a VERY large file that I would
like to load a line at a time, do some manipulations on it, and then
make it available to as a file-like object for use as input to a
database module (psycopg2) that wants a file-like object (with read
and readlines methods). I could write the manipulated file out to
disk and then read it back in, but that seems wasteful.
If your consumer doesn't need to seek, how about having it read from a pipe?
Aug 26 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.