By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,846 Members | 1,631 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,846 IT Pros & Developers. It's quick & easy.

Copying data between file-like objects

P: n/a
Hi,

another question: What's the most efficient way of copying data between
two file-like objects?

f1.write(f2.read()) doesn't seem to me as efficient as it might be, as a
string containing all the contents of f2 will be created and thrown away.
In the case of two StringIO objects, this means there's a point when the
contents is held in memory three times.

Reading and writing a series of short blocks to avoid a large copy buffer
seems ugly to me, and string objects will be created and thrown away all
the time. Do I have to live with that?

(In C, I would do the same thing, only without having to create and throw
away anything while overwriting a copy buffer, and being used to doing
everything the pedestrian way, anyway.)

--
Thomas
Jul 18 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Thomas Lotze wrote:
another question: What's the most efficient way of copying data between
two file-like objects?

f1.write(f2.read()) doesn't seem to me as efficient as it might be, as a
string containing all the contents of f2 will be created and thrown away.


You could try f1.writelines(f2).

Steve
Jul 18 '05 #2

P: n/a
Thomas Lotze wrote:
f1.write(f2.read()) doesn't seem to me as efficient as it might be, as a
string containing all the contents of f2 will be created and thrown away.
if f2 isn't too large, reading lots of data in one operation is often the most
efficient way (trust me, the memory system is a lot faster than your disk)

if you don't know how large f2 can be, use shutil.copyfileobj:
help(shutil.copyfileobj)

Help on function copyfileobj in module shutil:

copyfileobj(fsrc, fdst, length=16384)
copy data from file-like object fsrc to file-like object fdst
In the case of two StringIO objects, this means there's a point when the
contents is held in memory three times.


to copy stringio objects, you can use f1 = StringIO(f2.getvalue()). why you
would want/need to do this is more than I can figure out, though...

</F>

Jul 18 '05 #3

P: n/a
Fredrik Lundh wrote:
if f2 isn't too large, reading lots of data in one operation is often the most
efficient way (trust me, the memory system is a lot faster than your disk)
Sure.
if you don't know how large f2 can be, use shutil.copyfileobj:
>>> help(shutil.copyfileobj) Help on function copyfileobj in module shutil:

copyfileobj(fsrc, fdst, length=16384)
copy data from file-like object fsrc to file-like object fdst
This sounds like what I was looking for. Thanks for the pointer.
However, the following doesn't seem like anything is being copied:
from StringIO import StringIO
from shutil import copyfileobj
s = StringIO()
s2 = StringIO()
s.write('asdf')
copyfileobj(s, s2)
s2.getvalue()

''
to copy stringio objects, you can use f1 = StringIO(f2.getvalue()).
But this should have the same problem as using read(): a string will be
created on the way which contains all the content.
why you
would want/need to do this is more than I can figure out, though...


Because I want to manipulate a copy of the data and be able to compare it
to the original afterwards.

Another thing I'd like to do is copy parts of a StringIO object's content
to another object. This doesn't seem possible with any shutil method. Any
idea on that?

What one can really wonder, I admit, is why the difference between holding
data two or three times in memory matters that much, especially if the
latter is only for a short time. But as I'm going to use the code that
handles the long string as a core component to some application, I'd like
to make it behave as well as possible.

--
Thomas
Jul 18 '05 #4

P: n/a
Thomas Lotze wrote:
if you don't know how large f2 can be, use shutil.copyfileobj:
>>> help(shutil.copyfileobj) Help on function copyfileobj in module shutil:

copyfileobj(fsrc, fdst, length=16384)
copy data from file-like object fsrc to file-like object fdst


This sounds like what I was looking for. Thanks for the pointer.
However, the following doesn't seem like anything is being copied:
from StringIO import StringIO
from shutil import copyfileobj
s = StringIO()
s2 = StringIO()
s.write('asdf')
copyfileobj(s, s2)
s2.getvalue()

copyfileobj copies from the current location, and write leaves the file
pointer at the end of the file. a s.seek(0) before the copy fixes that.
to copy stringio objects, you can use f1 = StringIO(f2.getvalue()).


But this should have the same problem as using read(): a string will be
created on the way which contains all the content.


getvalue() returns the contents of the f2 file as a string, and f1 will use that
string as the buffer. there's no extra copying.
Because I want to manipulate a copy of the data and be able to compare it
to the original afterwards.
why not just use a plain string (or a list of strings)? your focus on StringIO sounds
like a leftover from some C library you've been using in an earlier life ;-)
Another thing I'd like to do is copy parts of a StringIO object's content
to another object. This doesn't seem possible with any shutil method. Any
idea on that?
use a plain string and slicing. (if you insist on using StringIO, use seek and read)
What one can really wonder, I admit, is why the difference between holding
data two or three times in memory matters that much, especially if the
latter is only for a short time. But as I'm going to use the code that
handles the long string as a core component to some application, I'd like
to make it behave as well as possible.


use plain strings, so you know what you're doing.

</F>

Jul 18 '05 #5

P: n/a
Fredrik Lundh wrote:
copyfileobj copies from the current location, and write leaves the file
pointer at the end of the file. a s.seek(0) before the copy fixes that.
Damn, this cannot be read from the documentation, and combined with the
fact that there's no length parameter for a portion to copy either, I
thought copying would mean copying all.
getvalue() returns the contents of the f2 file as a string, and f1 will
use that string as the buffer. there's no extra copying.
Oh, good to know. Then StringIO(f2.getvalue()) or StringIO(f2.read())
would be the way to go.
Because I want to manipulate a copy of the data and be able to compare
it to the original afterwards.


why not just use a plain string (or a list of strings)? your focus on
StringIO sounds like a leftover from some C library you've been using in
an earlier life ;-)


Because the data can be a lot, and modifying long strings means a lot of
slicing and copying partial strings around, if I understand right.
Modifying a StringIO buffer is possible in-place. Plus, it's easier to
teach an algorithm that works on a StringIO to use a file instead, so I
may be able to avoid reading stuff into memory altogether in certain
places without worrying about special cases.
use a plain string and slicing. (if you insist on using StringIO, use
seek and read)


OK.

--
Thomas
Jul 18 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.