473,394 Members | 1,902 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Why does StringIO discard its initial value?

When StringIO gets an initial value passed to its constructor, it seems
to discard it after the first call to .write(). For instance:
from StringIO import StringIO
buffer = StringIO('foo')
buffer.getvalue() 'foo' buffer.write('bar')
buffer.getvalue() 'bar' buffer.write('baz')
buffer.getvalue()

'barbaz'

The obvious workaround is to call buffer.write() with the initial value
instead of passing it to StringIO's constructor, so this issue doesn't
bother me very much, but I'm still curious about it. Is this the
expected behavior, and why it isn't mentioned in the docs if so?
Jul 18 '05 #1
5 2822
Maybe this short interactive session can give you an idea why.
from StringIO import StringIO
b = StringIO("123456789")
b.tell() 0 b.write("abc")
b.getvalue() 'abc456789' b.tell()

3

StringIO seems to operate like a file opened with "r+" (If I've got my modes
right): it is opened for reading and writing, and positioned at the beginning.
In my example, the write of 3 bytes overwrites the first 3 bytes of the file
and leaves the rest intact. In your example your first write overwrote the
whole initial contents of the file, so you couldn't notice this effect.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFCWWqGJd01MZaTXX0RAvSuAJ9lSChyzOej2TkqLuoaWp zxopOUPACfQv8D
lWmB6rReTFep5sYMwanqF7I=
=t4F9
-----END PGP SIGNATURE-----

Jul 18 '05 #2
[Leif K-Brooks]
The obvious workaround is to call buffer.write() with the initial value
instead of passing it to StringIO's constructor,
More than just a workaround, it is the preferred approach.
That makes is easier to switch to cStringIO where initialized objects are
read-only.
Is this the
expected behavior
Yes.
, and why it isn't mentioned in the docs if so?


Per your request, the docs have been updated.

Raymond Hettinger
Jul 18 '05 #3
Raymond Hettinger wrote:
[Leif K-Brooks]
The obvious workaround is to call buffer.write() with the initial value
instead of passing it to StringIO's constructor,

More than just a workaround, it is the preferred approach.
That makes is easier to switch to cStringIO where initialized objects are
read-only.


Others may find this helpful ; it's a pure Python wrapper for cStringIO
that makes it behave like StringIO in not having initialized objects
readonly. Would it be an idea to extend cStringIO like this in the
standard library? It shouldn't lose performance if used like a standard
cStringIO, but it prevents frustration :-)

David

class StringIO:
def __init__(self, buf = ''):
if not isinstance(buf, (str, unicode)):
buf = str(buf)
self.len = len(buf)
self.buf = cStringIO.StringIO()
self.buf.write(buf)
self.buf.seek(0)
self.pos = 0
self.closed = 0

def __iter__(self):
return self

def next(self):
if self.closed:
raise StopIteration
r = self.readline()
if not r:
raise StopIteration
return r

def close(self):
"""Free the memory buffer.
"""
if not self.closed:
self.closed = 1
del self.buf, self.pos

def isatty(self):
if self.closed:
raise ValueError, "I/O operation on closed file"
return False

def seek(self, pos, mode = 0):
if self.closed:
raise ValueError, "I/O operation on closed file"
self.buf.seek(pos, mode)
self.pos = self.buf.tell()

def tell(self):
if self.closed:
raise ValueError, "I/O operation on closed file"
return self.pos

def read(self, n = None):
if self.closed:
raise ValueError, "I/O operation on closed file"
if n == None:
r = self.buf.read()
else:
r = self.buf.read(n)
self.pos = self.buf.tell()
return r

def readline(self, length=None):
if self.closed:
raise ValueError, "I/O operation on closed file"
if length is not None:
r = self.buf.readline(length)
else:
r = self.buf.readline(length)
self.pos = self.buf.tell()
return r

def readlines(self):
if self.closed:
raise ValueError, "I/O operation on closed file"
lines = self.buf.readlines()
self.pos = self.buf.tell()
return lines

def truncate(self, size=None):
if self.closed:
raise ValueError, "I/O operation on closed file"
self.buf.truncate(size)
self.pos = self.buf.tell()
self.buf.seek(0, 2)
self.len = self.buf.tell()
self.buf.seek(self.pos)

def write(self, s):
if self.closed:
raise ValueError, "I/O operation on closed file"
origpos = self.buf.tell()
self.buf.write(s)
self.pos = self.buf.tell()
if origpos + len(s) > self.len:
self.buf.seek(0, 2)
self.len = self.buf.tell()
self.buf.seek(self.pos)

def writelines(self, lines):
if self.closed:
raise ValueError, "I/O operation on closed file"
self.buf.writelines(lines)
self.pos = self.buf.tell()
self.buf.seek(0, 2)
self.len = self.buf.tell()
self.buf.seek(self.pos)

def flush(self):
if self.closed:
raise ValueError, "I/O operation on closed file"
self.buf.flush()

def getvalue(self):
if self.closed:
raise ValueError, "I/O operation on closed file"
return self.buf.getvalue()
Jul 18 '05 #4
[David Fraser]
Others may find this helpful ; it's a pure Python wrapper for cStringIO
that makes it behave like StringIO in not having initialized objects
readonly. Would it be an idea to extend cStringIO like this in the
standard library? It shouldn't lose performance if used like a standard
cStringIO, but it prevents frustration :-)


IMO, that would be a step backwards. Initializing the object and then
writing to it is not a good practice. The cStringIOAPI needs to be as
file-like as possible. With files, we create an emtpy object and then
starting writing (the append mode for existing files is a different story).
Good code ought to maintain that parallelism so that it is easier to
substitute a real file for a writeable cStringIO object.

This whole thread (except for the documentation issue which has been
fixed) is about fighting the API rather than letting it be a guide to good
code.

If there were something wrong with the API, Guido would have long
since fired up the time machine and changed the timeline so that all
would be as right as rain ;-)
Raymond Hettinger
Jul 18 '05 #5
Raymond Hettinger wrote:
[David Fraser]
Others may find this helpful ; it's a pure Python wrapper for cStringIO
that makes it behave like StringIO in not having initialized objects
readonly. Would it be an idea to extend cStringIO like this in the
standard library? It shouldn't lose performance if used like a standard
cStringIO, but it prevents frustration :-)

IMO, that would be a step backwards. Initializing the object and then
writing to it is not a good practice. The cStringIOAPI needs to be as
file-like as possible. With files, we create an emtpy object and then
starting writing (the append mode for existing files is a different story).
Good code ought to maintain that parallelism so that it is easier to
substitute a real file for a writeable cStringIO object.

This whole thread (except for the documentation issue which has been
fixed) is about fighting the API rather than letting it be a guide to good
code.

If there were something wrong with the API, Guido would have long
since fired up the time machine and changed the timeline so that all
would be as right as rain ;-)


But surely the whole point of files is that you can do more than either
creating a new file or appending to an existing one (seek, write?)

The reason I wrote this was to enable manipulating zip files inside zip
files, in memory. This is on translate.sourceforge.net - I wanted to
manipulate Mozilla XPI files, and replace file contents etc. within the
XPI. The XPI files are zip format that contains jars inside (also zip
format). I needed to alter the contents of files within the inner zip files.

The zip classes in Python can handle adding files but not replacing
them. The cStringIO is as described above.

So I created extensions to the zipfile.ZipFile class that allow it to
delete existing files, and add them again with new contents (thus
replacing them).

And I created wStringIO so that I could do this all inplace on the
existing zip files.

This all required some extra hacking because of the dual-layer zip files.

But all this as far as I see would have been really tricky using the
existing zipfile and cStringIO classes, which both assume (conceptually)
that files are either readable or new or merely appendable (for zipfile).

The problem for me was not that cStringIO classes are too similar to
files, it was that they are too dissimilar. All of this would work with
either StringIO (but too slow) or real files (but I needed it in memory
because of the zipfiles being inside other zip files).

Am I missing something?

David
Jul 19 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

30
by: Christian Seberino | last post by:
How does Ruby compare to Python?? How good is DESIGN of Ruby compared to Python? Python's design is godly. I'm wondering if Ruby's is godly too. I've heard it has solid OOP design but then...
6
by: Juho Saarikko | last post by:
The program attached to this message makes the Python interpreter segfault randomly. I have tried both Python 2.2 which came with Debian Stable, and self-compiled Python 2.3.3 (newest I could find...
1
by: Thomas Lotze | last post by:
Hi, I want to implement a tokenizer for some syntax. So I thought I'd subclass StringIO and make my new class return tokens on next(). However, if I want to read tokens from two places in the...
21
by: Paul Rubin | last post by:
I've always found the string-building idiom temp_list = for x in various_pieces_of_output(): v = go_figure_out_some_string() temp_list.append(v) final_string = ''.join(temp_list) ...
3
by: Max | last post by:
I'm using StringIO for the first time (to buffer messages recieved from a socket). I thought it would be a simple matter of writing the stuff to the buffer and then calling readline, but that...
5
by: kutty | last post by:
Hi All, I am loading data to a child table from a text file. the text files also contains data not referenced by parent key. while loading the data if one row fails to satisfies the constraint...
2
by: Jonathan Bowlas | last post by:
Hi listers, I've written this little script to generate some html but I cannot get it to convert to a string so I can perform a replace() on the >, < characters that get returned. from...
3
by: bob | last post by:
I'm using the code below to read the zipped, base64 encoded WMF file saved in an XML file with "Save as XML" from MS Word. As the "At this point" comment shows, I know that the base64 decoding is...
6
by: sebastian.noack | last post by:
Hi, is there a way to or at least a reason why I can not use tarfile to create a gzip or bunzip2 compressed archive in the memory? You might might wanna answer "use StringIO" but this isn't...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.