I've always found the string-building idiom
temp_list = []
for x in various_pieces_of_output():
v = go_figure_out_some_string()
temp_list.append(v)
final_string = ''.join(temp_list)
completely repulsive. As an alternative I suggest
temp_buf = StringIO()
for x in various_pieces_of_output():
v = go_figure_out_some_string()
temp_buf += v
final_string = temp_buf.getvalue()
here, "temp_buf += v" is supposed to be the same as "temp_buf.write(v)".
So the suggestion is to add a __iadd__ method to StringIO and cStringIO.
Any thoughts?
Also, I wonder if it's now ok to eliminate the existing StringIO
module (make it an alias for cStringIO) now that new-style classes
permit extending cStringIO.StringIO. 21 2226
Paul Rubin <http://ph****@NOSPAM.invalid> wrote:
... temp_buf = StringIO() for x in various_pieces_of_output(): v = go_figure_out_some_string() temp_buf += v final_string = temp_buf.getvalue()
here, "temp_buf += v" is supposed to be the same as "temp_buf.write(v)". So the suggestion is to add a __iadd__ method to StringIO and cStringIO.
What's the added value of spelling x.write(v) as x += v? Is it worth
the utter strangeness of having a class which allows += and not + (the
only one in the std library, I think it would be)...?
Any thoughts?
I think that the piece of code you like and I just quoted is just fine,
simply by changing the += to a write.
Also, I wonder if it's now ok to eliminate the existing StringIO module (make it an alias for cStringIO) now that new-style classes permit extending cStringIO.StringIO.
I love having a pure-Python version of any C-coded standard library
module (indeed, I wish I had more!-) for all sorts of reasons, including
easing the burden of porting Python to weird platforms. In StringIO's
case, it's nice to be able to use the above idiom to concatenate Unicode
strings just as easily as plain ones, for example -- cStringIO (like
file objects) wants plain bytestrings.
It would be nice (in Py3k, when backwards compatibility can be broken)
to make the plain-named, "default" modules those coded in C, since
they're used more often, and find another convention to indicate pure
Python equivalents -- e.g., pickle/pypickle and StringIO/pyStringIO
rather than the current cPickle/pickle and cStringIO/StringIO. But I
hope the pure-python "reference" modules stay around (and, indeed, I'd
love for them to _proliferate_, maybe by adopting some of the work of
the pypy guys at some point;).
Alex al***@mail.comcast.net (Alex Martelli) writes: What's the added value of spelling x.write(v) as x += v? Is it worth the utter strangeness of having a class which allows += and not + (the only one in the std library, I think it would be)...?
Sure, + can also be supported. Adding two StringIO's, or a StringIO to a
string, results in a StringIO with the obvious contents.
In StringIO's case, it's nice to be able to use the above idiom to concatenate Unicode strings just as easily as plain ones, for example -- cStringIO (like file objects) wants plain bytestrings.
I wasn't aware of that limitation--maybe cStringIO could be extended
to take Unicode. You'd use an encode or decode method to get a
bytestring out. Or there could be a mutable-string class separate
from cStringIO, to be used for this purpose (of getting rid of the
list.append kludge).
But I hope the pure-python "reference" modules stay around (and, indeed, I'd love for them to _proliferate_, maybe by adopting some of the work of the pypy guys at some point;).
Maybe the standard versions of some of these things can be written in
RPython under PyPy, so they'll compile to fast machine code, and then
the C versions won't be needed. But with CPython I think we need the
C versions.
Paul Rubin <http://ph****@NOSPAM.invalid> wrote:
... In StringIO's case, it's nice to be able to use the above idiom to concatenate Unicode strings just as easily as plain ones, for example -- cStringIO (like file objects) wants plain bytestrings. I wasn't aware of that limitation--maybe cStringIO could be extended to take Unicode. You'd use an encode or decode method to get a bytestring out.
But why can't I have perfectly polymorphic "append a bunch of strings
together", just like I can now (with ''.join of a list of strings, or
StringIO), without caring whether the strings are Unicode or
bytestrings?
Or there could be a mutable-string class separate from cStringIO, to be used for this purpose (of getting rid of the list.append kludge).
StringIO works just fine. Developing (and having to document, learn,
teach, ...) a separate interface just in order to remove StringIO does
not seem worth it. As for extending cStringIO.write I guess that's
possible, but not without breaking compatibility (code that now uses
that write with unicode strings assuming that they'll get encoded into
bytestrings by the default encoding, and similarly assumes that getvalue
always returns a bytestring, when called on a cStringIO instance); you'd
need instead to add another couple of methods, or wait for Py3k. But I hope the pure-python "reference" modules stay around (and, indeed, I'd love for them to _proliferate_, maybe by adopting some of the work of the pypy guys at some point;).
Maybe the standard versions of some of these things can be written in RPython under PyPy, so they'll compile to fast machine code, and then the C versions won't be needed. But with CPython I think we need the C versions.
By all means, the C versions are welcome, I just don't want to lose the
Python versions either (and making them less readable by recoding them
in RPython would interfere with didactical use).
Alex al***@mail.comcast.net (Alex Martelli) writes: But why can't I have perfectly polymorphic "append a bunch of strings together", just like I can now (with ''.join of a list of strings, or StringIO), without caring whether the strings are Unicode or bytestrings?
I see that 'a' + u'b' = u'ab', which makes sense. I don't use Unicode
much so haven't paid much attention to such things. Is there some
sound reason cStringIO acts differently from StringIO? I'd expect
them to both do the same thing.
As for extending cStringIO.write I guess that's possible, but not without breaking compatibility ... you'd need instead to add another couple of methods, or wait for Py3k.
We're already discussing adding another method, namely __iadd__.
Maybe that's the place to put it.
Paul Rubin wrote: I've always found the string-building idiom
temp_list = [] for x in various_pieces_of_output(): v = go_figure_out_some_string() temp_list.append(v) final_string = ''.join(temp_list)
completely repulsive. As an alternative I suggest
temp_buf = StringIO() for x in various_pieces_of_output(): v = go_figure_out_some_string() temp_buf += v final_string = temp_buf.getvalue()
here, "temp_buf += v" is supposed to be the same as "temp_buf.write(v)". So the suggestion is to add a __iadd__ method to StringIO and cStringIO.
Any thoughts?
Why? StringIO/cStringIO have file-like interfaces, not sequences.
--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
The opinion of the strongest is always the best.
-- Jean de la Fontaine
Alex Martelli wrote: It would be nice (in Py3k, when backwards compatibility can be broken) to make the plain-named, "default" modules those coded in C, since they're used more often, and find another convention to indicate pure Python equivalents -- e.g., pickle/pypickle and StringIO/pyStringIO
How about something like a package py for all such python-coded modules
so you use py.StringIO (which I hope gets renamed to stringio in the
Py3K shift).
--
-Scott David Daniels sc***********@acm.org
Paul Rubin <http://ph****@NOSPAM.invalid> wrote: al***@mail.comcast.net (Alex Martelli) writes: But why can't I have perfectly polymorphic "append a bunch of strings together", just like I can now (with ''.join of a list of strings, or StringIO), without caring whether the strings are Unicode or bytestrings?
I see that 'a' + u'b' = u'ab', which makes sense. I don't use Unicode much so haven't paid much attention to such things. Is there some sound reason cStringIO acts differently from StringIO? I'd expect them to both do the same thing.
I believe that cStringIO tries to optimize, while StringIO doesn't and
is thereby more general. As for extending cStringIO.write I guess that's possible, but not without breaking compatibility ... you'd need instead to add another couple of methods, or wait for Py3k.
We're already discussing adding another method, namely __iadd__. Maybe that's the place to put it.
Still need another method to 'getvalue' which can return a Unicode
string (currently, cStringIO.getvalue returns plain strings only, and it
might break something if that guarantee was removed).
That being said, if the only way to use a StringIO was to call += or
__iadd__ on it, I would switch my recommendation away from it and
towards "just join the sequence of strings". Taking your example:
temp_buf = StringIO()
for x in various_pieces_of_output():
v = go_figure_out_some_string()
temp_buf += v
final_string = temp_buf.getvalue()
it's just more readable to me to express it
final_string = ''.join(go_figure_out_some_string()
for x in various_pieces_of_output())
Being able to use temp_buf.write(v) [like today, but with StringIO, not
cStringIO] would still have me recommending it to newbies, but having to
explain that extra += just tips the didactical balance. It's already
hard enough to jump ahead to a standard library module in the middle of
an explanation of strings, just to explain how to concatenate a bunch...
Yes, I do understand your performance issues:
Nimue:~/pynut alex$ python2.4 -mtimeit -s'from StringIO import StringIO'
's=StringIO(); s.writelines(str(i) for i in range(33)); x=s.getvalue()'
1000 loops, best of 3: 337 usec per loop
Nimue:~/pynut alex$ python2.4 -mtimeit -s'from cStringIO import
StringIO' 's=StringIO(); s.writelines(str(i) for i in range(33));
x=s.getvalue()'
10000 loops, best of 3: 98.1 usec per loop
Nimue:~/pynut alex$ python2.4 -mtimeit 's=list(); s.extend(str(i) for i
in range(33)); x="".join(s)'
10000 loops, best of 3: 99 usec per loop
but using += instead of writelines [[actually, how WOULD you express the
writelines equivalent???]] or abrogating plain-Python StringIO would not
speed up the cStringIO use (which is already just as fast as the ''.join
use).
Alex
Scott David Daniels <sc***********@acm.org> wrote: Alex Martelli wrote: It would be nice (in Py3k, when backwards compatibility can be broken) to make the plain-named, "default" modules those coded in C, since they're used more often, and find another convention to indicate pure Python equivalents -- e.g., pickle/pypickle and StringIO/pyStringIO How about something like a package py for all such python-coded modules so you use py.StringIO (which I hope gets renamed to stringio in the Py3K shift).
Sounds good to me, indeed better than 'name mangling'!-)
Alex al***@mail.comcast.net (Alex Martelli) writes: Is there some sound reason cStringIO acts differently from StringIO? I'd expect them to both do the same thing. I believe that cStringIO tries to optimize, while StringIO doesn't and is thereby more general.
I'm not sure what optimizations make sense. I'd thought the most
important difference was the ability to subclass StringIO, before
new-style classes arrived. It's really ugly that .getvalue does
different things for StringIO and cStringIO, something that I didn't
realize and which amazes me. I'd go as far as to say maybe .getvalue
should be deprecated in both modules, and replaced by .getstring
(returns regular or unicode string depending on contents) and
..getbytes (always returns a byte string). We're already discussing adding another method, namely __iadd__. Maybe that's the place to put it.
Still need another method to 'getvalue' which can return a Unicode string (currently, cStringIO.getvalue returns plain strings only, and it might break something if that guarantee was removed).
Yeah, replacing getvalue with explicit methods is preferable. "Explicit
is better than implicit."
That being said, if the only way to use a StringIO was to call += or __iadd__ on it, I would switch my recommendation away from it and towards "just join the sequence of strings".
Fixing getvalue takes care of it. The ''join idiom is IMO a total
monstrosity and should die, die, die, die, die.
it's just more readable to me to express it final_string = ''.join(go_figure_out_some_string() for x in various_pieces_of_output())
OK for that example, maybe not for a more complex one. Anyway I like
sum(...) even better (where sum promises to be O(n) in the number of
bytes), but clpy had THAT discussion a few days ago.
Being able to use temp_buf.write(v) [like today, but with StringIO, not cStringIO] would still have me recommending it to newbies, but having to explain that extra += just tips the didactical balance.
I just can't for the life of me see += as harder to explain than the
''.join horror. But yeah, the real problem is the incompatible
definitions of .getvalue between the two classes, so that should be
fixed, and .write would do the right thing.
but using += instead of writelines [[actually, how WOULD you express the writelines equivalent???]] or abrogating plain-Python StringIO would not speed up the cStringIO use (which is already just as fast as the ''.join use).
''.join with a list (rather than a generator) arg may be plain worse
than python StringIO. Imagine building up a megabyte string one
character at a time, which means making a million-element list and a
million temporary one-character strings before joining them.
Paul Rubin <http://ph****@NOSPAM.invalid> wrote:
... ''.join with a list (rather than a generator) arg may be plain worse than python StringIO. Imagine building up a megabyte string one character at a time, which means making a million-element list and a million temporary one-character strings before joining them.
Absolutely wrong: ''.join takes less for a million items than StringIO
takes for 100,000. It's _so_ easy to measure...!
Nimue:~/pynut alex$ python2.4 -mtimeit 's=["x" for i in xrange(999999)];
x="".join(s)'
10 loops, best of 3: 422 msec per loop
Nimue:~/pynut alex$ python2.4 -mtimeit -s'from StringIO import StringIO'
's=StringIO()' 'for i in xrange(99999): s.write("x")' 'x=s.getvalue()'
10 loops, best of 3: 688 msec per loop
After all, how do you think StringIO is implemented internally? A list
of strings and a ''.join at the end are the best way that comes to mind,
and of course there's going to be overhead (although I'm surprised to
see that the overhead is quite as bad as this). BTW, cStringIO isn't
very good here either:
Nimue:~/pynut alex$ python2.4 -mtimeit -s'from cStringIO import
StringIO' 's=StringIO()' 'for i in xrange(999999): s.write("x")'
'x=s.getvalue()'
10 loops, best of 3: 1.28 sec per loop
three times as slow as the ''.join you hate so much -- if it's to take
its place, it clearly needs a lot of work.
As for sum, you'll recall I was its original proponent, and my first
implementation did specialcase strings (delegating right to ''.join).
But that left O(N**2) behavior in many other cases (lists, tuples) and
eventually was whittled down to "summing *numbers*", at least as far as
the intention goes. Perhaps there's space for a "sumsequences" that's
something like itertools.chain but specialcases crucial cases such as
strings (plain and Unicode) and lists? Good luck getting it approved on
python-dev -- I'll gladly implement it, if you can get it past that
hurdle (chatting about it here is entertaining, but unless you can get
BDFL blessing it's in the end futile, and that requires python-dev...).
Alex al***@mail.comcast.net (Alex Martelli) writes: Absolutely wrong: ''.join takes less for a million items than StringIO takes for 100,000.
That depends on how much ram you have. You could try a billion items.
It's _so_ easy to measure...!
Yes but the result depends on your specific hardware and may be
different for someone else.
After all, how do you think StringIO is implemented internally? A list of strings and a ''.join at the end are the best way that comes to mind,
I'd have used the array module.
As for sum, you'll recall I was its original proponent, and my first implementation did specialcase strings (delegating right to ''.join).
You could imagine a realy dumb implementation of ''.join that used
a quadratic algorithm, and in fact http://docs.python.org/lib/string-methods.html
doesn't guarantee that join is linear. Therefore, the whole ''.join
idiom revolves around the progrmamer knowing some undocumented
behavior of the implementation (i.e. that ''.join is optimized). This
reliance on undocumented behavior seems totally bogus to me, but if
it's ok to optimize join, I'd think it's ok to also optimize sum, and
document both.
But that left O(N**2) behavior in many other cases (lists, tuples) and eventually was whittled down to "summing *numbers*", at least as far as the intention goes. Perhaps there's space for a "sumsequences" that's something like itertools.chain but specialcases crucial cases such as strings (plain and Unicode) and lists?
How making [].join(bunch_of_lists) analogous to ''.join, with a
documented guarantee that both are linear?
Paul Rubin <http://ph****@NOSPAM.invalid> wrote:
... Absolutely wrong: ''.join takes less for a million items than StringIO takes for 100,000. That depends on how much ram you have. You could try a billion items.
Let's see you try it -- I have better things to do than to trash around
checking assertions which I believe are false and that you're too lazy
to check yourself. After all, how do you think StringIO is implemented internally? A list of strings and a ''.join at the end are the best way that comes to mind,
I'd have used the array module.
....and would that support plain byte strings and Unicode smoothly and
polymorphically? You may recall a few posts ago expressing wonder at
what optimizations cStringIO might have that stop it from doing just
this... As for sum, you'll recall I was its original proponent, and my first implementation did specialcase strings (delegating right to ''.join).
You could imagine a realy dumb implementation of ''.join that used a quadratic algorithm, and in fact
http://docs.python.org/lib/string-methods.html
doesn't guarantee that join is linear. Therefore, the whole ''.join idiom revolves around the progrmamer knowing some undocumented behavior of the implementation (i.e. that ''.join is optimized). This
No more than StringIO.write "revolves around" the programmer knowing
exactly the same thing about the optimizations in StringIO: semantics
are guaranteed, performance characteristics are not.
reliance on undocumented behavior seems totally bogus to me, but if
So I assume you won't be using StringIO.write any more, nor ANY other
way to join sequences of strings? Because the performance of ALL of
them depend on such "undocumented behavior".
Personally, I don't consider depending on "undocumented behavior" *for
speed* to be bogus at all, particularly when there are no approaches
whose performance characteristics ARE documented and guaranteed.
Besides C++'s standard library, very few languages like to pin
themselves down by ensuring any performance guarantee;-).
How making [].join(bunch_of_lists) analogous to ''.join, with a documented guarantee that both are linear?
I personally have no objection to adding a join method to lists or other
sequences, but of course the semantics should be similar to:
def join(self, *others):
result = list()
for other in others[:-1]:
result.extend(other)
result.extend(self)
result.extend(others[-1])
return self.__class__(result)
As for performance guarantees, I don't think we have them now even for
list.append, list.extend, dict.__getitem__, and other similarly
fundamental methods. I assume any such guarantees would have to
weaselword regarding costs of memory allocation (including, possibly,
garbage collection), since such allocation may of course be needed and
its performance can easily be out of Python's control; and similarly,
costs of iterating on the items of 'others', cost of indexing it, and so
on (e.g.: for list.sort, cost of comparisons; for dict.__getitem__, cost
of hash on the key; and so on, and so forth).
I don't think it's worth my time doing weaselwording for this purpose,
but if any sealawyers otherwise idle want to volunteer (starting with
the existing methods of existing built-in types, I assume, rather than
by adding others), the offer might be welcome on python-dev (I assume
that large effort will have to be devoted to examining the actual
performance characteristics of at least the reference implementation, in
order to prove that the purported guarantees are indeed met).
Alex
Paul Rubin <http://ph****@NOSPAM.invalid> writes: After all, how do you think StringIO is implemented internally? A list of strings and a ''.join at the end are the best way that comes to mind,
I'd have used the array module.
I just checked the implementation and it uses ''.join combined with
some bogo-optimizations to cache the result of the join when you do a
seek or write. That is, .seek can take linear time instead of
constant time, a pretty bogus situation if you ask me, though maybe
the amortized time isn't so bad over multiple calls. I didn't check
how cStringIO does it. al***@mail.comcast.net (Alex Martelli) writes: That depends on how much ram you have. You could try a billion items. Let's see you try it
If you want me to try timing it with a billion items on your computer,
you'd have to set up a suitable account and open a network connection,
etc., probably not worth the trouble. Based on examining StringIO.py,
on my current computer (512MB ram), with 100 million items, it looks
like using a bunch of writes interspersed with seeks will be much
faster than just using writes. I wouldn't have guessed THAT. With a
billion items it will thrash no matter what. I'd have used the array module. ...and would that support plain byte strings and Unicode smoothly and
Actually, I see that getvalue is supposed to raise an error if you
mix unicode with 8-bit ascii:
The StringIO object can accept either Unicode or 8-bit strings,
but mixing the two may take some care. If both are used, 8-bit
strings that cannot be interpreted as 7-bit ASCII (that use the
8th bit) will cause a UnicodeError to be raised when getvalue()
is called.
This is another surprise, I'd have thought it could just convert to
unicode as soon as it saw a unicode string. I think I understand the
idea. The result is that with StringIO (Python 2.4.1),
s = StringIO() # ok
s.write('\xc3') # ok
s.write(u'a') # ok
s.seek(0,2) # raises UnicodeDecodeError
Raising the error at the second s.write doesn't seem like a big
problem. The StringIO doc already doesn't mention that seek can raise
a Unicode exception, so it needs to be fixed either way. the whole ''.join idiom revolves around the progrmamer knowing some undocumented behavior of the implementation...
No more than StringIO.write "revolves around" the programmer knowing exactly the same thing about the optimizations in StringIO: semantics are guaranteed, performance characteristics are not.
I think having either one use quadratic time is bogus (something like
n log n might be ok). reliance on undocumented behavior seems totally bogus to me, but if
So I assume you won't be using StringIO.write any more, nor ANY other way to join sequences of strings? Because the performance of ALL of them depend on such "undocumented behavior".
I'll keep using them but it means that the program's complexity
(i.e. that it's O(n) and not O(n**2)) depends on the interpreter
implementation, which is bogus. Do you really want a language
designed by computer science geniuses to be so underspecified that
there's no way to tell whether a straightforward program's running
time is linear or quadratic?
Besides C++'s standard library, very few languages like to pin themselves down by ensuring any performance guarantee;-).
I seem to remember Scheme guarantees tail recursion optimization.
This is along the same lines. It's one thing for the docs to not want
to promise that .sort() uses at most 3.827*n*(lg(n)+14.7) comparisons
or something like that. That's what I'd consider to be pinning down a
performance guarantee. Promising that .sort() is O(n log n) just says
that the implementation is reasonable. The C library's "qsort" doc
even specifies the precise algorithm, or used to. Python's heapq
module doc similarly specifies heapq's algorithm.
Even that gets far afield. The real objection here (about ''.join) is
that every Python user is expected to learn a weird, pervasive idiom,
but the reason for the idiom cannot be deduced from the language
reference. That is just bizarre.
As for performance guarantees, I don't think we have them now even for list.append, list.extend, dict.__getitem__, and other similarly fundamental methods. I assume any such guarantees would have to weaselword regarding costs of memory allocation...
I think it's enough to state the amortized complexity of these
operations to within a factor of O(log(N)). That should be easy
enough to do with the standard implementations (dicts using hashing,
etc) while still leaving the implementation pretty flexible. That
should allow determining the running speed of a user's program to
within a factor of O(log(N)), a huge improvement over not being able
to prove anything about it. Even without such guarantees it's enough
to say that these operations work in the obvious ways ("dicts use
hashing..."), and even without saying that, relying on the behavior
isn't so terrible, because the code that you write is the obvious code
for using operations that work in the obvious ways. That's not the
case for ''.join, the use of which is not obvious at all.
I do see docs for the built-in hash function and __hash__ method http://docs.python.org/lib/built-in-funcs.html#l2h-34 http://docs.python.org/ref/customization.html#l2h-195
indicating that dictionary lookup uses hashing.
Paul Rubin <http://ph****@NOSPAM.invalid> writes: etc., probably not worth the trouble. Based on examining StringIO.py, on my current computer (512MB ram), with 100 million items, it looks
Better make that 200 million.
[Paul Rubin] here, "temp_buf += v" is supposed to be the same as "temp_buf.write(v)". So the suggestion is to add a __iadd__ method to StringIO and cStringIO.
Any thoughts?
The StringIO API needs to closely mirror the file object API.
Do you want to change everything that is filelike to have +=
as a synonym for write()?
In for a penny; in for a pound.
Raymond
"Raymond Hettinger" <py****@rcn.com> writes: The StringIO API needs to closely mirror the file object API. Do you want to change everything that is filelike to have += as a synonym for write()?
Why would they need that? StringIO objects have getvalue() but other
file-like objects don't. What's wrong with __iadd__ being another
StringIO-specific operation?
And is making += a synonym for write() on other file objects really
that bad an idea? It would be like C++'s use of << for file objects
and could make some code nicer if you like that kind of thing.
What I was really aiming for was something like
java.lang.StringBuffer, if that wasn't obvious, but using an
already-existing class (StringIO). java.lang.StringBuffer supports a
bunch of other operations too, so maybe there's something to be said
for adding something like it to Python and using that instead of
StringIO for this purpose.
I also now notice that the StringBuffer doc describes how the Java
compiler is supposed to handle adding multiple String objects by using
a temporary StringBuffer: http://java.sun.com/j2se/1.4.2/docs/...ingBuffer.html
That could be seen as a performance specification.
Paul Rubin <http://ph****@NOSPAM.invalid> wrote: And is making += a synonym for write() on other file objects really that bad an idea? It would be like C++'s use of << for file objects and could make some code nicer if you like that kind of thing.
Not really: <<'s point is to allow chaining, f<<a<<b<<c. += would have
no such "advantage" (or disadvantage, as the case may be).
Alex al***@mail.comcast.net (Alex Martelli) writes: Not really: <<'s point is to allow chaining, f<<a<<b<<c. += would have no such "advantage" (or disadvantage, as the case may be).
Hmm, ok. I've always found << repulsive in that context though, so
won't suggest it for Python.
On Sun, 29 Jan 2006, Alex Martelli wrote: Paul Rubin <http://ph****@NOSPAM.invalid> wrote:
Maybe the standard versions of some of these things can be written in RPython under PyPy, so they'll compile to fast machine code, and then the C versions won't be needed.
By all means, the C versions are welcome, I just don't want to lose the Python versions either (and making them less readable by recoding them in RPython would interfere with didactical use).
Is RPython really that bad? Lack of generators seems like the only serious
issue to me. But with CPython I think we need the C versions.
Unless we use Shed Skin to translate the RPython into C++. Or maybe we
could write the code in Pyrex, generate C from that for CPython, then have
a python script which strips out the type definitions to generate pure
python for PyPy.
tom
--
Don't trust the laws of men. Trust the laws of mathematics.
[Raymond Hettinger] The StringIO API needs to closely mirror the file object API. Do you want to change everything that is filelike to have += as a synonym for write()?
[Paul Rubin] Why would they need that?
Polymorphism
StringIO objects have getvalue() but other file-like objects don't. What's wrong with __iadd__ being another StringIO-specific operation?
The getvalue() method isn't a synonym for another method in the file
object API. Its design capitalizes on properties unique to StringIO
(direct access to the full buffer without rewinding the file pointer).
In contrast, the __iadd__() method is a synonym for write(). Its use
cases are the same for both StringIO and file objects. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Thomas Lotze |
last post by:
Hi,
I want to implement a tokenizer for some syntax. So I thought I'd subclass
StringIO and make my new class return tokens on next().
However, if I want to read tokens from two places in the...
|
by: Leif K-Brooks |
last post by:
When StringIO gets an initial value passed to its constructor, it seems
to discard it after the first call to .write(). For instance:
>>> from StringIO import StringIO
>>> buffer =...
|
by: Ronny Mandal |
last post by:
Can someone please explain or point me to articles regarding these two
methods?
Thanks.
|
by: Max |
last post by:
I'm using StringIO for the first time (to buffer messages recieved from
a socket). I thought it would be a simple matter of writing the stuff to
the buffer and then calling readline, but that...
|
by: Jonathan Bowlas |
last post by:
Hi listers,
I've written this little script to generate some html but I cannot get it to
convert to a string so I can perform a replace() on the >, <
characters that get returned.
from...
|
by: bob |
last post by:
I'm using the code below to read the zipped, base64 encoded WMF file
saved in an XML file with "Save as XML" from MS Word. As the "At this
point" comment shows, I know that the base64 decoding is...
|
by: Moon |
last post by:
class Vec(list):
def __init__(self):
list.__init__(self, )
def __iadd__(self, other):
assert isinstance(other, Vec)
self += other
self += other
print "right now, v is: ", self, " as you'd...
|
by: samwyse |
last post by:
For whatever reason, I need an inproved integer. Sounds easy, let's
just subclass int:
pass
Now let's test it:
<class '__main__.test'>
0
|
by: sebastian.noack |
last post by:
Hi,
is there a way to or at least a reason why I can not use tarfile to
create a gzip or bunzip2 compressed archive in the memory?
You might might wanna answer "use StringIO" but this isn't...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
| |