472,989 Members | 3,016 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,989 software developers and data experts.

cStringIO unicode weirdness

Python 2.5 (r25:51908, Oct 6 2006, 15:24:43)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>import StringIO, cStringIO
StringIO.StringIO('a').getvalue()
'a'
>>cStringIO.StringIO('a').getvalue()
'a'
>>StringIO.StringIO(u'a').getvalue()
u'a'
>>cStringIO.StringIO(u'a').getvalue()
'a\x00\x00\x00'
>>>
I would have thought StringIO and cStringIO would return the
same result for this ascii-encodeable string. Worse:
>>StringIO.StringIO(u'a').getvalue().encode('utf-8').decode('utf-8')
u'a'

does the right thing, but
>>cStringIO.StringIO(u'a').getvalue().encode('ut f-8').decode('utf-8')
u'a\x00\x00\x00'

looks bogus. Am I misunderstanding something?
Jun 18 '07 #1
3 3188
On Jun 19, 8:56 am, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
Python 2.5 (r25:51908, Oct 6 2006, 15:24:43)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>import StringIO, cStringIO
>>StringIO.StringIO('a').getvalue()
'a'
>>cStringIO.StringIO('a').getvalue()
'a'
>>StringIO.StringIO(u'a').getvalue()
u'a'
>>cStringIO.StringIO(u'a').getvalue()
'a\x00\x00\x00'
>>>

I would have thought StringIO and cStringIO would return the
same result for this ascii-encodeable string.
Looks like a bug to me.
Worse:
>>StringIO.StringIO(u'a').getvalue().encode('utf-8').decode('utf-8')
u'a'

does the right thing, but
>>cStringIO.StringIO(u'a').getvalue().encode('ut f-8').decode('utf-8')
u'a\x00\x00\x00'

looks bogus. Am I misunderstanding something?
Not worse, no more bogus than before. Note that an explicit design
feature of utf8 is that ASCII characters (ord(c) < 128) are unchanged
by the transformation.
>>'a\x00\x00\x00'.encode('utf-8')
# IMPLICIT conversion to unicode (effectively .decode('ascii')), then
encoding as utf8
'a\x00\x00\x00' # no change to original buggy result
>>>
'a\x00\x00\x00'.decode('utf-8')
u'a\x00\x00\x00' # as expected
>>>
Jun 18 '07 #2
Paul Rubin wrote:
Python 2.5 (r25:51908, Oct 6 2006, 15:24:43)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>import StringIO, cStringIO
>>StringIO.StringIO('a').getvalue()
'a'
>>cStringIO.StringIO('a').getvalue()
'a'
>>StringIO.StringIO(u'a').getvalue()
u'a'
>>cStringIO.StringIO(u'a').getvalue()
'a\x00\x00\x00'
>>>

I would have thought StringIO and cStringIO would return the
same result for this ascii-encodeable string. Worse:
You would be wrong. The behavior of StringIO and cStringIO are
different under certain circumstances, and those differences are
intended. Among them is when they are confronted with unicode, as you
saw. Another is when provided with an initializer...
>>cs = cStringIO.StringIO('a')
cs.write('b')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'cStringIO.StringI' object has no attribute 'write'
>>s = StringIO.StringIO('a')
s.write('b')
There is a summer of code project that is working towards making them
behave the same, but the results will need to wait until Python 2.6
and/or 3.0 . Note that there are a few "closed, won't fix" bug reports
regarding these exact same issues in the Python bug tracker at sourceforge.

- Josiah
Jun 19 '07 #3
Josiah Carlson <jo************@sbcglobal.netwrites:
You would be wrong. The behavior of StringIO and cStringIO are
different under certain circumstances, and those differences are
intended. Among them is when they are confronted with unicode, as you
saw. Another is when provided with an initializer...
The doc says there's only supposed to be a difference if the unicode
can't be represented as ascii. That is not the case with the example
I posted.
There is a summer of code project that is working towards making them
behave the same, but the results will need to wait until Python 2.6
and/or 3.0 . Note that there are a few "closed, won't fix" bug
reports regarding these exact same issues in the Python bug tracker at
sourceforge.
Thanks, this helps. At minimum the 2.5 docs should be updated to
explain the issues.
Jun 19 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: David Thielen | last post by:
Hi; I am creating png files in my ASP .NET app. When I am running under Windows 2003/IIS 6, the file is not given the security permissions it should have. It does not have any permission for...
3
by: Laszlo Nagy | last post by:
This program: import sys import traceback import cStringIO a = 1.0 b = 0.0 try: c=a/b
1
by: garyjefferson123 | last post by:
I want to accept a cStringIO object in a function in a python extension module. How do I do this? e.g., static PyObject *myfunc(PyObject *self, PyObject *args) { PyObject *cstringio; if...
3
by: =?ISO-8859-1?Q?Markus_Sch=F6pflin?= | last post by:
Hello, I just stumbled accross a difference between cStringIO in Python 2.4 and 2.5. You can no longer feed arrays to cStringIO. Python 2.4: ---%<--- ActivePython 2.4.3 Build 12...
12
by: Stefan Scholl | last post by:
After an hour searching for a potential bug in XML parsing (PyXML), after updating from 2.4 to 2.5, I found this one: $ python2.5 Python 2.5 (release25-maint, Dec 9 2006, 14:35:53) on...
1
by: grbgooglefan | last post by:
I am importing cStringIO module in my PythonC++ embedded program. The import is failing with the following error: ImportError: /usr/lib/python2.3/lib-dynload/cStringIO.so: undefined symbol:...
1
by: Borse, Ganesh | last post by:
Hi, Can you please guide me for the following problem? The call to "PyImport_ImportModule("cStringIO");" is failing with an error of "undefined symbol: PyObject_SelfIter". Before importing this...
1
by: grbgooglefan | last post by:
I am in a perculiar situation. I want to use PyRun_SimpleString for creating Python functions in embedded Python in C++. But there could be cases when Python function code compilation could fail &...
5
by: peppergrower | last post by:
I've been experimenting with the 'with' statement (in __future__), and so far I like it. However, I can't get it to work with a cStringIO object. Here's a minimum working example: ### from...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
4
NeoPa
by: NeoPa | last post by:
Hello everyone. I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report). I know it can be done by selecting :...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
4
by: GKJR | last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...
3
SueHopson
by: SueHopson | last post by:
Hi All, I'm trying to create a single code (run off a button that calls the Private Sub) for our parts list report that will allow the user to filter by either/both PartVendor and PartType. On...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.