473,761 Members | 9,480 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

cPickle.dumps differs from Pickle.dumps; looks like a bug.

Hello list,

I've found the following strange behavior of cPickle. Do you think
it's a bug, or is it by design?

Best regards,
Victor.

from pickle import dumps
from cPickle import dumps as cdumps

print dumps('1001799' )==dumps(str(10 01799))
print cdumps('1001799 ')==cdumps(str( 1001799))

outputs

True
False
vicbook:~ victor$ python
Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type "help", "copyright" , "credits" or "license" for more information.
>>quit()
vicbook:~ victor$ uname -a
Darwin vicbook 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00
PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386

May 16 '07 #1
8 2319
On May 16, 1:13 pm, Victor Kryukov <victor.kryu... @gmail.comwrote :
Hello list,

I've found the following strange behavior of cPickle. Do you think
it's a bug, or is it by design?

Best regards,
Victor.

from pickle import dumps
from cPickle import dumps as cdumps

print dumps('1001799' )==dumps(str(10 01799))
print cdumps('1001799 ')==cdumps(str( 1001799))

outputs

True
False

vicbook:~ victor$ python
Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type "help", "copyright" , "credits" or "license" for more information.>>q uit()

vicbook:~ victor$ uname -a
Darwin vicbook 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00
PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386
If you unpickle though will the results be the same? I suspect they
will be. That should matter most of all (unless you plan to compare
objects' identity based on their pickled version.)

Remember, that by default pickle and cPickle will create a longer
ASCII representation, for a binary representation use a higher pickle
protocol -- 2 instead of 1.

Hope that helps,
-Nick Vatamaniuc

May 16 '07 #2
I've found the following strange behavior of cPickle. Do you think
it's a bug, or is it by design?

Best regards,
Victor.

from pickle import dumps
from cPickle import dumps as cdumps

print dumps('1001799' )==dumps(str(10 01799))
print cdumps('1001799 ')==cdumps(str( 1001799))

outputs

True
False

vicbook:~ victor$ python
Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type "help", "copyright" , "credits" or "license" for more information.>>>
quit()

vicbook:~ victor$ uname -a
Darwin vicbook 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00
PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386

If you unpickle though will the results be the same? I suspect they
will be. That should matter most of all (unless you plan to compare
objects' identity based on their pickled version.)
The OP was not comparing identity but equality. So it looks like a
real bug, I think the following should be True for any function f:

if a == b: f(a) == f(b)

or not?

Daniel
May 16 '07 #3
On May 16, 1:13 pm, Victor Kryukov <victor.kryu... @gmail.comwrote :
Hello list,

I've found the following strange behavior of cPickle. Do you think
it's a bug, or is it by design?

Best regards,
Victor.

from pickle import dumps
from cPickle import dumps as cdumps

print dumps('1001799' )==dumps(str(10 01799))
print cdumps('1001799 ')==cdumps(str( 1001799))

outputs

True
False

vicbook:~ victor$ python
Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type "help", "copyright" , "credits" or "license" for more information.>>q uit()

vicbook:~ victor$ uname -a
Darwin vicbook 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00
PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386
I might have found the culprit: see http://svn.python.org/projects/pytho...ules/cPickle.c
Function static int put2(...) has the following code block in it :

---------cPickle.c-----------
int p;
....
if ((p = PyDict_Size(sel f->memo)) < 0) goto finally;
/* Make sure memo keys are positive! */
/* XXX Why?
* XXX And does "positive" really mean non-negative?
* XXX pickle.py starts with PUT index 0, not 1. This makes for
* XXX gratuitous differences between the pickling modules.
*/
p++;
-------------------------------

p++ will cause the difference. It seems the developers are not quite
sure why it's there or whether memo key sizes can be 0 or have to be
1.

Here is corresponding section for the Python version (pickle.py) taken
from Python 2.5
---------pickle.py----------
def memoize(self, obj):
"""Store an object in the memo."""
# The Pickler memo is a dictionary mapping object ids to 2-
tuples
# that contain the Unpickler memo key and the object being
memoized.
# The memo key is written to the pickle and will become
# the key in the Unpickler's memo. The object is stored in
the
# Pickler memo so that transient objects are kept alive during
# pickling.

# The use of the Unpickler memo length as the memo key is just
a
# convention. The only requirement is that the memo values be
unique.
# But there appears no advantage to any other scheme, and this
# scheme allows the Unpickler memo to be implemented as a
plain (but
# growable) array, indexed by memo key.
if self.fast:
return
assert id(obj) not in self.memo
memo_len = len(self.memo)
self.write(self .put(memo_len))
self.memo[id(obj)] = memo_len, obj

# Return a PUT (BINPUT, LONG_BINPUT) opcode string, with argument
i.
def put(self, i, pack=struct.pac k):
if self.bin:
if i < 256:
return BINPUT + chr(i)
else:
return LONG_BINPUT + pack("<i", i)
return PUT + repr(i) + '\n'
------------------------------------------

In memoize memo_len is the 'int p' from the c version. The size is 0
and is kept 0 while in the C version the size initially is 0 but then
is incremented with p++;

Any developers that know more about this?

-Nick Vatamaniuc

May 16 '07 #4
In <ma************ *************** ************@py thon.org>, Daniel Nogradi
wrote:
The OP was not comparing identity but equality. So it looks like a
real bug, I think the following should be True for any function f:

if a == b: f(a) == f(b)

or not?
In [74]: def f(x):
....: return x / 2
....:

In [75]: a = 5

In [76]: b = 5.0

In [77]: a == b
Out[77]: True

In [78]: f(a) == f(b)
Out[78]: False

And `f()` doesn't even use something like `random()` or `time()` here. ;-)

Ciao,
Marc 'BlackJack' Rintsch
May 16 '07 #5
On 5/16/07, Daniel Nogradi <no*****@gmail. comwrote:
I've found the following strange behavior of cPickle. Do you think
it's a bug, or is it by design?
>
Best regards,
Victor.
>
from pickle import dumps
from cPickle import dumps as cdumps
>
print dumps('1001799' )==dumps(str(10 01799))
print cdumps('1001799 ')==cdumps(str( 1001799))
>
outputs
>
True
False
>
vicbook:~ victor$ python
Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type "help", "copyright" , "credits" or "license" for more information.>>>
quit()
>
vicbook:~ victor$ uname -a
Darwin vicbook 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00
PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386
If you unpickle though will the results be the same? I suspect they
will be. That should matter most of all (unless you plan to compare
objects' identity based on their pickled version.)

The OP was not comparing identity but equality. So it looks like a
real bug, I think the following should be True for any function f:

if a == b: f(a) == f(b)

or not?
Obviously not, in the general case. random.random(x ) is the most
obvious example, but there's any number functions which don't return
the same value for equal inputs. Take file() or open() - since you get
a new file object with new state, it obviously will not be equal even
if it's the same file path.

For certain inputs, cPickle doesn't print the memo information that is
used to support recursive and shared data structures. I'm not sure how
it tells the difference, perhaps it has something to do with
refcounts. In any case, it's an optimization of the pickle output, not
a bug.
May 16 '07 #6
I've found the following strange behavior of cPickle. Do you think
it's a bug, or is it by design?

Best regards,
Victor.

from pickle import dumps
from cPickle import dumps as cdumps

print dumps('1001799' )==dumps(str(10 01799))
print cdumps('1001799 ')==cdumps(str( 1001799))

outputs

True
False

vicbook:~ victor$ python
Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type "help", "copyright" , "credits" or "license" for more
information.>>>
quit()

vicbook:~ victor$ uname -a
Darwin vicbook 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00
PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386
>
If you unpickle though will the results be the same? I suspect they
will be. That should matter most of all (unless you plan to compare
objects' identity based on their pickled version.)
The OP was not comparing identity but equality. So it looks like a
real bug, I think the following should be True for any function f:

if a == b: f(a) == f(b)

or not?

Obviously not, in the general case. random.random(x ) is the most
obvious example, but there's any number functions which don't return
the same value for equal inputs. Take file() or open() - since you get
a new file object with new state, it obviously will not be equal even
if it's the same file path.
Right, sorry about that, posted too quickly :)
I was thinking for a while about a deterministic
For certain inputs, cPickle doesn't print the memo information that is
used to support recursive and shared data structures. I'm not sure how
it tells the difference, perhaps it has something to do with
refcounts. In any case, it's an optimization of the pickle output, not
a bug.
Caching?
>>from cPickle import dumps
dumps('0') == dumps(str(0))
True
>>dumps('1') == dumps(str(1))
True
>>dumps('2') == dumps(str(2))
True
.........
.........
>>dumps('9') == dumps(str(9))
True
>>dumps('10') == dumps(str(10))
False
>>dumps('11') == dumps(str(11))
False
Daniel
May 16 '07 #7
Daniel Nogradi wrote:
Caching?
>>>from cPickle import dumps
dumps('0') == dumps(str(0))
True
>>>dumps('1') == dumps(str(1))
True
>>>dumps('2') == dumps(str(2))
True
........
........
>>>dumps('9') == dumps(str(9))
True
>>>dumps('10' ) == dumps(str(10))
False
>>>dumps('11' ) == dumps(str(11))
False
All strings of length 0 (there is 1) and 1 (there are 256) are interned.

- Josiah
May 17 '07 #8
En Thu, 17 May 2007 02:09:02 -0300, Josiah Carlson
<jo************ @sbcglobal.nete scribió:
All strings of length 0 (there is 1) and 1 (there are 256) are interned.
I thought it was the case too, but not always:

pya = "a"
pyb = "A".lower()
pya==b
True
pya is b
False
pya is intern(a)
True
pyb is intern(b)
False

--
Gabriel Genellina

May 17 '07 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
3681
by: Drochom | last post by:
Hello, I have a huge problem with loading very simple structure into memory it is a list of tuples, it has 6MB and consists of 100000 elements >import cPickle >plik = open("mealy","r") >mealy = cPickle.load(plik) >plik.close()
2
2138
by: sh | last post by:
Hi guys, Well, I have a (maybe dumb) question. I want to write my own little blog using Python (as a fairly small but doable project for myself to learn more deaply Python in a web context). I don't want so far to use a database as a backend, I'd prefer use XML which is enough for a small amount of data the blog would have to deal with.
3
1810
by: Chris Curvey | last post by:
Hi all, I have this program class Company: def __init__(self, revenues, costs): self.revenues = revenues self.costs = costs def __getattr__(self, name):
1
3508
by: A.B., Khalid | last post by:
I wonder if someone can explain what is wrong here. I am pickling a list of dictionaries (see code attached) and unpickling it back using the HIGHEST_PROTOCOL of pickle and cPickle. I am getting an error message and trace backs if the list exceeds eight items. Whether I use pickle or cPickle does not matter, i.e., the eight number causes a problem in both modules, although the trace backs are of course dissimilar. This pickling and...
0
1130
by: Al Franz | last post by:
I believe there is a memory leak in cPickle. I am using python2.2. I have a parallel code which uses array() and indices() from Numeric to massage data buffers before being sent and received by Pypar. Pypar subsequently uses cPickle to pickle the data. After many hours of execution, my code crashes with one of the following error messages (depending upon the run): a = zeros(shape, typecode, savespace) MemoryError: can't allocate...
2
1378
by: David Bear | last post by:
I'm rather new to pickling but I have some dictionaries and lists I want to package and send to another process (on another machine). I was hoping I could just send a stringified pickle. However, the examples in the doc have: >>> import pickle >>> pickle.dump(obj,open('save.p','w')) I don't really want to write to a file. I know I could write to sys.stdout.
8
3254
by: Jeff Poole | last post by:
This is going to be a pretty vague message because it involves a large block of code I'd rather avoid posting. Basically, I've been pickling a dictionary of instances of a class I've created (which contains references to other instances of other classes). At some point in the last few weeks, pickling has stopped working with the following error: Traceback (most recent call last): File "./generateTools.py", line 50, in ?...
0
968
by: Bart Ogryczak | last post by:
It seems, that on Solaris cPickle is unable to unpickle some values, which it is able to pickle. 'F9.9999999999999694e-311\n.' Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: could not convert string to float 9.9999999999999694e-311
5
1765
by: Victor Kryukov | last post by:
Hello list, The following behavior is completely unexpected. Is it a bug or a by- design feature? Regards, Victor. -----------------
0
9531
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9957
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9775
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8780
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7332
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5373
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3881
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3456
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2752
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.