473,387 Members | 1,529 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

cPickle asymptotic performance?

Hello,

I've done some benchmarking while attempting to serialize my (large)
graph data structure with cPickle; I'm seeing superlinear performance
(plotting it seems to suggest n^2 where n is the number of nodes of my
graph), in the duration of the pickle.dump calls and I can't quite
figure out why. The connectivity of the graph is such that the number of
nodes is ~ number of edges, so I don't think this is a problem of edge
count secretly growing O(n). Is cPickle's behavior known to be O(n^2)?
Does anyone have any generic tips for speeding up cPickle?

Thanks,
...Eric
Jun 27 '08 #1
3 1472
Eric Jonas <jo***@MIT.EDUwrites:
I've done some benchmarking while attempting to serialize my (large)
graph data structure with cPickle; I'm seeing superlinear performance
(plotting it seems to suggest n^2 where n is the number of nodes of my
graph), in the duration of the pickle.dump calls and I can't quite
figure out why.
Try gc.disable() before loading the pickle, and gc.enable() after.
Is cPickle's behavior known to be O(n^2)?
No, but the garbage collector's sometimes is.
Jun 27 '08 #2

On Thu, 2008-06-12 at 20:57 +0200, Hrvoje Niksic wrote:
Eric Jonas <jo***@MIT.EDUwrites:
I've done some benchmarking while attempting to serialize my (large)
graph data structure with cPickle; I'm seeing superlinear performance
(plotting it seems to suggest n^2 where n is the number of nodes of my
graph), in the duration of the pickle.dump calls and I can't quite
figure out why.

Try gc.disable() before loading the pickle, and gc.enable() after.
Is cPickle's behavior known to be O(n^2)?

No, but the garbage collector's sometimes is.

Wow, that totally fixed it -- we went from 1200 seconds to 60 seconds.
I'm somewhat shocked -- is the occasionally-abysmal behavior of the GC
documented anywhere?
...Eric

Jun 27 '08 #3
Eric Jonas <jo***@MIT.EDUwrites:
>Try gc.disable() before loading the pickle, and gc.enable() after.
Is cPickle's behavior known to be O(n^2)?

No, but the garbage collector's sometimes is.

Wow, that totally fixed it -- we went from 1200 seconds to 60
seconds.
60 seconds is still a long time. How many objects are you
deserializing? Is the time now approximately O(n)?
I'm somewhat shocked -- is the occasionally-abysmal behavior of the
GC documented anywhere?
I don't think so. It's somewhat of an FAQ on the list, though. The
question tends to arise when someone tries to construct a large list
of non-trivial objects, which takes quadratic time because object
allocation regularly triggers GC, which traverses the growing list.
Jun 27 '08 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Richard Kessler | last post by:
I am attempting a GUI using BOA Constructor. I have some simple code to pickle an object, but for some reason when I use cPickle it hangs the system, but pickle works just fine. I do not have a...
2
by: sh | last post by:
Hi guys, Well, I have a (maybe dumb) question. I want to write my own little blog using Python (as a fairly small but doable project for myself to learn more deaply Python in a web context). ...
5
by: Alex Polite | last post by:
I need to put recursive data structures on disc and found out that cPickle doesn't like recursion. What are my options? alex -- Alex Polite http://polite.se
1
by: A.B., Khalid | last post by:
I wonder if someone can explain what is wrong here. I am pickling a list of dictionaries (see code attached) and unpickling it back using the HIGHEST_PROTOCOL of pickle and cPickle. I am getting an...
5
by: Marcus Lowland | last post by:
Hello, I'm fairly new to python and have read about and wanted to begin experimenting with cpickle. As I understand, this should be a native module in the python library. I have python 2.3 and now...
4
by: Mingus Tsai | last post by:
Hello- please help with unpickling problem: I am using Python version 2.3.4 with IDLE version 1.0.3 on a Windows XPhome system. My problem is with using cPickle to deserialize my pickled...
1
by: Carl J. Van Arsdall | last post by:
Hey everyone, cPickle is raising an ImportError that I just don't quite understand. Before I paste the code, let me explain the application. Basically the part of the application that failed is a...
5
by: Victor Kryukov | last post by:
Hello list, The following behavior is completely unexpected. Is it a bug or a by- design feature? Regards, Victor. -----------------
0
by: Calvin Spealman | last post by:
If you are getting to the point where your data is large enough to really care about the speed of cPickle, then maybe its time you moved past pickles for your storage format? 2.5 includes sqlite,...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.