473,385 Members | 1,873 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

pickle: huge memory consumption *during* pickling

Dear all,

I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.

Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?

System: python 2.3.4, Win XP, 1.X GHz class PC, 512 MB ram

Best redards
Hans Georg
Jul 18 '05 #1
8 8038

FWIIW, we pickle data extracted from large log files. The number of
pickled objects is about 1500, the size of the pickle file is 55+MB and
it takes about 3 mins to generate that file**.

This is cPickle, using protocol 2 (!) and all strings to be pickled are
intern'ed when initially created.

/Jean Brouwers
ProphICy Semiconductor, Inc.

**) On a dual 2.4 GHz Xeon machine with 2 GB of memory running RedHat
Linux 8.0.

In article <cn**********@fuerst.cs.uni-magdeburg.de>, Hans Georg
Krauthaeuser <hg*@et.uni-magdeburg.de> wrote:
Dear all,

I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.

Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?

System: python 2.3.4, Win XP, 1.X GHz class PC, 512 MB ram

Best redards
Hans Georg

Jul 18 '05 #2
Hans Georg Krauthaeuser wrote:
Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?


Are you sure you are using cPickle as opposed to pickle?

regards,
Gerrit Holl.

--
Weather in Twenthe, Netherlands 11/11 19:25:
-1.0°C wind 0.4 m/s None (57 m above NAP)
--
In the councils of government, we must guard against the acquisition of
unwarranted influence, whether sought or unsought, by the
military-industrial complex. The potential for the disastrous rise of
misplaced power exists and will persist.
-Dwight David Eisenhower, January 17, 1961
Jul 18 '05 #3
Hans Georg Krauthaeuser <hg*@et.uni-magdeburg.de> wrote:
I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.

Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).
You've probably got lots of instances of a single class... We managed
to 1/3 the memory requirments in a similar situation by using new
style classes (inherit from object) and defining __slots__ for just a
single class!
My class use data types taken from a c++ class via swig. Don't know if
that is important...


This may be important I don't know!

--
Nick Craig-Wood <ni**@craig-wood.com> -- http://www.craig-wood.com/nick
Jul 18 '05 #4

Good point. Double check first that you use

- import cPickle

- cPickle.dump(<obj>, <file>, 2) # note, protocol 2

/Jean Brouwers
ProphICy Semiconductor, Inc.

In article <ma**************************************@python.o rg>,
Gerrit <ge****@nl.linux.org> wrote:
Hans Georg Krauthaeuser wrote:
Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?


Are you sure you are using cPickle as opposed to pickle?

regards,
Gerrit Holl.

Jul 18 '05 #5
Hans Georg Krauthaeuser <hg*@et.uni-magdeburg.de> wrote in message news:<cn**********@fuerst.cs.uni-magdeburg.de>...
Dear all,

I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.
(Snip)
My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?

System: python 2.3.4, Win XP, 1.X GHz class PC, 512 MB ram

Best redards
Hans Georg


The tutorial books I read (including the Python Bible, I think) said
that pickle shouldn't be used for large objects, so I try to limit it
to smaller objects in small applications. I always wondered what they
meant by large objects, maybe this is an illustration of that? :)

Would it not be possible to save your data as a file, (or use a class
method to download the stored data to a file) on your disc? You could
always reload it from there for further use. Or split the class into
several smaller ones, each of which might be more efficient at using
pickle?

Regards

Tony Clarke
Jul 18 '05 #6
Yes, I'm sure that I'm using cPickle.

But, I don't use protocol 2. I will try that and post the difference.

Thanks for the hint.

Hans Georg

Jean Brouwers wrote:
Good point. Double check first that you use

- import cPickle

- cPickle.dump(<obj>, <file>, 2) # note, protocol 2

/Jean Brouwers
ProphICy Semiconductor, Inc.

In article <ma**************************************@python.o rg>,
Gerrit <ge****@nl.linux.org> wrote:

Hans Georg Krauthaeuser wrote:
Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?


Are you sure you are using cPickle as opposed to pickle?

regards,
Gerrit Holl.

Jul 18 '05 #7
Nick Craig-Wood wrote:
....
You've probably got lots of instances of a single class... We managed
You are right. My data are class objects taken from a c++ class (via swig).
to 1/3 the memory requirments in a similar situation by using new
style classes (inherit from object) and defining __slots__ for just a
single class!


Interesting, I didn't noticed __slots__ before.

In the swig-wrapper I found that my object are inherited from _object
and that is

import types
try:
_object = types.ObjectType
_newclass = 1
except AttributeError:
class _object : pass
_newclass = 0
del types

So, this are new style classes.

Now, I have to see how to get swig to generate __slots__ ...

Thanks,

Hans Georg
Jul 18 '05 #8

Using __slots__ will reduce memory usage quite a bit. Like the OP, we
found significant improvements in both memory usage and speed, more
details here

<http://mail.python.org/pipermail/python-list/2004-May/220513.html>

But __slots__ do have restictions, see

<http://docs.python.org/ref/slots.html>

Also, it is unclear whether SWIG can generate a class with __slots__ at
all.

/Jean Brouwers
ProphICy Semiconductor, Inc.

In article <cn**********@fuerst.cs.uni-magdeburg.de>, Hans Georg
Krauthaeuser <hg*@et.uni-magdeburg.de> wrote:
Nick Craig-Wood wrote:
....
You've probably got lots of instances of a single class... We managed


You are right. My data are class objects taken from a c++ class (via swig).
to 1/3 the memory requirments in a similar situation by using new
style classes (inherit from object) and defining __slots__ for just a
single class!


Interesting, I didn't noticed __slots__ before.

In the swig-wrapper I found that my object are inherited from _object
and that is

import types
try:
_object = types.ObjectType
_newclass = 1
except AttributeError:
class _object : pass
_newclass = 0
del types

So, this are new style classes.

Now, I have to see how to get swig to generate __slots__ ...

Thanks,

Hans Georg

Jul 18 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Simon Burton | last post by:
Hi, I am pickling big graphs of data and running into this problem: File "/usr/lib/python2.2/pickle.py", line 225, in save f(self, object) File "/usr/lib/python2.2/pickle.py", line 414, in...
2
by: Christian Tismer | last post by:
Martin v. Löwis wrote: > "Mark Hahn" <mark@hahnca.com> writes: > > >>I don't understand how this could happen with pickle. Isn't it supposed to >>stop when it runs into an object it has...
28
by: Grant Edwards | last post by:
I finally figured out why one of my apps sometimes fails under Win32 when it always works fine under Linux: Under Win32, the pickle module only works with a subset of floating point values. In...
10
by: crystalattice | last post by:
I'm creating an RPG for experience and practice. I've finished a character creation module and I'm trying to figure out how to get the file I/O to work. I've read through the python newsgroup...
3
by: dgdev | last post by:
I would like to pickle an extension type (written in pyrex). I have it working thus far by defining three methods: class C: # for pickling __getstate__(self): ... # make 'state_obj' return...
8
by: Victor Kryukov | last post by:
Hello list, I've found the following strange behavior of cPickle. Do you think it's a bug, or is it by design? Best regards, Victor. from pickle import dumps from cPickle import dumps as...
2
by: lazy | last post by:
Hi, I have a dictionary something like this, key1=>{key11=> , key12=> , .... } For lack of wording, I will call outer dictionary as dict1 and its value(inner dictionary) dict2 which is a...
10
by: krustymonkey | last post by:
I'm wondering if anyone can help with a workaround for a problem I currently have. I'm trying to set up a prefork tcp server. Specifically, I'm setting up a server that forks children and has them...
1
by: Nagu | last post by:
I didn't have the problem with dumping as a string. When I tried to save this object to a file, memory error pops up. I am sorry for the mention of size for a dictionary. What I meant by...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.