By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,315 Members | 1,736 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,315 IT Pros & Developers. It's quick & easy.

pickle: huge memory consumption *during* pickling

P: n/a
Dear all,

I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.

Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?

System: python 2.3.4, Win XP, 1.X GHz class PC, 512 MB ram

Best redards
Hans Georg
Jul 18 '05 #1
Share this Question
Share on Google+
8 Replies


P: n/a

FWIIW, we pickle data extracted from large log files. The number of
pickled objects is about 1500, the size of the pickle file is 55+MB and
it takes about 3 mins to generate that file**.

This is cPickle, using protocol 2 (!) and all strings to be pickled are
intern'ed when initially created.

/Jean Brouwers
ProphICy Semiconductor, Inc.

**) On a dual 2.4 GHz Xeon machine with 2 GB of memory running RedHat
Linux 8.0.

In article <cn**********@fuerst.cs.uni-magdeburg.de>, Hans Georg
Krauthaeuser <hg*@et.uni-magdeburg.de> wrote:
Dear all,

I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.

Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?

System: python 2.3.4, Win XP, 1.X GHz class PC, 512 MB ram

Best redards
Hans Georg

Jul 18 '05 #2

P: n/a
Hans Georg Krauthaeuser wrote:
Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?


Are you sure you are using cPickle as opposed to pickle?

regards,
Gerrit Holl.

--
Weather in Twenthe, Netherlands 11/11 19:25:
-1.0°C wind 0.4 m/s None (57 m above NAP)
--
In the councils of government, we must guard against the acquisition of
unwarranted influence, whether sought or unsought, by the
military-industrial complex. The potential for the disastrous rise of
misplaced power exists and will persist.
-Dwight David Eisenhower, January 17, 1961
Jul 18 '05 #3

P: n/a
Hans Georg Krauthaeuser <hg*@et.uni-magdeburg.de> wrote:
I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.

Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).
You've probably got lots of instances of a single class... We managed
to 1/3 the memory requirments in a similar situation by using new
style classes (inherit from object) and defining __slots__ for just a
single class!
My class use data types taken from a c++ class via swig. Don't know if
that is important...


This may be important I don't know!

--
Nick Craig-Wood <ni**@craig-wood.com> -- http://www.craig-wood.com/nick
Jul 18 '05 #4

P: n/a

Good point. Double check first that you use

- import cPickle

- cPickle.dump(<obj>, <file>, 2) # note, protocol 2

/Jean Brouwers
ProphICy Semiconductor, Inc.

In article <ma**************************************@python.o rg>,
Gerrit <ge****@nl.linux.org> wrote:
Hans Georg Krauthaeuser wrote:
Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?


Are you sure you are using cPickle as opposed to pickle?

regards,
Gerrit Holl.

Jul 18 '05 #5

P: n/a
Hans Georg Krauthaeuser <hg*@et.uni-magdeburg.de> wrote in message news:<cn**********@fuerst.cs.uni-magdeburg.de>...
Dear all,

I have a long running application (electromagnetic compatibility
measurements in mode-stirred chambers over GPIB) that use pickle
(cPickle) to autosave a class instance with all the measured data from
time to time.

At the beginning, pickling is quite fast but when the data becomes more
and more pickling slows down rapidly.
(Snip)
My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?

System: python 2.3.4, Win XP, 1.X GHz class PC, 512 MB ram

Best redards
Hans Georg


The tutorial books I read (including the Python Bible, I think) said
that pickle shouldn't be used for large objects, so I try to limit it
to smaller objects in small applications. I always wondered what they
meant by large objects, maybe this is an illustration of that? :)

Would it not be possible to save your data as a file, (or use a class
method to download the stored data to a file) on your disc? You could
always reload it from there for further use. Or split the class into
several smaller ones, each of which might be more efficient at using
pickle?

Regards

Tony Clarke
Jul 18 '05 #6

P: n/a
Yes, I'm sure that I'm using cPickle.

But, I don't use protocol 2. I will try that and post the difference.

Thanks for the hint.

Hans Georg

Jean Brouwers wrote:
Good point. Double check first that you use

- import cPickle

- cPickle.dump(<obj>, <file>, 2) # note, protocol 2

/Jean Brouwers
ProphICy Semiconductor, Inc.

In article <ma**************************************@python.o rg>,
Gerrit <ge****@nl.linux.org> wrote:

Hans Georg Krauthaeuser wrote:
Today morning we reached the situation that it took 6 hours to pickle
the class instance. The pickle file was than approx. 92 MB (this is ok).
During pickling the memory consuption of the python proccess was up to
450 MB (512 MB RAM -> machine was swapping all the time).

My class use data types taken from a c++ class via swig. Don't know if
that is important...

My feeling is that I'm doing something wrong. But my python knowlegde is
not so deep to see what that is.

Is there an other way to perform an autosave of an class instance? Shelve?


Are you sure you are using cPickle as opposed to pickle?

regards,
Gerrit Holl.

Jul 18 '05 #7

P: n/a
Nick Craig-Wood wrote:
....
You've probably got lots of instances of a single class... We managed
You are right. My data are class objects taken from a c++ class (via swig).
to 1/3 the memory requirments in a similar situation by using new
style classes (inherit from object) and defining __slots__ for just a
single class!


Interesting, I didn't noticed __slots__ before.

In the swig-wrapper I found that my object are inherited from _object
and that is

import types
try:
_object = types.ObjectType
_newclass = 1
except AttributeError:
class _object : pass
_newclass = 0
del types

So, this are new style classes.

Now, I have to see how to get swig to generate __slots__ ...

Thanks,

Hans Georg
Jul 18 '05 #8

P: n/a

Using __slots__ will reduce memory usage quite a bit. Like the OP, we
found significant improvements in both memory usage and speed, more
details here

<http://mail.python.org/pipermail/python-list/2004-May/220513.html>

But __slots__ do have restictions, see

<http://docs.python.org/ref/slots.html>

Also, it is unclear whether SWIG can generate a class with __slots__ at
all.

/Jean Brouwers
ProphICy Semiconductor, Inc.

In article <cn**********@fuerst.cs.uni-magdeburg.de>, Hans Georg
Krauthaeuser <hg*@et.uni-magdeburg.de> wrote:
Nick Craig-Wood wrote:
....
You've probably got lots of instances of a single class... We managed


You are right. My data are class objects taken from a c++ class (via swig).
to 1/3 the memory requirments in a similar situation by using new
style classes (inherit from object) and defining __slots__ for just a
single class!


Interesting, I didn't noticed __slots__ before.

In the swig-wrapper I found that my object are inherited from _object
and that is

import types
try:
_object = types.ObjectType
_newclass = 1
except AttributeError:
class _object : pass
_newclass = 0
del types

So, this are new style classes.

Now, I have to see how to get swig to generate __slots__ ...

Thanks,

Hans Georg

Jul 18 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.