471,319 Members | 1,634 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,319 software developers and data experts.

Python Memory Usage

I am using Python to process particle data from a physics simulation.
There are about 15 MB of data associated with each simulation, but
there are many simulations. I read the data from each simulation into
Numpy arrays and do a simple calculation on them that involves a few
eigenvalues of small matricies and quite a number of temporary
arrays. I had assumed that that generating lots of temporary arrays
would make my program run slowly, but I didn't think that it would
cause the program to consume all of the computer's memory, because I'm
only dealing with 10-20 MB at a time.

So, I have a function that reliably increases the virtual memory usage
by ~40 MB each time it's run. I'm measuring memory usage by looking
at the VmSize and VmRSS lines in the /proc/[pid]/status file on an
Ubuntu (edgy) system. This seems strange because I only have 15 MB of

I started looking at the difference between what gc.get_objects()
returns before and after my function. I expected to see zillions of
temporary Numpy arrays that I was somehow unintentionally maintaining
references to. However, I found that only 27 additional objects were
in the list that comes from get_objects(), and all of them look
small. A few strings, a few small tuples, a few small dicts, and a
Frame object.

I also found a tool called heapy (http://guppy-pe.sourceforge.net/)
which seems to be able to give useful information about memory usage
in Python. This seemed to confirm what I found from manual
inspection: only a few new objects are allocated by my function, and
they're small.

I found Evan Jones article about the Python 2.4 memory allocator never
freeing memory in certain circumstances: http://evanjones.ca/python-memory.html.
This sounds a lot like what's happening to me. However, his patch was
applied in Python 2.5 and I'm using Python 2.5. Nevertheless, it
looks an awful lot like Python doesn't think it's holding on to the
memory, but doesn't give it back to the operating system, either. Nor
does Python reuse the memory, since each successive call to my
function consumes an additional 40 MB. This continues until finally
the VM usage is gigabytes and I get a MemoryException.

I'm using Python 2.5 on an Ubuntu edgy box, and numpy 1.0.3. I'm also
using a few routines from scipy 0.5.2, but for this part of the code
it's just the eigenvalue routines.

It seems that the standard advice when someone has a bit of Python
code that progressively consumes all memory is to fork a process. I
guess that's not the worst thing in the world, but it certainly is
annoying. Given that others seem to have had this problem, is there a
slick package to do this? I envision:
value = call_in_separate_process(my_func, my_args)

Suggestions about how to proceed are welcome. Ideally I'd like to
know why this is going on and fix it. Short of that workarounds that
are more clever than the "separate process" one are also welcome.


Jun 20 '07 #1
0 2944

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Sridhar R | last post: by
reply views Thread by Robby Dermody | last post: by
25 posts views Thread by abhinav | last post: by
13 posts views Thread by placid | last post: by
3 posts views Thread by test.07 | last post: by
17 posts views Thread by frederic.pica | last post: by
1 post views Thread by yzghan | last post: by
1 post views Thread by Jean-Paul Calderone | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.