Python memory handling

frederic.pica

Greets,

I've some troubles getting my memory freed by python, how can I force
it to release the memory ?
I've tried del and gc.collect() with no success.
Here is a code sample, parsing an XML file under linux python 2.4
(same problem with windows 2.5, tried with the first example) :
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
#Using http://www.pixelbeat.org/scripts/ps_mem.py to get memory
information
import cElementTree as ElementTree #meminfo: 2.3 Mb private, 1.6 Mb
shared
import gc #no memory change

et=ElementTree. parse('primary. xml') #meminfo: 34.6 Mb private, 1.6 Mb
shared
del et #no memory change
gc.collect() #no memory change

So how can I free the 32.3 Mb taken by ElementTree ??

The same problem here with a simple file.readlines( )
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.readline s() #meminfo: 12 Mb private, 1.4 Mb shared
del data #meminfo: 11.5 Mb private, 1.4 Mb shared
gc.collect() # no memory change

But works great with file.read() :
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared
del data #meminfo: 1.1 Mb private, 1.4 Mb shared
gc.collect() # no memory change

So as I can see, python maintain a memory pool for lists.
In my first example, if I reparse the xml file, the memory doesn't
grow very much (0.1 Mb precisely)
So I think I'm right with the memory pool.

But is there a way to force python to release this memory ?!

Regards,
FP

May 31 '07 #1

Subscribe Reply

8478

Marc 'BlackJack' Rintsch

In <11************ **********@h2g2 000hsg.googlegr oups.com>, frederic.pica
wrote:

So as I can see, python maintain a memory pool for lists.
In my first example, if I reparse the xml file, the memory doesn't
grow very much (0.1 Mb precisely)
So I think I'm right with the memory pool.

But is there a way to force python to release this memory ?!

AFAIK not. But why is this important as long as the memory consumption
doesn't grow constantly? The virtual memory management of the operating
system usually takes care that only actually used memory is in physical
RAM.

Ciao,
Marc 'BlackJack' Rintsch

May 31 '07 #2

frederic.pica

On 31 mai, 14:16, Marc 'BlackJack' Rintsch <bj_...@gmx.net wrote:

In <1180611604.247 696.149...@h2g2 000hsg.googlegr oups.com>, frederic.pica
wrote:

So as I can see, python maintain a memory pool for lists.
In my first example, if I reparse the xml file, the memory doesn't
grow very much (0.1 Mb precisely)
So I think I'm right with the memory pool.

But is there a way to force python to release this memory ?!

AFAIK not. But why is this important as long as the memory consumption
doesn't grow constantly? The virtual memory management of the operating
system usually takes care that only actually used memory is in physical
RAM.

Ciao,
Marc 'BlackJack' Rintsch

Because I'm an adept of small is beautiful, of course the OS will swap
the unused memory if needed.
If I daemonize this application I will have a constant 40 Mb used, not
yet free for others applications. If another application need this
memory, the OS will have to swap and loose time for the other
application... And I'm not sure that the system will swap first this
unused memory, it could also swap first another application... AFAIK.
And these 40 Mb are only for a 7 Mb xml file, what about parsing a big
one, like 50 Mb ?

I would have preferred to have the choice of manually freeing this
unused memory or setting manually the size of the memory pool

Regards,
FP

May 31 '07 #3

Paul Melis

Hello,

fr***********@g mail.com wrote:

I've some troubles getting my memory freed by python, how can I force
it to release the memory ?
I've tried del and gc.collect() with no success.

[...]

The same problem here with a simple file.readlines( )
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.readline s() #meminfo: 12 Mb private, 1.4 Mb shared
del data #meminfo: 11.5 Mb private, 1.4 Mb shared
gc.collect() # no memory change

But works great with file.read() :
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared
del data #meminfo: 1.1 Mb private, 1.4 Mb shared
gc.collect() # no memory change

So as I can see, python maintain a memory pool for lists.
In my first example, if I reparse the xml file, the memory doesn't
grow very much (0.1 Mb precisely)
So I think I'm right with the memory pool.

But is there a way to force python to release this memory ?!

This is from the 2.5 series release notes
(http://www.python.org/download/relea...5.1/NEWS.txt):

"[...]

- Patch #1123430: Python's small-object allocator now returns an arena to
the system ``free()`` when all memory within an arena becomes unused
again. Prior to Python 2.5, arenas (256KB chunks of memory) were never
freed. Some applications will see a drop in virtual memory size now,
especially long-running applications that, from time to time, temporarily
use a large number of small objects. Note that when Python returns an
arena to the platform C's ``free()``, there's no guarantee that the
platform C library will in turn return that memory to the operating
system.
The effect of the patch is to stop making that impossible, and in
tests it
appears to be effective at least on Microsoft C and gcc-based systems.
Thanks to Evan Jones for hard work and patience.

[...]"

So with 2.4 under linux (as you tested) you will indeed not always get
the used memory back, with respect to lots of small objects being
collected.

The difference therefore (I think) you see between doing an f.read() and
an f.readlines() is that the former reads in the whole file as one large
string object (i.e. not a small object), while the latter returns a list
of lines where each line is a python object.

I wonder how 2.5 would work out on linux in this situation for you.

Paul

May 31 '07 #4

frederic.pica

On 31 mai, 16:22, Paul Melis <p...@science.u va.nlwrote:

Hello,

frederic.p...@g mail.com wrote:
I've some troubles getting my memory freed by python, how can I force
it to release the memory ?
I've tried del and gc.collect() with no success.

[...]

The same problem here with a simple file.readlines( )
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.readline s() #meminfo: 12 Mb private, 1.4 Mb shared
del data #meminfo: 11.5 Mb private, 1.4 Mb shared
gc.collect() # no memory change

But works great with file.read() :
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared
del data #meminfo: 1.1 Mb private, 1.4 Mb shared
gc.collect() # no memory change

So as I can see, python maintain a memory pool for lists.
In my first example, if I reparse the xml file, the memory doesn't
grow very much (0.1 Mb precisely)
So I think I'm right with the memory pool.

But is there a way to force python to release this memory ?!

This is from the 2.5 series release notes
(http://www.python.org/download/relea...5.1/NEWS.txt):

"[...]

- Patch #1123430: Python's small-object allocator now returns an arena to
the system ``free()`` when all memory within an arena becomes unused
again. Prior to Python 2.5, arenas (256KB chunks of memory) were never
freed. Some applications will see a drop in virtual memory size now,
especially long-running applications that, from time to time, temporarily
use a large number of small objects. Note that when Python returns an
arena to the platform C's ``free()``, there's no guarantee that the
platform C library will in turn return that memory to the operating
system.
The effect of the patch is to stop making that impossible, and in
tests it
appears to be effective at least on Microsoft C and gcc-based systems.
Thanks to Evan Jones for hard work and patience.

[...]"

So with 2.4 under linux (as you tested) you will indeed not always get
the used memory back, with respect to lots of small objects being
collected.

The difference therefore (I think) you see between doing an f.read() and
an f.readlines() is that the former reads in the whole file as one large
string object (i.e. not a small object), while the latter returns a list
of lines where each line is a python object.

I wonder how 2.5 would work out on linux in this situation for you.

Paul

Hello,

I will try later with python 2.5 under linux, but as far as I can see,
it's the same problem under my windows python 2.5
After reading this document :
http://evanjones.ca/memoryallocator/python-memory.pdf

I think it's because list or dictionnaries are used by the parser, and
python use an internal memory pool (not pymalloc) for them...

Regards,
FP

May 31 '07 #5

Josh Bloom

If the memory usage is that important to you, you could break this out
into 2 programs, one that starts the jobs when needed, the other that
does the processing and then quits.
As long as the python startup time isn't an issue for you.
On 31 May 2007 04:40:04 -0700, fr***********@g mail.com
<fr***********@ gmail.comwrote:

Greets,

I've some troubles getting my memory freed by python, how can I force
it to release the memory ?
I've tried del and gc.collect() with no success.
Here is a code sample, parsing an XML file under linux python 2.4
(same problem with windows 2.5, tried with the first example) :
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
#Using http://www.pixelbeat.org/scripts/ps_mem.py to get memory
information
import cElementTree as ElementTree #meminfo: 2.3 Mb private, 1.6 Mb
shared
import gc #no memory change

et=ElementTree. parse('primary. xml') #meminfo: 34.6 Mb private, 1.6 Mb
shared
del et #no memory change
gc.collect() #no memory change

So how can I free the 32.3 Mb taken by ElementTree ??

The same problem here with a simple file.readlines( )
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.readline s() #meminfo: 12 Mb private, 1.4 Mb shared
del data #meminfo: 11.5 Mb private, 1.4 Mb shared
gc.collect() # no memory change

But works great with file.read() :
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared
del data #meminfo: 1.1 Mb private, 1.4 Mb shared
gc.collect() # no memory change

So as I can see, python maintain a memory pool for lists.
In my first example, if I reparse the xml file, the memory doesn't
grow very much (0.1 Mb precisely)
So I think I'm right with the memory pool.

But is there a way to force python to release this memory ?!

Regards,
FP

--
http://mail.python.org/mailman/listinfo/python-list

May 31 '07 #6

frederic.pica

On 31 mai, 17:29, "Josh Bloom" <joshbl...@gmai l.comwrote:

If the memory usage is that important to you, you could break this out
into 2 programs, one that starts the jobs when needed, the other that
does the processing and then quits.
As long as the python startup time isn't an issue for you.

On 31 May 2007 04:40:04 -0700, frederic.p...@g mail.com

<frederic.p...@ gmail.comwrote:
Greets,

I've some troubles getting my memory freed by python, how can I force
it to release the memory ?
I've tried del and gc.collect() with no success.
Here is a code sample, parsing an XML file under linux python 2.4
(same problem with windows 2.5, tried with the first example) :
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
#Usinghttp://www.pixelbeat.o rg/scripts/ps_mem.pyto get memory
information
import cElementTree as ElementTree #meminfo: 2.3 Mb private, 1.6 Mb
shared
import gc #no memory change

et=ElementTree. parse('primary. xml') #meminfo: 34.6 Mb private, 1.6 Mb
shared
del et #no memory change
gc.collect() #no memory change

So how can I free the 32.3 Mb taken by ElementTree ??

The same problem here with a simple file.readlines( )
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.readline s() #meminfo: 12 Mb private, 1.4 Mb shared
del data #meminfo: 11.5 Mb private, 1.4 Mb shared
gc.collect() # no memory change

But works great with file.read() :
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary .xml') #no memory change
data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared
del data #meminfo: 1.1 Mb private, 1.4 Mb shared
gc.collect() # no memory change

So as I can see, python maintain a memory pool for lists.
In my first example, if I reparse the xml file, the memory doesn't
grow very much (0.1 Mb precisely)
So I think I'm right with the memory pool.

But is there a way to force python to release this memory ?!

Regards,
FP

--
http://mail.python.org/mailman/listinfo/python-list

Yes it's a solution, but I think it's not a good way, I did'nt want to
use bad hacks to bypass a python specific problem.
And the problem is everywhere, every python having to manage big
files.
I've tried xml.dom.minidom using a 66 Mb xml file =675 Mb of memory
that will never be freed. But that time I've got many unreachable
object when running gc.collect()
Using the same file with cElementTree took me 217 Mb, with no
unreachable object.
For me it's not a good behavior, it's not a good way to let the system
swap this unused memory instead of freeing it.
I think it's a really good idea to have a memory pool for performance
reason, but why is there no 'free block' limit ?
Python is a really really good language that can do many things in a
clear, easier and performance way I think. It has always feet all my
needs. But I can't imagine there is no good solution for that problem,
by limiting the free block pool size or best, letting the user specify
this limit and even better, letting the user completely freeing it
(with also the limit manual specification)

Like:
import pool
pool.free()
pool.limit(size in megabytes)

Why not letting the user choosing that, why not giving the user more
flexibility ?
I will try later under linux with the latest stable python

Regards,
FP

May 31 '07 #7

Chris Mellon

Like:

import pool
pool.free()
pool.limit(size in megabytes)

Why not letting the user choosing that, why not giving the user more
flexibility ?
I will try later under linux with the latest stable python

Regards,
FP

The idea that memory allocated to a process but not being used is a
"cost" is really a fallacy, at least on modern virtual memory sytems.
It matters more for fully GCed languages, where the entire working set
needs to be scanned, but the Python GC is only for breaking refcounts
and doesn't need to scan the entire memory space.

There are some corner cases where it matters, and thats why it was
addressed for 2.5, but in general it's not something that you need to
worry about.

May 31 '07 #8

Thorsten Kampe

* (31 May 2007 06:15:18 -0700)

On 31 mai, 14:16, Marc 'BlackJack' Rintsch <bj_...@gmx.net wrote:
In <1180611604.247 696.149...@h2g2 000hsg.googlegr oups.com>, frederic.pica
wrote:
And I'm not sure that the system will swap first this
unused memory, it could also swap first another application... AFAIK.

Definitely not, this is the principal function of virtual memory in
every Operating System.

May 31 '07 #9

Thorsten Kampe

* Chris Mellon (Thu, 31 May 2007 12:10:07 -0500)

Like:
import pool
pool.free()
pool.limit(size in megabytes)

Why not letting the user choosing that, why not giving the user more
flexibility ?
I will try later under linux with the latest stable python

Regards,
FP

The idea that memory allocated to a process but not being used is a
"cost" is really a fallacy, at least on modern virtual memory sytems.
It matters more for fully GCed languages, where the entire working set
needs to be scanned, but the Python GC is only for breaking refcounts
and doesn't need to scan the entire memory space.

There are some corner cases where it matters, and thats why it was
addressed for 2.5, but in general it's not something that you need to
worry about.

If it's swapped to disk than this is a big concern. If your Python app
allocates 600 MB of RAM and does not use 550 MB after one minute and
this unused memory gets into the page file then the Operating System
has to allocate and write 550 MB onto your hard disk. Big deal.

Thorsten

May 31 '07 #10

Similar topics

5115

Python memory management

by: Marcelo A. Camelo | last post by:

Hi! I will be presenting Python to an audience of game developers, mostly C/C++ programmers. In my presentation I will talk about using python and C/C++ extension instead of pure C/C++ to write commercial games and content creation tools. In a number of times I have presented these ideas to this same kind of audience and they were received with cold skepticism, despite the fact that some, including me, are already doing this with...

Python

1739

Python memory use (psyco, C++)

by: Roy Smith | last post by:

I understand that psyco significantly increases memory use. Is that for code or data? More specifically, if I've got a memory intensive application (it might use 100's of Mbytes of data), should I expect memory use to go up significantly under psyco? Also, for that memory intensive application, how should I expect Python memory use to compare with C++? I'm really only interested in data; the memory needed to store the code is almost...

Python

9357

Python memory deallocate

by: mariano.difelice | last post by:

Hi, I've a big memory problem with my application. First, an example: If I write: a = range(500*1024) I see that python process allocate approximately 80Mb of memory.

Python

1728

Python Memory Leak Detector

by: Stephen Kellett | last post by:

Announcing Software Tools for Python We are pleased to inform you that we have completed the port and beta test of our Memory Analysis software tool to support Python. The software tools run on the Windows NT/2000/XP (and above) platforms. Python Memory Validator (a memory leak detection tool) http://www.softwareverify.com/python/memory/index.html The website provides 30 day evaluation versions of the software as well

Python

4487

Python memory usage

by: placid | last post by:

Hi All, Just wondering when i run the following code; for i in range(1000000): print i the memory usage of Python spikes and when the range(..) block finishes execution the memory usage does not drop down. Is there a way of freeing this memory that range(..) allocated?

Python

4525

Python's handling of unicode surrogates

by: Adam Olsen | last post by:

As was seen in another thread, there's a great deal of confusion with regard to surrogates. Most programmers assume Python's unicode type exposes only complete characters. Even CPython's own functions do this on occasion. This leads to different behaviour across platforms and makes it unnecessarily difficult to properly support all languages. To solve this I propose Python's unicode type using UTF-16 should have gaps in its index,...

Python

3085

Python Memory Usage

by: greg.novak | last post by:

I am using Python to process particle data from a physics simulation. There are about 15 MB of data associated with each simulation, but there are many simulations. I read the data from each simulation into Numpy arrays and do a simple calculation on them that involves a few eigenvalues of small matricies and quite a number of temporary arrays. I had assumed that that generating lots of temporary arrays would make my program run slowly,...

Python

214

Re: Python memory usage

by: bieffe62 | last post by:

On 21 Ott, 17:19, Rolf Wester <rolf.wes...@ilt.fraunhofer.dewrote: To be sure that the deallocated memory is not cached at some level to be reused, you could try someting like this: while 1: l =

Python

1410

Memory Handling - DataTable and SQL Database

by: marcellus7 | last post by:

Hey guys Im having some problem with handling DataTable memory, here's my predicament; Ive got a TableAdapter and Dataset from a SQL Database that I am inserting records to. These records are being inserting directly to schema using a DataRow of the table from the dataset (I cant use the insert query from the tableadapter because there are over 100 columns in the table that its being inserted to.) My problem is that the datatable seems to be...

.NET Framework

8917

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

9426

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

9200

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

9142

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

6722

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6022

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

4795

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

2680

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

2163

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General