473,399 Members | 2,278 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

Memory problem with Python



i am developing distributed environment in my college using Python. I
am using therads in client for downloading wepages. Even though i am
reusing the thread, memory usage get increased. I don know why.? I am
using BerkelyDB for URLQueue, BeautifulShop for Parsing the webpages.

Any idea of redusing the memory usage.. please tell me....

I want my program to run in bouded Memory.. Please..........

Jun 18 '07 #1
3 1230
On Jun 17, 8:51 pm, Squzer Crawler <Squ...@gmail.comwrote:
i am developing distributed environment in my college using Python. I
am using therads in client for downloading wepages. Even though i am
reusing the thread, memory usage get increased. I don know why.? I am
using BerkelyDB for URLQueue, BeautifulShop for Parsing the webpages.
Isn't the increased memory resulted from storing the already
processed pages?

Look first at all places where your code instantiates new
objects - and make sure you don't keep references to such objects that
are not needed anymore.

Also, reusing threads has nothing to do with saving memory - but
with saving on thread creation time, if I understand your problem
description.

Jun 18 '07 #2
On Jun 18, 11:06 am, "sor...@gmail.com" <sor...@gmail.comwrote:
On Jun 17, 8:51 pm, Squzer Crawler <Squ...@gmail.comwrote:
i am developing distributed environment in my college using Python. I
am using therads in client for downloading wepages. Even though i am
reusing the thread, memory usage get increased. I don know why.? I am
using BerkelyDB for URLQueue, BeautifulShop for Parsing the webpages.

Isn't the increased memory resulted from storing the already
processed pages?

Look first at all places where your code instantiates new
objects - and make sure you don't keep references to such objects that
are not needed anymore.

Also, reusing threads has nothing to do with saving memory - but
with saving on thread creation time, if I understand your problem
description.
what about the cyclic reference.. can i use GC in my program..

if so, please tell me how to implement.. i am calling the gc.collect()
at the enf of the fetching.. Will it reduce my program speed. Else in
which way i can call it..?

please tell me........

Jun 18 '07 #3
Squzer Crawler wrote:
On Jun 18, 11:06 am, "sor...@gmail.com" <sor...@gmail.comwrote:
>On Jun 17, 8:51 pm, Squzer Crawler <Squ...@gmail.comwrote:
>>i am developing distributed environment in my college using Python. I
am using therads in client for downloading wepages. Even though i am
reusing the thread, memory usage get increased. I don know why.? I am
using BerkelyDB for URLQueue, BeautifulShop for Parsing the webpages.
Isn't the increased memory resulted from storing the already
processed pages?

Look first at all places where your code instantiates new
objects - and make sure you don't keep references to such objects that
are not needed anymore.

Also, reusing threads has nothing to do with saving memory - but
with saving on thread creation time, if I understand your problem
description.

what about the cyclic reference.. can i use GC in my program..

if so, please tell me how to implement.. i am calling the gc.collect()
at the enf of the fetching.. Will it reduce my program speed. Else in
which way i can call it..?
Garbage collection should happen automatically as long as you are
deleting references to objects you no longer need. If gc.garbage isn't
empty, then you have unbreakable reference cycles. It seems more
likely, as soring@gmail says, that you are keeping copies of the things
you already parsed in memory.

What you can do (if you aren't able to find the bug) is have a wrapper
program that repeatedly starts up your url fetcher via os.system().
Then have your url fetcher close itself down every few hours.

- Josiah
Jun 18 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: lebo | last post by:
Hi I'm trying to understand how Python handles memory usage and dynamic object loading and unloading. Problem to solve? Build a very low memory footprint (non-GUI) Python application for...
3
by: Guy | last post by:
Hi It might take me a little time to explain this but here goes. Firstly I'm not using the latest upto date python releases so my first plan is to try more upto date rels of python and win32...
1
by: Carl Bevil | last post by:
Hello all. If I want to use a custom memory manager to, say, track memory allocations in Python, what's the best way to do this? I seem to remember there being a way in version 1.5 (or so -- been...
35
by: Alex Martelli | last post by:
Having fixed a memory leak (not the leak of a Python reference, some other stuff I wasn't properly freeing in certain cases) in a C-coded extension I maintain, I need a way to test that the leak is...
10
by: Andrew Trevorrow | last post by:
No response to my last message, so I'll try a different tack... Does anyone know of, or even better, has anyone here written a C++ application for Mac/Windows that allows users to run Python...
20
by: mariano.difelice | last post by:
Hi, I've a big memory problem with my application. First, an example: If I write: a = range(500*1024) I see that python process allocate approximately 80Mb of memory.
17
by: frederic.pica | last post by:
Greets, I've some troubles getting my memory freed by python, how can I force it to release the memory ? I've tried del and gc.collect() with no success. Here is a code sample, parsing an XML...
5
by: vishnu | last post by:
Hi there, I am embedding python 2.5 on embedded system running on RTOS where I had strict memory constraints. As python is a huge malloc intensive application, I observed huge memory...
0
by: greg.novak | last post by:
I am using Python to process particle data from a physics simulation. There are about 15 MB of data associated with each simulation, but there are many simulations. I read the data from each...
3
by: crazy420fingers | last post by:
I'm running a python program that simulates a wireless network protocol for a certain number of "frames" (measure of time). I've observed the following: 1. The memory consumption of the program...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.