469,287 Members | 2,553 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,287 developers. It's quick & easy.

Memory limit to dict?

I was wondering whether certain data structures in Python, e.g. dict,
might have limits as to the amount of memory they're allowed to take up.
Is there any documentation on that?

Why am I asking? I'm reading 3.6 GB worth of BLAST output files into a
nested dictionary structure (dict within dict ...). Looks something like
this:

{ GenomeID:
{ ProteinID:
{ GenomeID:
{ ProteinID, Score, PercentValue, EValue } } } }

Now, the thing is: Even on a machine with 16 GB RAM, the program
terminates with a MemoryError, obviously long before the machine's RAM
is used up.

I've no idea how far the Windows task manager's resource monitor can be
trusted -- probably not as far as I could throw a heavy-set St Bernard
--, but it seems to stop roughly when that monitor records a swap file
size of 2.2 GB.

Barring any revamping of the code itself, which I will have to do
anyway, is there anything so far that would indicate a problem inherent
to Python?

(I can post the relevant code too, of course, if that would help.)

TIA!

--
Peter
Apr 11 '06 #1
3 3586
Peter Beattie <Pe***********@web.de> writes:

<snip>
I've no idea how far the Windows task manager's resource monitor can be
trusted -- probably not as far as I could throw a heavy-set St Bernard
--, but it seems to stop roughly when that monitor records a swap file
size of 2.2 GB.


Not being a windows expert at all, but I would assume with 32 bit
windows each process in the system can have an address space of ~2
gigs. In linux the process address space is split in half, bottom 2
gigs for OS mappings, top for the process, so it looks like you might
just be hitting the maximum allowed address space mapping.

You should partition your data into hierarchial modules and let python
do the swapping for you...although you have 16 gigs (I have to put a
holy crap after that!) you will always run into process limits, at
least until true 64 bit os's are in vouge.

--
burton samograd kruhft .at. gmail
kruhft.blogspot.com www.myspace.com/kruhft metashell.blogspot.com
Apr 11 '06 #2
Em Ter, 2006-04-11 *s 19:45 +0200, Peter Beattie escreveu:
I was wondering whether certain data structures in Python, e.g. dict,
might have limits as to the amount of memory they're allowed to take up.
Is there any documentation on that?

Why am I asking? I'm reading 3.6 GB worth of BLAST output files into a
nested dictionary structure (dict within dict ...). Looks something like
this:

{ GenomeID:
{ ProteinID:
{ GenomeID:
{ ProteinID, Score, PercentValue, EValue } } } }


I don't have the answer to your question and I'll make a new one: isn't
the overhead (performance and memory) of creating dicts too large to be
used in this scale?

I'm just speculating, but I *think* that using lists and objects may be
better.

My 2 cents,

--
Felipe.

Apr 11 '06 #3
An alternative is to use ZODB. For example, you could use the BTree
class for the outermost layers of the nested dict, and a regular dict
for the innermost layer. If broken up properly, you can store
apparently unlimited amount of data with reasonable performance.

Just remember not to iterate over the entire collection of objects
without aborting the transaction regularly.

Apr 11 '06 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Benjamin Scott | last post: by
4 posts views Thread by Thomas Rast | last post: by
9 posts views Thread by Chris S. | last post: by
5 posts views Thread by Ian Bicking | last post: by
17 posts views Thread by frederic.pica | last post: by
19 posts views Thread by jsanshef | last post: by
reply views Thread by rkmr.em | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.