472,141 Members | 1,391 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,141 software developers and data experts.

Re: Memory error while saving dictionary of size 65000X50 using pickle

I didn't have the problem with dumping as a string. When I tried to
save this object to a file, memory error pops up.

I am sorry for the mention of size for a dictionary. What I meant by
65000X50 is that it has 65000 keys and each key has a list of 50
tuples.

I was able to save a dictionary object with 65000 keys and a list of
15-tuple values to a file. But I could not do the same when I have a
list of 25-tuple values for 65000 keys.

You exmple works just fine on my side.

Thank you,
Nagu
Jul 7 '08 #1
1 5937
I didn't have the problem with dumping as a string. When I tried to
save this object to a file, memory error pops up.
That's not what the backtrace says. The backtrace says that the error
occurs inside pickle.dumps() (and it is consistent with the functions
being called, so it's plausible).
I am sorry for the mention of size for a dictionary. What I meant by
65000X50 is that it has 65000 keys and each key has a list of 50
tuples.
[...]
>
You exmple works just fine on my side.
I can get the program

import pickle

d = {}

for i in xrange(65000):
d[i]=[(x,) for x in range(50)]
print "Starting dump"
s = pickle.dumps(d)

to complete successfully, also, however, it consumes a lot
of memory. I can reduce memory usage slightly by
a) dumping directly to a file, and
b) using cPickle instead of pickle
i.e.

import cPickle as pickle

d = {}

for i in xrange(65000):
d[i]=[(x,) for x in range(50)]
print "Starting dump"
pickle.dump(d,open("/tmp/t.pickle","wb"))

The memory consumed originates primarily from the need to determine
shared references. If you are certain that no object sharing occurs
in your graph, you can do
import cPickle as pickle

d = {}

for i in xrange(65000):
d[i]=[(x,) for x in range(50)]
print "Starting dump"
p = pickle.Pickler(open("/tmp/t.pickle","wb"))
p.fast = True
p.dump(d)

With that, I see no additional memory usage, and pickling completes
really fast.

Regards,
Martin
Jul 7 '08 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Benjamin Scott | last post: by
6 posts views Thread by Byron | last post: by
8 posts views Thread by Hans Georg Krauthaeuser | last post: by
10 posts views Thread by Luis P. Mendes | last post: by
6 posts views Thread by Ganesan selvaraj | last post: by
3 posts views Thread by RubenDV | last post: by
3 posts views Thread by Catherine Moroney | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.