472,354 Members | 2,102 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,354 software developers and data experts.

Memory problem

Hi,

I need to read a large amount of data into a list. So I am trying to
see if I'll have any memory problem. When I do
x=range(2700*2700*3) I got the following message:

Traceback (most recent call last):
File "<stdin>", line 1, in ?
MemoryError

Any way to get around this problem? I have a machine of 4G memory. The
total number of data points (float) that I need to read is in the order
of 200-300 millions.

Thanks.

Aug 14 '06 #1
13 2313
Yi Xing wrote:
I need to read a large amount of data into a list. So I am trying to
see if I'll have any memory problem. When I do
x=range(2700*2700*3) I got the following message:
Traceback (most recent call last):
File "<stdin>", line 1, in ?
MemoryError
Any way to get around this problem? I have a machine of 4G memory. The
total number of data points (float) that I need to read is in the order
of 200-300 millions.
If you know that you need floats only, then you can use a typed array
(an array.array) instead of an untyped array (a Python list):

import array
a = array.array("f")

You can also try with a numerical library like scipy, it may support up
to 2 GB long arrays.

Bye,
bearophile

Aug 14 '06 #2
Yi Xing wrote:
Hi,

I need to read a large amount of data into a list. So I am trying to
see if I'll have any memory problem. When I do
x=range(2700*2700*3) I got the following message:

Traceback (most recent call last):
File "<stdin>", line 1, in ?
MemoryError

Any way to get around this problem? I have a machine of 4G memory. The
total number of data points (float) that I need to read is in the order
of 200-300 millions.
2700*2700*3 is only 21M. Your computer shouldn't have raised a sweat,
let alone MemoryError. Ten times that got me a MemoryError on a 1GB
machine.

A raw Python float takes up 8 bytes. On a 32-bit machine a float object
will have another 8 bytes of (type, refcount). Instead of a list, you
probably need to use an array.array (which works on homogenous
contents, so it costs 8 bytes each float, not 16), or perhaps
numeric/numpy/scipy/...

HTH,
John

Aug 14 '06 #3

be************@lycos.com wrote:
If you know that you need floats only, then you can use a typed array
(an array.array) instead of an untyped array (a Python list):

import array
a = array.array("f")
Clarification: typecode 'f' stores a Python float (64-bits, equivalent
to a C double) as a 32-bit FP number (equivalent to a C float) -- with
apart from the obvious loss of precision, a little extra time being
required to convert to & fro. You may consider the trade-off
worthwhile.

Cheers,
John

Aug 14 '06 #4
Yi Xing wrote:
Hi,

I need to read a large amount of data into a list. So I am trying to see
if I'll have any memory problem. When I do
x=range(2700*2700*3) I got the following message:

Traceback (most recent call last):
File "<stdin>", line 1, in ?
MemoryError

Any way to get around this problem? I have a machine of 4G memory. The
total number of data points (float) that I need to read is in the order
of 200-300 millions.

Thanks.
On my 1Gb machine this worked just fine, no memory error.

-Larry Bates
Aug 14 '06 #5
On a related question: how do I initialize a list or an array with a
pre-specified number of elements, something like
int p[100] in C? I can do append() for 100 times but this looks silly...

Thanks.

Yi Xing

Aug 14 '06 #6

Yi Xing wrote:
On a related question: how do I initialize a list or an array with a
pre-specified number of elements, something like
int p[100] in C? I can do append() for 100 times but this looks silly...

Thanks.

Yi Xing
You seldom need to do that in python, but it's easy enough:

new_list = [0 for notused in xrange(100)]

or if you already have a list:

my_list.extend(0 for notused in xrange(100))

HTH,
~Simon

Aug 14 '06 #7

Yi Xing wrote:
On a related question: how do I initialize a list or an array with a
pre-specified number of elements, something like
int p[100] in C? I can do append() for 100 times but this looks silly...

Thanks.

Yi Xing
Use [0]*100 for a list.

THN

Aug 14 '06 #8
Yi Xing wrote:
On a related question: how do I initialize a list or an array with a
pre-specified number of elements, something like
int p[100] in C? I can do append() for 100 times but this looks silly...

Thanks.

Yi Xing
Unlike other languages this is seldom done in Python. I think you should
probably be looking at http://numeric.scipy.org/ if you want to have
"traditional" arrays of floats.

-Larry
Aug 14 '06 #9
Thanks! I just found that that I have no problem with
x=[[10.0]*2560*2560]*500, but x=range(1*2560*2560*30) doesn't work.

-Yi
On Aug 14, 2006, at 3:08 PM, Larry Bates wrote:
Yi Xing wrote:
>On a related question: how do I initialize a list or an array with a
pre-specified number of elements, something like
int p[100] in C? I can do append() for 100 times but this looks
silly...

Thanks.

Yi Xing
Unlike other languages this is seldom done in Python. I think you
should
probably be looking at http://numeric.scipy.org/ if you want to have
"traditional" arrays of floats.

-Larry
--
http://mail.python.org/mailman/listinfo/python-list
Aug 14 '06 #10

Yi Xing wrote:
On a related question: how do I initialize a list or an array with a
pre-specified number of elements, something like
int p[100] in C? I can do append() for 100 times but this looks silly...

Thanks.

Yi Xing
In the case of an array, you may wish to consider the fromfile()
method.

Cheers,
John

Aug 14 '06 #11
Yi Xing wrote:
Thanks! I just found that that I have no problem with
x=[[10.0]*2560*2560]*500, but x=range(1*2560*2560*30) doesn't work.
That's no surprise. In the first case, try

x[0][0] = 20.0
print x[1][0]

You have the very same (identical) list of 2560*2560 values in x
500 times.

To create such a structure correctly, do

x = [None] * 500
for i in range(500)
x[i] = [10.0]*2560*2560

In any case, check ulimit(1).

Regards,
Martin
Aug 14 '06 #12
Yi Xing wrote:
Thanks! I just found that that I have no problem with
x=[[10.0]*2560*2560]*500, but x=range(1*2560*2560*30) doesn't work.
range(1*2560*2560*30) is creating a list of 196M *unique* ints.
Assuming 32-bit ints and pointers: that's 4 bytes each for the value, 4
for the type pointer, 4 for the refcount and 4 for the actual list
element (a pointer to the 12-byte object). so that's one chunk of
4x196M = 786MB of contiguous list, plus 196M chunks each whatever size
gets allocated for a request of 12 bytes. Let's guess at 16. So the
total memory you need is 3920M.

Now let's look at [[10.0]*2560*2560]*500.
Firstly that creates a tiny list [10.0]. then you create a list that
contains 2560*2560 = 6.5 M references to that *one* object containing
10.0. That's 26MB. Then you make a list of 500 references to that big
list. This new list costs you 2000 bytes. Total required: about 26.2MB.
The minute you start having non-unique numbers instead of 10.0, this
all falls apart.

In any case, your above comparison is nothing at all to do with the
solution that you need, which as already explained will involve
array.array or numpy.

What you now need to do is answer the questions about your pagefile
etc.

Cheers,
John

Aug 14 '06 #13
I used the array module and loaded all the data into an array.
Everything works fine now.
On Aug 14, 2006, at 4:01 PM, John Machin wrote:
Yi Xing wrote:
>Thanks! I just found that that I have no problem with
x=[[10.0]*2560*2560]*500, but x=range(1*2560*2560*30) doesn't work.

range(1*2560*2560*30) is creating a list of 196M *unique* ints.
Assuming 32-bit ints and pointers: that's 4 bytes each for the value, 4
for the type pointer, 4 for the refcount and 4 for the actual list
element (a pointer to the 12-byte object). so that's one chunk of
4x196M = 786MB of contiguous list, plus 196M chunks each whatever size
gets allocated for a request of 12 bytes. Let's guess at 16. So the
total memory you need is 3920M.

Now let's look at [[10.0]*2560*2560]*500.
Firstly that creates a tiny list [10.0]. then you create a list that
contains 2560*2560 = 6.5 M references to that *one* object containing
10.0. That's 26MB. Then you make a list of 500 references to that big
list. This new list costs you 2000 bytes. Total required: about 26.2MB.
The minute you start having non-unique numbers instead of 10.0, this
all falls apart.

In any case, your above comparison is nothing at all to do with the
solution that you need, which as already explained will involve
array.array or numpy.

What you now need to do is answer the questions about your pagefile
etc.

Cheers,
John

--
http://mail.python.org/mailman/listinfo/python-list
Aug 15 '06 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Salvador | last post by:
Hi, I am using WMI to gather information about different computers (using win2K and win 2K3), checking common classes and also WMI load balance. My application runs every 1 minute and reports...
9
by: Bruno Barberi Gnecco | last post by:
I'm using PHP to run a CLI application. It's a script run by cron that parses some HTML files (with DOM XML), and I ended up using PHP to integrate with the rest of the code that already runs the...
9
by: jeungster | last post by:
Hello, I'm trying to track down a memory issue with a C++ application that I'm working on: In a nutshell, the resident memory usage of my program continues to grow as the program runs. It...
17
by: frederic.pica | last post by:
Greets, I've some troubles getting my memory freed by python, how can I force it to release the memory ? I've tried del and gc.collect() with no success. Here is a code sample, parsing an XML...
1
by: martinsmith160 | last post by:
Hi all I am trying to create a level builder tool for a final year project and im having some problems drawing. I have placed a picture box within a panel so i can scroll around the image which is...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge required to effectively administer and manage Oracle...
0
jalbright99669
by: jalbright99669 | last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made but the http to https rule only works for...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and...
0
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
0
Oralloy
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the synthesis of my design into a bitstream, not the C++...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
0
by: Rahul1995seven | last post by:
Introduction: In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python has gained popularity among beginners and experts...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.