hard memory limits

Maurice LING

Hi,

I think I've hit a system limit in python when I try to construct a list
of 200,000 elements. My error is

malloc: vm_allocate (size = 2400256) failed......

Just wondering is this specific to my system or what? Will adding more
RAM helps in this case?

Thanks and cheers
Maurice

Jul 19 '05 #1

Subscribe Post Reply

2672

Fredrik Lundh

Maurice LING wrote:

I think I've hit a system limit in python when I try to construct a list
of 200,000 elements.
there's no such limit in Python.
My error is

malloc: vm_allocate (size = 2400256) failed......

Just wondering is this specific to my system or what?
that doesn't look like a Python error (Python usually raises
MemoryError exceptions when it runs out of memory), and
there's no sign of any vm_allocate function in the Python
sources, so yes, it's a system-specific problem.
Will adding more RAM helps in this case?

probably. more swap space might also help. or you could use a
smarter malloc package. posting more details on your platform,
toolchain, python version, and list building approach might also
help.

(are you perhaps building multiple lists piece by piece, interleaved
with other object allocations? if so, it's probably a fragmentation
problem. to check this, watch the process size. if if grows at a
regular rate, and then explodes just before you get the above error,
you may need to reconsider the design a bit).

</F>

Jul 19 '05 #2

Mike Meyer

"Fredrik Lundh" <fr*****@pythonware.com> writes:

Maurice LING wrote:
Will adding more RAM helps in this case?

probably. more swap space might also help. or you could use a
smarter malloc package. posting more details on your platform,
toolchain, python version, and list building approach might also
help.

Without platform information, it's hard to say. On a modern Unix
system, you only run into system resource limits when the system is
heavily loaded. Otherwise, you're going to hit per-process limits. In
the latter case, adding RAM or swap won't help at all. Raising the
per-process limits is the solution.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Jul 19 '05 #3

Bill Mill

On 5/6/05, Mike Meyer <mw*@mired.org> wrote:

"Fredrik Lundh" <fr*****@pythonware.com> writes:

Maurice LING wrote:
Will adding more RAM helps in this case?
probably. more swap space might also help. or you could use a
smarter malloc package. posting more details on your platform,
toolchain, python version, and list building approach might also
help.

Without platform information, it's hard to say. On a modern Unix
system, you only run into system resource limits when the system is
heavily loaded. Otherwise, you're going to hit per-process limits. In
the latter case, adding RAM or swap won't help at all. Raising the
per-process limits is the solution.

A quick google shows it to be mac os X, and a pretty frequent error message..

http://www.google.com/search?hl=en&q...=Google+Search

Peace
Bill Mill
bill.mill at gmail.com
<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
--
http://mail.python.org/mailman/listinfo/python-list

Jul 19 '05 #4

Fredrik Lundh

Mike Meyer wrote:

Without platform information, it's hard to say. On a modern Unix
system, you only run into system resource limits when the system is
heavily loaded. Otherwise, you're going to hit per-process limits. In
the latter case, adding RAM or swap won't help at all. Raising the
per-process limits is the solution.

does Mac OS X ship with memory limits set by default? isn't that
a single-user system?

</F>

Jul 19 '05 #5

James Stroud

On Friday 06 May 2005 10:29 am, Fredrik Lundh wrote:

Mike Meyer wrote:
Without platform information, it's hard to say. On a modern Unix
system, you only run into system resource limits when the system is
heavily loaded. Otherwise, you're going to hit per-process limits. In
the latter case, adding RAM or swap won't help at all. Raising the
per-process limits is the solution.

does Mac OS X ship with memory limits set by default? isn't that
a single-user system?

</F>

Dear original poster or whoever is interested in OS X:

OS X is not a single user system. It is BSD based unix. And its fv@king
sweeeeeeeeeeeeeet! (Though I'm using only Linux right now :o/

If configurable memory limits are a problem and if running python from the
shell, do:

% unlimit

You can also change this in your .cshrc, .tcshrc, .bashrc, .k[whatever for
korn], etc. if you run a custom shell.

Your shell settings for each user are in NetInfo Manager.

If you are completely clueless as to what the hell I'm talking about, then
stop and try this:

1. start a Terminal
2. type this at the prompt:

% echo unlimit >> .bashrc

3. Quit that terminal.
4. Start a new terminal.
5. Start python and make your list.

Hope it works.

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/

Jul 19 '05 #6

Fredrik Lundh

James Stroud wrote:

does Mac OS X ship with memory limits set by default? isn't that
a single-user system?

Dear original poster or whoever is interested in OS X:

OS X is not a single user system. It is BSD based unix. And its fv@king
sweeeeeeeeeeeeeet! (Though I'm using only Linux right now :o/

Well, Apple's marketing materials contain no signs whatsoever that the
systems Apple sells are designed for massive numbers of users, compared
to systems from RedHat, Sun, HP, etc. (if you look at apple.com in this
very moment, it talks a lot about "your mac" and "your desktop" and "your
computer", not "the mac/desktop/computer you share with hundreds of
other users").

So why would Apple insist on setting unusably low process limits, when
the others don't?

</F>

Jul 19 '05 #7

James Stroud

On Friday 06 May 2005 11:27 am, Fredrik Lundh wrote:

James Stroud wrote:
does Mac OS X ship with memory limits set by default? isn't that
a single-user system?

Dear original poster or whoever is interested in OS X:

OS X is not a single user system. It is BSD based unix. And its fv@king
sweeeeeeeeeeeeeet! (Though I'm using only Linux right now :o/

Well, Apple's marketing materials contain no signs whatsoever that the
systems Apple sells are designed for massive numbers of users, compared
to systems from RedHat, Sun, HP, etc. (if you look at apple.com in this
very moment, it talks a lot about "your mac" and "your desktop" and "your
computer", not "the mac/desktop/computer you share with hundreds of
other users").

So why would Apple insist on setting unusably low process limits, when
the others don't?

</F>

I think that two different markets exist for this computer:

1. Joe user who has never seen a command line interface. These people need a
nice, cozy little user environment that takes as little understanding as
possible to use. They also buy the most computers and are probably most
responsive to fluffy advertising campaigns. Hence the targeted advertising on
apple.com In this case, my guess is that memory allocation, etc, is left to
the application. For cocoa apps it is the objective c runtime handling this
kind of thing and for carbon apps, it is probobably tacked on during the
process of carbonizing. But I should say that I really don't know much about
the low level workings of either.

2. Scientists/Engineers/Programmer types. These people configure their own
limits instinctively and probably forgot that they ever put that unlimit in
their rc files (like I did) and the dozens of other customizations they did
to get their OS X boxes just so for unix use. When Joe User makes the
crossover, such customizations don't seem very intuitive. Plus, I remember
having to ulimit my IRIX account on SGIs I used back in the day--so other
*nixes seem to have similar requirements.

To answer your question, my guess is that no one has complained yet.

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/

Jul 19 '05 #8

Mike Meyer

"Fredrik Lundh" <fr*****@pythonware.com> writes:

James Stroud wrote:
> does Mac OS X ship with memory limits set by default? isn't that
> a single-user system?

Dear original poster or whoever is interested in OS X:

OS X is not a single user system. It is BSD based unix. And its fv@king
sweeeeeeeeeeeeeet! (Though I'm using only Linux right now :o/

So why would Apple insist on setting unusably low process limits, when
the others don't?

You're making an unwarranted assumption here - that the OP wasn't
creating a large process of some kind. IIRC, all we ever saw was the
size of the request that triggered the error, with no indication of
the total process size.

FWIW, OS X has a Mach kernel. The failing vm_malloc call listed in the
OP is a Mach call, not a Unix call. These days, the userland code is
largely FreeBSD. It used to include "the best of" OpenBSD, NetBSD and
FreeBSD at a relatively small level, but that headache was dropped in
favor of tracking one external system. The legacy of the mixed
heritage is utilities from NetBSD and OpenBSD that aren't in
FreeBSD. shlock comes to mind as an obvious example.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Jul 19 '05 #9

John Machin

On Fri, 06 May 2005 18:24:21 +1000, Maurice LING <ma*********@acm.org>
wrote:

Hi,

I think I've hit a system limit in python when I try to construct a list
of 200,000 elements. My error is

malloc: vm_allocate (size = 2400256) failed......

Just wondering is this specific to my system or what? Will adding more
RAM helps in this case?

Not if it's an OS limit (see other posts). Not if you are doing
something so weird or drastic that you will use up the extra RAM and
still get the same message.

If you were to reply to Fredrik's question (HOW are you creating your
list), and this one: WHAT is an "element", we might be able to help
you avoid a trip to the Apple dealer.

As a bit of a reality check for you:
[numbers based on a 32-bit machine, CPython]

An extra list of 200000 ints will take up 800000 bytes (plus small
change) if the ints are all in range(-1, 101) and thus cached --
that's a 4-byte pointer (i.e. PyObject *) each.

If all the ints are outside that range, and are distinct, they'll take
(worst case) 16 bytes each (assuming you aren't using a debug build of
Python). The additional 12 bytes are for the int object: a reference
counter, a pointer to the type object, and the actual value. So you're
looking at approx 3.2MB.

In general, reckon on each element taking up 12+sizeof(element_value).

I'd suspect that your "elements" are not quite as elementary as ints,
and/or you are doing a whole lot of *other* memory allocation. Confess
all ...

Cheers,
John

Jul 19 '05 #10

Maurice LING

Hi everyone,

thanks for your help.

Yes, I'm using Mac OSX 1.3 with 256MB Ram. Each element in the list is a
float. The list is actually a retrieved results of document IDs from
SOAP interface. And Mac OSX does not have 'unlimit' command as shown,

Maurice-Lings-Computer:~ mauriceling$ unlimit
-bash: unlimit: command not found
Maurice-Lings-Computer:~ mauriceling$ which unlimit
Maurice-Lings-Computer:~ mauriceling$ sh unlimit
unlimit: unlimit: No such file or directory
Maurice-Lings-Computer:~ mauriceling$
Cheers
Maurice

Jul 19 '05 #11

Mike Meyer

Maurice LING <ma*********@acm.org> writes:

Hi everyone,

thanks for your help.

Yes, I'm using Mac OSX 1.3 with 256MB Ram. Each element in the list is
a float. The list is actually a retrieved results of document IDs from
SOAP interface. And Mac OSX does not have 'unlimit' command as shown,

Maurice-Lings-Computer:~ mauriceling$ unlimit
-bash: unlimit: command not found
Maurice-Lings-Computer:~ mauriceling$ which unlimit
Maurice-Lings-Computer:~ mauriceling$ sh unlimit
unlimit: unlimit: No such file or directory
Maurice-Lings-Computer:~ mauriceling$

This is a shell builtin command, not an OS X command. For bash, the
command is "ulimit", not "unlimit". You'll need to read the bash man
page for exact details on how to raise your processes memory limits.

Note that the OS X may have a hard limit that you can't exceed except
as root. This can be raised, but it's a global system configuration,
and you'll have to get someone who knows more about OS X than I do to
tell you how to do that.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Jul 19 '05 #12

Fredrik Lundh

Mike Meyer wrote:

So why would Apple insist on setting unusably low process limits, when
the others don't?
You're making an unwarranted assumption here - that the OP wasn't
creating a large process of some kind.

You need a special license to create large processes on a Mac?

I clicked on the google link that Bill posted, and noted that it wasn't
exactly something that only affected a single Python user. If some-
thing causes problems for many different applications that run fine
on other Unix systems, it's pretty obvious that the default OS X con-
figuration isn't quite as Unixy as one would expect.
FWIW, OS X has a Mach kernel. The failing vm_malloc call listed in the
OP is a Mach call, not a Unix call.

So has tru64. I've done some serious Python stuff on that platform
(stuff that included some really large processes ;-), and I never had
any allocation problems. But of course, that system was designed
by DEC people...

</F>

Jul 19 '05 #13

James Stroud

Sorry Maurice, apparently in bash its "ulimit" (no n). I don't use bash, so I
don't know all of the differences offhand. Try that.

James

On Friday 06 May 2005 03:02 pm, Maurice LING wrote:

Hi everyone,

thanks for your help.

Yes, I'm using Mac OSX 1.3 with 256MB Ram. Each element in the list is a
float. The list is actually a retrieved results of document IDs from
SOAP interface. And Mac OSX does not have 'unlimit' command as shown,

Maurice-Lings-Computer:~ mauriceling$ unlimit
-bash: unlimit: command not found
Maurice-Lings-Computer:~ mauriceling$ which unlimit
Maurice-Lings-Computer:~ mauriceling$ sh unlimit
unlimit: unlimit: No such file or directory
Maurice-Lings-Computer:~ mauriceling$
Cheers
Maurice

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/

Jul 19 '05 #14

James Tanis

James Stroud wrote:

Sorry Maurice, apparently in bash its "ulimit" (no n). I don't use bash, so I
don't know all of the differences offhand. Try that.

The only shells I know of that uses unlimit is csh & tcsh.. bleh.. :)
FWIW, I've had the same problem in openbsd, while ulimit will fix your
problem temporarily you'll probably want to edit your default user class
/etc/login.conf. In response to someone earlier, I think it's Linux here
that is un-unix like, I do not think that characteristally a (non-admin)
user is allowed unlimited access to ram in most varieties of unix, among
other things and for good reason.

Jul 19 '05 #15

Maurice LING

James Stroud wrote:

Sorry Maurice, apparently in bash its "ulimit" (no n). I don't use bash, so I
don't know all of the differences offhand. Try that.

James

Thanks guys,

It doesn't seems to help. I'm thinking that it might be a SOAPpy
problem. The allocation fails when I grab a list of more than 150k
elements through SOAP but allocating a 1 million element list is fine in
python.

Now I have a performance problem...

Say I have 3 lists (20K elements, 1G elements, and 0 elements), call
them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in
'a' into 'c'...

a = range(1, 100000, 5)
b = range(0, 1000000)
c = []
for i in b:

.... if i not in a: c.append(i)
....

This takes forever to complete. Is there anyway to optimize this?

Thanks in advance

Cheers
Maurice

Jul 19 '05 #16

Mike Meyer

"Fredrik Lundh" <fr*****@pythonware.com> writes:

Mike Meyer wrote:
> So why would Apple insist on setting unusably low process limits, when
> the others don't?
You're making an unwarranted assumption here - that the OP wasn't
creating a large process of some kind.

You need a special license to create large processes on a Mac?

No more so than on any other OS. What does that have to do with my
pointing out that the OP may have been creating a process that
exceeded the size normally allowed for non-administrator processes?
Any modern OS should have different groups of users with different
sets of possible maximum resource allocation - which only an
administrator should be allowed to change.
I clicked on the google link that Bill posted, and noted that it wasn't
exactly something that only affected a single Python user. If some-
thing causes problems for many different applications that run fine
on other Unix systems, it's pretty obvious that the default OS X con-
figuration isn't quite as Unixy as one would expect.

I didn't follow that link - I formulated my own google queery. While I
saw lots of things about vm_malloc, trying vm_malloc python turns up
nothing. Which seems to indicate that this is a relatively rare thing
for python users.

FWIW, OS X has a Mach kernel. The failing vm_malloc call listed in the
OP is a Mach call, not a Unix call.

So has tru64. I've done some serious Python stuff on that platform
(stuff that included some really large processes ;-), and I never had
any allocation problems. But of course, that system was designed
by DEC people...

Oddly enough, google just turned up a vm_malloc for VMS as well. I
wonder if that's where tru64 got it from - and if so how it got into
Mach?

There is something very non-unixy going on here, though. Why is
vm_malloc exiting with an error message, instead of returning a
failure to the calling application? I've seen other applications
include a FOSS malloc implementation to work around bugs in the
system's malloc. Maybe Python should do that on the Mac?

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Jul 19 '05 #17

Bengt Richter

On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <ma*********@acm.org> wrote:

James Stroud wrote:
Sorry Maurice, apparently in bash its "ulimit" (no n). I don't use bash, so I
don't know all of the differences offhand. Try that.

James

Thanks guys,

It doesn't seems to help. I'm thinking that it might be a SOAPpy
problem. The allocation fails when I grab a list of more than 150k
elements through SOAP but allocating a 1 million element list is fine in
python.

Now I have a performance problem...

Say I have 3 lists (20K elements, 1G elements, and 0 elements), call
them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in
'a' into 'c'...
a = range(1, 100000, 5)
b = range(0, 1000000)
c = []
for i in b:

... if i not in a: c.append(i)
...

This takes forever to complete. Is there anyway to optimize this?

Checking whether something is in a list may average checking equality with
each element in half the list. Checking for membership in a set should
be much faster for any significant size set/list. I.e., just changing to

a = set(range(1, 100000, 5))

should help. I assume those aren't examples of your real data ;-)
You must have a lot of memory if you are keeping 1G elements there and
copying a significant portion of them. Do you need to do this file-to-file,
keeping a in memory? Perhaps page-file thrashing is part of the time problem?

Regards,
Bengt Richter

Jul 19 '05 #18

John Machin

On Sat, 07 May 2005 02:29:48 GMT, bo**@oz.net (Bengt Richter) wrote:

On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <ma*********@acm.org> wrote:

It doesn't seems to help. I'm thinking that it might be a SOAPpy
problem. The allocation fails when I grab a list of more than 150k
elements through SOAP but allocating a 1 million element list is fine in
python.

Now I have a performance problem...

Say I have 3 lists (20K elements, 1G elements, and 0 elements), call
them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in
'a' into 'c'...
>>> a = range(1, 100000, 5)
>>> b = range(0, 1000000)
>>> c = []
>>> for i in b:

... if i not in a: c.append(i)
...

This takes forever to complete. Is there anyway to optimize this?

Checking whether something is in a list may average checking equality with
each element in half the list. Checking for membership in a set should
be much faster for any significant size set/list. I.e., just changing to

a = set(range(1, 100000, 5))

should help. I assume those aren't examples of your real data ;-)
You must have a lot of memory if you are keeping 1G elements there and
copying a significant portion of them. Do you need to do this file-to-file,
keeping a in memory? Perhaps page-file thrashing is part of the time problem?

Since when was 1000000 == 1G??

Maurice, is this mucking about with 1M or 1G lists in the same
exercise as the "vm_malloc fails when allocating a 20K-element list"
problem? Again, it might be a good idea if you gave us a little bit
more detail. You haven't even posted the actual *PYTHON* error message
and stack trace that you got from the original problem. In fact,
there's a possible interpretation that the (system?) malloc merely
prints the vm_malloc message and staggers on somehow ...

Regards,
John

Jul 19 '05 #19

Maurice LING

John Machin wrote:

On Sat, 07 May 2005 02:29:48 GMT, bo**@oz.net (Bengt Richter) wrote:

On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <ma*********@acm.org> wrote:
It doesn't seems to help. I'm thinking that it might be a SOAPpy
problem. The allocation fails when I grab a list of more than 150k
elements through SOAP but allocating a 1 million element list is fine in
python.

Now I have a performance problem...

Say I have 3 lists (20K elements, 1G elements, and 0 elements), call
them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in
'a' into 'c'...
>>a = range(1, 100000, 5)
>>b = range(0, 1000000)
>>c = []
>>for i in b:

... if i not in a: c.append(i)
...

This takes forever to complete. Is there anyway to optimize this?

Checking whether something is in a list may average checking equality with
each element in half the list. Checking for membership in a set should
be much faster for any significant size set/list. I.e., just changing to

a = set(range(1, 100000, 5))

should help. I assume those aren't examples of your real data ;-)
You must have a lot of memory if you are keeping 1G elements there and
copying a significant portion of them. Do you need to do this file-to-file,
keeping a in memory? Perhaps page-file thrashing is part of the time problem?

Since when was 1000000 == 1G??

Maurice, is this mucking about with 1M or 1G lists in the same
exercise as the "vm_malloc fails when allocating a 20K-element list"
problem? Again, it might be a good idea if you gave us a little bit
more detail. You haven't even posted the actual *PYTHON* error message
and stack trace that you got from the original problem. In fact,
there's a possible interpretation that the (system?) malloc merely
prints the vm_malloc message and staggers on somehow ...

Regards,
John

This is the exact error message:

*** malloc: vm_allocate(size=9203712) failed (error code=3)
*** malloc[489]: error: Can't allocate region

Nothing else. No stack trace, NOTHING.

maurice

Jul 19 '05 #20

Philippe C. Martin

Hi,

Why don't you catch the exception and print the trace ?

Regards,

Philippe

Maurice LING wrote:

John Machin wrote:
On Sat, 07 May 2005 02:29:48 GMT, bo**@oz.net (Bengt Richter) wrote:

On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <ma*********@acm.org>
wrote:

It doesn't seems to help. I'm thinking that it might be a SOAPpy
problem. The allocation fails when I grab a list of more than 150k
elements through SOAP but allocating a 1 million element list is fine in
python.

Now I have a performance problem...

Say I have 3 lists (20K elements, 1G elements, and 0 elements), call
them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in
'a' into 'c'...
>>>a = range(1, 100000, 5)
>>>b = range(0, 1000000)
>>>c = []
>>>for i in b:

... if i not in a: c.append(i)
...

This takes forever to complete. Is there anyway to optimize this?
Checking whether something is in a list may average checking equality
with each element in half the list. Checking for membership in a set
should be much faster for any significant size set/list. I.e., just
changing to

a = set(range(1, 100000, 5))

should help. I assume those aren't examples of your real data ;-)
You must have a lot of memory if you are keeping 1G elements there and
copying a significant portion of them. Do you need to do this
file-to-file, keeping a in memory? Perhaps page-file thrashing is part of
the time problem?

Since when was 1000000 == 1G??

Maurice, is this mucking about with 1M or 1G lists in the same
exercise as the "vm_malloc fails when allocating a 20K-element list"
problem? Again, it might be a good idea if you gave us a little bit
more detail. You haven't even posted the actual *PYTHON* error message
and stack trace that you got from the original problem. In fact,
there's a possible interpretation that the (system?) malloc merely
prints the vm_malloc message and staggers on somehow ...

Regards,
John

This is the exact error message:

*** malloc: vm_allocate(size=9203712) failed (error code=3)
*** malloc[489]: error: Can't allocate region

Nothing else. No stack trace, NOTHING.

maurice

Jul 19 '05 #21

Robert Kern

Philippe C. Martin wrote:

Hi,

Why don't you catch the exception and print the trace ?
I don't think a Python exception is ever raised. The error message
quoted below comes from the system, not Python.
Regards,

Philippe

Maurice LING wrote:

This is the exact error message:

*** malloc: vm_allocate(size=9203712) failed (error code=3)
*** malloc[489]: error: Can't allocate region

Nothing else. No stack trace, NOTHING.

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Jul 19 '05 #22

John Roth

"Maurice LING" <ma*********@acm.org> wrote in message
news:d5**********@domitilla.aioe.org...

Hi,

I think I've hit a system limit in python when I try to construct a list
of 200,000 elements. My error is

malloc: vm_allocate (size = 2400256) failed......

Just wondering is this specific to my system or what? Will adding more RAM
helps in this case?

Thanks and cheers
Maurice

malloc (which is the memory manager Python uses when it
runs out of its own heap memory) is trying to get another
2.4 megabyte block of memory from the operating system
so it can expand the heap.

The operating system is refusing to fill the request. There are
a lot of reasons why this might happen, ranging from system
limits (too little swap space, too little real memory), to an
inability to find a 2.4 meg block in the user part of the
address space, etc.

I'd check swap space, and then balance redesigning the
application to not try to get large blocks of memory with
spending money on more hardware memory that might not
solve the problem.

John Roth

Jul 19 '05 #23

Bengt Richter

On Sat, 07 May 2005 14:03:34 +1000, Maurice LING <ma*********@acm.org> wrote:

John Machin wrote:
On Sat, 07 May 2005 02:29:48 GMT, bo**@oz.net (Bengt Richter) wrote:

On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <ma*********@acm.org> wrote:

It doesn't seems to help. I'm thinking that it might be a SOAPpy
problem. The allocation fails when I grab a list of more than 150k
elements through SOAP but allocating a 1 million element list is fine in
python.

Now I have a performance problem...

Say I have 3 lists (20K elements, 1G elements, and 0 elements), call
them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in
'a' into 'c'...
>>>a = range(1, 100000, 5)
>>>b = range(0, 1000000)
>>>c = []
>>>for i in b:

... if i not in a: c.append(i)
...

This takes forever to complete. Is there anyway to optimize this?
Checking whether something is in a list may average checking equality with
each element in half the list. Checking for membership in a set should
be much faster for any significant size set/list. I.e., just changing to

a = set(range(1, 100000, 5))

should help. I assume those aren't examples of your real data ;-)
You must have a lot of memory if you are keeping 1G elements there and
copying a significant portion of them. Do you need to do this file-to-file,
keeping a in memory? Perhaps page-file thrashing is part of the time problem?

Since when was 1000000 == 1G??

Maurice, is this mucking about with 1M or 1G lists in the same
exercise as the "vm_malloc fails when allocating a 20K-element list"
problem? Again, it might be a good idea if you gave us a little bit
more detail. You haven't even posted the actual *PYTHON* error message
and stack trace that you got from the original problem. In fact,
there's a possible interpretation that the (system?) malloc merely
prints the vm_malloc message and staggers on somehow ...

Regards,
John

This is the exact error message:

*** malloc: vm_allocate(size=9203712) failed (error code=3)
*** malloc[489]: error: Can't allocate region

Nothing else. No stack trace, NOTHING.

1. Can you post minimal exact code that produces the above exact error message?
2. Will you? ;-)

Regards,
Bengt Richter

Jul 19 '05 #24

Maurice LING

Bengt Richter wrote:

On Sat, 07 May 2005 14:03:34 +1000, Maurice LING <ma*********@acm.org> wrote:

John Machin wrote:
On Sat, 07 May 2005 02:29:48 GMT, bo**@oz.net (Bengt Richter) wrote:

On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <ma*********@acm.org> wrote:
>It doesn't seems to help. I'm thinking that it might be a SOAPpy
>problem. The allocation fails when I grab a list of more than 150k
>elements through SOAP but allocating a 1 million element list is fine in
>python.
>
>Now I have a performance problem...
>
>Say I have 3 lists (20K elements, 1G elements, and 0 elements), call
>them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in
>'a' into 'c'...
>
>
>
>>>>a = range(1, 100000, 5)
>>>>b = range(0, 1000000)
>>>>c = []
>>>>for i in b:
>
>... if i not in a: c.append(i)
>...
>
>This takes forever to complete. Is there anyway to optimize this?
>

Checking whether something is in a list may average checking equality with
each element in half the list. Checking for membership in a set should
be much faster for any significant size set/list. I.e., just changing to

a = set(range(1, 100000, 5))

should help. I assume those aren't examples of your real data ;-)
You must have a lot of memory if you are keeping 1G elements there and
copying a significant portion of them. Do you need to do this file-to-file,
keeping a in memory? Perhaps page-file thrashing is part of the time problem?
Since when was 1000000 == 1G??

Maurice, is this mucking about with 1M or 1G lists in the same
exercise as the "vm_malloc fails when allocating a 20K-element list"
problem? Again, it might be a good idea if you gave us a little bit
more detail. You haven't even posted the actual *PYTHON* error message
and stack trace that you got from the original problem. In fact,
there's a possible interpretation that the (system?) malloc merely
prints the vm_malloc message and staggers on somehow ...

Regards,
John

This is the exact error message:

*** malloc: vm_allocate(size=9203712) failed (error code=3)
*** malloc[489]: error: Can't allocate region

Nothing else. No stack trace, NOTHING.

1. Can you post minimal exact code that produces the above exact error message?
2. Will you? ;-)

Regards,
Bengt Richter

I've re-tried the minimal code mimicking the error in interactive mode
and got this:

from SOAPpy import WSDL
serv = WSDL.Proxy('http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/v1.1/eutils.wsdl'
) result = serv.run_eSearch(db='pubmed', term='mouse', retmax=500000) *** malloc: vm_allocate(size=9121792) failed (error code=3)
*** malloc[901]: error: Can't allocate region
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 453, in
__call__
return self.__r_call(*args, **kw)
File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 475, in
__r_call
self.__hd, self.__ma)
File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 347, in
__call
config = self.config)
File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 212, in
call
data = r.getfile().read(message_len)
File "/sw/lib/python2.3/socket.py", line 301, in read
data = self._sock.recv(recv_size)
MemoryError

When changed retmax to 150000, it works nicely.

Jul 19 '05 #25

Fredrik Lundh

Mike Meyer wrote:

There is something very non-unixy going on here, though. Why is
vm_malloc exiting with an error message, instead of returning a
failure to the calling application? I've seen other applications
include a FOSS malloc implementation to work around bugs in the
system's malloc. Maybe Python should do that on the Mac?

from what I can tell (by reading the google hits), the malloc implementation
prints a message, but returns NULL as usual (you can find reports of this
message preceeding a MemoryError traceback).

</F>

Jul 19 '05 #26

hard memory limits

Similar topics