473,597 Members | 2,275 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

segmentation fault in scipy?

I'm running operations large arrays of floats, approx 25,000 x 80.
Python (scipy) does not seem to come close to using 4GB of wired mem,
but segments at around a gig. Everything works fine on smaller batches
of data around 10,000 x 80 and uses a max of ~600mb of mem. Any Ideas?
Is this just too much data for scipy?

Thanks Conor

Traceback (most recent call last):
File "C:\Temp\CR_2\r un.py", line 68, in ?
net.rProp(1.2, .5, .000001, 50.0, input, output, 1)
File "/Users/conorrob/Desktop/CR_2/Network.py", line 230, in rProp
print scipy.trace(err or*scipy.transp ose(error))
File "D:\Python24\Li b\site-packages\numpy\ core\defmatrix. py", line
149, in
__mul__
return N.dot(self, other)
MemoryError


May 10 '06 #1
15 3758
co************@ gmail.com wrote:
I'm running operations large arrays of floats, approx 25,000 x 80.
Python (scipy) does not seem to come close to using 4GB of wired mem,
but segments at around a gig. Everything works fine on smaller batches
of data around 10,000 x 80 and uses a max of ~600mb of mem. Any Ideas?
Is this just too much data for scipy?

Thanks Conor

Traceback (most recent call last):
File "C:\Temp\CR_2\r un.py", line 68, in ?
net.rProp(1.2, .5, .000001, 50.0, input, output, 1)
File "/Users/conorrob/Desktop/CR_2/Network.py", line 230, in rProp
print scipy.trace(err or*scipy.transp ose(error))
File "D:\Python24\Li b\site-packages\numpy\ core\defmatrix. py", line
149, in
__mul__
return N.dot(self, other)
MemoryError


You should ask this question on the numpy-discussion list for better
feedback.
Does it actually segfault or give you this Memory Error?
Temporary arrays that need to be created could be the source of the
extra memory.
Generally, you should be able to use all the memory on your system
(unless you are on a 64-bit system and are not using Python 2.5).

-Travis

May 10 '06 #2
If I run it from the shell (unix) I get: Segmentation fault and see a
core dump in my processes. If I run it in the python shell I get as
above:
File "D:\Python24\Li b\site-packages\numpy\ core\defmatrix. py", line
149, in
__mul__
return N.dot(self, other)
MemoryError

I your experience as one of the dev of scipy, is this too much data?

thank you

May 10 '06 #3
co************@ gmail.com wrote:
I'm running operations large arrays of floats, approx 25,000 x 80.
Python (scipy) does not seem to come close to using 4GB of wired mem,
but segments at around a gig. Everything works fine on smaller batches
of data around 10,000 x 80 and uses a max of ~600mb of mem. Any Ideas?
Is this just too much data for scipy?

Thanks Conor

Traceback (most recent call last):
File "C:\Temp\CR_2\r un.py", line 68, in ?
net.rProp(1.2, .5, .000001, 50.0, input, output, 1)
File "/Users/conorrob/Desktop/CR_2/Network.py", line 230, in rProp
print scipy.trace(err or*scipy.transp ose(error))
File "D:\Python24\Li b\site-packages\numpy\ core\defmatrix. py", line
149, in
__mul__
return N.dot(self, other)
MemoryError


This is not a segfault. Is this the only error you see? Or are you actually
seeing a segfault somewhere?

If error.shape == (25000, 80), then dot(error, transpose(error )) will be
returning an array of shape (25000, 25000). Assuming double precision floats,
that array will take up about 4768 megabytes of memory, more than you have. The
memory usage doesn't go up near 4 gigabytes because the allocation of the very
large returned array fails, so the large chunk of memory never gets allocated.

There are two possibilities:

1. As Travis mentioned, numpy won't create the array because it is still
32-bit-limited due to the Python 2.4 C API. This has been resolved with Python 2.5.

2. The default build of numpy uses plain-old malloc(3) to allocate memory, and
it may be failing to create such large chunks of memory.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

May 10 '06 #4
Good point. Finding the SSE using an absolute error matrix of (25000 x
1) is insane. I pulled out the error function (for now) and I'm back
in business. Thanks for all the great advise.

May 10 '06 #5
co************@ gmail.com wrote:
Good point. Finding the SSE using an absolute error matrix of (25000 x
1) is insane. I pulled out the error function (for now) and I'm back
in business. Thanks for all the great advise.


Could you go back for a second and describe your problem a little bit more. It
sounds like you were doing the wrong operation. By SSE, you mean "Sum of Squared
Errors" of the 25000 length-80 vectors, right? In that case, using matrix
multiplication won't give you that. That will, in essence calculate the
dot-product of each of the 25000 length-80 vectors with *each* of the other
25000 length-80 vectors in addition to themselves. It seems to me like you want
something like this:

SSE = sum(error * error, axis=-1)

Then SSE.shape == (25000,).

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

May 10 '06 #6
Im using rprop (not dependent on error function in this case ie.
standard rprop vs. irprop or arprop) for an MLP tanh, sigmod nnet as
part of a hybrid model. I guess I was using a little Matlab thought
when I wrote the SSE funtion. My batches are about 25,000 x 80 so my
absolute error (diff between net outputs and desired outputs) when
using *one* output unit is shape(~25000,), am I wrong to assume
trace(error*tra nspose(error)) is the sum of the squared errors which
should be an shape(1,)? I'm just now starting to dig a little deeper
into scipy, and I need to get the full doc.

Thanks for all your input.

May 11 '06 #7
co************@ gmail.com wrote:
Im using rprop (not dependent on error function in this case ie.
standard rprop vs. irprop or arprop) for an MLP tanh, sigmod nnet as
part of a hybrid model. I guess I was using a little Matlab thought
when I wrote the SSE funtion. My batches are about 25,000 x 80 so my
absolute error (diff between net outputs and desired outputs) when
using *one* output unit is shape(~25000,), am I wrong to assume
trace(error*tra nspose(error)) is the sum of the squared errors which
should be an shape(1,)?


I'm afraid you're using terminology (and abbreviations!) that I can't follow.
Let me try to restate what's going on and you can correct me as I screw up. You
have a neural net that has 80 output units. You have 25000 observations that you
are using to train the neural net. Each observation vector (and consequently,
each error vector) has 80 elements.

Judging by your notation, you are using the matrix subclass of array to change *
to matrix multiplication. In my message you are responding to (btw, please quote
the emails you respond to so we can maintain some context), I gave an answer for
you using regular arrays which have * as elementwise multiplication. The matrix
object's behavior gets in the way of the most natural way to do these
calculations, so I do recommend avoiding the matrix object and learning to use
the dot() function to do matrix multiplication instead. But if you want to
continue using matrix objects, then you can use the multiply() function to do
element-wise multiplication.

The answer I gave also used the wrong name for the result. It seems that you
want the sum of the squared errors across all of the observations. In this case,
you can use axis=None to specify that every element should be summed:

SSE = sum(multiply(er ror, error), axis=None)

trace(dot(error , transpose(error ))) wastes a *huge* amount of time and memory
since you are calculating (if your machine was capable of it) a gigantic matrix,
then throwing away all of the off-diagonal elements. The method I give above
wastes a little memory; there is one temporary matrix the size of error.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

May 11 '06 #8
> If I run it from the shell (unix) I get: Segmentation fault and see a
core dump in my processes. If I run it in the python shell I get as
above:
File "D:\Python24\Li b\site-packages\numpy\ core\defmatrix. py", line
149, in


That's a Window's path... Does Windows even make full use of
4GB?

Im developing on a unix machine. However, you do have a sharp eye...
So you win the blue ribbon and a free roundtable pizza! I'm actually
running experiments on an HP workstation while I work (emailed myself
the error), that's one of the beauties of python. Yes, Dennis there is
a max of 4gb of addressable mem on that machine and 2 cpus. This is
what a unix path looks like. Can you tell me what OS and shell Im
running (both are obvious from the prompt) for the bonus prize?

[gremlins:~/desktop/CR_2] conorrob% python run.py
AL
fold: 0
Done Loading
Input Ready
Network Training
Initial SSE:
Segmentation fault
[gremlins:~/desktop/CR_2] conorrob%

May 11 '06 #9
>I'm afraid you're using terminology (and abbreviations!) that I can't follow.
Let me try to restate what's going on and you can correct me as I screw up. You
have a neural net that has 80 output units. You have 25000 observations that you
are using to train the neural net. Each observation vector (and consequently,
each error vector) has 80 elements.
First, sorry for not quoting and using abbreviations. Second, your
obsevation is correct, except that I have *one* output unit, not 80. I
have ~80 input units + bias (for each of the 25000 observations), x +
bias number of hidden units and one output unit leaving me with an
output array/matrix of shape =(25000,), as well as my desired output
having the same shape.

RPROP = Resilliant BackPropogation , uses chages in the error gradiant
ignores the magnitude of the gradiant, which can be harmful. See "A
Direct Adaptive Method for Faster Backpropogation Learning: The RPROP
Algorithm" -Martin Riedmiller and Heinrich Braun, just google it and
read Section D, it's well written.

tanh = hyperbolic tangent function [-1, 1] y vals, often times better
for its steeper derivative and wider range.

sigmoid function = popular for its [0,1] range and can calculate
posterior probabilities when using a the cross entropy error function
which I have commented out since it takes more time to process and I'm
not using my error function in this specific case at this point in
time, thus SSE is not really needed, however I'd like use it as a
check. Also, popular for its easily calculatable derivative.

The answer I gave also used the wrong name for the result. It seems that you
want the sum of the squared errors across all of the observations. In this case,
you can use axis=None to specify that every element should be summed:

SSE = sum(multiply(er ror, error), axis=None)

trace(dot(erro r, transpose(error ))) wastes a *huge* amount of time and memory
since you are calculating (if your machine was capable of it) a gigantic matrix,
then throwing away all of the off-diagonal elements. The method I give above
wastes a little memory; there is one temporary matrix the size of error.


This is great advise and much appreciated. It was the answer to my
problem, thank you. However, isn't this faster...
scipy.sum(scipy .array(scipy.ma trix(error)*sci py.matrix(error )), axis =
None)
as we you explained in my other posting?

Thanks again.

May 11 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
6797
by: sivignon | last post by:
Hi, I'm writing a php script which deals with 3 ORACLE databases. This script is launch by a script shell on an linux machine like this : /../php/bin/php ./MySript.php (PHP 4.3.3) My script works fine and do all what I need. But at the end of the execution, I can read "Segmentation Fault". The segmentation fault appear at the end of my script execution,
3
11410
by: Zheng Da | last post by:
Program received signal SIGSEGV, Segmentation fault. 0x40093343 in _int_malloc () from /lib/tls/libc.so.6 (gdb) bt #0 0x40093343 in _int_malloc () from /lib/tls/libc.so.6 #1 0x40094c54 in malloc () from /lib/tls/libc.so.6 It's really strange; I just call malloc() like "tmp=malloc(size);" the system gives me Segmentation fault I want to write a code to do like a dynamic array, and the code is as
5
2988
by: Fra-it | last post by:
Hi everybody, I'm trying to make the following code running properly, but I can't get rid of the "SEGMENTATION FAULT" error message when executing. Reading some messages posted earlier, I understood that a segmentation fault can occur whenever I declare a pointer and I leave it un-initialized. So I thought the problem here is with the (const char *)s in the stuct flightData (please note that I get the same fault declaring as char * the...
18
26085
by: Digital Puer | last post by:
Hi, I'm coming over from Java to C++, so please bear with me. In C++, is there a way for me to use exceptions to catch segmentation faults (e.g. when I access a location off the end of an array)? Thanks.
27
3345
by: Paminu | last post by:
I have a wierd problem. In my main function I print "test" as the first thing. But if I run the call to node_alloc AFTER the printf call I get a segmentation fault and test is not printed! #include <stdlib.h> #include <stdio.h> typedef struct _node_t {
7
5868
by: pycraze | last post by:
I would like to ask a question. How do one handle the exception due to Segmentation fault due to Python ? Our bit operations and arithmetic manipulations are written in C and to some of our testcases we experiance Segmentation fault from the python libraries. If i know how to handle the exception for Segmentation fault , it will help me complete the run on any testcase , even if i experiance Seg Fault due to any one or many functions in...
3
5158
by: madunix | last post by:
My Server is suffering bad lag (High Utlization) I am running on that server Oracle10g with apache_1.3.35/ php-4.4.2 Web visitors retrieve data from the web by php calls through oci cobnnection from 10g release2 PHP is configured with the following parameters './configure' '--prefix=/opt/oracle/php' '--with-apxs=/opt/oracle/apache/bin/apxs' '--with-config-file-path=/opt/oracle/apache/conf' '--enable-safe-mode' '--enable-session'...
8
3637
by: Mathias | last post by:
Dear ng, I use the thread module (not threading) for a client/server app where I distribute large amounts of pickled data over ssh tunnels. Now I get regular Segmentation Faults during high load episodes. I use a semaphore to have pickle/unpickle run nonthreaded, but I still get frequent nondeterministic segmentation faults. Since there is no traceback after a sf, I have no clue what exactly happened, and debugging a multithreaded app...
6
5030
by: DanielJohnson | last post by:
int main() { printf("\n Hello World"); main; return 0; } This program terminate just after one loop while the second program goes on infinitely untill segmentation fault (core dumped) on gcc. The only difference is that in first I only call "main" and in second call
0
7962
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8267
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8380
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8024
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8258
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6681
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
5844
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
3880
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
3921
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.