473,394 Members | 1,735 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

scipy.stats.itemfreq: overflow with add.reduce

Hi All,

I was playing with scipy.stats.itemfreq when I observed the following
overflow:

In [119]:for i in [254,255,256,257,258]:
.....: l=[0]*i
.....: print i, stats.itemfreq(l), l.count(0)
.....:
254 [ [ 0 254]] 254
255 [ [ 0 255]] 255
256 [ [0 0]] 256
257 [ [0 1]] 257
258 [ [0 2]] 258

itemfreq is pretty small (in stats.py):

----------------------------------------------------------------------
def itemfreq(a):
"""
Returns a 2D array of item frequencies. Column 1 contains item values,
column 2 contains their respective counts. Assumes a 1D array is passed.

Returns: a 2D frequency table (col [0:n-1]=scores, col n=frequencies)
"""
scores = _support.unique(a)
scores = sort(scores)
freq = zeros(len(scores))
for i in range(len(scores)):
freq[i] = add.reduce(equal(a,scores[i]))
return array(_support.abut(scores, freq))
----------------------------------------------------------------------

It seems that add.reduce is the source for the overflow:

In [116]:from scipy import *

In [117]:for i in [254,255,256,257,258]:
.....: l=[0]*i
.....: print i, add.reduce(equal(l,0))
.....:
254 254
255 255
256 0
257 1
258 2

Is there any possibility to avoid the overflow?

BTW:
Python 2.3.5 (#2, Aug 30 2005, 15:50:26)
[GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2

scipy_version.scipy_version --> '0.3.2'
Thanks and best regards
Hans Georg Krauthäuser
Dec 21 '05 #1
2 2102
Hans Georg Krauthaeuser schrieb:
Hi All,

I was playing with scipy.stats.itemfreq when I observed the following
overflow:

In [119]:for i in [254,255,256,257,258]:
.....: l=[0]*i
.....: print i, stats.itemfreq(l), l.count(0)
.....:
254 [ [ 0 254]] 254
255 [ [ 0 255]] 255
256 [ [0 0]] 256
257 [ [0 1]] 257
258 [ [0 2]] 258

itemfreq is pretty small (in stats.py):

----------------------------------------------------------------------
def itemfreq(a):
"""
Returns a 2D array of item frequencies. Column 1 contains item values,
column 2 contains their respective counts. Assumes a 1D array is passed.

Returns: a 2D frequency table (col [0:n-1]=scores, col n=frequencies)
"""
scores = _support.unique(a)
scores = sort(scores)
freq = zeros(len(scores))
for i in range(len(scores)):
freq[i] = add.reduce(equal(a,scores[i]))
return array(_support.abut(scores, freq))
----------------------------------------------------------------------

It seems that add.reduce is the source for the overflow:

In [116]:from scipy import *

In [117]:for i in [254,255,256,257,258]:
.....: l=[0]*i
.....: print i, add.reduce(equal(l,0))
.....:
254 254
255 255
256 0
257 1
258 2

Is there any possibility to avoid the overflow?

BTW:
Python 2.3.5 (#2, Aug 30 2005, 15:50:26)
[GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2

scipy_version.scipy_version --> '0.3.2'
Thanks and best regards
Hans Georg Krauthäuser

After some further investigation:

In [150]:add.reduce(array(equal([0]*256,0),typecode='l'))
Out[150]:256

In [151]:add.reduce(equal([0]*256,0))
Out[151]:0

The problem occurs with arrays with typecode 'b' (as returned by equal).

Workaround patch for itemfreq is obvious, but ... is it a bug or a feature?

regards
Hans Georg
Dec 21 '05 #2
Hans Georg Krauthaeuser schrieb:
Hans Georg Krauthaeuser schrieb:
Hi All,

I was playing with scipy.stats.itemfreq when I observed the following
overflow:

In [119]:for i in [254,255,256,257,258]:
.....: l=[0]*i
.....: print i, stats.itemfreq(l), l.count(0)
.....:
254 [ [ 0 254]] 254
255 [ [ 0 255]] 255
256 [ [0 0]] 256
257 [ [0 1]] 257
258 [ [0 2]] 258

itemfreq is pretty small (in stats.py):

----------------------------------------------------------------------
def itemfreq(a):
"""
Returns a 2D array of item frequencies. Column 1 contains item values,
column 2 contains their respective counts. Assumes a 1D array is passed.

Returns: a 2D frequency table (col [0:n-1]=scores, col n=frequencies)
"""
scores = _support.unique(a)
scores = sort(scores)
freq = zeros(len(scores))
for i in range(len(scores)):
freq[i] = add.reduce(equal(a,scores[i]))
return array(_support.abut(scores, freq))
----------------------------------------------------------------------

It seems that add.reduce is the source for the overflow:

In [116]:from scipy import *

In [117]:for i in [254,255,256,257,258]:
.....: l=[0]*i
.....: print i, add.reduce(equal(l,0))
.....:
254 254
255 255
256 0
257 1
258 2

Is there any possibility to avoid the overflow?

BTW:
Python 2.3.5 (#2, Aug 30 2005, 15:50:26)
[GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2

scipy_version.scipy_version --> '0.3.2'
Thanks and best regards
Hans Georg Krauthäuser


After some further investigation:

In [150]:add.reduce(array(equal([0]*256,0),typecode='l'))
Out[150]:256

In [151]:add.reduce(equal([0]*256,0))
Out[151]:0

The problem occurs with arrays with typecode 'b' (as returned by equal).

Workaround patch for itemfreq is obvious, but ... is it a bug or a feature?

regards
Hans Georg


I feel a bit lonely here, but, nevertheless a further remark:

The problem comes directly from the ufunc 'add' for typecode 'b'. In
contrast to 'multiply' the typecode is not 'upcasted':

In [178]:array(array([1],'b')*2)
Out[178]:array([2],'i')

In [179]:array(array([1],'b')+array([1],'b'))
Out[179]:array([2],'b')

So, for a array a with typecode 'b' it follows that

a+a != a*2

At the moment, I don't have the time to try the new scipy_core. It would
be nice to hear whether the problem is known or even already fixed!?

Regards
Hans Georg Krauthäuser
Dec 22 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: hawkesed | last post by:
Hi All, has anyone out there recently set up scipy on Windows? Cause I am trying to do so know and I am not having much luck. I have ActiveState and Plone. When I try to import scipy in...
0
by: dba | last post by:
Hi I am a new MySQL dba tuning a busy Apache/MySQL installation. Here are some of the statistics for the last 24 hours: Com_admin_commands 978,760 What "Admin_Commands" does this...
7
by: nitro | last post by:
Hi, I am using a Debian system. I installed NumPy and everything works well. When I try to install SciPy, I get the following error. Any help would be appreciated. ===...
1
by: tkpmep | last post by:
I installed SciPy and NumPy (0.9.5, because 0.9.6 does not work with the current version of SciPy), and had some teething troubles. I looked around for help and observed that the tutorial is dated...
2
by: bartsimpson8882002 | last post by:
I was wondering if scipy/numpy has the inverse cumulative normal function, ie the function f in this expression f(scipy.stats.norm.cdf(1.2)) = 1.2 or more generally, a function f which fits...
0
by: Julien Fiore | last post by:
Hi, I have problems trying to install the scipy.weave package. I run Python 2.4 on windows XP and my C compiler is MinGW. Below is the output of scipy.weave.test(). I read that the tests should...
2
by: robert | last post by:
I'm using latest numpy & scipy. What is this problem ? : RuntimeError: module compiled against version 1000002 of C-API but this version of numpy is 1000009 Traceback (most recent call last):...
18
by: robert | last post by:
Is there a ready made function in numpy/scipy to compute the correlation y=mx+o of an X and Y fast: m, m-err, o, o-err, r-coef,r-coef-err ? Or a formula to to compute the 3 error ranges? ...
2
by: Frank Moyles | last post by:
Hi, I want to use SciPy library. I am using W2k, and ActiveState Python 2.5. I have succesfully numpy, but when I run the scipy-0.6.0.win32-py2.5.exe (from the downloads section on the SciPy...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.