473,386 Members | 1,830 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Computing correlations with SciPy

I want to compute the correlation between two sequences X and Y, and
tried using SciPy to do so without success.l Here's what I have, how
can I correct it?
X = [1, 2, 3, 4, 5]
Y = [5, 4, 3, 2, 1]
import scipy
scipy.corrcoef(X,Y) Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "C:\Python24\Lib\site-packages\numpy\lib\function_base.py", line
671, in corrcoef
d = diag(c)
File "C:\Python24\Lib\site-packages\numpy\lib\twodim_base.py", line
80, in diag
raise ValueError, "Input must be 1- or 2-d."
ValueError: Input must be 1- or 2-d.


Thanks in advance

Thomas Philips

Mar 16 '06 #1
4 5630
Em Qui, 2006-03-16 Ã*s 07:49 -0800, tk****@hotmail.com escreveu:
I want to compute the correlation between two sequences X and Y, and
tried using SciPy to do so without success.l Here's what I have, how
can I correct it?


$ python2.4
Python 2.4.2 (#2, Nov 20 2005, 17:04:48)
[GCC 4.0.3 20051111 (prerelease) (Debian 4.0.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
x = [1,2,3,4,5]
y = [5,4,3,2,1]
import scipy
scipy.corrcoef(x, y) array([[ 1., -1.],
[-1., 1.]]) # Looks fine for me...

Mar 16 '06 #2
>>>>> "tkpmep" == tkpmep <tk****@hotmail.com> writes:

tkpmep> I want to compute the correlation between two sequences X
tkpmep> and Y, and tried using SciPy to do so without success.l
tkpmep> Here's what I have, how can I correct it?
X = [1, 2, 3, 4, 5] Y = [5, 4, 3, 2, 1] import scipy
scipy.corrcoef(X,Y) tkpmep> Traceback (most recent call last): File "<interactive
tkpmep> input>", line 1, in ? File
tkpmep> "C:\Python24\Lib\site-packages\numpy\lib\function_base.py",
tkpmep> line 671, in corrcoef d = diag(c) File
tkpmep> "C:\Python24\Lib\site-packages\numpy\lib\twodim_base.py",
tkpmep> line 80, in diag raise ValueError, "Input must be 1- or
tkpmep> 2-d." ValueError: Input must be 1- or 2-d.


Hmm, this may be a bug in scipy. matplotlib also defines a corrcoef
function, which you may want to use until this problem gets sorted out

In [9]: matplotlib.mlab.corrcoef(X,Y)

In [10]: X = [1, 2, 3, 4, 5]

In [11]: Y = [5, 4, 3, 2, 1]

In [12]: matplotlib.mlab.corrcoef(X,Y)
Out[12]:
array([[ 1., -1.],
[-1., 1.]])
Mar 16 '06 #3
tk****@hotmail.com wrote:
I want to compute the correlation between two sequences X and Y, and
tried using SciPy to do so without success.l Here's what I have, how
can I correct it?


This was a bug in NumPy (inherited from Numeric actually). The fix is
in SVN of NumPy.

Here are the new versions of those functions that should work as you
wish (again, these are in SVN, but perhaps you have a binary install).

These functions belong in <site-packages>/numpy/lib/function_base.py

def cov(m,y=None, rowvar=1, bias=0):
"""Estimate the covariance matrix.

If m is a vector, return the variance. For matrices return the
covariance matrix.

If y is given it is treated as an additional (set of)
variable(s).

Normalization is by (N-1) where N is the number of observations
(unbiased estimate). If bias is 1 then normalization is by N.

If rowvar is non-zero (default), then each row is a variable with
observations in the columns, otherwise each column
is a variable and the observations are in the rows.
"""

X = asarray(m,ndmin=2)
if X.shape[0] == 1:
rowvar = 1
if rowvar:
axis = 0
tup = (slice(None),newaxis)
else:
axis = 1
tup = (newaxis, slice(None))
if y is not None:
y = asarray(y,ndmin=2)
X = concatenate((X,y),axis)

X -= X.mean(axis=1-axis)[tup]
if rowvar:
N = X.shape[1]
else:
N = X.shape[0]

if bias:
fact = N*1.0
else:
fact = N-1.0

if not rowvar:
return (dot(X.transpose(), X.conj()) / fact).squeeze()
else:
return (dot(X,X.transpose().conj())/fact).squeeze()

def corrcoef(x, y=None, rowvar=1, bias=0):
"""The correlation coefficients
"""
c = cov(x, y, rowvar, bias)
try:
d = diag(c)
except ValueError: # scalar covariance
return 1
return c/sqrt(multiply.outer(d,d))

Mar 17 '06 #4
Tested it and it works like a charm! Thank you very much for fixing
this. Not knowing what an SVN is, I simply copied the code into the
appropriate library files and it works perfectly well.

May I suggest a simple enhancement: modify corrcoef so that if it is
fed two 1 dimensional arrays, it returns a scalar. cov does something
similar for covariances: if you feed it just one vector, it returns a
scalar, and if you feed it two, it returns the covariance matrix i.e:
x = [1, 2, 3, 4, 5] z = [5, 4, 3, 2, 1] scipy.cov(x,z) array([[ 2.5, -2.5],
[-2.5, 2.5]])
scipy.cov(x)

2.5

I suspect that the majority of users use corrcoef to obtain point
estimates of the covariance of two vectors, and relatively few will
estimate a covariance matrix, as this method tends not to be robust to
the presence of noise and/or errors in the data.

Thomas Philips

Mar 19 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Markus von Ehr | last post by:
Hi, is there already a scipy windows binary for Py2.3? Thanks for any hint, Markus
20
by: mclaugb | last post by:
Has anyone recompiled the Scientific Computing package using NumPy instead of Numeric? I need a least squares algorithm and a Newton Rhaphson algorithm which is contained in Numeric but all the...
1
by: tkpmep | last post by:
I installed SciPy and NumPy (0.9.5, because 0.9.6 does not work with the current version of SciPy), and had some teething troubles. I looked around for help and observed that the tutorial is dated...
0
by: Julien Fiore | last post by:
Hi, I have problems trying to install the scipy.weave package. I run Python 2.4 on windows XP and my C compiler is MinGW. Below is the output of scipy.weave.test(). I read that the tests should...
11
by: Fie Pye | last post by:
Hallo I would like to have a high class open source tools for scientific computing and powerful 2D and 3D data visualisation. Therefore I chosepython, numpy and scipy as a base. Now I am in...
2
by: robert | last post by:
I'm using latest numpy & scipy. What is this problem ? : RuntimeError: module compiled against version 1000002 of C-API but this version of numpy is 1000009 Traceback (most recent call last):...
18
by: robert | last post by:
Is there a ready made function in numpy/scipy to compute the correlation y=mx+o of an X and Y fast: m, m-err, o, o-err, r-coef,r-coef-err ? Or a formula to to compute the 3 error ranges? ...
19
by: LucasLondon | last post by:
Hi there, First of all apologies for the long post. Hope someone can offer some advice. I have about 200 columns of time series data that I need to perform a correlation analysis on in terms...
6
by: jadamwilson2 | last post by:
Hello, I have recently become interested in using python for scientific computing, and came across both sage and enthought. I am curious if anyone can tell me what the differences are between the...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.