Computing correlations with SciPy

tkpmep

I want to compute the correlation between two sequences X and Y, and
tried using SciPy to do so without success.l Here's what I have, how
can I correct it?

X = [1, 2, 3, 4, 5]
Y = [5, 4, 3, 2, 1]
import scipy
scipy.corrcoef(X,Y) Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "C:\Python24\Lib\site-packages\numpy\lib\function_base.py", line
671, in corrcoef
d = diag(c)
File "C:\Python24\Lib\site-packages\numpy\lib\twodim_base.py", line
80, in diag
raise ValueError, "Input must be 1- or 2-d."
ValueError: Input must be 1- or 2-d.

Thanks in advance

Thomas Philips

Mar 16 '06 #1

Subscribe Post Reply

5630

Felipe Almeida Lessa

Em Qui, 2006-03-16 Ã*s 07:49 -0800, tk****@hotmail.com escreveu:

I want to compute the correlation between two sequences X and Y, and
tried using SciPy to do so without success.l Here's what I have, how
can I correct it?

$ python2.4
Python 2.4.2 (#2, Nov 20 2005, 17:04:48)
[GCC 4.0.3 20051111 (prerelease) (Debian 4.0.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

x = [1,2,3,4,5]
y = [5,4,3,2,1]
import scipy
scipy.corrcoef(x, y) array([[ 1., -1.],
[-1., 1.]]) # Looks fine for me...

Mar 16 '06 #2

John Hunter

>>>>> "tkpmep" == tkpmep <tk****@hotmail.com> writes:

tkpmep> I want to compute the correlation between two sequences X
tkpmep> and Y, and tried using SciPy to do so without success.l
tkpmep> Here's what I have, how can I correct it?

X = [1, 2, 3, 4, 5] Y = [5, 4, 3, 2, 1] import scipy
scipy.corrcoef(X,Y) tkpmep> Traceback (most recent call last): File "<interactive
tkpmep> input>", line 1, in ? File
tkpmep> "C:\Python24\Lib\site-packages\numpy\lib\function_base.py",
tkpmep> line 671, in corrcoef d = diag(c) File
tkpmep> "C:\Python24\Lib\site-packages\numpy\lib\twodim_base.py",
tkpmep> line 80, in diag raise ValueError, "Input must be 1- or
tkpmep> 2-d." ValueError: Input must be 1- or 2-d.

Hmm, this may be a bug in scipy. matplotlib also defines a corrcoef
function, which you may want to use until this problem gets sorted out

In [9]: matplotlib.mlab.corrcoef(X,Y)

In [10]: X = [1, 2, 3, 4, 5]

In [11]: Y = [5, 4, 3, 2, 1]

In [12]: matplotlib.mlab.corrcoef(X,Y)
Out[12]:
array([[ 1., -1.],
[-1., 1.]])

Mar 16 '06 #3

Travis Oliphant

tk****@hotmail.com wrote:

I want to compute the correlation between two sequences X and Y, and
tried using SciPy to do so without success.l Here's what I have, how
can I correct it?

This was a bug in NumPy (inherited from Numeric actually). The fix is
in SVN of NumPy.

Here are the new versions of those functions that should work as you
wish (again, these are in SVN, but perhaps you have a binary install).

These functions belong in <site-packages>/numpy/lib/function_base.py

def cov(m,y=None, rowvar=1, bias=0):
"""Estimate the covariance matrix.

If m is a vector, return the variance. For matrices return the
covariance matrix.

If y is given it is treated as an additional (set of)
variable(s).

Normalization is by (N-1) where N is the number of observations
(unbiased estimate). If bias is 1 then normalization is by N.

If rowvar is non-zero (default), then each row is a variable with
observations in the columns, otherwise each column
is a variable and the observations are in the rows.
"""

X = asarray(m,ndmin=2)
if X.shape[0] == 1:
rowvar = 1
if rowvar:
axis = 0
tup = (slice(None),newaxis)
else:
axis = 1
tup = (newaxis, slice(None))
if y is not None:
y = asarray(y,ndmin=2)
X = concatenate((X,y),axis)

X -= X.mean(axis=1-axis)[tup]
if rowvar:
N = X.shape[1]
else:
N = X.shape[0]

if bias:
fact = N*1.0
else:
fact = N-1.0

if not rowvar:
return (dot(X.transpose(), X.conj()) / fact).squeeze()
else:
return (dot(X,X.transpose().conj())/fact).squeeze()

def corrcoef(x, y=None, rowvar=1, bias=0):
"""The correlation coefficients
"""
c = cov(x, y, rowvar, bias)
try:
d = diag(c)
except ValueError: # scalar covariance
return 1
return c/sqrt(multiply.outer(d,d))

Mar 17 '06 #4

tkpmep

Tested it and it works like a charm! Thank you very much for fixing
this. Not knowing what an SVN is, I simply copied the code into the
appropriate library files and it works perfectly well.

May I suggest a simple enhancement: modify corrcoef so that if it is
fed two 1 dimensional arrays, it returns a scalar. cov does something
similar for covariances: if you feed it just one vector, it returns a
scalar, and if you feed it two, it returns the covariance matrix i.e:

x = [1, 2, 3, 4, 5] z = [5, 4, 3, 2, 1] scipy.cov(x,z) array([[ 2.5, -2.5],
[-2.5, 2.5]])
scipy.cov(x)

2.5

I suspect that the majority of users use corrcoef to obtain point
estimates of the covariance of two vectors, and relatively few will
estimate a covariance matrix, as this method tends not to be robust to
the presence of noise and/or errors in the data.

Thomas Philips

Mar 19 '06 #5

Similar topics

scipy for Python 2.3 on win?

by: Markus von Ehr | last post by:

Hi, is there already a scipy windows binary for Py2.3? Thanks for any hint, Markus

Python

Scientific Computing with NumPy

by: mclaugb | last post by:

Has anyone recompiled the Scientific Computing package using NumPy instead of Numeric? I need a least squares algorithm and a Newton Rhaphson algorithm which is contained in Numeric but all the...

Python

Getting started with Scipy/NumPy

by: tkpmep | last post by:

I installed SciPy and NumPy (0.9.5, because 0.9.6 does not work with the current version of SciPy), and had some teething troubles. I looked around for help and observed that the tutorial is dated...

Python

problems installling scipy.weave on windows

by: Julien Fiore | last post by:

Hi, I have problems trying to install the scipy.weave package. I run Python 2.4 on windows XP and my C compiler is MinGW. Below is the output of scipy.weave.test(). I read that the tests should...

Python

Scientific computing and data visualization.

by: Fie Pye | last post by:

Hallo I would like to have a high class open source tools for scientific computing and powerful 2D and 3D data visualisation. Therefore I chosepython, numpy and scipy as a base. Now I am in...

Python

latest numpy & scipy - incompatible ?

by: robert | last post by:

I'm using latest numpy & scipy. What is this problem ? : RuntimeError: module compiled against version 1000002 of C-API but this version of numpy is 1000009 Traceback (most recent call last):...

Python

numpy/scipy: correlation

by: robert | last post by:

Is there a ready made function in numpy/scipy to compute the correlation y=mx+o of an X and Y fast: m, m-err, o, o-err, r-coef,r-coef-err ? Or a formula to to compute the 3 error ranges? ...

Python

Automating Calculation of Lagged Cross Correlations between Variables

by: LucasLondon | last post by:

Hi there, First of all apologies for the long post. Hope someone can offer some advice. I have about 200 columns of time series data that I need to perform a correlation analysis on in terms...

Visual Basic 4 / 5 / 6

sage vs enthought for sci computing

by: jadamwilson2 | last post by:

Hello, I have recently become interested in using python for scientific computing, and came across both sage and enthought. I am curious if anyone can tell me what the differences are between the...

Python

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General