472,371 Members | 1,596 Online

# mean ans std dev of an array?

import array
a = array.array('f', [1,2,3])

print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array
data?

Do I need to import it into a Numeric Array to do this?

Oct 23 '06 #1
8 20978 print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array data?
Well, you could use numpy or whatever. If you want to calculate directly,
you could do something like (untested):

n = len(a)
mean = sum(a) / n
sd = sqrt(sum((x-mean)**2 for x in a) / n)
Oct 23 '06 #2
Simplest I see is to do it manually.

If your array data is numeric compatible

mean = sum(a)/len(a)

as for the standard Deviation it depends on the nature of your data...
check out http://en.wikipedia.org/wiki/Standard_deviation for info on
that... but in all a for loop with a few calculation within should be
enough

I know no "standard" way to do this in python

Have not tested above code but it should work

Éric

import array
a = array.array('f', [1,2,3])

print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array
data?

Do I need to import it into a Numeric Array to do this?

Oct 23 '06 #3
import array
a = array.array('f', [1,2,3])
print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array
data?

Do I need to import it into a Numeric Array to do this?
No, you don't have to...though there are likely some stats
modules floating around. However, they're pretty simple to
implement:
>>import array
import math
a = array.array('f', [1,2,3])
mean = lambda a: sum(a)/len(a)
mean(a)
2.0
>>a
array('f', [1.0, 2.0, 3.0])
>>def stdev(a):
.... m = mean(a)
.... return math.sqrt(sum((x-m) ** 2 for x in a) / len(a))
....
>>stdev(a)
0.81649658092772603
Pretty much a no-brainer implementation of the algorithm
described at http://en.wikipedia.org/wiki/Standard_deviation

-tkc

Oct 23 '06 #4

Paul Rubin wrote:
print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array data?

Well, you could use numpy or whatever. If you want to calculate directly,
you could do something like (untested):

n = len(a)
mean = sum(a) / n
sd = sqrt(sum((x-mean)**2 for x in a) / n)
This last line looks like it will be very slow...
Escpecially if the array is large...
So speed is the real issue here...
If there is a faster way... like transferring the array to a different
container class...
but what?

Oct 24 '06 #5
>n = len(a)
mean = sum(a) / n
sd = sqrt(sum((x-mean)**2 for x in a) / n)
...
>If there is a faster way... like transferring the array to a
different container class... but what?
Perhaps:
>>import scipy
print scipy.mean([1,2,3,4,5,6])
3.5
>>print scipy.std([1,2,3,4,5,6])
1.87082869339

Skip
Oct 24 '06 #6
<sk**@pobox.comwrote in message
news:ma***************************************@pyt hon.org...
>
>n = len(a)
>mean = sum(a) / n
>sd = sqrt(sum((x-mean)**2 for x in a) / n)
...
>If there is a faster way... like transferring the array to a
>different container class... but what?

Perhaps:
>>import scipy
>>print scipy.mean([1,2,3,4,5,6])
3.5
>>print scipy.std([1,2,3,4,5,6])
1.87082869339

Skip
Can scipy work with an iterator/generator? If you can only make one pass
through the data, you can try this:

lst = (random.gauss(0,10) for i in range(1000))
# compute n, sum(x), and sum(x**2) with single pass through list
n,sumx,sumx2 = reduce(lambda a,b:(a+b,a+b,a+b), ((1,x,x*x)
for x in lst) )
sd = sqrt( (sumx2 - (sumx*sumx/n))/(n-1) )

-- Paul
Oct 24 '06 #7
Paul McGuire wrote:
<sk**@pobox.comwrote in message
news:ma***************************************@pyt hon.org...
> >n = len(a)
>mean = sum(a) / n
>sd = sqrt(sum((x-mean)**2 for x in a) / n)
...
> >If there is a faster way... like transferring the array to a
>different container class... but what?

Perhaps:
> >>import scipy
>>print scipy.mean([1,2,3,4,5,6])
3.5
> >>print scipy.std([1,2,3,4,5,6])
1.87082869339

Skip
Note that those are also in numpy, now.
Can scipy work with an iterator/generator?
There is a fromiter() constructor for 1D arrays. The basic array() constructor
(which is used by the other functions like mean() and std() to create an array
from a non-array sequence) doesn't quite have enough information to consume
generic iterators. The mean() and std() algorithms operate on arrays, not
generic iterators.
If you can only make one pass
through the data, you can try this:

lst = (random.gauss(0,10) for i in range(1000))
# compute n, sum(x), and sum(x**2) with single pass through list
n,sumx,sumx2 = reduce(lambda a,b:(a+b,a+b,a+b), ((1,x,x*x)
for x in lst) )
sd = sqrt( (sumx2 - (sumx*sumx/n))/(n-1) )
Don't do that. If you *must* restrict yourself to one pass, there are better
algorithms.

http://en.wikipedia.org/wiki/Algorit...ating_variance

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
an underlying truth."
-- Umberto Eco

Oct 24 '06 #8
import array
a = array.array('f', [1,2,3])

print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array
data?

Do I need to import it into a Numeric Array to do this?

I quickly fish this out of my functions toolbox. There's got to be
faster functions in scipy, though.

Frederic

(Disclaimer: If you build an air liner or a ocean liner with this and
the wings fall off at thirty thousand feet or it turns upside down in
the middle of an ocean, respectively of course, I expect a bunch of
contingency lawers lining up at my door wanting to sue you on my behalf.)
def standard_deviation (values):

"""
Takes a sequence and returns mean, variance and standard deviation.
Non-values (None) are skipped

"""

import math

mean = _sum_values_squared = _sum_values = 0.0

l = len (values)
i = 0
item_count = 0
while i < l:
value = values [i]
if value != None:
_sum_values += value
_sum_values_squared += value * value
item_count += 1
i += 1

if item_count < 2: # having skipped all Nones
return None, None, None

mean = _sum_values / item_count

variance = (_sum_values_squared - item_count * mean * mean) /
(item_count - 1)

if variance < 0.0: variance = 0.0
# Rounding errors can cause minute negative values which would crash
the sqrt

standard_deviation = math.sqrt (variance)

return mean, variance, standard_deviation
Oct 25 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

### Similar topics

 4 by: David Gray | last post by: Greetings all, I need to sort an array containing text only values and remove duplicates at the same time. I was thinking of... 1. Loading all values into one array (Array1) 2. Read... 4 by: PhilC | last post by: Hi Folks, If I have an array holding a pair of numbers, and that pairing is unique, is there a way that I can find the array index number for that pair? Thanks, PhilC 58 by: jr | last post by: Sorry for this very dumb question, but I've clearly got a long way to go! Can someone please help me pass an array into a function. Here's a starting point. void TheMainFunc() { // Body of... 19 by: Henry | last post by: I finally thought I had an understanding of multi dimensional arrays in C when I get this: #include #define max_x 3 #define max_y 5 int array; 2 by: ron | last post by: Hi, Could i clear the a single dimension array and a jagged array in the following way. int mySingleArr = new int; mySingleArr = System.Array.Clear(); int myJaggedArr = new int;... 15 by: Geoff Cox | last post by: Hello, Can I separately declare and initialize a string array? How and where would I do it in the code below? It was created using Visual C++ 2005 Express Beta 2 ... In C# I would have ... 3 by: Raider | last post by: Is there any array template implementation available? I mean from C++ Technical Report. 8 by: dude | last post by: i'll try to be short ... i have this in html :