469,366 Members | 2,243 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,366 developers. It's quick & easy.

mean ans std dev of an array?

import array
a = array.array('f', [1,2,3])

print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array
data?

Do I need to import it into a Numeric Array to do this?

Oct 23 '06 #1
8 20753
"SpreadTooThin" <bj********@gmail.comwrites:
print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array data?
Well, you could use numpy or whatever. If you want to calculate directly,
you could do something like (untested):

n = len(a)
mean = sum(a) / n
sd = sqrt(sum((x-mean)**2 for x in a) / n)
Oct 23 '06 #2
Simplest I see is to do it manually.

If your array data is numeric compatible

mean = sum(a)/len(a)

as for the standard Deviation it depends on the nature of your data...
check out http://en.wikipedia.org/wiki/Standard_deviation for info on
that... but in all a for loop with a few calculation within should be
enough

I know no "standard" way to do this in python

Have not tested above code but it should work

Éric

SpreadTooThin wrote:
import array
a = array.array('f', [1,2,3])

print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array
data?

Do I need to import it into a Numeric Array to do this?

Oct 23 '06 #3
import array
a = array.array('f', [1,2,3])
print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array
data?

Do I need to import it into a Numeric Array to do this?
No, you don't have to...though there are likely some stats
modules floating around. However, they're pretty simple to
implement:
>>import array
import math
a = array.array('f', [1,2,3])
mean = lambda a: sum(a)/len(a)
mean(a)
2.0
>>a
array('f', [1.0, 2.0, 3.0])
>>def stdev(a):
.... m = mean(a)
.... return math.sqrt(sum((x-m) ** 2 for x in a) / len(a))
....
>>stdev(a)
0.81649658092772603
Pretty much a no-brainer implementation of the algorithm
described at http://en.wikipedia.org/wiki/Standard_deviation

-tkc

Oct 23 '06 #4

Paul Rubin wrote:
"SpreadTooThin" <bj********@gmail.comwrites:
print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array data?

Well, you could use numpy or whatever. If you want to calculate directly,
you could do something like (untested):

n = len(a)
mean = sum(a) / n
sd = sqrt(sum((x-mean)**2 for x in a) / n)
This last line looks like it will be very slow...
Escpecially if the array is large...
So speed is the real issue here...
If there is a faster way... like transferring the array to a different
container class...
but what?

Oct 24 '06 #5
>n = len(a)
mean = sum(a) / n
sd = sqrt(sum((x-mean)**2 for x in a) / n)
...
>If there is a faster way... like transferring the array to a
different container class... but what?
Perhaps:
>>import scipy
print scipy.mean([1,2,3,4,5,6])
3.5
>>print scipy.std([1,2,3,4,5,6])
1.87082869339

Skip
Oct 24 '06 #6
<sk**@pobox.comwrote in message
news:ma***************************************@pyt hon.org...
>
>n = len(a)
>mean = sum(a) / n
>sd = sqrt(sum((x-mean)**2 for x in a) / n)
...
>If there is a faster way... like transferring the array to a
>different container class... but what?

Perhaps:
>>import scipy
>>print scipy.mean([1,2,3,4,5,6])
3.5
>>print scipy.std([1,2,3,4,5,6])
1.87082869339

Skip
Can scipy work with an iterator/generator? If you can only make one pass
through the data, you can try this:

lst = (random.gauss(0,10) for i in range(1000))
# compute n, sum(x), and sum(x**2) with single pass through list
n,sumx,sumx2 = reduce(lambda a,b:(a[0]+b[0],a[1]+b[1],a[2]+b[2]), ((1,x,x*x)
for x in lst) )
sd = sqrt( (sumx2 - (sumx*sumx/n))/(n-1) )

-- Paul
Oct 24 '06 #7
Paul McGuire wrote:
<sk**@pobox.comwrote in message
news:ma***************************************@pyt hon.org...
> >n = len(a)
>mean = sum(a) / n
>sd = sqrt(sum((x-mean)**2 for x in a) / n)
...
> >If there is a faster way... like transferring the array to a
>different container class... but what?

Perhaps:
> >>import scipy
>>print scipy.mean([1,2,3,4,5,6])
3.5
> >>print scipy.std([1,2,3,4,5,6])
1.87082869339

Skip
Note that those are also in numpy, now.
Can scipy work with an iterator/generator?
There is a fromiter() constructor for 1D arrays. The basic array() constructor
(which is used by the other functions like mean() and std() to create an array
from a non-array sequence) doesn't quite have enough information to consume
generic iterators. The mean() and std() algorithms operate on arrays, not
generic iterators.
If you can only make one pass
through the data, you can try this:

lst = (random.gauss(0,10) for i in range(1000))
# compute n, sum(x), and sum(x**2) with single pass through list
n,sumx,sumx2 = reduce(lambda a,b:(a[0]+b[0],a[1]+b[1],a[2]+b[2]), ((1,x,x*x)
for x in lst) )
sd = sqrt( (sumx2 - (sumx*sumx/n))/(n-1) )
Don't do that. If you *must* restrict yourself to one pass, there are better
algorithms.

http://en.wikipedia.org/wiki/Algorit...ating_variance

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Oct 24 '06 #8
SpreadTooThin wrote:
import array
a = array.array('f', [1,2,3])

print a.mean()
print a.std_dev()

Is there a way to calculate the mean and standard deviation on array
data?

Do I need to import it into a Numeric Array to do this?

I quickly fish this out of my functions toolbox. There's got to be
faster functions in scipy, though.

Frederic

(Disclaimer: If you build an air liner or a ocean liner with this and
the wings fall off at thirty thousand feet or it turns upside down in
the middle of an ocean, respectively of course, I expect a bunch of
contingency lawers lining up at my door wanting to sue you on my behalf.)
def standard_deviation (values):

"""
Takes a sequence and returns mean, variance and standard deviation.
Non-values (None) are skipped

"""

import math

mean = _sum_values_squared = _sum_values = 0.0

l = len (values)
i = 0
item_count = 0
while i < l:
value = values [i]
if value != None:
_sum_values += value
_sum_values_squared += value * value
item_count += 1
i += 1

if item_count < 2: # having skipped all Nones
return None, None, None

mean = _sum_values / item_count

variance = (_sum_values_squared - item_count * mean * mean) /
(item_count - 1)

if variance < 0.0: variance = 0.0
# Rounding errors can cause minute negative values which would crash
the sqrt

standard_deviation = math.sqrt (variance)

return mean, variance, standard_deviation
Oct 25 '06 #9

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by David Gray | last post: by
19 posts views Thread by Henry | last post: by
2 posts views Thread by ron | last post: by
15 posts views Thread by Geoff Cox | last post: by
3 posts views Thread by Raider | last post: by
8 posts views Thread by dude | last post: by
21 posts views Thread by arnuld | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.