471,322 Members | 1,507 Online

# time series calculation in list comprehension?

Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

Mar 10 '06 #1
9 2956 falcon wrote:
Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

I suggest that statistical data, including time series, be stored and
processed in arrays, such as the one found in NumPy. You can compute
averages using the "sum" function and array slices.

Mar 10 '06 #2

"falcon" <sh******@gmail.com> wrote in message
Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

Write explicit for loops, possibly with nested if conditionals, that do
exactly what you want. The functional expressions are abbreviations for
certain patterns of induction. Except as an educational exercise, I do not
think it worthwhile to go through contortions to force fit a problem to a
pattern it does not really fit.

Terry Jan Reedy

Mar 10 '06 #3
falcon wrote:
Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

I agree with others that reduce is not the best way to do this. But,
to satisfy your curiosity, I offer this horribly inefficient way to use
"reduce" to calculate the average of a list:

from __future__ import division

def reduceaverage(acc, x):
return [acc + x, acc + 1, (acc + x) / (acc + 1) ]

numbers = [4, 8, 15, 16, 23, 42]
print reduce(reduceaverage, numbers, [0,0,0])

....basically, the idea is to write a function that takes as its first
argument the accumulated values, and as its second argument the next
value in the list. In Python, this is almost always the wrong way to
do something, but it is kind of geeky and LISP-ish.

Mar 10 '06 #4
"falcon" <sh******@gmail.com> writes:
Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

Do you mean something like this?

for i in xrange(5, len(ts)):
# compute and print moving average from i-5 to i
print i, sum(ts[i-5:i]) / 5.
Mar 10 '06 #5
Well, you could iterate over an index into the list:

from __future__ import division

def moving_average(sequence, n):
return [sum(sequence[i:i+n])/n for i in
xrange(len(sequence)-n+1)]

Of course, that's hardly efficient. You really want to use the value
calculated for the i_th term in the (i+1)th term's evaluation. While
it's not easy (or pretty) to store state between iterations in a list
comprehension, this is the perfect use for a generator:

def generator_to_list(f):
return lambda *args,**keywords: list(f(*args,**keywords))

@generator_to_list
def moving_average(sequence, n):
assert len(sequence) >= n and n > 0
average = sum(sequence[:n]) / n
yield average
for i in xrange(1, len(sequence)-n+1):
average += (sequence[i+n-1] - sequence[i-1]) / n
yield average

Mar 10 '06 #6
Lonnie Princehouse wrote:
You*really*want*to*use*the*value calculated for the i_th term in the
(i+1)th term's evaluation.**
It may sometimes be necessary to recalculate the average for every iteration
to avoid error accumulation. Another tradeoff with your optimization is
that it becomes harder to switch the accumulation function from average to
max, say.
While it's not easy (or pretty) to store state between iterations in a
list comprehension, this is the perfect use for a generator:

**def*generator_to_list(f):
****return*lambda**args,**keywords:*list(f(*args,* *keywords))

**@generator_to_list
**def*moving_average(sequence,*n):
****assert*len(sequence)*>=*n*and*n*>*0
****average*=*sum(sequence[:n])*/*n
****yield*average
****for*i*in*xrange(1,*len(sequence)-n+1):
******average*+=*(sequence[i+n-1]*-*sequence[i-1])*/*n
******yield*average

Here are two more that work with arbitrary iterables:

from __future__ import division

from itertools import islice, tee, izip
from collections import deque

def window(items, n):
it = iter(items)
w = deque(islice(it, n-1))
for item in it:
w.append(item)
yield w # for a robust implementation:
# yield tuple(w)
w.popleft()

def moving_average1(items, n):
return (sum(w)/n for w in window(items, n))

def moving_average2(items, n):
first_items, last_items = tee(items)
accu = sum(islice(last_items, n-1))
for first, last in izip(first_items, last_items):
accu += last
yield accu/n
accu -= first

While moving_average1() is even slower than your inefficient variant,
moving_average2() seems to be a tad faster than the efficient one.

Peter

Mar 11 '06 #7
In article <11**********************@v46g2000cwv.googlegroups .com>,
falcon <sh******@gmail.com> wrote:
Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

I used the following to return an array of the average of the last n
values -it's not particularly pretty, but it works

# set number of values to average
weighting = 10

# an array of values we want to calculate a running average on
ratings = []
# an array of running averages
running_avg = []

# some routine to fill ratings with the values
r = random.Random()
for i in range(0, 20):
ratings.append(float(r.randint(0, 99)))

for i in range(1, 1 + len(ratings)):
if i < weighting:
running_avg.append(ratings[i - 1])
else:
running_avg.append(reduce(lambda s, a: s+ a,
ratings[i - weighting : i]) /
len(ratings[i - weighting : i]))

for i in range(0, len(ratings)):
print "%3d: %3d %5.2f" % (i, ratings[i], running_avg[i])
sample output:
0: 34 34.00
1: 28 28.00
2: 58 58.00
3: 16 34.00
4: 74 44.00
5: 32 45.00
6: 74 49.00
7: 21 50.25
8: 78 51.25
9: 28 50.25
10: 32 39.75
11: 93 57.75
12: 2 38.75
13: 7 33.50
14: 8 27.50
15: 30 11.75
16: 1 11.50
17: 8 11.75
18: 40 19.75
19: 8 14.25

For all but the first 3 rows, the third column is the average of the
values in the 2nd column for this and the preceding 3 rows.

--
Jim Segrave (je*@jes-2.demon.nl)

Mar 12 '06 #8
[Peter Otten]
from __future__ import division

from itertools import islice, tee, izip . . . def moving_average2(items, n):
first_items, last_items = tee(items)
accu = sum(islice(last_items, n-1))
for first, last in izip(first_items, last_items):
accu += last
yield accu/n
accu -= first

While moving_average1() is even slower than your inefficient variant,
moving_average2() seems to be a tad faster than the efficient one.

This is nicely done and scales-up well. Given an n-average of m-items,
it has O(n) memory consumption and O(m) running time. In contrast, the
other variants do more work than necessary by pulling the whole
sequence into memory or by re-summing all n items at every step,
resulting in O(m) memory consumption and O(m*n) running time.

This recipe gets my vote for the best solution.
Raymond

Mar 12 '06 #9
Wow, thanks for all the reponses. Very helpful!

Mar 13 '06 #10

### This discussion thread is closed

Replies have been disabled for this discussion.