472,139 Members | 1,729 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,139 software developers and data experts.

Problem calculating moving average

Firstly, sorry for the long post, but I've included a fair bit of
sample data.

Im doing a comparision of 10yr Bond prices and CPI adjustments, with
an 18 week moving average of the CPI. I'm using a nested sub-query to
calculate the moving average, but I'm having difficulty selecting
exactly 18 data points (ie When I include the 'HAVING COUNT(C1.Closes)
= 18' line, I get no results).

Can anyone help?
-- Some sample data:
CREATE TABLE Bond10 (
Closes [datetime] NOT NULL ,
Prices [smallmoney] NOT NULL ,
)
INSERT INTO Bond10
SELECT '1994-01-14', 6.57 UNION
SELECT '1994-01-21', 6.53 UNION
SELECT '1994-01-28', 6.44 UNION
SELECT '1994-02-04', 6.51 UNION
SELECT '1994-02-11', 6.54 UNION
SELECT '1994-02-18', 6.89 UNION
SELECT '1994-02-25', 7.18 UNION
SELECT '1994-03-04', 7.43 UNION
SELECT '1994-03-11', 7.43 UNION
SELECT '1994-03-18', 7.44 UNION
SELECT '1994-03-25', 7.66 UNION
SELECT '1994-04-01', 7.96 UNION
SELECT '1994-04-08', 8.07 UNION
SELECT '1994-04-15', 8.24 UNION
SELECT '1994-04-22', 8.23 UNION
SELECT '1994-04-29', 8.45 UNION
SELECT '1994-05-06', 8.82 UNION
SELECT '1994-05-13', 8.86 UNION
SELECT '1994-05-20', 8.44 UNION
SELECT '1994-05-27', 8.75 UNION
SELECT '1994-06-03', 8.79 UNION
SELECT '1994-06-10', 8.77 UNION
SELECT '1994-06-17', 9.24 UNION
SELECT '1994-06-24', 9.63 UNION
SELECT '1994-07-01', 9.66 UNION
SELECT '1994-07-08', 9.59 UNION
SELECT '1994-07-15', 9.41 UNION
SELECT '1994-07-22', 9.56 UNION
SELECT '1994-07-29', 9.58 UNION
SELECT '1994-08-05', 9.31
CREATE TABLE AvgCPI (
Closes [datetime] NOT NULL ,
AvgCPI [smallmoney] NOT NULL ,
)
INSERT INTO AvgCPI
SELECT '1994-01-14', 2.04 UNION
SELECT '1994-01-21', 2.04 UNION
SELECT '1994-01-28', 2.04 UNION
SELECT '1994-02-04', 2.04 UNION
SELECT '1994-02-11', 2.04 UNION
SELECT '1994-02-18', 2.04 UNION
SELECT '1994-02-25', 2.04 UNION
SELECT '1994-03-04', 1.51 UNION
SELECT '1994-03-11', 1.51 UNION
SELECT '1994-03-18', 1.51 UNION
SELECT '1994-03-25', 1.51 UNION
SELECT '1994-04-01', 1.51 UNION
SELECT '1994-04-08', 1.51 UNION
SELECT '1994-04-15', 1.51 UNION
SELECT '1994-04-22', 1.51 UNION
SELECT '1994-04-29', 1.51 UNION
SELECT '1994-05-06', 1.51 UNION
SELECT '1994-05-13', 1.51 UNION
SELECT '1994-05-20', 1.51 UNION
SELECT '1994-05-27', 1.51 UNION
SELECT '1994-06-03', 1.80 UNION
SELECT '1994-06-10', 1.80 UNION
SELECT '1994-06-17', 1.80 UNION
SELECT '1994-06-24', 1.80 UNION
SELECT '1994-07-01', 1.80 UNION
SELECT '1994-07-08', 1.80 UNION
SELECT '1994-07-15', 1.80 UNION
SELECT '1994-07-22', 1.80 UNION
SELECT '1994-07-29', 1.80 UNION
SELECT '1994-08-05', 1.80

-- My query so far:
SELECT A1.Closes, A1.Prices, B1.AvgCPI, SUM(C1.AvgCPI) AS MovSumCPI,
AVG(C1.AvgCPI) AS MovAvgCPI, COUNT(C1.AvgCPI) AS Counter
FROM (
SELECT Closes, Prices FROM Bond10
) A1
LEFT JOIN (
SELECT Closes, AvgCPI FROM AvgCPI
) B1 ON A1.Closes = B1.Closes
LEFT JOIN (
SELECT Closes, AvgCPI FROM AvgCPI
) C1 ON C1.Closes >= A1.Closes AND DATEADD(Week,-18,C1.Closes) <
A1.Closes
GROUP BY A1.Closes, A1.Prices, B1.AvgCPI, C1.AvgCPI
-- HAVING COUNT(C1.Closes) = 18
ORDER BY A1.Closes

DROP TABLE Bond10
DROP TABLE AvgCPI
Expected Results

Closes Bon10 AvCPI MovAvg
========== ==== ==== ====
14-Jan-94 6.57 2.04
14-Jan-94 6.57 2.04
21-Jan-94 6.53 2.04
21-Jan-94 6.53 2.04
28-Jan-94 6.44 2.04
28-Jan-94 6.44 2.04
4-Feb-94 6.51 2.04
4-Feb-94 6.51 2.04
4-Feb-94 6.51 2.04
11-Feb-94 6.54 2.04
11-Feb-94 6.54 2.04
11-Feb-94 6.54 2.04
18-Feb-94 6.89 2.04
18-Feb-94 6.89 2.04
18-Feb-94 6.89 2.04
25-Feb-94 7.18 2.04
25-Feb-94 7.18 2.04
25-Feb-94 7.18 2.04 2.04
4-Mar-94 7.43 1.51 2.01
4-Mar-94 7.43 1.51 1.98
11-Mar-94 7.43 1.51 1.95
11-Mar-94 7.43 1.51 1.92
18-Mar-94 7.44 1.51 1.89
18-Mar-94 7.44 1.51 1.86
25-Mar-94 7.66 1.51 1.83
25-Mar-94 7.66 1.51 1.80
1-Apr-94 7.96 1.51 1.78
1-Apr-94 7.96 1.51 1.75
8-Apr-94 8.07 1.51 1.72
8-Apr-94 8.07 1.51 1.69
15-Apr-94 8.24 1.51 1.66
15-Apr-94 8.24 1.51 1.63
22-Apr-94 8.23 1.51 1.60
22-Apr-94 8.23 1.51 1.57
29-Apr-94 8.45 1.51 1.54
29-Apr-94 8.45 1.51 1.51
6-May-94 8.82 1.51 1.51
6-May-94 8.82 1.51 1.51
13-May-94 8.86 1.51 1.51
13-May-94 8.86 1.51 1.51
20-May-94 8.44 1.51 1.51
20-May-94 8.44 1.51 1.51
27-May-94 8.75 1.51 1.51
27-May-94 8.75 1.51 1.51
3-Jun-94 8.79 1.8 1.53
10-Jun-94 8.77 1.8 1.54
17-Jun-94 9.24 1.8 1.56
24-Jun-94 9.63 1.8 1.57
1-Jul-94 9.66 1.8 1.59
8-Jul-94 9.59 1.8 1.61
15-Jul-94 9.41 1.8 1.62
22-Jul-94 9.56 1.8 1.64
29-Jul-94 9.58 1.8 1.66
5-Aug-94 9.31 1.8 1.67

Thanks,

Stephen
Jul 20 '05 #1
6 9371
I'm not sure I understand your expected result. Why do you want to see
multiple rows for each week? How are you calculating the MovAvg column? How
can you have a 18 week MovAvg when you don't have 18 weeks of data until
1994-05-13?

Here's how I would have expected to do it:

SELECT B.Closes, B.Prices,
MIN(CASE A.closes WHEN B.closes THEN A.AvgCPI END) AS AvgCPI,
CASE MIN(A.closes) WHEN DATEADD(DAY,-119,B.closes)
THEN AVG(A.AvgCPI) END AS MovAvgCPI,
COUNT(A.AvgCPI) AS Counter
FROM Bond10 AS B
LEFT JOIN AvgCPI AS A
ON A.Closes BETWEEN DATEADD(DAY,-119,B.closes) AND B.Closes
GROUP BY B.Closes, B.Prices
ORDER BY B.Closes

Rather different to what you asked for but maybe it helps.

--
David Portas
------------
Please reply only to the newsgroup
--
Jul 20 '05 #2
[posted and mailed, please reply in news]

Stephen Miller (js******@hotmail.com) writes:
Im doing a comparision of 10yr Bond prices and CPI adjustments, with
an 18 week moving average of the CPI. I'm using a nested sub-query to
calculate the moving average, but I'm having difficulty selecting
exactly 18 data points (ie When I include the 'HAVING COUNT(C1.Closes)
= 18' line, I get no results).


I would not say that completly understand the logic, as it is a bit
complex. But one thing caught my eye in the query:

GROUP BY A1.Closes, A1.Prices, B1.AvgCPI, C1.AvgCPI

That C1.AvgCPI seems out of place. When I remove it from the GROUP
BY, the Counter gets the value 18 for all the early rows. It does
not really look like your expected results. Then again, the
expected results looked very similar to the results of the query
you posted.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #3
Like the other reviewers, I was unsure why you had multiple rows for a
given date and I was uncertain as to the what data was to be included
in the average.

The code below will return the average CPI for the previous 18 weeks
(including the current week). This handles cases where there are less
than 18 weeks of data and will be handle missing weeks and weeks that
have more than one entry.
select
b.closes
,b.prices
,a.AvgCPI
,(select sum(a2.AvgCPI)/count(a2.avgCPI) from AvgCPI a2 where
a2.closes <= a.closes AND A2.closes >= dateadd(week, -18,
a.closes))
from
bond10 b
,AvgCPI a
where
a.closes = b.closes
I note that this gives different results to the other posting.
Jul 20 '05 #4
"David Portas" <RE****************************@acm.org> wrote in message news:<IO********************@giganews.com>...
I'm not sure I understand your expected result. Why do you want to see
multiple rows for each week? How are you calculating the MovAvg column? How
can you have a 18 week MovAvg when you don't have 18 weeks of data until
1994-05-13?

Here's how I would have expected to do it:

SELECT B.Closes, B.Prices,
MIN(CASE A.closes WHEN B.closes THEN A.AvgCPI END) AS AvgCPI,
CASE MIN(A.closes) WHEN DATEADD(DAY,-119,B.closes)
THEN AVG(A.AvgCPI) END AS MovAvgCPI,
COUNT(A.AvgCPI) AS Counter
FROM Bond10 AS B
LEFT JOIN AvgCPI AS A
ON A.Closes BETWEEN DATEADD(DAY,-119,B.closes) AND B.Closes
GROUP BY B.Closes, B.Prices
ORDER BY B.Closes

Rather different to what you asked for but maybe it helps.


David,

I must apologise! I'm not sure how it happened (although I do recall
having a dozen spreadsheets open at the time), but I posted the wrong
result set.

Your analysis however was spot on! The results I was trying to archive
(and your query correctly returns) are:
Date 10BOND CPI MovAvg
========== ====== ====== ======
1994-01-14 6.5700 2.0400 NULL
1994-01-21 6.5300 2.0400 NULL
1994-01-28 6.4400 2.0400 NULL
1994-02-04 6.5100 2.0400 NULL
1994-02-11 6.5400 2.0400 NULL
1994-02-18 6.8900 2.0400 NULL
1994-02-25 7.1800 2.0400 NULL
1994-03-04 7.4300 1.5100 NULL
1994-03-11 7.4300 1.5100 NULL
1994-03-18 7.4400 1.5100 NULL
1994-03-25 7.6600 1.5100 NULL
1994-04-01 7.9600 1.5100 NULL
1994-04-08 8.0700 1.5100 NULL
1994-04-15 8.2400 1.5100 NULL
1994-04-22 8.2300 1.5100 NULL
1994-04-29 8.4500 1.5100 NULL
1994-05-06 8.8200 1.5100 NULL
1994-05-13 8.8600 1.5100 1.7161
1994-05-20 8.4400 1.5100 1.6866
1994-05-27 8.7500 1.5100 1.6572
1994-06-03 8.7900 1.8000 1.6438
1994-06-10 8.7700 1.8000 1.6305
1994-06-17 9.2400 1.8000 1.6172
1994-06-24 9.6300 1.8000 1.6038
1994-07-01 9.6600 1.8000 1.5905
1994-07-08 9.5900 1.8000 1.6066
1994-07-15 9.4100 1.8000 1.6227
1994-07-22 9.5600 1.8000 1.6388
1994-07-29 9.5800 1.8000 1.6550
1994-08-05 9.3100 1.8000 1.6711

Thanks and again, I apologise for posting the wrong information.

Regards,

Stephen
Jul 20 '05 #5
Stephen,

This isn't going to be of immediate help because SQL Server currently
doesn't support the syntax, but I'm posting it anyway to ask if anyone
knows if it will be available in the Yukon release. I ran this in DB2.
The syntax is described as relational OLAP or Windowing functionality
and is specified in the SQL-2003 draft. The following is a link:

http://www.jtc1sc32.org/sc32/jtc1sc3...f?OpenDocument

You can achieve a similar result by using a select statement within
the select clause (as has already been posted), but over large result
sets this performs very poorly.

I got slightly different rounding results, but it's basically as you
posted...

Displaying result for:
---------------------
select "Date"
,"10Bond"
,"CPI"
, CASE
WHEN "RowNum" < 18 then
NULL
else
"MovAvg"
end as "MovAvg"
from (
select ROW_NUMBER() OVER (ORDER BY b.closes) AS "RowNum"
,b.CLOSES as "Date"
,b.PRICES as "10Bond"
,a.AVGCPI as "CPI"
,DECIMAL (ROUND(avg(a.AvgCPI) over (order by b.Closes
rows between 17 preceding
and current row
), 4), 7, 4) as "MovAvg"
from bond10 b
, avgCPI a
where b.closes = a.closes
) as t

Date 10Bond CPI MovAvg
---------- --------- --------- ---------
1994-01-14 6.5700 2.0400 NULL
1994-01-21 6.5300 2.0400 NULL
1994-01-28 6.4400 2.0400 NULL
1994-02-04 6.5100 2.0400 NULL
1994-02-11 6.5400 2.0400 NULL
1994-02-18 6.8900 2.0400 NULL
1994-02-25 7.1800 2.0400 NULL
1994-03-04 7.4300 1.5100 NULL
1994-03-11 7.4300 1.5100 NULL
1994-03-18 7.4400 1.5100 NULL
1994-03-25 7.6600 1.5100 NULL
1994-04-01 7.9600 1.5100 NULL
1994-04-08 8.0700 1.5100 NULL
1994-04-15 8.2400 1.5100 NULL
1994-04-22 8.2300 1.5100 NULL
1994-04-29 8.4500 1.5100 NULL
1994-05-06 8.8200 1.5100 NULL
1994-05-13 8.8600 1.5100 1.7161
1994-05-20 8.4400 1.5100 1.6867
1994-05-27 8.7500 1.5100 1.6572
1994-06-03 8.7900 1.8000 1.6439
1994-06-10 8.7700 1.8000 1.6306
1994-06-17 9.2400 1.8000 1.6172
1994-06-24 9.6300 1.8000 1.6039
1994-07-01 9.6600 1.8000 1.5906
1994-07-08 9.5900 1.8000 1.6067
1994-07-15 9.4100 1.8000 1.6228
1994-07-22 9.5600 1.8000 1.6389
1994-07-29 9.5800 1.8000 1.6550
1994-08-05 9.3100 1.8000 1.6711

30 Row(s) affected

Christian.
Jul 20 '05 #6
Christian Maslen wrote:
I ran this in DB2.
The syntax is described as relational OLAP or Windowing
functionality and is specified in the SQL-2003 draft.
.
You can achieve a similar result by using a select
statement within the select clause (as has already been
posted), but over large result sets this performs very poorly


Very nice query and it's ashame sql99 functionality isn't
available in server.
But you can avoid the performance of hit due to the
number of comparisons based on the inequality operator
in either a subquery or join that is used as a general
solution to running sums in <=sql92.The RAC utility
computes running sums without resort to multiple comparisons.
Once runs are computed you can obtain moving runs/averages
by subtracting prior runs from the current run and thus
again eliminate the comparisons based on inequality operators.
This example using RAC duplicates your query.Unions are used
to more clearly show what is going on.Of course a single
Select with a case statement could also be used:).

Exec Rac
@transform='sum(AvgCPI) as CPI',
@rows='a.Closes(date) as [Date] & PRICES as [10Bond] ',
@pvtcol='Report Mode',
@from='##Bond10 as a inner join ##AvgCPI as b
on a.closes=b.closes',
@grand_totals='n',
@racheck='y',
-- Compute the runs of CPI over the dates.
@rowruns='(CPI)',
-- At this point CPI runs over dates have been computed
-- in the work table 'rac'.The 'rd' column is a Rac by product
-- that is a counter of the date rows.A union query produces the
-- moving averages.
@select=
'select a.rd,a.[Date],a.[10Bond],a.CPI,null as MovAvg
from rac as a
where rd<18
union
select a.rd,a.[Date],a.[10Bond],a.CPI,
cast( cast(a.runs as decimal(7,4))/18.0 as decimal(7,4))
from rac as a
where rd=18
union
/* Here a join is used to get a prior run value that is subtracted from
the current run to obtain the correct moving average for N (18) rows. */
select a.rd,a.[Date],a.[10Bond],a.CPI,
cast( (cast(a.runs as decimal(7,4)) - cast(b.runs as decimal(7,4)))/18.0
as decimal(7,4))
from rac as a inner join rac as b
on a.rd-18=b.rd
order by a.rd'

RAC v2.2 and QALite @
www.rac4sql.net


*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by GregM | last post: by
3 posts views Thread by paeh | last post: by
1 post views Thread by Michel | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.