By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,149 Members | 1,016 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,149 IT Pros & Developers. It's quick & easy.

Aggregate query for multiple records

P: n/a
Hello, I am new to the list, my apology if this question is beyond the
scope or charter of this list.

My questions is:
What is the best method to perform an aggregate query to calculate
sum() values for each distinct wid as in the example below, but except
for all wid's (not just WHERE wid='01/1-6-1-30w1/0').

Also, performance wise, would it be better to build a function for this
query. The table has 9 million records and these aggregate queries
take hours.
SELECT
SUM(oil) as sumoil, SUM(hours) as sumhours,
FROM
(SELECT oil, hours prd_data WHERE wid='01/1-6-1-30w1/0'
ORDER BY date LIMIT 6) subtable
;
Table description:
Table "prd_data"
Column | Type | Modifiers
--------+-----------------------+-----------
date | integer |
hours | real |
oil | real |
gas | real |
water | real |
pwid | integer |
wid | character varying(20) |
year | smallint |
Indexes: wid_index6
Actual table (prd_data), 9 million records:

date | hours | oil | gas | water | pwid | wid | year
--------+-------+-------+------+-------+------+-----------------+------
196507 | 360 | 159.4 | 11.3 | 40.9 | 413 | 01/1-1-1-31w1/0 | 1965
196508 | 744 | 280 | 20 | 27.2 | 413 | 01/1-1-1-31w1/0 | 1965
196509 | 360 | 171.1 | 11.4 | 50.4 | 413 | 01/1-1-1-31w1/0 | 1965
196510 | 744 | 202.1 | 25 | 89.8 | 413 | 01/1-1-1-31w1/0 | 1965
196512 | 744 | 201.3 | 23.8 | 71.9 | 413 | 01/1-1-1-31w1/0 | 1965
196511 | 720 | 184 | 17.6 | 78.9 | 413 | 01/1-1-1-31w1/0 | 1965
196610 | 744 | 99.8 | 15.4 | 53.7 | 413 | 01/1-1-1-31w1/0 | 1966
196612 | 744 | 86 | 12.8 | 36.1 | 413 | 01/1-1-1-31w1/0 | 1966
196611 | 720 | 86 | 12.6 | 41.7 | 413 | 01/1-1-1-31w1/0 | 1966
196601 | 744 | 191.6 | 22.6 | 50.7 | 413 | 01/1-1-1-31w1/0 | 1966
200301 | 461 | 68.8 | 0 | 186.3 | 47899 | 9G/6-1-50-24w3/0 | 2003
200310 | 740 | 446.3 | 0 | 563.1 | 47899 | 9G/6-1-50-24w3/0 | 2003
200306 | 667 | 92.1 | 0 | 968.8 | 47899 | 9G/6-1-50-24w3/0 | 2003
200304 | 0 | 0 | 0 | 0 | 47899 | 9G/6-1-50-24w3/0 | 2003
200308 | 457 | 100.7 | 0 | 82.8 | 47899 | 9G/6-1-50-24w3/0 | 2003
200307 | 574 | 78 | 0 | 752 | 47899 | 9G/6-1-50-24w3/0 | 2003
200312 | 582 | 360.9 | 0 | 569 | 47899 | 9G/6-1-50-24w3/0 | 2003
200311 | 681 | 260.8 | 0 | 563.9 | 47899 | 9G/6-1-50-24w3/0 | 2003
200305 | 452 | 0 | 0 | 0 | 47899 | 9G/6-1-50-24w3/0 | 2003
200309 | 637 | 244.6 | 0 | 193.8 | 47899 | 9G/6-1-50-24w3/0 | 2003
(20 rows)

Thanks,

--
Scott A. Gerhardt, P.Geo.
Gerhardt Information Technologies
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a

Scott Gerhardt <sc***@g-it.ca> writes:
Hello, I am new to the list, my apology if this question is beyond the scope or
charter of this list.
Not only is this on-charter, but this specific question comes up fairly often.
My questions is:
What is the best method to perform an aggregate query to calculate sum() values
for each distinct wid as in the example below, but except for all wid's (not
just WHERE wid='01/1-6-1-30w1/0').


This type of "top 6" or in this case "first 6" query is pretty tricky to do in
SQL. In fact the best solution anyone's proposed here uses non-standard
postgres extensions to define an aggregate that keeps an accumulation in an
array.

Something like (but I suppose you need reals, not integers):

test=> create or replace function first_6_accum (integer[], integer) returns integer[]
language sql immutable as
'select case when array_upper($1,1)>=6 then $1 else $1||$2 end';

test=> create function sum_6(integer[]) returns integer
immutable language sql as
'select $1[1]+$1[2]+$1[3]+$1[4]+$1[5]+$1[6]';

test=> create aggregate sum_first_6
(basetype=integer, sfunc=first_6_accum, stype=integer[],initcond='{}',finalfunc=sum_6);

test=> select sum_first_6(i) from (select i from (select 1 as i union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8) as x order by i asc) as x;
sum_first_6
-------------
21
(1 row)

You'll need to select from a subquery that guarantees the correct ordering.
And then you'll need to do a GROUP BY wid. And then you should be aware that
some versions of postgres didn't always use a sorting method for the group by
that guaranteed the ordering of the subquery was preserved. I think you're
safe in 7.4 but you would have to test it.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.