GROUP BY problem in conjunction with a SUM()

Rudolf Bargholz

Perhaps some kind soul could help me out with an SQL I have been trying all
day to get to work, where one colum is just not summing up the way I want it
to.

I have the following data

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ

1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 3 dd
1 B 3 ee
2 C 1 ff
2 C 1 gg

An item is part of a group. The items are sold in specified numbers and are
associated with persons.

The first result I want to acheive is list each item and the number of items
used by the number of persons associated with the item:

select
GRP_SEQ
,ITEM_SEQ
,sum(ITEM_NR)/count(ITEM_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM on ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=ITEM_SEQ
where
ITEM_GRPSEQ=GRP_SEQ
group by ITEM_SEQ,GRP_SEQ
order by 2,1

The result is

1 A 2 3
1 B 3 2
2 C 1 2

This works fine.

The next SQL is the one I am having problems with. I want to group the items
by their group and return the total number of items sold for that group of
items and count the number of persons that used the items of that group.
What I have come up with is the following:

select
GRP_SEQ
,???
,count(PERSONS_SEQ)
from
GRP
inner join ITEM on ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=ITEM_SEQ
where
ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

The result I am expecting to acheive is

1 5 5
2 1 2

but I am have not been able to work out how I can I can sum the ITEM_NR for
each item only once for each item. I could of course modify the SQL to

select
GRP_SEQ
,(select sum(i2.ITEM_NR) from ITEM i2 where i2.ITEM_GRPSEQ=GRP_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM i1 on i1.ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=i1.ITEM_SEQ
where
i1.ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

but there must be some method to achieve the result without using the costly
subselect. Somehow I think there is event something in the first SQL that I
am overseeing that would allow me to replace the
"sum(ITEM_NR)/count(ITEM_SEQ)" with something more elegant.

Perhaps some kind soul could help me out ....

Regards

Rudolf Bargholz

Feb 18 '07 #1

Subscribe Reply

9967

Knut Stolze

Rudolf Bargholz wrote:

Perhaps some kind soul could help me out with an SQL I have been trying
all day to get to work, where one colum is just not summing up the way I
want it to.

I have the following data

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ

1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 3 dd
1 B 3 ee
2 C 1 ff
2 C 1 gg

An item is part of a group. The items are sold in specified numbers and
are associated with persons.

The first result I want to acheive is list each item and the number of
items used by the number of persons associated with the item:

From your description of the query I don't see any reason to operate on
GRP_SEQ. Also, do you want to count the number of items and the number of
persons independently? Or do you want to calculate some sort of average,
i.e. 5 persons ordered (on average) 1.3 times item A?

select
GRP_SEQ
,ITEM_SEQ
,sum(ITEM_NR)/count(ITEM_SEQ)

I guess that NR_ITEMS is the same as ITEM_NR? If so, and if ITEM_SEQ can't
be NULL, then a simple AVG(ITEM_NR) would be fine here as well.

,count(PERSONS_SEQ)
from
GRP
inner join ITEM on ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=ITEM_SEQ
where
ITEM_GRPSEQ=GRP_SEQ
group by ITEM_SEQ,GRP_SEQ
order by 2,1

Your query doesn't match at all with the schema and sample data you provided
above. What are the ITEM and GROUP tables?

The result is

1 A 2 3
1 B 3 2
2 C 1 2

This works fine.

The next SQL is the one I am having problems with. I want to group the
items
by their group and return the total number of items sold for that group
of items and count the number of persons that used the items of that
group. What I have come up with is the following:

Again, assuming that you want to have independent counts of persons and
items, how about this:

SELECT grp_seq, SUM(nr_items), COUNT(DISTINCT person_seq)
FROM ...
GROUP BY grp_seq

select
GRP_SEQ
,???
,count(PERSONS_SEQ)
from
GRP
inner join ITEM on ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=ITEM_SEQ
where
ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

The result I am expecting to acheive is

1 5 5
2 1 2

Why the "5" and "1" in the 2nd column?

but I am have not been able to work out how I can I can sum the ITEM_NR
for each item only once for each item. I could of course modify the SQL to

select
GRP_SEQ
,(select sum(i2.ITEM_NR) from ITEM i2 where i2.ITEM_GRPSEQ=GRP_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM i1 on i1.ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=i1.ITEM_SEQ
where
i1.ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

but there must be some method to achieve the result without using the
costly subselect.

Why do you think that a subselect is "costly"? The idea of SQL is to tell
the DBMS "what" you want and the DBMS figures out the best way on "how" to
get there. So it would be a good idea to first have a look at the
optimized plans and also to do some performance measurements before saying
something is expensive or not.

Anyway, the biggest problem I have is that the semantics of your queries are
not clear at all...

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany

Feb 18 '07 #2

Rudolf Bargholz

Hi Knut,

Again, assuming that you want to have independent counts of persons and
items, how about this:

SELECT grp_seq, SUM(nr_items), COUNT(DISTINCT person_seq)
FROM ...
GROUP BY grp_seq

Here the original data set:

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ

1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 3 dd
1 B 3 ee
2 C 1 ff
2 C 1 gg

If I group by the GRP_SEQ, the sum of the NR_ITEMS for the GROUP 1 would be
2 + 2 + 2 + 3 + 3 = 12. The result I am looking for is 2 + 3 = 3.

There were two items (A and B) of the GROUP 1 sold. The first item (A) is
associated with tthree persons (aa, bb, and cc). The second item (B) is
associated with two persons (dd and ee).

>

Why the "5" and "1" in the 2nd column?

Perhaps it was a little dumb having the columns NR_ITEMS and PERSONS add up
to the same number. A better example would be

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ
1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 1 dd
1 B 1 ee
2 C 1 ff
2 C 1 gg

The number of items sold for the GROUP 1 is 2 + 1 = 3
The number of persons that travelled using GROUP 1 is 5
The number of items sold for the GROUP 2 is 1
The number of persons that travelled using GROUP 1 is 2

Does this make the more sense?

Regards

Rudolf Bargholz

Feb 18 '07 #3

Rudolf Bargholz

Hi Knut,

>select
GRP_SEQ
,(select sum(i2.ITEM_NR) from ITEM i2 where i2.ITEM_GRPSEQ=GRP_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM i1 on i1.ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=i1.ITEM_SEQ
where
i1.ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

but there must be some method to achieve the result without using the
costly subselect.

Why do you think that a subselect is "costly"? The idea of SQL is to tell
the DBMS "what" you want and the DBMS figures out the best way on "how" to
get there. So it would be a good idea to first have a look at the
optimized plans and also to do some performance measurements before saying
something is expensive or not.

Anyway, the biggest problem I have is that the semantics of your queries
are
not clear at all...

The perormance issue is just a guess of mine. I was always under the
impression that databases handled sets and the grouping of sets better than
when using subselects in columns. It is just a matter of experience for me
that wherever I use selects in columns of an SQL the performance drops, and
when I rely on set functions the SQLs speed up.

My example is very clear to me who has been trying to find a solution to
this problem all day. To someone else, I now see it makes less sense.
Perhaps the following will help:

create table DB2ADMIN.GRP
(
GRP_SEQ INTEGER not null,
GRP_NAME VARCHAR(20) not null
);
insert into DB2ADMIN.GRP(GRP_SEQ, GRP_NAME) values (1,'Hotel
Adlon'),(2,'Marriot');;

create table DB2ADMIN.ITEM
(
ITEM_SEQ INTEGER not null,
ITEM_GRPSEQ INTEGER not null,
ITEM_NAME VARCHAR(20) not null,
ITEM_NR INTEGER
);
insert into DB2ADMIN.ITEM(ITEM_SEQ, ITEM_GRPSEQ, ITEM_NAME, ITEM_NR)
values (1,1,'Triple',2),(2,1,'Double',1),(3,2,'Single',1) ;

create table DB2ADMIN.PERSON
(
PERSON_SEQ INTEGER not null,
PERSON_ITEMSEQ INTEGER not null,
PERSON_NAME VARCHAR(20) not null
);
insert into DB2ADMIN.PERSON(PERSON_SEQ, PERSON_ITEMSEQ, PERSON_NAME)
values
(1,1,'Rudolf'),(2,1,'Sandra'),(3,1,'Alyssa'),(3,2, 'Hans'),(4,2,'Diana'),(5,3,'Eduard');

"Hotel Adlon" has 3 sold rooms with a total of 5 persons
"Mariott" has 1 sold room with a total of 1 person

Note, there were two tripple rooms sold to three persons.

The following SQL

SELECT grp_seq, SUM(nr_items), COUNT(DISTINCT person_seq)
FROM ...
GROUP BY grp_seq

will return the following incorect result:

1, 8, 5
2, 1, 1
The Hotel Adlon sold eight rooms for five persons.

What I am looking for is

1, 3, 5
2, 1, 1
The Hotel Adlon sold three rooms for five persons
Regards

Rudolf Bargholz

Feb 18 '07 #4

Knut Stolze

Rudolf Bargholz wrote:

Hi Knut,

>Again, assuming that you want to have independent counts of persons and
items, how about this:

SELECT grp_seq, SUM(nr_items), COUNT(DISTINCT person_seq)
FROM ...
GROUP BY grp_seq

Here the original data set:

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ

1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 3 dd
1 B 3 ee
2 C 1 ff
2 C 1 gg

If I group by the GRP_SEQ, the sum of the NR_ITEMS for the GROUP 1 would
be 2 + 2 + 2 + 3 + 3 = 12. The result I am looking for is 2 + 3 = 3.

2 + 3 = 5

There were two items (A and B) of the GROUP 1 sold. The first item (A) is
associated with tthree persons (aa, bb, and cc). The second item (B) is
associated with two persons (dd and ee).

Could you explain what exactly your query should return? Because I don't
know how the "2" and "3", which you want to add, come to pass. Do you
simply want to count the number of persons in each group? That's done by
the 3rd expression in the select list already.

>Why the "5" and "1" in the 2nd column?

Perhaps it was a little dumb having the columns NR_ITEMS and PERSONS add
up to the same number. A better example would be

GRP_SEQ ITEM_SEQ NR_ITEMS PERSONS_SEQ
1 A 2 aa
1 A 2 bb
1 A 2 cc
1 B 1 dd
1 B 1 ee
2 C 1 ff
2 C 1 gg

The number of items sold for the GROUP 1 is 2 + 1 = 3

How's that? I see that the item A in group 1 has been sold 3x 2 times. Or
is "NR_ITEMS" something like a room number?

The number of persons that travelled using GROUP 1 is 5

Ok, I can understand that.

The number of items sold for the GROUP 2 is 1
The number of persons that travelled using GROUP 1 is 2

Does this make the more sense?

No, not really... I guess the problem is that you are showing a joined table
here, whereas your base tables store the data in a different schema, right?

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany

Feb 18 '07 #5

Knut Stolze

Rudolf Bargholz wrote:

Hi Knut,

>select
GRP_SEQ
,(select sum(i2.ITEM_NR) from ITEM i2 where i2.ITEM_GRPSEQ=GRP_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM i1 on i1.ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=i1.ITEM_SEQ
where
i1.ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

but there must be some method to achieve the result without using the
costly subselect.

Why do you think that a subselect is "costly"? The idea of SQL is to
tell the DBMS "what" you want and the DBMS figures out the best way on
"how" to
get there. So it would be a good idea to first have a look at the
optimized plans and also to do some performance measurements before
saying something is expensive or not.

Anyway, the biggest problem I have is that the semantics of your queries
are
not clear at all...

The perormance issue is just a guess of mine. I was always under the
impression that databases handled sets and the grouping of sets better
than when using subselects in columns. It is just a matter of experience
for me that wherever I use selects in columns of an SQL the performance
drops, and when I rely on set functions the SQLs speed up.

Forget about performance. First you have to get the semantics of the query
right, and _then_ you can think about improving it.

My example is very clear to me who has been trying to find a solution to
this problem all day. To someone else, I now see it makes less sense.
Perhaps the following will help:

create table DB2ADMIN.GRP
(
GRP_SEQ INTEGER not null,
GRP_NAME VARCHAR(20) not null
);
insert into DB2ADMIN.GRP(GRP_SEQ, GRP_NAME) values (1,'Hotel
Adlon'),(2,'Marriot');;

create table DB2ADMIN.ITEM
(
ITEM_SEQ INTEGER not null,
ITEM_GRPSEQ INTEGER not null,
ITEM_NAME VARCHAR(20) not null,
ITEM_NR INTEGER
);
insert into DB2ADMIN.ITEM(ITEM_SEQ, ITEM_GRPSEQ, ITEM_NAME, ITEM_NR)
values (1,1,'Triple',2),(2,1,'Double',1),(3,2,'Single',1) ;

create table DB2ADMIN.PERSON
(
PERSON_SEQ INTEGER not null,
PERSON_ITEMSEQ INTEGER not null,
PERSON_NAME VARCHAR(20) not null
);
insert into DB2ADMIN.PERSON(PERSON_SEQ, PERSON_ITEMSEQ, PERSON_NAME)
values
(1,1,'Rudolf'),(2,1,'Sandra'),(3,1,'Alyssa'),(3,2, 'Hans'),(4,2,'Diana')

(5,3,'Eduard');
What are the primary keys on your tables? I thought it would be the _SEQ
columns, but those don't have unique values in the PERSON table.

Just a different, more compact form:

GRP:
====
GRP_SEQ GRP_NAME
1 Hotel Adlon
2 Marriot

ITEM:
=====
ITEM_SEQ ITEM_GRPSEQ ITEM_NAME ITEM_NR
1 1 Triple 2
2 1 Double 1
3 2 Single 1

PERSON:
=======
PERSON_SEQ PERSON_ITEMSEQ PERSON_NAME
1 1 Rudolf
2 1 Sandra
3 1 Alyssa
3 2 Hans
4 2 Diana
5 3 Eduard

I think you should use speaking names for your tables and columns, i.e.
something like ROOMS and "NUMBER_OF_ROOMS" instead of "ITEM" and "ITEM_NR".
Or what exactly is the meaning of the "ITEM_NR" column?

"Hotel Adlon" has 3 sold rooms with a total of 5 persons

How do you come to 3 rooms here? I see a "Triple" rooms and a "Double".
The triple is occupied by Rudolf, Sandra, and Alyssa, and the double room
is taken by Hans and Diana.

"Mariott" has 1 sold room with a total of 1 person

Ok, I get that.

The following SQL

SELECT grp_seq, SUM(nr_items), COUNT(DISTINCT person_seq)
FROM ...
GROUP BY grp_seq

will return the following incorect result:

1, 8, 5
2, 1, 1
The Hotel Adlon sold eight rooms for five persons.

What I am looking for is

1, 3, 5
2, 1, 1
The Hotel Adlon sold three rooms for five persons

As stated above, I can't figure out why there would be 3 rooms. But I
believe I get what you want to achieve. First, I have a question: do you
want to sum up something or do you just want to count the number of
persons/rooms? Again, the semantics are not clear. I assume you just want
to count the number of rooms and persons.

SELECT i.item_grpseq,
COUNT(DISTINCT i.item_seq) AS number_of_rooms,
COUNT(*) AS number_of_persons
FROM item AS i JOIN person AS p ON
( i.item_seq = p.person_itemseq )
GROUP BY i.item_grpseq

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany

Feb 18 '07 #6

Rudolf Bargholz

Hi Knut,

"Knut Stolze" <st****@de.ibm.comschrieb im Newsbeitrag
news:er**********@lc03.rz.uni-jena.de...

Rudolf Bargholz wrote:

>Hi Knut,

> >select
GRP_SEQ
,(select sum(i2.ITEM_NR) from ITEM i2 where i2.ITEM_GRPSEQ=GRP_SEQ)
,count(PERSONS_SEQ)
from
GRP
inner join ITEM i1 on i1.ITEM_GRPSEQ=GRP_SEQ
left outer join PERSONS on PERSONS_ITEMSEQ=i1.ITEM_SEQ
where
i1.ITEM_GRPSEQ=GRP_SEQ
group by GRP_SEQ
order by 1

but there must be some method to achieve the result without using the
costly subselect.

Why do you think that a subselect is "costly"? The idea of SQL is to
tell the DBMS "what" you want and the DBMS figures out the best way on
"how" to
get there. So it would be a good idea to first have a look at the
optimized plans and also to do some performance measurements before
saying something is expensive or not.

Anyway, the biggest problem I have is that the semantics of your queries
are
not clear at all...

The perormance issue is just a guess of mine. I was always under the
impression that databases handled sets and the grouping of sets better
than when using subselects in columns. It is just a matter of experience
for me that wherever I use selects in columns of an SQL the performance
drops, and when I rely on set functions the SQLs speed up.

Forget about performance. First you have to get the semantics of the
query
right, and _then_ you can think about improving it.

>My example is very clear to me who has been trying to find a solution to
this problem all day. To someone else, I now see it makes less sense.
Perhaps the following will help:

create table DB2ADMIN.GRP
(
GRP_SEQ INTEGER not null,
GRP_NAME VARCHAR(20) not null
);
insert into DB2ADMIN.GRP(GRP_SEQ, GRP_NAME) values (1,'Hotel
Adlon'),(2,'Marriot');;

create table DB2ADMIN.ITEM
(
ITEM_SEQ INTEGER not null,
ITEM_GRPSEQ INTEGER not null,
ITEM_NAME VARCHAR(20) not null,
ITEM_NR INTEGER
);
insert into DB2ADMIN.ITEM(ITEM_SEQ, ITEM_GRPSEQ, ITEM_NAME, ITEM_NR)
values (1,1,'Triple',2),(2,1,'Double',1),(3,2,'Single',1) ;

create table DB2ADMIN.PERSON
(
PERSON_SEQ INTEGER not null,
PERSON_ITEMSEQ INTEGER not null,
PERSON_NAME VARCHAR(20) not null
);
insert into DB2ADMIN.PERSON(PERSON_SEQ, PERSON_ITEMSEQ, PERSON_NAME)
values
(1,1,'Rudolf'),(2,1,'Sandra'),(3,1,'Alyssa'),(3,2 ,'Hans'),(4,2,'Diana')
(5,3,'Eduard');
What are the primary keys on your tables? I thought it would be the _SEQ
columns, but those don't have unique values in the PERSON table.

Ok, now I feel really dumb :-) It ought to have been

(1,1,'Rudolf'),(2,1,'Sandra'),(3,1,'Alyssa'),(4,2, 'Hans'),(5,2,'Diana'),(6,3,'Eduard');

GRP_SEQ, ITEM_SEQ and PERSON_SEQ are all primary keys
ITEM_GRPSEQ and PERSON_ITEMSEQ are foreign keys.

>
Just a different, more compact form:

GRP:
====
GRP_SEQ GRP_NAME
1 Hotel Adlon
2 Marriot

ITEM:
=====
ITEM_SEQ ITEM_GRPSEQ ITEM_NAME ITEM_NR
1 1 Triple 2
2 1 Double 1
3 2 Single 1

PERSON:
=======
PERSON_SEQ PERSON_ITEMSEQ PERSON_NAME
1 1 Rudolf
2 1 Sandra
3 1 Alyssa
3 2 Hans
4 2 Diana
5 3 Eduard

I think you should use speaking names for your tables and columns, i.e.
something like ROOMS and "NUMBER_OF_ROOMS" instead of "ITEM" and
"ITEM_NR".
Or what exactly is the meaning of the "ITEM_NR" column?

All of our tables and columns have german names, some of the table and
column names are acronyms or abbreviations, and the original programmer has
a slight problem with dyslexia. I was trying to simplify the table
structure, which is a lot more detailed than I have explained above. In out
database there are many more tables that play a role in this SQL.

>
>"Hotel Adlon" has 3 sold rooms with a total of 5 persons

How do you come to 3 rooms here? I see a "Triple" rooms and a "Double".
The triple is occupied by Rudolf, Sandra, and Alyssa, and the double room
is taken by Hans and Diana.

>"Mariott" has 1 sold room with a total of 1 person

Ok, I get that.

>The following SQL

SELECT grp_seq, SUM(nr_items), COUNT(DISTINCT person_seq)
FROM ...
GROUP BY grp_seq

will return the following incorect result:

1, 8, 5
2, 1, 1
The Hotel Adlon sold eight rooms for five persons.

What I am looking for is

1, 3, 5
2, 1, 1
The Hotel Adlon sold three rooms for five persons

As stated above, I can't figure out why there would be 3 rooms. But I
believe I get what you want to achieve. First, I have a question: do you
want to sum up something or do you just want to count the number of
persons/rooms? Again, the semantics are not clear. I assume you just
want
to count the number of rooms and persons.

I need to sum the number of rooms. the problem is, in our database you can
book two double rooms, or three single rooms, and I need to find the number
of rooms sold. Most customers have one item identical to one sold room, but
other customers associate four persons to two sold rooms, i.e. they sell one
room item with the NUMBER_OF_ROOMS column having the value 2.

>
SELECT i.item_grpseq,
COUNT(DISTINCT i.item_seq) AS number_of_rooms,
COUNT(*) AS number_of_persons
FROM item AS i JOIN person AS p ON
( i.item_seq = p.person_itemseq )
GROUP BY i.item_grpseq

The count(distinct xxx) will only work if you always associate the room with
only one sold item. As the NUMBER_OF_ROOMS/ITEM_NR is not always 1, the
above does not work for me. What I would need is something like

sum(NUMBER_OF_ROOMS_SOLD) over (distinct ITEM_SEQ)

>
--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany

Regards

Rudolf Bargholz

Feb 18 '07 #7

Knut Stolze

Rudolf Bargholz wrote:

>GRP:
====
GRP_SEQ GRP_NAME
1 Hotel Adlon
2 Marriot

ITEM:
=====
ITEM_SEQ ITEM_GRPSEQ ITEM_NAME ITEM_NR
1 1 Triple 2
2 1 Double 1
3 2 Single 1

PERSON:
=======
PERSON_SEQ PERSON_ITEMSEQ PERSON_NAME
1 1 Rudolf
2 1 Sandra
3 1 Alyssa
4 2 Hans
5 2 Diana
6 3 Eduard

>>What I am looking for is

1, 3, 5
2, 1, 1
The Hotel Adlon sold three rooms for five persons

As stated above, I can't figure out why there would be 3 rooms. But I
believe I get what you want to achieve. First, I have a question: do you
want to sum up something or do you just want to count the number of
persons/rooms? Again, the semantics are not clear. I assume you just
want
to count the number of rooms and persons.

I need to sum the number of rooms. the problem is, in our database you can
book two double rooms, or three single rooms, and I need to find the
number of rooms sold. Most customers have one item identical to one sold
room, but other customers associate four persons to two sold rooms, i.e.
they sell one room item with the NUMBER_OF_ROOMS column having the value
2.

So in ITEM you keep track of something like "bookings"?

>SELECT i.item_grpseq,
COUNT(DISTINCT i.item_seq) AS number_of_rooms,
COUNT(*) AS number_of_persons
FROM item AS i JOIN person AS p ON
( i.item_seq = p.person_itemseq )
GROUP BY i.item_grpseq

The count(distinct xxx) will only work if you always associate the room
with only one sold item.

That would be mandatory anyways due to the foreign key relationship
established between PERSON and ITEM. However, an item is not the same as a
room, i.e. it can be one or more rooms, right?

As the NUMBER_OF_ROOMS/ITEM_NR is not always 1,
the above does not work for me. What I would need is something like

sum(NUMBER_OF_ROOMS_SOLD) over (distinct ITEM_SEQ)

You can do exactly that. Search for "column-function" on this site:
http://publib.boulder.ibm.com/infoce...n/r0000736.htm

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany

Feb 19 '07 #8

Tonkuma

On Feb 19, 4:36 am, "Rudolf Bargholz" <bargholz.remove...@spamcop.net>
wrote:

"Hotel Adlon" has 3 sold rooms with a total of 5 persons
"Mariott" has 1 sold room with a total of 1 person

Note, there were two tripple rooms sold to three persons.

What I am looking for is

1, 3, 5
2, 1, 1
The Hotel Adlon sold three rooms for five persons

-------------------- Commands Entered ------------------------------
select GRP_SEQ
, SUM(DISTINCT ITEM_NR) AS "number of items sold"
, COUNT(PERSON_SEQ) AS "number of persons"
from GRP
inner join
ITEM i
on i.ITEM_GRPSEQ=GRP_SEQ
left outer join
PERSON
on PERSON_ITEMSEQ=i.ITEM_SEQ
group by GRP_SEQ
order by 1;
--------------------------------------------------------------------

GRP_SEQ number of items sold number of persons
----------- -------------------- -----------------
1 3 5
2 1 1

2 record(s) selected.

Feb 19 '07 #9

Tonkuma

This may be better. Because there is a possibility different room to
be sold same number.
-------------------- Commands Entered ------------------------------
select GRP_SEQ
, SUM(ITEM_SOLD) AS "number of items sold"
, SUM(PERSON_USED) AS "number of persons"
from (select GRP_SEQ
, MAX(ITEM_NR) AS ITEM_SOLD
, COUNT(PERSON_SEQ) AS PERSON_USED
from GRP
inner join
ITEM i
on i.ITEM_GRPSEQ=GRP_SEQ
left outer join
PERSON2
on PERSON_ITEMSEQ=i.ITEM_SEQ
group by GRP_SEQ, ITEM_SEQ
) R
group by GRP_SEQ
order by 1;
--------------------------------------------------------------------

GRP_SEQ number of items sold number of persons
----------- -------------------- -----------------
1 3 5
2 1 1

2 record(s) selected.

Feb 19 '07 #10

Tonkuma

Another idea.
------------------- Commands Entered ------------------------------
select GRP_SEQ
, MOD(SUM(DISTINCT ITEM_SEQ*1000+ITEM_NR),1000) AS "number of
items sold"
, count(PERSON_SEQ) AS "number of persons"
from GRP
inner join
ITEM i
on i.ITEM_GRPSEQ=GRP_SEQ
left outer join
PERSON2
on PERSON_ITEMSEQ=i.ITEM_SEQ
group by GRP_SEQ
order by 1;
-------------------------------------------------------------------

GRP_SEQ number of items sold number of persons
----------- -------------------- -----------------
1 3 5
2 1 1

2 record(s) selected.

Feb 19 '07 #11

GROUP BY problem in conjunction with a SUM()

Similar topics