Suppose I have two tables:
CREATE TABLE Tab1 (
[A1] [int] NOT NULL,
[A2] [int] NOT NULL,
[B1] [int] NOT NULL,
[B2] [int] NOT NULL)
CREATE TABLE Tab2 (
[A1] [int] NOT NULL,
[A2] [int] NOT NULL,
[B1] [int] NOT NULL,
[B2] [int] NOT NULL)
I need to sum B1 and B2 values grouped by A1 and A2 values from Tab1
united with the portion of Tab2 for which the A1 and A2 values do not
exist in Tab1. Is there a nice TSQL way to put it into one and
possibly small query? 5 1187
On 20 Oct 2004 02:22:58 0700, Alexander Korovyev wrote: Suppose I have two tables:
CREATE TABLE Tab1 ( [A1] [int] NOT NULL, [A2] [int] NOT NULL, [B1] [int] NOT NULL, [B2] [int] NOT NULL)
CREATE TABLE Tab2 ( [A1] [int] NOT NULL, [A2] [int] NOT NULL, [B1] [int] NOT NULL, [B2] [int] NOT NULL)
I need to sum B1 and B2 values grouped by A1 and A2 values from Tab1 united with the portion of Tab2 for which the A1 and A2 values do not exist in Tab1. Is there a nice TSQL way to put it into one and possibly small query?
Hi Alexander,
From the narrative, it's hard to understand what you want. Please provide
some illustrative sample data (in the form of INSERT statements, so that I
can copy and paste for testing purposes) and the output you expect for
that sample data.
Also, provide some explanation about the real world problem you're trying
to solve. This problem looks like a homework assignment; if it is I'll
gladly give you some pointers, but not a complete solution. If it's a real
world problem you're facing in your job, I'll be more tempted to give a
complete solution.
Best, Hugo

(Remove _NO_ and _SPAM_ to get my email address)
Hugo Kornelis <hugo@pe_NO_rFact.in_SPAM_fo> wrote in message news:<r2********************************@4ax.com>. .. On 20 Oct 2004 02:22:58 0700, Alexander Korovyev wrote:
Suppose I have two tables:
CREATE TABLE Tab1 ( [A1] [int] NOT NULL, [A2] [int] NOT NULL, [B1] [int] NOT NULL, [B2] [int] NOT NULL)
CREATE TABLE Tab2 ( [A1] [int] NOT NULL, [A2] [int] NOT NULL, [B1] [int] NOT NULL, [B2] [int] NOT NULL)
I need to sum B1 and B2 values grouped by A1 and A2 values from Tab1 united with the portion of Tab2 for which the A1 and A2 values do not exist in Tab1. Is there a nice TSQL way to put it into one and possibly small query?
Hi Alexander,
From the narrative, it's hard to understand what you want. Please provide some illustrative sample data (in the form of INSERT statements, so that I can copy and paste for testing purposes) and the output you expect for that sample data.
Also, provide some explanation about the real world problem you're trying to solve. This problem looks like a homework assignment; if it is I'll gladly give you some pointers, but not a complete solution. If it's a real world problem you're facing in your job, I'll be more tempted to give a complete solution.
Hello Hugo,
Here is my illustrative sample data:
INSERT INTO Tab1 VALUES(0, 0, 10, 20)
INSERT INTO Tab1 VALUES(0, 1, 15, 5)
INSERT INTO Tab1 VALUES(0, 1, 25, 15)
INSERT INTO Tab1 VALUES(1, 0, 35, 10)
INSERT INTO Tab2 VALUES(0, 1, 40, 0)
INSERT INTO Tab2 VALUES(1, 0, 15, 15)
INSERT INTO Tab2 VALUES(1, 1, 25, 30)
The following is what the result should look like:
0, 0, 10, 20
0, 1, 40, 10
1, 0, 35, 10
1, 1, 25, 30
I have come up with this perverted code so far:
SELECT A1, A2, SUM(B1), SUM(B2) FROM (
SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 1 as p FROM Tab1
GROUP BY A1, A2
UNION
SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 0 as p FROM Tab2
GROUP BY A1, A2
) AS t GROUP BY A1, A2 HAVING SUM(p)=0
UNION SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2 FROM Tab1 GROUP BY
A1, A2 ORDER BY A1, A2
It shows bad performance and looks ugly. There certainly must be a
better (=more concise) way of expressing it. Thank you.
P.S. Since you asked, this is not a homework problem.. but for the
sake of sparing you the full detail (which is long and not very
interesting) it can be considered so :) (i.e. I don't mind pointers
only).
On 20 Oct 2004 12:24:33 0700, Alexander Korovyev wrote:
(snip) I have come up with this perverted code so far:
SELECT A1, A2, SUM(B1), SUM(B2) FROM ( SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 1 as p FROM Tab1 GROUP BY A1, A2 UNION SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 0 as p FROM Tab2 GROUP BY A1, A2 ) AS t GROUP BY A1, A2 HAVING SUM(p)=0 UNION SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2 FROM Tab1 GROUP BY A1, A2 ORDER BY A1, A2
It shows bad performance and looks ugly. There certainly must be a better (=more concise) way of expressing it. Thank you.
Hi Alexander,
I've got two alternatives that look better and have a "cleaner" execution
plan, but show worse performance on the limited set of data you posted
here. Try them against your database to see if they start behaving better
when there's more data to work on:
SELECT A1, A2, SUM(B1), SUM(B2)
FROM Tab1
GROUP BY A1, A2
UNION ALL
SELECT A1, A2, SUM(B1), SUM(B2)
FROM Tab2
WHERE NOT EXISTS
(SELECT *
FROM Tab1
WHERE Tab1.A1 = Tab2.A1
AND Tab1.A2 = Tab2.A2)
GROUP BY A1, A2
or
SELECT A1, A2, SUM(B1), SUM(B2)
FROM Tab1
GROUP BY A1, A2
UNION ALL
SELECT Tab2.A1, Tab2.A2, SUM(Tab2.B1), SUM(Tab2.B2)
FROM Tab2
LEFT JOIN Tab1
ON Tab1.A1 = Tab2.A1
AND Tab1.A2 = Tab2.A2
WHERE Tab1.A1 IS NULL
GROUP BY Tab2.A1, Tab2.A2
Another approach gave ma a query that is (in my opinion) not so ugly as
yours, but uglier than my first two, but that gives better performance (on
your 7row testset) than all others is this one:
SELECT COALESCE(t1.A1, t2.A1), COALESCE(t1.A2, t2.A2),
COALESCE(t1.B1, t2.B1), COALESCE(t1.B2, t2.B2)
FROM (SELECT A1, A2, SUM(B1) AS B1, SUM(B2) AS B2
FROM Tab2
GROUP BY A1, A2) AS t2
FULL JOIN (SELECT A1, A2, SUM(B1) AS B1, SUM(B2) AS B2
FROM Tab1
GROUP BY A1, A2) AS t1
ON t1.A1 = t2.A1
AND t1.A2 = t2.A2
A final hint: if you do keep your version of the query, change UNION to
UNIOAN ALL. It should not change the results, but UNION ALL is generally
faster then UNION (no removal of duplicates needed  might save you a sort
and will definitely save you some processing logic)
Best, Hugo

(Remove _NO_ and _SPAM_ to get my email address)
On 20 Oct 2004 12:24:33 0700, Alexander Korovyev wrote: It shows bad performance and (...)
Hi Alexander,
Forgot to add this on my previous reply:
If all alternatives perform bad as well, you probably need to have a look
at the indexes available for the tables used in your query.
Best, Hugo

(Remove _NO_ and _SPAM_ to get my email address)
Hugo Kornelis <hugo@pe_NO_rFact.in_SPAM_fo> wrote in message news:<t4********************************@4ax.com>. .. On 20 Oct 2004 12:24:33 0700, Alexander Korovyev wrote:
(snip)I have come up with this perverted code so far:
SELECT A1, A2, SUM(B1), SUM(B2) FROM ( SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 1 as p FROM Tab1 GROUP BY A1, A2 UNION SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 0 as p FROM Tab2 GROUP BY A1, A2 ) AS t GROUP BY A1, A2 HAVING SUM(p)=0 UNION SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2 FROM Tab1 GROUP BY A1, A2 ORDER BY A1, A2
It shows bad performance and looks ugly. There certainly must be a better (=more concise) way of expressing it. Thank you.
Hi Alexander,
I've got two alternatives that look better and have a "cleaner" execution plan, but show worse performance on the limited set of data you posted here. Try them against your database to see if they start behaving better when there's more data to work on:
SELECT A1, A2, SUM(B1), SUM(B2) FROM Tab1 GROUP BY A1, A2 UNION ALL SELECT A1, A2, SUM(B1), SUM(B2) FROM Tab2 WHERE NOT EXISTS (SELECT * FROM Tab1 WHERE Tab1.A1 = Tab2.A1 AND Tab1.A2 = Tab2.A2) GROUP BY A1, A2
or
SELECT A1, A2, SUM(B1), SUM(B2) FROM Tab1 GROUP BY A1, A2 UNION ALL SELECT Tab2.A1, Tab2.A2, SUM(Tab2.B1), SUM(Tab2.B2) FROM Tab2 LEFT JOIN Tab1 ON Tab1.A1 = Tab2.A1 AND Tab1.A2 = Tab2.A2 WHERE Tab1.A1 IS NULL GROUP BY Tab2.A1, Tab2.A2
Another approach gave ma a query that is (in my opinion) not so ugly as yours, but uglier than my first two, but that gives better performance (on your 7row testset) than all others is this one:
SELECT COALESCE(t1.A1, t2.A1), COALESCE(t1.A2, t2.A2), COALESCE(t1.B1, t2.B1), COALESCE(t1.B2, t2.B2) FROM (SELECT A1, A2, SUM(B1) AS B1, SUM(B2) AS B2 FROM Tab2 GROUP BY A1, A2) AS t2 FULL JOIN (SELECT A1, A2, SUM(B1) AS B1, SUM(B2) AS B2 FROM Tab1 GROUP BY A1, A2) AS t1 ON t1.A1 = t2.A1 AND t1.A2 = t2.A2
A final hint: if you do keep your version of the query, change UNION to UNIOAN ALL. It should not change the results, but UNION ALL is generally faster then UNION (no removal of duplicates needed  might save you a sort and will definitely save you some processing logic)
Best, Hugo
Thanks a lot! This discussion thread is closed Replies have been disabled for this discussion. Similar topics
2 posts
views
Thread by lawrence 
last post: by

9 posts
views
Thread by netpurpose 
last post: by

6 posts
views
Thread by paii 
last post: by

3 posts
views
Thread by pw 
last post: by

7 posts
views
Thread by K. Crothers 
last post: by

3 posts
views
Thread by google 
last post: by

reply
views
Thread by ward 
last post: by

10 posts
views
Thread by L. R. Du Broff 
last post: by

7 posts
views
Thread by Rnykster 
last post: by

3 posts
views
Thread by pbd22 
last post: by
          