By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,918 Members | 1,852 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,918 IT Pros & Developers. It's quick & easy.

Need help with a query

P: n/a
Suppose I have two tables:

CREATE TABLE Tab1 (
[A1] [int] NOT NULL,
[A2] [int] NOT NULL,
[B1] [int] NOT NULL,
[B2] [int] NOT NULL)

CREATE TABLE Tab2 (
[A1] [int] NOT NULL,
[A2] [int] NOT NULL,
[B1] [int] NOT NULL,
[B2] [int] NOT NULL)

I need to sum B1 and B2 values grouped by A1 and A2 values from Tab1
united with the portion of Tab2 for which the A1 and A2 values do not
exist in Tab1. Is there a nice T-SQL way to put it into one and
possibly small query?
Jul 20 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
On 20 Oct 2004 02:22:58 -0700, Alexander Korovyev wrote:
Suppose I have two tables:

CREATE TABLE Tab1 (
[A1] [int] NOT NULL,
[A2] [int] NOT NULL,
[B1] [int] NOT NULL,
[B2] [int] NOT NULL)

CREATE TABLE Tab2 (
[A1] [int] NOT NULL,
[A2] [int] NOT NULL,
[B1] [int] NOT NULL,
[B2] [int] NOT NULL)

I need to sum B1 and B2 values grouped by A1 and A2 values from Tab1
united with the portion of Tab2 for which the A1 and A2 values do not
exist in Tab1. Is there a nice T-SQL way to put it into one and
possibly small query?


Hi Alexander,

From the narrative, it's hard to understand what you want. Please provide
some illustrative sample data (in the form of INSERT statements, so that I
can copy and paste for testing purposes) and the output you expect for
that sample data.

Also, provide some explanation about the real world problem you're trying
to solve. This problem looks like a homework assignment; if it is I'll
gladly give you some pointers, but not a complete solution. If it's a real
world problem you're facing in your job, I'll be more tempted to give a
complete solution.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #2

P: n/a
Hugo Kornelis <hugo@pe_NO_rFact.in_SPAM_fo> wrote in message news:<r2********************************@4ax.com>. ..
On 20 Oct 2004 02:22:58 -0700, Alexander Korovyev wrote:
Suppose I have two tables:

CREATE TABLE Tab1 (
[A1] [int] NOT NULL,
[A2] [int] NOT NULL,
[B1] [int] NOT NULL,
[B2] [int] NOT NULL)

CREATE TABLE Tab2 (
[A1] [int] NOT NULL,
[A2] [int] NOT NULL,
[B1] [int] NOT NULL,
[B2] [int] NOT NULL)

I need to sum B1 and B2 values grouped by A1 and A2 values from Tab1
united with the portion of Tab2 for which the A1 and A2 values do not
exist in Tab1. Is there a nice T-SQL way to put it into one and
possibly small query?


Hi Alexander,

From the narrative, it's hard to understand what you want. Please provide
some illustrative sample data (in the form of INSERT statements, so that I
can copy and paste for testing purposes) and the output you expect for
that sample data.

Also, provide some explanation about the real world problem you're trying
to solve. This problem looks like a homework assignment; if it is I'll
gladly give you some pointers, but not a complete solution. If it's a real
world problem you're facing in your job, I'll be more tempted to give a
complete solution.


Hello Hugo,

Here is my illustrative sample data:

INSERT INTO Tab1 VALUES(0, 0, 10, 20)
INSERT INTO Tab1 VALUES(0, 1, 15, -5)
INSERT INTO Tab1 VALUES(0, 1, 25, 15)
INSERT INTO Tab1 VALUES(1, 0, 35, 10)

INSERT INTO Tab2 VALUES(0, 1, 40, 0)
INSERT INTO Tab2 VALUES(1, 0, 15, 15)
INSERT INTO Tab2 VALUES(1, 1, 25, 30)

The following is what the result should look like:

0, 0, 10, 20
0, 1, 40, 10
1, 0, 35, 10
1, 1, 25, 30

I have come up with this perverted code so far:

SELECT A1, A2, SUM(B1), SUM(B2) FROM (
SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 1 as p FROM Tab1
GROUP BY A1, A2
UNION
SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 0 as p FROM Tab2
GROUP BY A1, A2
) AS t GROUP BY A1, A2 HAVING SUM(p)=0
UNION SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2 FROM Tab1 GROUP BY
A1, A2 ORDER BY A1, A2

It shows bad performance and looks ugly. There certainly must be a
better (=more concise) way of expressing it. Thank you.

P.S. Since you asked, this is not a homework problem.. but for the
sake of sparing you the full detail (which is long and not very
interesting) it can be considered so :) (i.e. I don't mind pointers
only).
Jul 20 '05 #3

P: n/a
On 20 Oct 2004 12:24:33 -0700, Alexander Korovyev wrote:

(snip)
I have come up with this perverted code so far:

SELECT A1, A2, SUM(B1), SUM(B2) FROM (
SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 1 as p FROM Tab1
GROUP BY A1, A2
UNION
SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 0 as p FROM Tab2
GROUP BY A1, A2
) AS t GROUP BY A1, A2 HAVING SUM(p)=0
UNION SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2 FROM Tab1 GROUP BY
A1, A2 ORDER BY A1, A2

It shows bad performance and looks ugly. There certainly must be a
better (=more concise) way of expressing it. Thank you.


Hi Alexander,

I've got two alternatives that look better and have a "cleaner" execution
plan, but show worse performance on the limited set of data you posted
here. Try them against your database to see if they start behaving better
when there's more data to work on:

SELECT A1, A2, SUM(B1), SUM(B2)
FROM Tab1
GROUP BY A1, A2
UNION ALL
SELECT A1, A2, SUM(B1), SUM(B2)
FROM Tab2
WHERE NOT EXISTS
(SELECT *
FROM Tab1
WHERE Tab1.A1 = Tab2.A1
AND Tab1.A2 = Tab2.A2)
GROUP BY A1, A2

or

SELECT A1, A2, SUM(B1), SUM(B2)
FROM Tab1
GROUP BY A1, A2
UNION ALL
SELECT Tab2.A1, Tab2.A2, SUM(Tab2.B1), SUM(Tab2.B2)
FROM Tab2
LEFT JOIN Tab1
ON Tab1.A1 = Tab2.A1
AND Tab1.A2 = Tab2.A2
WHERE Tab1.A1 IS NULL
GROUP BY Tab2.A1, Tab2.A2
Another approach gave ma a query that is (in my opinion) not so ugly as
yours, but uglier than my first two, but that gives better performance (on
your 7-row testset) than all others is this one:

SELECT COALESCE(t1.A1, t2.A1), COALESCE(t1.A2, t2.A2),
COALESCE(t1.B1, t2.B1), COALESCE(t1.B2, t2.B2)
FROM (SELECT A1, A2, SUM(B1) AS B1, SUM(B2) AS B2
FROM Tab2
GROUP BY A1, A2) AS t2
FULL JOIN (SELECT A1, A2, SUM(B1) AS B1, SUM(B2) AS B2
FROM Tab1
GROUP BY A1, A2) AS t1
ON t1.A1 = t2.A1
AND t1.A2 = t2.A2

A final hint: if you do keep your version of the query, change UNION to
UNIOAN ALL. It should not change the results, but UNION ALL is generally
faster then UNION (no removal of duplicates needed - might save you a sort
and will definitely save you some processing logic)
Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #4

P: n/a
On 20 Oct 2004 12:24:33 -0700, Alexander Korovyev wrote:
It shows bad performance and (...)


Hi Alexander,

Forgot to add this on my previous reply:

If all alternatives perform bad as well, you probably need to have a look
at the indexes available for the tables used in your query.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 20 '05 #5

P: n/a
Hugo Kornelis <hugo@pe_NO_rFact.in_SPAM_fo> wrote in message news:<t4********************************@4ax.com>. ..
On 20 Oct 2004 12:24:33 -0700, Alexander Korovyev wrote:

(snip)
I have come up with this perverted code so far:

SELECT A1, A2, SUM(B1), SUM(B2) FROM (
SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 1 as p FROM Tab1
GROUP BY A1, A2
UNION
SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2, 0 as p FROM Tab2
GROUP BY A1, A2
) AS t GROUP BY A1, A2 HAVING SUM(p)=0
UNION SELECT A1, A2, SUM(B1) as B1, SUM(B2) as B2 FROM Tab1 GROUP BY
A1, A2 ORDER BY A1, A2

It shows bad performance and looks ugly. There certainly must be a
better (=more concise) way of expressing it. Thank you.


Hi Alexander,

I've got two alternatives that look better and have a "cleaner" execution
plan, but show worse performance on the limited set of data you posted
here. Try them against your database to see if they start behaving better
when there's more data to work on:

SELECT A1, A2, SUM(B1), SUM(B2)
FROM Tab1
GROUP BY A1, A2
UNION ALL
SELECT A1, A2, SUM(B1), SUM(B2)
FROM Tab2
WHERE NOT EXISTS
(SELECT *
FROM Tab1
WHERE Tab1.A1 = Tab2.A1
AND Tab1.A2 = Tab2.A2)
GROUP BY A1, A2

or

SELECT A1, A2, SUM(B1), SUM(B2)
FROM Tab1
GROUP BY A1, A2
UNION ALL
SELECT Tab2.A1, Tab2.A2, SUM(Tab2.B1), SUM(Tab2.B2)
FROM Tab2
LEFT JOIN Tab1
ON Tab1.A1 = Tab2.A1
AND Tab1.A2 = Tab2.A2
WHERE Tab1.A1 IS NULL
GROUP BY Tab2.A1, Tab2.A2
Another approach gave ma a query that is (in my opinion) not so ugly as
yours, but uglier than my first two, but that gives better performance (on
your 7-row testset) than all others is this one:

SELECT COALESCE(t1.A1, t2.A1), COALESCE(t1.A2, t2.A2),
COALESCE(t1.B1, t2.B1), COALESCE(t1.B2, t2.B2)
FROM (SELECT A1, A2, SUM(B1) AS B1, SUM(B2) AS B2
FROM Tab2
GROUP BY A1, A2) AS t2
FULL JOIN (SELECT A1, A2, SUM(B1) AS B1, SUM(B2) AS B2
FROM Tab1
GROUP BY A1, A2) AS t1
ON t1.A1 = t2.A1
AND t1.A2 = t2.A2

A final hint: if you do keep your version of the query, change UNION to
UNIOAN ALL. It should not change the results, but UNION ALL is generally
faster then UNION (no removal of duplicates needed - might save you a sort
and will definitely save you some processing logic)
Best, Hugo


Thanks a lot!
Jul 20 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.