By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,446 Members | 3,031 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,446 IT Pros & Developers. It's quick & easy.

case, joins and NULL trouble

P: n/a
Hi

Consider two tables
id1 code1
----------- -----
1 a
2 b
3 c

id2 code2 value
----------- ----- -----------
1 a 0
2 a 1
3 b 1

They are joined on the code field.

For each code, I want the maximum corresponding value. If the value
doesn't exist (corresponding code in second table doesn't exist), I want
a NULL field returned.

The result should look like this:
code2 value
----- -----------
a 1
b 1
c NULL

I can't get it to include the NULL row.
While there are uniqe ID's in this example, the real life example uses a
horrible four field compound key.
Any help would be appreciated.

Ger.

The above example can be recreated by the following script.

DROP table #temp1
DROP table #temp2

SELECT 1 AS 'id1', 'a' AS 'code1'
INTO #temp1
UNION
SELECT 2, 'b'
UNION
SELECT 3, 'c'

SELECT 1 AS 'id2', 'a' AS 'code2', 0 AS value
INTO #temp2
UNION
SELECT 2, 'a', 1
UNION
SELECT 3, 'b', 1

SELECT code2, value
FROM #temp1 t1
LEFT JOIN #temp2 t2 ON t1.code1 = t2.code2
WHERE CASE
WHEN t2.value IS NULL THEN 1
WHEN t2.value = 0 THEN 2
WHEN t2.value = 1 THEN 3
END = (
SELECT
MAX( CASE
WHEN value IS NULL THEN 1
WHEN value = 0 THEN 2
WHEN value = 1 THEN 3
END )
FROM
#temp2
WHERE
code2 = t2.code2
)

Jul 23 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Hi

You may want to try something like:

SELECT t1.code1 as Code2, (SELECT MAX(value) FROM T2 WHERE T2.code2 =
t1.code1 ) AS Value
FROM T1

John

"Geremy" <v@x.com> wrote in message
news:MP************************@freenews.iinet.net .au...
Hi

Consider two tables
id1 code1
----------- -----
1 a
2 b
3 c

id2 code2 value
----------- ----- -----------
1 a 0
2 a 1
3 b 1

They are joined on the code field.

For each code, I want the maximum corresponding value. If the value
doesn't exist (corresponding code in second table doesn't exist), I want
a NULL field returned.

The result should look like this:
code2 value
----- -----------
a 1
b 1
c NULL

I can't get it to include the NULL row.
While there are uniqe ID's in this example, the real life example uses a
horrible four field compound key.
Any help would be appreciated.

Ger.

The above example can be recreated by the following script.

DROP table #temp1
DROP table #temp2

SELECT 1 AS 'id1', 'a' AS 'code1'
INTO #temp1
UNION
SELECT 2, 'b'
UNION
SELECT 3, 'c'

SELECT 1 AS 'id2', 'a' AS 'code2', 0 AS value
INTO #temp2
UNION
SELECT 2, 'a', 1
UNION
SELECT 3, 'b', 1

SELECT code2, value
FROM #temp1 t1
LEFT JOIN #temp2 t2 ON t1.code1 = t2.code2
WHERE CASE
WHEN t2.value IS NULL THEN 1
WHEN t2.value = 0 THEN 2
WHEN t2.value = 1 THEN 3
END = (
SELECT
MAX( CASE
WHEN value IS NULL THEN 1
WHEN value = 0 THEN 2
WHEN value = 1 THEN 3
END )
FROM
#temp2
WHERE
code2 = t2.code2
)

Jul 23 '05 #2

P: n/a
Geremy (v@x.com) writes:
SELECT code2, value
FROM #temp1 t1
LEFT JOIN #temp2 t2 ON t1.code1 = t2.code2
WHERE CASE
WHEN t2.value IS NULL THEN 1
WHEN t2.value = 0 THEN 2
WHEN t2.value = 1 THEN 3
END = (
SELECT
MAX( CASE
WHEN value IS NULL THEN 1
WHEN value = 0 THEN 2
WHEN value = 1 THEN 3
END )
FROM
#temp2
WHERE
code2 = t2.code2
)


There is a "classical" error here. And don't feel ashamed, it took me
some time, before I learnt the subtle difference between WHERE and
ON.

When you say "t1 LEFT JOIN t2 ON ... ", you tell SQL Server to take
all rows in t1, and the matching rows in t2. For the rows in t1 where
is no match in t1, the columns from t2 are NULL.

This resulting table is then used as input to the next JOIN clause.
(Conceptually, that is, the optimizer may evaluate the query in
different order.) Finally, the WHERE clause is applied as a filter
on the resulting table. This means that for this non-matching rows
t2.code2 will be NULL on the next-to-last line.

If you would change WHERE to AND (and code2 to code in the SELECT
list), the query would give the correct result, because the subquery
would be resolved earlier.

However, this is a much simpler version of the query:

SELECT t1.code1, t2.value
FROM #temp1 t1
LEFT JOIN (SELECT code2, value = MAX(value)
FROM #temp2
GROUP BY code2) AS t2 ON t1.code1 = t2.code2

I am here using a derived table which conceptually is a derived
query within the subquery. Again, conceptually only. The optimizer
may recast computation order. In fact, my experience is that a solution
like this one performs a lot better than the correlated subquery
that John suggested.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #3

P: n/a
I cannot bring myself to use that proprietary temp table syntax and
lose a primary key.

DROP TABLE Foobar1, Foobar2;

CREATE TABLE Foobar1
(t_id INTEGER NOT NULL PRIMARY KEY, code1 CHAR(1) NOT NULL);
INSERT INTO Foobar1 VALUES (1, 'a');
INSERT INTO Foobar1 VALUES (2, 'b');
INSERT INTO Foobar1 VALUES (3, 'c');

CREATE TABLE Foobar2
(t_id INTEGER NOT NULL PRIMARY KEY, code2 CHAR(1) NOT NULL, value
INTEGER NOT NULL);
INSERT INTO Foobar2 VALUES (1, 'a', 0);
INSERT INTO Foobar2 VALUES (2, 'a', 1);
INSERT INTO Foobar2 VALUES (3, 'b', 1);

If you did not need the NULLs, this would do it:

SELECT T2.code2, MAX(T2.value)
FROM Foobar2 AS T2
WHERE EXISTS
(SELECT *
FROM Foobar1 AS T1
WHERE T1.code1 = T2.code2)
GROUP BY T2.code2;

With the NULLs, use this;

SELECT T1.code1, MAX(T2.value)
FROM Foobar1 AS T1
LEFT OUTER JOIN
Foobar2 AS T2
ON T1.code1 = T2.code2
GROUP BY T1.code1;

This should be faster than correlated subqueries.

Jul 23 '05 #4

P: n/a
es****@sommarskog.se says...
Geremy (v@x.com) writes:
SELECT code2, value
FROM #temp1 t1
LEFT JOIN #temp2 t2 ON t1.code1 = t2.code2
WHERE CASE
WHEN t2.value IS NULL THEN 1
WHEN t2.value = 0 THEN 2
WHEN t2.value = 1 THEN 3
END = (
SELECT
MAX( CASE
WHEN value IS NULL THEN 1
WHEN value = 0 THEN 2
WHEN value = 1 THEN 3
END )
FROM
#temp2
WHERE
code2 = t2.code2
)


There is a "classical" error here. And don't feel ashamed, it took me
some time, before I learnt the subtle difference between WHERE and
ON.

When you say "t1 LEFT JOIN t2 ON ... ", you tell SQL Server to take
all rows in t1, and the matching rows in t2. For the rows in t1 where
is no match in t1, the columns from t2 are NULL.

This resulting table is then used as input to the next JOIN clause.
(Conceptually, that is, the optimizer may evaluate the query in
different order.) Finally, the WHERE clause is applied as a filter
on the resulting table. This means that for this non-matching rows
t2.code2 will be NULL on the next-to-last line.

If you would change WHERE to AND (and code2 to code in the SELECT
list), the query would give the correct result, because the subquery
would be resolved earlier.

However, this is a much simpler version of the query:

SELECT t1.code1, t2.value
FROM #temp1 t1
LEFT JOIN (SELECT code2, value = MAX(value)
FROM #temp2
GROUP BY code2) AS t2 ON t1.code1 = t2.code2

I am here using a derived table which conceptually is a derived
query within the subquery. Again, conceptually only. The optimizer
may recast computation order. In fact, my experience is that a solution


Cheers mate, if I could kiss you I would. Thanks not only for the
solution, but also for a fantastic explanation. I've learned something
today.

There's a cold pint waiting for you in Australia if you ever make it
down here.

Thanks again.
Ger
Jul 23 '05 #5

P: n/a
In article <11**********************@o13g2000cwo.googlegroups .com>,
jc*******@earthlink.net says...
I cannot bring myself to use that proprietary temp table syntax and
lose a primary key.

DROP TABLE Foobar1, Foobar2;

CREATE TABLE Foobar1
(t_id INTEGER NOT NULL PRIMARY KEY, code1 CHAR(1) NOT NULL);
INSERT INTO Foobar1 VALUES (1, 'a');
INSERT INTO Foobar1 VALUES (2, 'b');
INSERT INTO Foobar1 VALUES (3, 'c');

CREATE TABLE Foobar2
(t_id INTEGER NOT NULL PRIMARY KEY, code2 CHAR(1) NOT NULL, value
INTEGER NOT NULL);
INSERT INTO Foobar2 VALUES (1, 'a', 0);
INSERT INTO Foobar2 VALUES (2, 'a', 1);
INSERT INTO Foobar2 VALUES (3, 'b', 1);

If you did not need the NULLs, this would do it:

SELECT T2.code2, MAX(T2.value)
FROM Foobar2 AS T2
WHERE EXISTS
(SELECT *
FROM Foobar1 AS T1
WHERE T1.code1 = T2.code2)
GROUP BY T2.code2;

With the NULLs, use this;

SELECT T1.code1, MAX(T2.value)
FROM Foobar1 AS T1
LEFT OUTER JOIN
Foobar2 AS T2
ON T1.code1 = T2.code2
GROUP BY T1.code1;

This should be faster than correlated subqueries.

Thanks for your reply, while I didn't use it, I do appreciate the time
spent to answer, and the solution is now in the archived for anyone to
try.

Ger
Jul 23 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.