case, joins and NULL trouble - Microsoft SQL Server

Geremy

Hi

Consider two tables
id1 code1
----------- -----
1 a
2 b
3 c

id2 code2 value
----------- ----- -----------
1 a 0
2 a 1
3 b 1

They are joined on the code field.

For each code, I want the maximum corresponding value. If the value
doesn't exist (corresponding code in second table doesn't exist), I want
a NULL field returned.

The result should look like this:
code2 value
----- -----------
a 1
b 1
c NULL

I can't get it to include the NULL row.
While there are uniqe ID's in this example, the real life example uses a
horrible four field compound key.
Any help would be appreciated.

Ger.

The above example can be recreated by the following script.

DROP table #temp1
DROP table #temp2

SELECT 1 AS 'id1', 'a' AS 'code1'
INTO #temp1
UNION
SELECT 2, 'b'
UNION
SELECT 3, 'c'

SELECT 1 AS 'id2', 'a' AS 'code2', 0 AS value
INTO #temp2
UNION
SELECT 2, 'a', 1
UNION
SELECT 3, 'b', 1

SELECT code2, value
FROM #temp1 t1
LEFT JOIN #temp2 t2 ON t1.code1 = t2.code2
WHERE CASE
WHEN t2.value IS NULL THEN 1
WHEN t2.value = 0 THEN 2
WHEN t2.value = 1 THEN 3
END = (
SELECT
MAX( CASE
WHEN value IS NULL THEN 1
WHEN value = 0 THEN 2
WHEN value = 1 THEN 3
END )
FROM
#temp2
WHERE
code2 = t2.code2
)

Jul 23 '05 #1

Subscribe Post Reply

1731

John Bell

Hi

You may want to try something like:

SELECT t1.code1 as Code2, (SELECT MAX(value) FROM T2 WHERE T2.code2 =
t1.code1 ) AS Value
FROM T1

John

"Geremy" <v@x.com> wrote in message
news:MP************************@freenews.iinet.net .au...

Hi

Consider two tables
id1 code1
----------- -----
1 a
2 b
3 c

id2 code2 value
----------- ----- -----------
1 a 0
2 a 1
3 b 1

They are joined on the code field.

For each code, I want the maximum corresponding value. If the value
doesn't exist (corresponding code in second table doesn't exist), I want
a NULL field returned.

The result should look like this:
code2 value
----- -----------
a 1
b 1
c NULL

I can't get it to include the NULL row.
While there are uniqe ID's in this example, the real life example uses a
horrible four field compound key.
Any help would be appreciated.

Ger.

The above example can be recreated by the following script.

DROP table #temp1
DROP table #temp2

SELECT 1 AS 'id1', 'a' AS 'code1'
INTO #temp1
UNION
SELECT 2, 'b'
UNION
SELECT 3, 'c'

SELECT 1 AS 'id2', 'a' AS 'code2', 0 AS value
INTO #temp2
UNION
SELECT 2, 'a', 1
UNION
SELECT 3, 'b', 1

SELECT code2, value
FROM #temp1 t1
LEFT JOIN #temp2 t2 ON t1.code1 = t2.code2
WHERE CASE
WHEN t2.value IS NULL THEN 1
WHEN t2.value = 0 THEN 2
WHEN t2.value = 1 THEN 3
END = (
SELECT
MAX( CASE
WHEN value IS NULL THEN 1
WHEN value = 0 THEN 2
WHEN value = 1 THEN 3
END )
FROM
#temp2
WHERE
code2 = t2.code2
)

Jul 23 '05 #2

Erland Sommarskog

Geremy (v@x.com) writes:

SELECT code2, value
FROM #temp1 t1
LEFT JOIN #temp2 t2 ON t1.code1 = t2.code2
WHERE CASE
WHEN t2.value IS NULL THEN 1
WHEN t2.value = 0 THEN 2
WHEN t2.value = 1 THEN 3
END = (
SELECT
MAX( CASE
WHEN value IS NULL THEN 1
WHEN value = 0 THEN 2
WHEN value = 1 THEN 3
END )
FROM
#temp2
WHERE
code2 = t2.code2
)

There is a "classical" error here. And don't feel ashamed, it took me
some time, before I learnt the subtle difference between WHERE and
ON.

When you say "t1 LEFT JOIN t2 ON ... ", you tell SQL Server to take
all rows in t1, and the matching rows in t2. For the rows in t1 where
is no match in t1, the columns from t2 are NULL.

This resulting table is then used as input to the next JOIN clause.
(Conceptually, that is, the optimizer may evaluate the query in
different order.) Finally, the WHERE clause is applied as a filter
on the resulting table. This means that for this non-matching rows
t2.code2 will be NULL on the next-to-last line.

If you would change WHERE to AND (and code2 to code in the SELECT
list), the query would give the correct result, because the subquery
would be resolved earlier.

However, this is a much simpler version of the query:

SELECT t1.code1, t2.value
FROM #temp1 t1
LEFT JOIN (SELECT code2, value = MAX(value)
FROM #temp2
GROUP BY code2) AS t2 ON t1.code1 = t2.code2

I am here using a derived table which conceptually is a derived
query within the subquery. Again, conceptually only. The optimizer
may recast computation order. In fact, my experience is that a solution
like this one performs a lot better than the correlated subquery
that John suggested.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Jul 23 '05 #3

--CELKO--

I cannot bring myself to use that proprietary temp table syntax and
lose a primary key.

DROP TABLE Foobar1, Foobar2;

CREATE TABLE Foobar1
(t_id INTEGER NOT NULL PRIMARY KEY, code1 CHAR(1) NOT NULL);
INSERT INTO Foobar1 VALUES (1, 'a');
INSERT INTO Foobar1 VALUES (2, 'b');
INSERT INTO Foobar1 VALUES (3, 'c');

CREATE TABLE Foobar2
(t_id INTEGER NOT NULL PRIMARY KEY, code2 CHAR(1) NOT NULL, value
INTEGER NOT NULL);
INSERT INTO Foobar2 VALUES (1, 'a', 0);
INSERT INTO Foobar2 VALUES (2, 'a', 1);
INSERT INTO Foobar2 VALUES (3, 'b', 1);

If you did not need the NULLs, this would do it:

SELECT T2.code2, MAX(T2.value)
FROM Foobar2 AS T2
WHERE EXISTS
(SELECT *
FROM Foobar1 AS T1
WHERE T1.code1 = T2.code2)
GROUP BY T2.code2;

With the NULLs, use this;

SELECT T1.code1, MAX(T2.value)
FROM Foobar1 AS T1
LEFT OUTER JOIN
Foobar2 AS T2
ON T1.code1 = T2.code2
GROUP BY T1.code1;

This should be faster than correlated subqueries.

Jul 23 '05 #4

Geremy

es****@sommarskog.se says...

Geremy (v@x.com) writes:
SELECT code2, value
FROM #temp1 t1
LEFT JOIN #temp2 t2 ON t1.code1 = t2.code2
WHERE CASE
WHEN t2.value IS NULL THEN 1
WHEN t2.value = 0 THEN 2
WHEN t2.value = 1 THEN 3
END = (
SELECT
MAX( CASE
WHEN value IS NULL THEN 1
WHEN value = 0 THEN 2
WHEN value = 1 THEN 3
END )
FROM
#temp2
WHERE
code2 = t2.code2
)

There is a "classical" error here. And don't feel ashamed, it took me
some time, before I learnt the subtle difference between WHERE and
ON.

When you say "t1 LEFT JOIN t2 ON ... ", you tell SQL Server to take
all rows in t1, and the matching rows in t2. For the rows in t1 where
is no match in t1, the columns from t2 are NULL.

This resulting table is then used as input to the next JOIN clause.
(Conceptually, that is, the optimizer may evaluate the query in
different order.) Finally, the WHERE clause is applied as a filter
on the resulting table. This means that for this non-matching rows
t2.code2 will be NULL on the next-to-last line.

If you would change WHERE to AND (and code2 to code in the SELECT
list), the query would give the correct result, because the subquery
would be resolved earlier.

However, this is a much simpler version of the query:

SELECT t1.code1, t2.value
FROM #temp1 t1
LEFT JOIN (SELECT code2, value = MAX(value)
FROM #temp2
GROUP BY code2) AS t2 ON t1.code1 = t2.code2

I am here using a derived table which conceptually is a derived
query within the subquery. Again, conceptually only. The optimizer
may recast computation order. In fact, my experience is that a solution

Cheers mate, if I could kiss you I would. Thanks not only for the
solution, but also for a fantastic explanation. I've learned something
today.

There's a cold pint waiting for you in Australia if you ever make it
down here.

Thanks again.
Ger

Jul 23 '05 #5

Geremy

In article <11**********************@o13g2000cwo.googlegroups .com>,
jc*******@earthlink.net says...

I cannot bring myself to use that proprietary temp table syntax and
lose a primary key.

DROP TABLE Foobar1, Foobar2;

CREATE TABLE Foobar1
(t_id INTEGER NOT NULL PRIMARY KEY, code1 CHAR(1) NOT NULL);
INSERT INTO Foobar1 VALUES (1, 'a');
INSERT INTO Foobar1 VALUES (2, 'b');
INSERT INTO Foobar1 VALUES (3, 'c');

CREATE TABLE Foobar2
(t_id INTEGER NOT NULL PRIMARY KEY, code2 CHAR(1) NOT NULL, value
INTEGER NOT NULL);
INSERT INTO Foobar2 VALUES (1, 'a', 0);
INSERT INTO Foobar2 VALUES (2, 'a', 1);
INSERT INTO Foobar2 VALUES (3, 'b', 1);

If you did not need the NULLs, this would do it:

SELECT T2.code2, MAX(T2.value)
FROM Foobar2 AS T2
WHERE EXISTS
(SELECT *
FROM Foobar1 AS T1
WHERE T1.code1 = T2.code2)
GROUP BY T2.code2;

With the NULLs, use this;

SELECT T1.code1, MAX(T2.value)
FROM Foobar1 AS T1
LEFT OUTER JOIN
Foobar2 AS T2
ON T1.code1 = T2.code2
GROUP BY T1.code1;

This should be faster than correlated subqueries.

Thanks for your reply, while I didn't use it, I do appreciate the time
spent to answer, and the solution is now in the archived for anyone to
try.

Ger

Jul 23 '05 #6

Similar topics

how to do joins correctly

by: jgalzic | last post by:

Hi, I'm having trouble doing joins correctly on two tables. I've read up a lot about the different types of joins and tried lots of variations on inner, outer, and left joins with no avail....

MySQL Database

Need help on inner joins.

by: Prem | last post by:

Hi, I am having many problems with inner join. my first problem is : 1) I want to know the precedance while evaluating query with multiple joins. eg. select Employees.FirstName,...

Microsoft SQL Server

Need Help on tricky sql involving joins

by: Bung | last post by:

Hi, I have a tricky sql statment I have to write (tricky for me) and I am stuck. I'm having trouble with the following problem. Table1 (Column a, Column b, Column c) Table2 (Column a, Column...

Microsoft SQL Server

Alternative to self-joins??

by: nuked | last post by:

I have a table that has values of variables for certain entities. The columns of interest are targetID, variableID, and valueID. A row (1, 5, 9) means that target number 1 has a value of 9 for...

MySQL Database

Joins: strategy and how-to approach

by: Scott Marquardt | last post by:

My SQL acumen stems from just a couple courses, and everything since from the trenches. Fun + angst over time. I'm needing some advice on joins. Though I understand the basics, I'm having...

Microsoft SQL Server

Tricky joins / nulls

by: oneannoyingguy | last post by:

I am having some trouble reaching my intended results in combining information from a few tables. The easiest way to explain is with a dummy model t1________________________ date - portfolio -...

Microsoft Access / VBA

SQL JOINs

by: NeoPa | last post by:

Introduction Joins, in SQL, are a way of linking Recordsets together. They involve restricting which data is returned in the output Recordset. When no join is specified but two Recordsets are,...

Microsoft Access / VBA

Multiple JOINS and controlling duplicates

by: beargrease | last post by:

I'm kind of comfortable with basic joins, but a current project requires a complex query of many tables. The GROUP_CONCAT(DISTINCT ...) function has been very useful as returning my values as comma...

MySQL Database

mysql joins related query

by: pradeepjain | last post by:

MySQL Database

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++