470,831 Members | 1,474 Online

# Difference on condition in JOIN & WHERE

I appreciate how difficult it is to resolve a problem without all the
information but maybe someone has come across a similar problem.

I have an 'extract' table which has 1853 rows when I ask for all rows where
period_ = 3. The allocation table for info has 210 rows.

I have two scripts below. The first script where I specify a period on a
join, brings back 1853 lines and works. The second script where I specify
the period in the where clause only brings back 1844 rows. I have located
the missing 9 rows and they don't look any different to the other 1844 rows.

Can someone educate me as to the difference between specifying a condition
on a join and a condition in a where clause.

SELECT
a.costcentre_,
b.nett_,
a.*,
b.*

FROM extract a

LEFT OUTER JOIN
allocation b

ON a.e_reg_ = b.reg_no_
AND b.period_ = 3
WHERE
a.period_ = 3
--------------
SELECT
a.costcentre_,
b.nett_,
a.*,
b.*

FROM extract a

LEFT OUTER JOIN
allocation b

ON a.e_reg_ = b.reg_no_
WHERE
a.period_ = 3
AND b.period_ = 3
May 3 '07 #1
4 13582
Jane T (ja***@nospam.net) writes:
Can someone educate me as to the difference between specifying a condition
on a join and a condition in a where clause.
As long as you are doing only inner joins it does not matter. When you
do outer joins it matters a lot.

Let's look at this:
FROM extract a
LEFT OUTER JOIN allocation b ON a.e_reg_ = b.reg_no_
AND b.period_ = 3
WHERE a.period_ = 3
You take all rows from extract. Then we add the columns from the
allocation table and fill data into these where there is a match
on the condition

a.e_reg_ = b.reg_no_ AND b.period_ = 3

For non-matching rows, you leave those columns NULL. Finally, you filter
this with the WHERE clause.
FROM extract a
LEFT OUTER JOIN allocation b ON a.e_reg_ = b.reg_no_
WHERE a.period_ = 3
AND b.period_ = 3

Again, you take all rows from extract, and add the columns from allocation.
Now you fill in the data for the rows that match the condition:

a.e_reg_ = b.reg_no_

and leave NULL in the columns from allocation where there is no match. Now
you filter with the WHERE clause. But the WHERE clause has the condition:

b.period_ = 3

And the rows with NULL in b.period_ does not fulfil this condition,
and thus rows are filtered out.

More generally, the FROM JOIN clauses builds a table, and then the
WHERE clause filters that table.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
May 3 '07 #2
Jane T wrote:
I have an 'extract' table which has 1853 rows when I ask for all rows where
period_ = 3. The allocation table for info has 210 rows.

I have two scripts below. The first script where I specify a period on a
join, brings back 1853 lines and works. The second script where I specify
the period in the where clause only brings back 1844 rows. I have located
the missing 9 rows and they don't look any different to the other 1844 rows.

Can someone educate me as to the difference between specifying a condition
on a join and a condition in a where clause.
For equal joins, there's no difference. For left outer joins,
specifying a condition on the join simply causes it to return
rows with data from the left side and NULLs from the right side,
whereas specifying a condition on the WHERE causes it to filter
the output of the JOIN.

Basically, the query is processed in this order:

* FROM/JOIN
* SELECT
* WHERE
* GROUP BY
* HAVING
* ORDER BY
May 3 '07 #3
"Ed Murphy" <em*******@socal.rr.comwrote in message
Jane T wrote:
Basically, the query is processed in this order:

* FROM/JOIN
* SELECT
* WHERE
* GROUP BY
* HAVING
* ORDER BY
Ed,

Wouldn't SELECT be much further down in the list, between HAVING and ORDER
BY? Otherwise one might be tempted to write:

SELECT col1 AS mycol
FROM Table
WHERE mycol = 'X';

:)

Plamen Ratchev
http://www.SQLStudio.com
May 4 '07 #4
Here is how OUTER JOINs work in SQL-92. Assume you are given:

Table1 Table2
a b a c
====== ======
1 w 1 r
2 x 2 s
3 y 3 t
4 z

and the outer join expression:

Table1
LEFT OUTER JOIN
Table2
ON Table1.a = Table2.a <== join condition
AND Table2.c = 't'; <== single table condition

We call Table1 the "preserved table" and Table2 the "unpreserved
table" in the query. What I am going to give you is a little
different, but equivalent to the ANSI/ISO standards.

1) We build the CROSS JOIN of the two tables. Scan each row in the
result set.

2) If the predicate tests TRUE for that row, then you keep it. You
also remove all rows derived from it from the CROSS JOIN

3) If the predicate tests FALSE or UNKNOWN for that row, then keep the
columns from the preserved table, convert all the columns from the
unpreserved table to NULLs and remove the duplicates.

So let us execute this by hand:

Let @ = passed the first predicate
Let * = passed the second predicate

Table1 CROSS JOIN Table2
a b a c
=========================
1 w 1 r @
1 w 2 s
1 w 3 t *
2 x 1 r
2 x 2 s @
2 x 3 t *
3 y 1 r
3 y 2 s
3 y 3 t @* <== the TRUE set
4 z 1 r
4 z 2 s
4 z 3 t *

Table1 LEFT OUTER JOIN Table2
a b a c
=========================
3 y 3 t <= only TRUE row
-----------------------
1 w NULL NULL Sets of duplicates
1 w NULL NULL
1 w NULL NULL
-----------------------
2 x NULL NULL
2 x NULL NULL
2 x NULL NULL
3 y NULL NULL <== derived from the TRUE set - Remove
3 y NULL NULL
-----------------------
4 z NULL NULL
4 z NULL NULL
4 z NULL NULL

the final results:

Table1 LEFT OUTER JOIN Table2
a b a c
=========================
1 w NULL NULL
2 x NULL NULL
3 y 3 t
4 z NULL NULL

The basic rule is that every row in the preserved table is represented
in the results in at least one result row.

There are limitations and very serious problems with the extended
products. Consider the two Chris Date tables

Suppliers SupParts
supno supno partno qty
========= ==============
S1 S1 P1 100
S2 S1 P2 250
S3 S2 P1 100
S2 P2 250

and let's do an extended equality outer join like this:

SELECT *
FROM Supplier, SupParts
WHERE Supplier.supno *= SupParts.supno
AND qty < 200;

If I do the outer first, I get:

Suppliers LOJ SupParts
supno supno partno qty
=======================
S1 S1 P1 100
S1 S1 P2 250
S2 S2 P1 100
S2 S2 P2 250
S3 NULL NULL NULL

Then I apply the (qty < 200) predicate and get

Suppliers LOJ SupParts
supno supno partno qty
===================
S1 S1 P1 100
S2 S2 P1 100

Doing it in the opposite order

Suppliers LOJ SupParts
supno supno partno qty
===================
S1 S1 P1 100
S2 S2 P1 100
S3 NULL NULL NULL

Sybase does it one way, Oracle does it the other and Centura (nee
Gupta) lets you pick which one -- the worst of both non-standard
worlds! In SQL-92, you have a choice and can force the order of
execution. Either do the predicates after the join ...

SELECT *
FROM Supplier
LEFT OUTER JOIN
SupParts
ON Supplier.supno = SupParts.supno
WHERE qty < 200;

... or do it in the joining:

SELECT *
FROM Supplier
LEFT OUTER JOIN
SupParts
ON Supplier.supno = SupParts.supno
AND qty < 200;

Another problem is that you cannot show the same table as preserved
and unpreserved in the extended equality version, but it is easy in
SQL-92. For example to find the students who have taken Math 101 and
might have taken Math 102:

SELECT C1.student, C1.math, C2.math
FROM (SELECT * FROM Courses WHERE math = 101) AS C1
LEFT OUTER JOIN
(SELECT * FROM Courses WHERE math = 102) AS C2
ON C1.student = C2.student;

May 4 '07 #5

### This discussion thread is closed

Replies have been disabled for this discussion.