469,331 Members | 5,388 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,331 developers. It's quick & easy.

Equivalence of join and exists

Hello all,

We are seeing strange behaviour for queries on a table we need to
convert data from. We try to find out whether table A(B_CONV_ID)
contains data that does not exists in table B(CONV_ID). We expect that
there is data missing in table B, thus finding a positive number of
records.

There are two tables, A and B.

A: ID, ..., B_CONV_ID
B: ID, ..., CONV_ID

The queries we are trying to execute and their results are the
following:

select count(*)
from A as AA
join B as BB on BB.CONV_ID = AA.B_CONV_ID
where BB.CONV_ID is null;

Result = 0;

select count(*)
from A as AA
where not exists (select 1 from B as BB where BB.CONV_ID =
AA.B_CONV_ID);

Result = 6273

A colleague confirmed me that the two queries were supposed to
retrieve the same result. Can anyone give us a clue of why the result
can be this different.
Nov 20 '07 #1
2 2027
On Nov 20, 3:50 pm, frederikenge...@gmail.com wrote:
Hello all,

We are seeing strange behaviour for queries on a table we need to
convert data from. We try to find out whether table A(B_CONV_ID)
contains data that does not exists in table B(CONV_ID). We expect that
there is data missing in table B, thus finding a positive number of
records.

There are two tables, A and B.

A: ID, ..., B_CONV_ID
B: ID, ..., CONV_ID

The queries we are trying to execute and their results are the
following:

select count(*)
from A as AA
join B as BB on BB.CONV_ID = AA.B_CONV_ID
where BB.CONV_ID is null;

Result = 0;

select count(*)
from A as AA
where not exists (select 1 from B as BB where BB.CONV_ID =
AA.B_CONV_ID);

Result = 6273

A colleague confirmed me that the two queries were supposed to
retrieve the same result. Can anyone give us a clue of why the result
can be this different.
You need to do an outer join. I assume A join B is the same A inner
join B, and in that case:

from A as AA join B as BB
on BB.CONV_ID = AA.B_CONV_ID
where BB.CONV_ID is null;

is a contradiction. Try:

select count(*)
from A as AA
left outer join B as BB
on BB.CONV_ID = AA.B_CONV_ID
where BB.CONV_ID is null;

/Lennart
Nov 20 '07 #2
fr*************@gmail.com wrote:
Hello all,

We are seeing strange behaviour for queries on a table we need to
convert data from. We try to find out whether table A(B_CONV_ID)
contains data that does not exists in table B(CONV_ID). We expect that
there is data missing in table B, thus finding a positive number of
records.

There are two tables, A and B.

A: ID, ..., B_CONV_ID
B: ID, ..., CONV_ID

The queries we are trying to execute and their results are the
following:

select count(*)
from A as AA
join B as BB on BB.CONV_ID = AA.B_CONV_ID
where BB.CONV_ID is null;

Result = 0;

select count(*)
from A as AA
where not exists (select 1 from B as BB where BB.CONV_ID =
AA.B_CONV_ID);

Result = 6273

A colleague confirmed me that the two queries were supposed to
retrieve the same result. Can anyone give us a clue of why the result
can be this different.
The queries do not have the same semantics, especially if you consider
NULLs. What happens is this:

The join predicate of the first query will evaluate to "unknown" if either
BB.CONV_ID or AA.B_CONV_ID happen to be NULL. "unknown" is interpreted
like "false", so all joined rows where either of the attributes is NULL
will be discarded. Adding "WHERE bb.conv_id IS NULL" doesn't change
anything at all because the relevant rows where already discarded before.

In the second query, you have pretty much the same. Your subselect does not
identify any qualifying rows in table B. But now you have a NOT EXISTS,
which leads to a "true" result for the predicate, and that qualifies the
row in table A.
What you can do to fix that is to correct the join condition in query 1 like
this:

ON ( bb.conv_id = aa.b_conv_id OR
bb.conv_id IS NULL OR
aa.b_conv_id IS NULL )

or maybe this, depending on what you want to achieve:

ON ( bb.conv_id = aa.b_conv_id OR
( bb.conv_id IS NULL AND
aa.b_conv_id IS NULL ) )

(You wouldn't need any of the parenthesis, but it makes things much
clearer.)

Alternatively, you can add "OR bb.conv_id IS NULL" to the WHERE clause in
the subselect of the 2nd query.

--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany
Nov 20 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by red | last post: by
10 posts views Thread by Xah Lee | last post: by
3 posts views Thread by MaxPenguin | last post: by
5 posts views Thread by Alejandrina | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
reply views Thread by Purva khokhar | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.