By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,364 Members | 1,147 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,364 IT Pros & Developers. It's quick & easy.

An optimum query..?

P: n/a
Hello friends,
I have one simple question. I have two tables. 1 ( Table A ) has about
2.5 million rows and second one ( Table B ) has about 1 million. There
are common ID fields in both tables. I want join them on ID field and
get all rows of Table A which are not in there Table B.

When I ran following two queries, I got same result set, but time it
took was very different.

Following query took 1:35 minutes
SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1
WHERE NOT exists (
SELECT 1 from [Table B] Tbl2 WHERE
Tbl1.UID = Tbl2.UID )

vs this one took .45 seconds.

SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1 LEFT OUTER JOIN [Table B] Tbl2 WHERE ON Tbl1.UID =
Tbl2.UID
WHERE Tbl2.UID IS NULL

Which option is better ? I have subsequent joins to another table
which has about 2 mil more rows and trying to optimize the response
time.

I appreciate all help from the community.
JB
Apr 3 '08 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Which option is better ? I have subsequent joins to another table
which has about 2 mil more rows and trying to optimize the response
time.
I would have expected both the NOT EXISTS and the LEFT JOIN...WHERE..IS NULL
to yield the same plan and performance. However, the second query you
posted is invalid (has a WHERE in the JOIN clause). Can you post the actual
table DDL and queries?

--
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/

<ds**********@gmail.comwrote in message
news:19**********************************@m73g2000 hsh.googlegroups.com...
Hello friends,
I have one simple question. I have two tables. 1 ( Table A ) has about
2.5 million rows and second one ( Table B ) has about 1 million. There
are common ID fields in both tables. I want join them on ID field and
get all rows of Table A which are not in there Table B.

When I ran following two queries, I got same result set, but time it
took was very different.

Following query took 1:35 minutes
SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1
WHERE NOT exists (
SELECT 1 from [Table B] Tbl2 WHERE
Tbl1.UID = Tbl2.UID )

vs this one took .45 seconds.

SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1 LEFT OUTER JOIN [Table B] Tbl2 WHERE ON Tbl1.UID =
Tbl2.UID
WHERE Tbl2.UID IS NULL

Which option is better ? I have subsequent joins to another table
which has about 2 mil more rows and trying to optimize the response
time.

I appreciate all help from the community.
JB
Apr 3 '08 #2

P: n/a
On Apr 2, 10:21*pm, "Dan Guzman" <guzma...@nospam-
online.sbcglobal.netwrote:
Which option is better ? I have subsequent joins to another table
which has about 2 mil more rows and trying to optimize the response
time.

I would have expected both the NOT EXISTS and the LEFT JOIN...WHERE..IS NULL
to yield the same plan and performance. *However, the second query you
posted is invalid (has a WHERE in the JOIN clause). *Can you post the actual
table DDL and queries?

--
Hope this helps.

Dan Guzman
SQL Server MVPhttp://weblogs.sqlteam.com/dang/

<dsdevonso...@gmail.comwrote in message

news:19**********************************@m73g2000 hsh.googlegroups.com...
Hello friends,
I have one simple question. I have two tables. 1 ( Table A ) has about
2.5 million rows and second one ( Table B ) has about 1 million. There
are common ID fields in both tables. I want join them on ID field and
get all rows of Table A which are not in there Table B.
When I ran following two queries, I got same result set, but time it
took was very different.
Following query took 1:35 minutes
SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1
WHERE NOT exists (
SELECT *1 from [Table B] Tbl2 WHERE
Tbl1.UID = Tbl2.UID )
vs this one took .45 seconds.
SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1 LEFT OUTER JOIN [Table B] Tbl2 WHERE ON Tbl1.UID =
Tbl2.UID
WHERE Tbl2.UID IS NULL
Which option is better ? I have subsequent joins to another table
which has about 2 mil more rows and trying to optimize the response
time.
I appreciate all help from the community.
JB- Hide quoted text -

- Show quoted text -
Oh,
second query is just without WHERE clause. It was an honest mistake. I
checked on client statistics and execution plan. In Left Outer join
query, Client processing time 67858, vs in NOT EXISTS query Client
processing time 72074 .

Thanks Dan..
Apr 3 '08 #3

P: n/a
On Apr 2, 8:06 pm, dsdevonso...@gmail.com wrote:
Oh,
second query is just without WHERE clause. It was an honest mistake. I
checked on client statistics and execution plan. In Left Outer join
query, Client processing time 67858, vs in NOT EXISTS query Client
processing time 72074 .
What do the query plans look like? Like Dan, I would expect that these
queries would produce the same plan.

Did you run the queries multiple times? If you ran each one only once,
the difference may be due to caching.
Apr 3 '08 #4

P: n/a
Here is another alternative, if you have 2005 or 2008. Since you did
not use very good data element names, you might not know about SAN
("Standard Address Number") which is used in many industries for
address data. On Oracle and other DBs, the EXCEPT is quite fast.

SELECT san, last_name, first_name, house_nbr, street_name, city_name,
state_code
FROM Addresses
WHERE san
IN
(SELECT san FROM Addresses
EXCEPT
SELECT san FROM Foobar); -- not a great table name ..
Apr 3 '08 #5

This discussion thread is closed

Replies have been disabled for this discussion.