By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,829 Members | 673 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,829 IT Pros & Developers. It's quick & easy.

SQL performance: Nested SELECT vs. INNER JOIN

P: n/a
Hello All -

I am wondering if anyone has any thoughts on which is better from a
performance perspective: a nested Select statement or an Inner Join.
For example, I could do either of the following:

SELECT supplier_name
FROM supplier
WHERE supplier_ID IN (SELECT supplier_ID FROM products WHERE
product_ID =22);

VS.

SELECT supplier.supplier_name
FROM supplier INNER JOIN products ON supplier.supplier_ID =
products.supplier_ID
WHERE products.product_ID = 22;
Which is faster?

Thanks.
Brian
Nov 12 '05 #1
Share this Question
Share on Google+
23 Replies


P: n/a
On 25 Sep 2003 11:37:05 -0700 in comp.databases.ms-access,
br*********@yahoo.com (Brian) wrote:
Hello All -

I am wondering if anyone has any thoughts on which is better from a
performance perspective: a nested Select statement or an Inner Join.
For example, I could do either of the following:


(air code below, be sure to post your findings :)

Dim strSQL(1 to 2) As String
Dim i as long, j as long
Dim varStart As variant
Dim rs As Recordset
Dim db As Database

strSQL(1)="SELECT supplier_name " & _
"FROM supplier " & _
"WHERE supplier_ID IN (SELECT supplier_ID FROM products WHERE " & _
"product_ID =22);"

strSQL(2)="SELECT supplier.supplier_name " & _
"FROM supplier INNER JOIN products ON supplier.supplier_ID = " & _
"products.supplier_ID " & _
"WHERE products.product_ID = 22; "

set db = currentdb
For i = 1 to 2
varStart=now()
for j=1 to 1000
set rs=db.openrecordset(strSQL(i),dbopensnapshot)
rs.close
set rs=nothing
next j
debug.print "Method " & i & " " & DateDiff("s",varStart,Now())
Next i
set db=nothing

--
A)bort, R)etry, I)nfluence with large hammer.
Nov 12 '05 #2

P: n/a
"Brian" <br*********@yahoo.com> wrote in message
news:be**************************@posting.google.c om...
Hello All -

I am wondering if anyone has any thoughts on which is better from a
performance perspective: a nested Select statement or an Inner Join.
For example, I could do either of the following:

SELECT supplier_name
FROM supplier
WHERE supplier_ID IN (SELECT supplier_ID FROM products WHERE
product_ID =22);

VS.

SELECT supplier.supplier_name
FROM supplier INNER JOIN products ON supplier.supplier_ID =
products.supplier_ID
WHERE products.product_ID = 22;
Which is faster?

Try them and see! A clever optimizer should produce identical query plans.
However, a mistake that I've seen very often is where the subquery generates
multiple rows for each supplier, (for example, if you wanted to see all
suppliers that sold a certain product type). In this case query 1 is
superior, because the query processor can stop searching for rows as soon as
it finds the first one. <rant>Some beginners would use query 2 and put a
DISTINCT clause in to eliminate duplicates. Obviously, this is a bad, bad
thing to do, yet I've seen it done many times. Not only by beginners, but by
supposedly experienced developers.</rant>

BTW, I would, in any case, use EXISTS instead of IN, so query 1 would
become:

SELECT s.supplier_name
FROM supplier AS s
WHERE EXISTS
(
SELECT * FROM products AS p
WHERE p.product_ID =22
AND p.Supplier_ID = s.Supplier_ID
)

but, as I said, query 2 would be just as good.



Nov 12 '05 #3

P: n/a
"John Winterbottom" <jo***************@hotmail.com> wrote in message
news:bk************@ID-185006.news.uni-berlin.de...
"Brian" <br*********@yahoo.com> wrote in message
news:be**************************@posting.google.c om...
Hello All -

I am wondering if anyone has any thoughts on which is better from a
performance perspective: a nested Select statement or an Inner Join.
For example, I could do either of the following:

SELECT supplier_name
FROM supplier
WHERE supplier_ID IN (SELECT supplier_ID FROM products WHERE
product_ID =22);

VS.

SELECT supplier.supplier_name
FROM supplier INNER JOIN products ON supplier.supplier_ID =
products.supplier_ID
WHERE products.product_ID = 22;
Which is faster?


Just to add to what I was syaing earlier, these two examples both use the
Northwind database to find suppliers that sell seafood products.

The right way
-------------------------------------
select s.CompanyName
from Suppliers as s
where exists
(
select * from products p
inner join categories c on p.CategoryID = c.CategoryID
where c.CategoryName = "Seafood"
and p.SupplierID=s.SupplierID
)
----------------------------------------
The wrong way
------------------------------------------------
SELECT DISTINCT Suppliers.CompanyName
FROM Suppliers INNER JOIN (Categories INNER JOIN Products ON
Categories.CategoryID = Products.CategoryID) ON Suppliers.SupplierID =
Products.SupplierID
WHERE (((Categories.CategoryName)="Seafood"))
------------------------------------------------
Nov 12 '05 #4

P: n/a

John,

On Thu, 25 Sep 2003 15:59:53 -0400, "John Winterbottom"
<jo***************@hotmail.com> wrote in comp.databases.ms-access:

Just to add to what I was syaing earlier, these two examples both use the
Northwind database to find suppliers that sell seafood products.

The right way
-------------------------------------
select s.CompanyName
from Suppliers as s
where exists
(
select * from products p
inner join categories c on p.CategoryID = c.CategoryID
where c.CategoryName = "Seafood"
and p.SupplierID=s.SupplierID
)
----------------------------------------
The wrong way
------------------------------------------------
SELECT DISTINCT Suppliers.CompanyName
FROM Suppliers INNER JOIN (Categories INNER JOIN Products ON
Categories.CategoryID = Products.CategoryID) ON Suppliers.SupplierID =
Products.SupplierID
WHERE (((Categories.CategoryName)="Seafood"))
------------------------------------------------


I find it very interesting that you would say this.

I thought it was well understood and accepted that Jet is very good at
optimizing multi-table sql statements that utilize joins, and very
poor at handling subqueries. I would kindly suggest that you test
your two example queries, because I would hazard a guess that you'll
find that the performance is the exact opposite of what you expect.

Of course, with rdbms' more generally, subqueries are fine and often
preferable in situations like this. But its not that way with Jet.

Peter Miller
__________________________________________________ __________
PK Solutions -- Data Recovery for Microsoft Access/Jet/SQL
Free quotes, Guaranteed lowest prices and best results
www.pksolutions.com 1.800.987.7716 1.619.839.3900
Nov 12 '05 #5

P: n/a
"Peter Miller" <pm*****@pksolutions.com> wrote in message
news:o7********************************@4ax.com...

John,

On Thu, 25 Sep 2003 15:59:53 -0400, "John Winterbottom"
<jo***************@hotmail.com> wrote in comp.databases.ms-access:

Just to add to what I was syaing earlier, these two examples both use the
Northwind database to find suppliers that sell seafood products.

The right way
-------------------------------------
select s.CompanyName
from Suppliers as s
where exists
(
select * from products p
inner join categories c on p.CategoryID = c.CategoryID
where c.CategoryName = "Seafood"
and p.SupplierID=s.SupplierID
)
----------------------------------------
The wrong way
------------------------------------------------
SELECT DISTINCT Suppliers.CompanyName
FROM Suppliers INNER JOIN (Categories INNER JOIN Products ON
Categories.CategoryID = Products.CategoryID) ON Suppliers.SupplierID =
Products.SupplierID
WHERE (((Categories.CategoryName)="Seafood"))
------------------------------------------------


I find it very interesting that you would say this.

I thought it was well understood and accepted that Jet is very good at
optimizing multi-table sql statements that utilize joins, and very
poor at handling subqueries. I would kindly suggest that you test
your two example queries, because I would hazard a guess that you'll
find that the performance is the exact opposite of what you expect.

Of course, with rdbms' more generally, subqueries are fine and often
preferable in situations like this. But its not that way with Jet.

Right now I can only test in SQL Server - and it's as I said; query 1 has a
lower cost than query 2. I'd be very surprised if Jet was different. The
reason is because of how a select works, (or how it shold work). Joe Celko
has posted this several times - I'll see if I can find it if you like.

Basically, when you use SELECT DISTINCT...WHERE, (or a GROUP BY clause), the
query engine first builds a working table with all the rows satisfying the
WHERE condition. Only then can it aggregate them. This is more expensive. In
the first example, (using EXISTS), the processor only needs to test for the
existence of a single row. In other words, if you have 5000 seafood products
from the same supplier, all it needs to do is find one of them and the
EXISTS condition is satisfied. The processor can move on to the next
supplier.




Nov 12 '05 #6

P: n/a

On Thu, 25 Sep 2003 16:56:36 -0400, "John Winterbottom"
<jo***************@hotmail.com> wrote in comp.databases.ms-access:
Right now I can only test in SQL Server - and it's as I said; query 1 has a
lower cost than query 2. I'd be very surprised if Jet was different. The
reason is because of how a select works, (or how it shold work). Joe Celko
has posted this several times - I'll see if I can find it if you like.
Hey, Joe's the man, but your statement doesn't follow...

To wit:
I'd be very surprised if Jet was different.
Understood.
The reason is because of how a select works, (or how it shold work).
But that's precisely the point I was making. Jet is fine at joins,
and weak at subqueries. It's got nothing to do with what 'should' be
the case. It's been clear through the history of Jet that it is weak
in this area.
Basically, when you use SELECT DISTINCT...WHERE, (or a GROUP BY clause), the
query engine first builds a working table with all the rows satisfying the
WHERE condition. Only then can it aggregate them. This is more expensive. In
the first example, (using EXISTS), the processor only needs to test for the
existence of a single row. In other words, if you have 5000 seafood products
from the same supplier, all it needs to do is find one of them and the
EXISTS condition is satisfied. The processor can move on to the next
supplier.


....yes, all well understood. But What you'll find with jet is that
EXISTS clauses don't work well, because they rely on subqueries, and
subqueries are poorly handled by Jet.

For example, take your 'correct' example, and break it down again by
using another exists statement (ie, no joins at all, just a query with
a subquery and a nested subquery beneath the subquery.

It is my understanding that Jet actually works the full subquery (or
subqueries in this case) up from the bottom, and essentially does the
join implicitly only once the subquery has been processed (to the
extent possible). At the same time, DISTINCT is nicely optimized to
an effective TOP 1 / GROUP BY combo (ie, no processing occurs beyond
finding the first matching instance at each level). So you see little
penalty in Jet by using DISTINCT and a large penalty for using the
subquery.

As I'm sure Celko would agree, despite the obvious benefits of generic
sql that's 100% standards compliant, certain sql implementations will
have certain nuances that simply can't be ignored in real-world
database applications. Jet's poor handling of subqueries is a case in
point.

Peter Miller
__________________________________________________ __________
PK Solutions -- Data Recovery for Microsoft Access/Jet/SQL
Free quotes, Guaranteed lowest prices and best results
www.pksolutions.com 1.800.987.7716 1.619.839.3900
Nov 12 '05 #7

P: n/a
With all due respect to Joe Celko, he has never had
much time for JET...

(david)
Nov 12 '05 #8

P: n/a
"Peter Miller" <pm*****@pksolutions.com> wrote in message
news:oj********************************@4ax.com...

...yes, all well understood. But What you'll find with jet is that
EXISTS clauses don't work well, because they rely on subqueries, and
subqueries are poorly handled by Jet.

For example, take your 'correct' example, and break it down again by
using another exists statement (ie, no joins at all, just a query with
a subquery and a nested subquery beneath the subquery.

It is my understanding that Jet actually works the full subquery (or
subqueries in this case) up from the bottom, and essentially does the
join implicitly only once the subquery has been processed (to the
extent possible). At the same time, DISTINCT is nicely optimized to
an effective TOP 1 / GROUP BY combo (ie, no processing occurs beyond
finding the first matching instance at each level). So you see little
penalty in Jet by using DISTINCT and a large penalty for using the
subquery.


Peter, you're absolutely right. I tested this and there's no difference
whatsoever in execution time for either query running against a Jet
database. I haven't looked at the query plans to see if they are the same.
I'm pretty amazed at this. If you run the two in SQL Server you'll see a
major improvement for query 1.

Live and learn. Thanks for the correction Pater, and the explanation. That's
twice in one day I've made a claim that's incorrect, so now it's time for me
to go and have a beer.




Nov 12 '05 #9

P: n/a

"david epsom dot com dot au" <david@epsomdotcomdotau> wrote in message
news:uT**************@TK2MSFTNGP10.phx.gbl...
With all due respect to Joe Celko, he has never had
much time for JET...


True. This may be one of the reasons why!

I had just assumed, since the designers of Jet had chosen to implement the
exists clause, that they would have optimized it as well, (as do Oracle, DB2
and SQL Server). Otehrwise what's the point? Oh well, as I said to Peter,
live and learn.
Nov 12 '05 #10

P: n/a
On Thu, 25 Sep 2003 20:06:04 -0400, "John Winterbottom"
<jo***************@hotmail.com> wrote in comp.databases.ms-access:
I tested this and there's no difference
whatsoever in execution time for either query running against a Jet
database.
Hmm. I tested it here too, but found a 33% performance benefit using
query 2 over query 1. Did you iterate enough times to get meaningful
results? I found that using a 20,000 iteration loop on query 1
against the Northwind tables, I saw 29 seconds for query 1 and 19
seconds for query 2. If I broke query 1 down into two exists/subquery
clauses instead of just one and a join, the time required escalated to
38 seconds for 20,000 iterations.

In other words, there's very much a difference, and its decidedly
faster to join that to use subqueries.
I'm pretty amazed at this. If you run the two in SQL Server you'll see a
major improvement for query 1.


As you would with Oracle, DB2 or Sybase. Just not Jet.

Peter Miller
__________________________________________________ __________
PK Solutions -- Data Recovery for Microsoft Access/Jet/SQL
Free quotes, Guaranteed lowest prices and best results
www.pksolutions.com 1.800.987.7716 1.619.839.3900
Nov 12 '05 #11

P: n/a
I like to think that, at least originally, Jet was optimised
for naive users...
(david)

"John Winterbottom" <jo***************@hotmail.com> wrote in message
news:bl************@ID-185006.news.uni-berlin.de...

"david epsom dot com dot au" <david@epsomdotcomdotau> wrote in message
news:uT**************@TK2MSFTNGP10.phx.gbl...
With all due respect to Joe Celko, he has never had
much time for JET...

True. This may be one of the reasons why!

I had just assumed, since the designers of Jet had chosen to implement the
exists clause, that they would have optimized it as well, (as do Oracle,

DB2 and SQL Server). Otehrwise what's the point? Oh well, as I said to Peter,
live and learn.

Nov 12 '05 #12

P: n/a
"david epsom dot com dot au" wrote
With all due respect to Joe Celko, he
has never had much time for JET...


Joe and I have crossed words (not swords) in the past, and then I decided he
was an "elitist" who didn't seem to have either much time or respect for
anything he considered a mere desktop database. He's certainly not the only
one in that category, of course.
Nov 12 '05 #13

P: n/a
rkc

"Larry Linson" <bo*****@localhost.net> wrote in message
news:Lk*******************@nwrddc01.gnilink.net...
"david epsom dot com dot au" wrote
> With all due respect to Joe Celko, he
> has never had much time for JET...
Joe and I have crossed words (not swords) in the past, and then I decided

he was an "elitist" who didn't seem to have either much time or respect for
anything he considered a mere desktop database. He's certainly not the only one in that category, of course.

It would be pointless to spend time learning the idiosyncrasies of something
you
have no use for. That's not elitist. That's sensible.


Nov 12 '05 #14

P: n/a
It is "elitist" to denigrate a category of products _in your field_ without
understanding them and imply that because they aren't something else (in
this case, server databases) that they couldn't be worth anyone's time.

There are situations in which a server database is the proper solution;
there are other situations in which a simple desktop database is the proper
solution. If you are '"in the database business", you really ought to be a
pragmatist and understand both kinds of tool sufficiently well to make an
informed decision as to which one is appropriate.

And, you aren't nearly as likely to be thought an insufferable egotist
rather than just elitist, either.

On the other hand, it might well be pointless for someone in the database
business to spend the time learning the idiosyncracies of image processing
software.

Larry

"rkc" <rk*@yabba.dabba.do.rochester.rr.com> wrote in message
news:k1*******************@twister.nyroc.rr.com...

"Larry Linson" <bo*****@localhost.net> wrote in message
news:Lk*******************@nwrddc01.gnilink.net...
"david epsom dot com dot au" wrote
> With all due respect to Joe Celko, he
> has never had much time for JET...
Joe and I have crossed words (not swords) in the past, and then I

decided he
was an "elitist" who didn't seem to have either much time or respect for
anything he considered a mere desktop database. He's certainly not the only
one in that category, of course.

It would be pointless to spend time learning the idiosyncrasies of

something you
have no use for. That's not elitist. That's sensible.

Nov 12 '05 #15

P: n/a
rkc

"Larry Linson" <bo*****@localhost.net> wrote in message
news:sh*******************@nwrddc01.gnilink.net...
It is "elitist" to denigrate a category of products _in your field_ without understanding them and imply that because they aren't something else (in
this case, server databases) that they couldn't be worth anyone's time.

There are situations in which a server database is the proper solution;
there are other situations in which a simple desktop database is the proper solution. If you are '"in the database business", you really ought to be a
pragmatist and understand both kinds of tool sufficiently well to make an
informed decision as to which one is appropriate.

And, you aren't nearly as likely to be thought an insufferable egotist
rather than just elitist, either.

On the other hand, it might well be pointless for someone in the database
business to spend the time learning the idiosyncracies of image processing
software.


Of course you are right. Anyone in the database business should be
intimately familiar with every database system on the planet. It's
unthinkable
that there may be database professionals that never work at the desktop
level.

Nov 12 '05 #16

P: n/a
"rkc" <rk*@yabba.dabba.do.rochester.rr.com> wrote in
news:EL*******************@twister.nyroc.rr.com:
Of course you are right. Anyone in the database business should be
intimately familiar with every database system on the planet. It's
unthinkable
that there may be database professionals that never work at the desktop
level.


I always work at the desktop level except when I put the keyboard on the
floor and type with my toes.

--
Lyle
(for e-mail refer to http://ffdba.com/contacts.htm)
Nov 12 '05 #17

P: n/a
rk*@yabba.dabba.do.rochester.rr.mom (rkc) wrote in
<EL*******************@twister.nyroc.rr.com>:

"Larry Linson" <bo*****@localhost.net> wrote in message
news:sh*******************@nwrddc01.gnilink.net.. .
It is "elitist" to denigrate a category of products _in your
field_

without
understanding them and imply that because they aren't something
else (in this case, server databases) that they couldn't be
worth anyone's time.

There are situations in which a server database is the proper
solution; there are other situations in which a simple desktop
database is the

proper
solution. If you are '"in the database business", you really
ought to be a pragmatist and understand both kinds of tool
sufficiently well to make an informed decision as to which one
is appropriate.

And, you aren't nearly as likely to be thought an insufferable
egotist rather than just elitist, either.

On the other hand, it might well be pointless for someone in the
database business to spend the time learning the idiosyncracies
of image processing software.


Of course you are right. Anyone in the database business should
be intimately familiar with every database system on the planet.
It's unthinkable
that there may be database professionals that never work at the
desktop level.


It's fine to not know something.

It's *not* fine to badmouth something you don't know, since your
lack of understanding may be the source of your bad impression of
the product more than any actual deficiencies in the product.

Celko is an arrogant blowhard, so far as I can see.

But then, so am I!

--
David W. Fenton http://www.bway.net/~dfenton
dfenton at bway dot net http://www.bway.net/~dfassoc
Nov 12 '05 #18

P: n/a
RC

In regards to performance there isn't really any difference between the
two, it's more of a question on what method your more comfortable with,
SQL Server has a very power tool known as the Query Optimizer that's
incharge on how it handles a request and uses the best execution plan
for a query, so if you write select statement, the Query optimizer might
use an entirly different execution plan. Of course you can force SQL
Server to use your plan using the SET FORCEPLAN statement, but it's a
good idea to let SQL Server decide with plan to use. To test this for
your self execute the two Queries, On the Query menu, click Show
Execution Plan, and you will see that SQL Server used the same plan for
both.


*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 12 '05 #19

P: n/a
"rkc" wrote
It is "elitist" to denigrate a category of
products _in your field_ without
understanding them and imply that because
they aren't something else (in this case,
server databases) that they couldn't be
worth anyone's time.

There are situations in which a server
database is the proper solution; there
are other situations in which a simple
desktop database is the proper
solution. If you are '"in the database
business", you really ought to be a
pragmatist and understand both kinds
of tool sufficiently well to make an
informed decision as to which one is appropriate.

And, you aren't nearly as likely to be
thought an insufferable egotist
rather than just elitist, either.

On the other hand, it might well be
pointless for someone in the database
business to spend the time learning
the idiosyncracies of image processing
software.
Of course you are right. Anyone in the
database business should be intimately
familiar with every database system on
the planet. It's unthinkable that there may
be database professionals that never work
at the desktop level.


Are you just being obnoxious or do you really think that is what I said? If
the latter, then you really ought to re-read it. I suspect the former, and,
if that is so, you aren't impressing anybody by deliberately
misinterpreting, you know.

I am not, of course, "intimately familiar with every database system on the
planet", but I am in the database business and I make it my business to know
enough about server systems in general (and a bit more about some of the
more frequently-used ones) so that I can discuss them intelligently. And, if
I did not know enough to discuss them intelligently, I'd certainly have
enough common sense not to dismiss them out of hand.
Nov 12 '05 #20

P: n/a
"Larry Linson" <bo*****@localhost.net> wrote in
news:ss*******************@nwrddc01.gnilink.net:
Are you just being obnoxious or do you really think that is what I said?
If the latter, then you really ought to re-read it. I suspect the
former, and, if that is so, you aren't impressing anybody by
deliberately misinterpreting, you know.


Why not? It works for you.

--
Lyle
(for e-mail refer to http://ffdba.com/contacts.htm)
Nov 12 '05 #21

P: n/a
"Lyle Fairfield" wrote
Are you just being obnoxious or do
you really think that is what I said?
If the latter, then you really ought to
re-read it. I suspect the former, and,
if that is so, you aren't impressing any-
body by deliberately misinterpreting,
you know.


Why not? It works for you.


Lyle, if you feel _compelled_ to start an argument, go try to bait someone
else into it.
Nov 12 '05 #22

P: n/a
rkc

"Larry Linson" <bo*****@localhost.net> wrote in message
news:ss*******************@nwrddc01.gnilink.net...
"rkc" wrote
> > On the other hand, it might well be
> > pointless for someone in the database
> > business to spend the time learning
> > the idiosyncracies of image processing
> > software.

> Of course you are right. Anyone in the
> database business should be intimately
> familiar with every database system on
> the planet. It's unthinkable that there may
> be database professionals that never work
> at the desktop level.


Are you just being obnoxious or do you really think that is what I said?

If the latter, then you really ought to re-read it. I suspect the former, and, if that is so, you aren't impressing anybody by deliberately
misinterpreting, you know.


Oh my. Are we crossing words or swords here?

I wouldn't have replied at all if it weren't for your ridiculous image
software
paragraph.

Moving on.


Nov 12 '05 #23

P: n/a
Oh, and another point: Complex 'Nested Select' queries sometimes don't work
with tables linked to an ODBC datasource. The Jet queryplan builder fails
to build a valid queryplan.

(david)

"Brian" <br*********@yahoo.com> wrote in message
news:be**************************@posting.google.c om...
Hello All -

I am wondering if anyone has any thoughts on which is better from a
performance perspective: a nested Select statement or an Inner Join.
For example, I could do either of the following:

SELECT supplier_name
FROM supplier
WHERE supplier_ID IN (SELECT supplier_ID FROM products WHERE
product_ID =22);

VS.

SELECT supplier.supplier_name
FROM supplier INNER JOIN products ON supplier.supplier_ID =
products.supplier_ID
WHERE products.product_ID = 22;
Which is faster?

Thanks.
Brian

Nov 12 '05 #24

This discussion thread is closed

Replies have been disabled for this discussion.