469,283 Members | 2,158 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,283 developers. It's quick & easy.

optimization question (indexes and joins)

I am building a query to select contacts and filter by category. I was
wondering if anyone could explain why the first select statement takes 950
times as long to execute compared to the second. I'm sure it has something
to do with the way the join is processed but I'm still curious what the
engine actually does in each case. In my test the first statement executed
in 0.03 seconds while the second took

Some additional info- In the tests there were 3911 contact records and 17164
contact_category records. The following fields have BTREE indexes:
contact.id, UNIQUE(contact_category.contact_id,
contact_category.category_id)

SELECT DISTINCT contacts.id FROM contacts
LEFT JOIN contact_category ON contacts.id=contact_category.contact_id
WHERE category_id IN (1,2,3,4)
LIMIT 0,500;

----------------------------------------------------------------------------------------

SELECT DISTINCT contacts.id FROM contacts, contact_category
WHERE contacts.id=contact_category.contact_id AND category_id IN (1,2,3,4)
LIMIT 0,500;

----------------------------------------------------------------------------------------

Any tips or clarification on the queries or suggestions how to improve them
or the indexes would be greatly appreciated.

Alex
Jul 23 '05 #1
3 1290
The primary difference is the type of joins being performed. The first
query is (obviously) a left join, using indexed columns from each table
as the join condition. The second, slower, query is an inner join,
which, according to the docs:

INNER JOIN and , (comma) are semantically equivalent in the absence of
a join condition: both produce a Cartesian product between the specified
tables (that is, each and every row in the first table are joined to all
rows in the second table).

which is then (presumably) joining on some non-indexed columns, and,
save the limit clause, would return 67,000,000+ records as opposed to
3911 in the first query.

An EXPLAIN SELECT is very helpful in determining where index and query
optimization should take place.

Alex Glass wrote:
I am building a query to select contacts and filter by category. I was
wondering if anyone could explain why the first select statement takes 950
times as long to execute compared to the second. I'm sure it has something
to do with the way the join is processed but I'm still curious what the
engine actually does in each case. In my test the first statement executed
in 0.03 seconds while the second took

Some additional info- In the tests there were 3911 contact records and 17164
contact_category records. The following fields have BTREE indexes:
contact.id, UNIQUE(contact_category.contact_id,
contact_category.category_id)

SELECT DISTINCT contacts.id FROM contacts
LEFT JOIN contact_category ON contacts.id=contact_category.contact_id
WHERE category_id IN (1,2,3,4)
LIMIT 0,500;

----------------------------------------------------------------------------------------

SELECT DISTINCT contacts.id FROM contacts, contact_category
WHERE contacts.id=contact_category.contact_id AND category_id IN (1,2,3,4)
LIMIT 0,500;

----------------------------------------------------------------------------------------

Any tips or clarification on the queries or suggestions how to improve them
or the indexes would be greatly appreciated.

Alex

Jul 23 '05 #2
Thanks for the reply, but if you re-read what I wrote the bizarre thing that
is happening is that the INNER JOIN (Cartesian product) is faster that the
LEFT JOIN which is what I don't quite understand. It seems counter-intuitive
that this be the case...

"hoonew" <ho****@ino.now> wrote in message
news:tlzEe.116931$gc6.54828@okepread04...
The primary difference is the type of joins being performed. The first
query is (obviously) a left join, using indexed columns from each table as
the join condition. The second, slower, query is an inner join, which,
according to the docs:

INNER JOIN and , (comma) are semantically equivalent in the absence of a
join condition: both produce a Cartesian product between the specified
tables (that is, each and every row in the first table are joined to all
rows in the second table).

which is then (presumably) joining on some non-indexed columns, and, save
the limit clause, would return 67,000,000+ records as opposed to 3911 in
the first query.

An EXPLAIN SELECT is very helpful in determining where index and query
optimization should take place.

Alex Glass wrote:
I am building a query to select contacts and filter by category. I was
wondering if anyone could explain why the first select statement takes
950 times as long to execute compared to the second. I'm sure it has
something to do with the way the join is processed but I'm still curious
what the engine actually does in each case. In my test the first
statement executed in 0.03 seconds while the second took

Some additional info- In the tests there were 3911 contact records and
17164 contact_category records. The following fields have BTREE indexes:
contact.id, UNIQUE(contact_category.contact_id,
contact_category.category_id)

SELECT DISTINCT contacts.id FROM contacts
LEFT JOIN contact_category ON contacts.id=contact_category.contact_id
WHERE category_id IN (1,2,3,4)
LIMIT 0,500;

----------------------------------------------------------------------------------------

SELECT DISTINCT contacts.id FROM contacts, contact_category
WHERE contacts.id=contact_category.contact_id AND category_id IN
(1,2,3,4)
LIMIT 0,500;

----------------------------------------------------------------------------------------

Any tips or clarification on the queries or suggestions how to improve
them or the indexes would be greatly appreciated.

Alex

Jul 24 '05 #3
Indeed, I need to read more carefully... What does EXPLAIN SELECT return?

Alex Glass wrote:
Thanks for the reply, but if you re-read what I wrote the bizarre thing that
is happening is that the INNER JOIN (Cartesian product) is faster that the
LEFT JOIN which is what I don't quite understand. It seems counter-intuitive
that this be the case...

Jul 24 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by Jesse Sheidlower | last post: by
2 posts views Thread by ensnare | last post: by
3 posts views Thread by Ike | last post: by
5 posts views Thread by Kenneth Courville | last post: by
19 posts views Thread by nospammmer | last post: by
10 posts views Thread by Troels Arvin | last post: by
22 posts views Thread by Zamdrist | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.