472,141 Members | 1,008 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,141 software developers and data experts.

how to optimize the join query...??

43
hi
i need to optimize the join query, which joins three tables
say table1 ,table2 , table3 each having huge volume of records...
the query is as

Expand|Select|Wrap|Line Numbers
  1. select table1.id,table2.time,table3.Status
  2. from table1 left join table2
  3. using(id)
  4. left join table3
  5. using (id)
  6. limit 10000
this gives result efficiently if limit is less but problem is with huge data set
when each table consist millions of records
any suggestion for optimizing it or any other way to do so
thanks in advance i am using mysql 4.1
Dec 22 '07 #1
11 1858
pbmods
5,821 Expert 4TB
Heya, Bravo.

Your experience is quite common; joins are SLOW for large datasets.

You'd be better served in this case by either breaking it up into three separate queries, or by using subqueries.

You could try doing outer joins; sometimes that works:
Expand|Select|Wrap|Line Numbers
  1. SELECT table1.id,table2.time,table3.Status
  2. FROM table1, table2, table3
  3. WHERE table2.id = table1.id AND table 3.id = table 2.id
  4. LIMIT 10000
  5.  
Dec 23 '07 #2
bravo
43
Thanks
as per ur suggestions i tried subquerries but unfortunately still not getting any
fruitfull result.i cant break it into different queries due to some reasons...
i need the result set like
Table1
id | name
1 | a
2 | b
3 | c


Table2
id | time
2 | 0:0
3 | 0:1

Table 3
id | status
3 | true

I need the result set as
id | time | status
1 | null | null
2 | 0:0 | null
3 | 0:1 | true

but not getting it right through subqueries, getting right set using join but it is not efficient for huge set
plz help me out.... i am stuck
Dec 24 '07 #3
pbmods
5,821 Expert 4TB
Heya, bravo.

What do you get if you run this query:
Expand|Select|Wrap|Line Numbers
  1. EXPLAIN SELECT table1.id,table2.TIME,table3.Status
  2. FROM table1 LEFT JOIN table2
  3. USING(id)
  4. LEFT JOIN table3
  5. USING (id)
  6. LIMIT 10000
  7.  
Dec 24 '07 #4
bravo
43
Heya, bravo.

What do you get if you run this query:
Expand|Select|Wrap|Line Numbers
  1. EXPLAIN SELECT table1.id,table2.TIME,table3.Status
  2. FROM table1 LEFT JOIN table2
  3. USING(id)
  4. LEFT JOIN table3
  5. USING (id)
  6. LIMIT 10000
  7.  
this is what i get when i run the above query....

id|select_type | table |type |possible_key| key|key_len|ref|rows|Extra
1 | simple | table1 | All | null | null | null | null | 149292 |
1 | simple | table2 | All | null | null | null | null | 100000 |
1 | simple | table3 | ref | id | id | 25 | host.table2.id | 10 |
Dec 25 '07 #5
pbmods
5,821 Expert 4TB
Heya, Bravo.

this is what i get when i run the above query....

id|select_type | table |type |possible_key| key|key_len|ref|rows|Extra
1 | simple | table1 | All | null | null | null | null | 149292 |
1 | simple | table2 | All | null | null | null | null | 100000 |
1 | simple | table3 | ref | id | id | 25 | host.table2.id | 10 |
Ok. What we're going to do here is turn table1 into a derived table by using a subquery. This will instantly cut down the number of records that MySQL has to work with from 150,000 to the 10,000 that you need:

Expand|Select|Wrap|Line Numbers
  1. SELECT
  2.         `dTable1`.`id`,
  3.         `table2`.`time`,
  4.         `table3`.`Status`
  5.     FROM
  6.     (
  7.         (
  8.             SELECT
  9.                     `id`
  10.                 FROM
  11.                     `table1`
  12.                 LIMIT 10000
  13.         ) AS `dTable1`
  14.         LEFT JOIN
  15.             `table2`
  16.                 USING(`id`)
  17.         LEFT JOIN
  18.             `table3`
  19.                 USING (`id`)
  20.     )
  21.     LIMIT 10000
  22.  
Dec 25 '07 #6
bravo
43
hi
thanks again
i am not an expert but have tried lot of techniques for optimizing this query but got the same result
even the above query gives result in same time as previous one
one more thing i noticed that if i implement paging then the time increases as a multiple like for first 100 rec 10 secs, next 100 20secs and so on.....
so joins are not going to help me
but using subqueries i am not getting null fields for the records not matching as mentioned previously...and they are also slow same as joins

an optimized subquery for above query may be helpful but i am not getting it right

i am so confused not getting what to do now.... there must be some way to get out of such situations in mysql .....
Dec 26 '07 #7
pbmods
5,821 Expert 4TB
Heya, Bravo.

Is each table indexed on the id column?
Dec 27 '07 #8
bravo
43
Heya, Bravo.

Is each table indexed on the id column?

yes every table is indexed on the id column
as id is unique and is same in each table
does this effecting the performance?????


also i have one doubt over indexing ....
if in a table updation, insertion,deletion is done
frequently then does indexing effects the performance
ie will it have some negative impact..

and a humble request for u to reply soon as i dont think anybody
else in this forum is interested in this....
Dec 27 '07 #9
pbmods
5,821 Expert 4TB
Heya, Bravo.

Generally, indexing is a good thing. Especially for large recordsets, the benefits to having an indexed table far outweigh the overhead associated with maintaining those indexes.

At this point, I think your best bet would be to create an index table. Create a table that contains the results of your query and then run a periodic script to update it.

For maximum efficiency, you might want to go in and create some MySQL triggers to keep your index table auto-updated.
Dec 27 '07 #10
bravo
43
Thanks a lot for ur help....
i will try it .....
and update if i will get success some how.... may be some one else
get benifit....
Dec 27 '07 #11
pbmods
5,821 Expert 4TB
Heya, Bravo.

Good luck, and if it gives you any trouble, post back anytime :)
Dec 30 '07 #12

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

4 posts views Thread by mjuricek | last post: by
reply views Thread by Michal Hlavac | last post: by
4 posts views Thread by Huaer.XC | last post: by
19 posts views Thread by octangle | last post: by
1 post views Thread by rajeshkapadi | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.