By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,834 Members | 2,265 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,834 IT Pros & Developers. It's quick & easy.

left outer join terrible slow compared to inner join

P: n/a
Hi,

i've a speed problem withe the following statement:
SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id
FROM ot_adresse AS a, ot_produkt AS p
LEFT OUTER JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';

This is terrible slow compared to the inner join:
SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id
FROM ot_adresse AS a, ot_produkt AS p, ot_kat_prod AS pz
WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37'
AND p.p_id = pz.p_id;

These are the EXPLAIN ANALYZE output of both statements on
postgres 7.2.4:

EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ot_adresse AS a, ot_produkt AS p LEFT OUTER JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id ) WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';
NOTICE: QUERY PLAN:

Unique (cost=22061.28..22061.30 rows=1 width=272) (actual time=13332.01..13332.97 rows=11 loops=1)
-> Sort (cost=22061.28..22061.28 rows=2 width=272) (actual time=13332.00..13332.03 rows=46 loops=1)
-> Nested Loop (cost=21627.92..22061.27 rows=2 width=272) (actual time=13303.51..13328.98 rows=46 loops=1)
-> Index Scan using o_adresse_id_uidx on o_adresse (cost=0.00..5.96 rows=1 width=34) (actual time=0.16..0.19 rows=1 loops=1)
-> Materialize (cost=21900.98..21900.98 rows=12347 width=238) (actual time=13071.53..13111.92 rows=51394 loops=1)
-> Merge Join (cost=21627.92..21900.98 rows=12347 width=238) (actual time=11724.45..12908.46 rows=51394 loops=1)
-> Sort (cost=16815.61..16815.61 rows=6640 width=68) (actual time=4283.02..4307.07 rows=26049 loops=1)
-> Seq Scan on o_produkt (cost=0.00..16394.06 rows=6640 width=68) (actual time=0.06..1126.96 rows=26049 loops=1)
-> Sort (cost=4812.31..4812.31 rows=40851 width=170) (actual time=7441.36..7481.73 rows=51521 loops=1)
-> Subquery Scan pz (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.14..1161.81 rows=40896 loops=1)
-> Seq Scan on o_kat_prod (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.13..419.07 rows=40896 loops=1)
Total runtime: 13377.02 msec

EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ot_adresse AS a, ot_produkt AS p, ot_kat_prod AS pz WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37' AND p.p_id = pz.p_id;
NOTICE: QUERY PLAN:

Unique (cost=41.29..41.31 rows=1 width=272) (actual time=6.67..7.64 rows=11 loops=1)
-> Sort (cost=41.29..41.29 rows=2 width=272) (actual time=6.67..6.71 rows=46 loops=1)
-> Nested Loop (cost=0.00..41.28 rows=2 width=272) (actual time=0.68..3.73 rows=46 loops=1)
-> Nested Loop (cost=0.00..23.80 rows=1 width=102) (actual time=0.46..0.87 rows=11 loops=1)
-> Index Scan using o_adresse_id_uidx on o_adresse (cost=0.00..5.96 rows=1 width=34) (actual time=0.16..0.17 rows=1 loops=1)
-> Index Scan using o_produkt_a_id_idx on o_produkt (cost=0.00..17.83 rows=1 width=68) (actual time=0.29..0.65 rows=11 loops=1)
-> Index Scan using o_kat_prod_p_id_idx on o_kat_prod (cost=0.00..17.42 rows=5 width=170) (actual time=0.16..0.24 rows=4 loops=11)
Total runtime: 7.96 msec

Do i've any chance to get the indexes used in the OUTER JOIN?

Thanks for any hints!
-tb
--
Thomas Beutin tb@laokoon.IN-Berlin.DE
Beam me up, Scotty. There is no intelligent live down in Redmond.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
Thomas Beutin <ty****@laokoon.IN-Berlin.DE> writes:
i've a speed problem withe the following statement:
SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id
FROM ot_adresse AS a, ot_produkt AS p
LEFT OUTER JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';


In 7.2 (and 7.3), this syntax forces the planner to join ot_produkt to
ot_kat_prod first, which is terribly inefficient because the WHERE
constraints don't constrain that join at all. You could work around
this by writing instead

FROM (ot_adresse AS a CROSS JOIN ot_produkt AS p)
LEFT OUTER JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';

See http://www.postgresql.org/docs/7.3/s...cit-joins.html
for discussion. (Note: 7.4 will be less rigid about this issue.)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #2

P: n/a
On Thu, Aug 28, 2003 at 11:42:00AM -0400, Tom Lane wrote:
Thomas Beutin <ty****@laokoon.IN-Berlin.DE> writes:
i've a speed problem withe the following statement:
SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id
FROM ot_adresse AS a, ot_produkt AS p
LEFT OUTER JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';


In 7.2 (and 7.3), this syntax forces the planner to join ot_produkt to
ot_kat_prod first, which is terribly inefficient because the WHERE
constraints don't constrain that join at all. You could work around
this by writing instead

FROM (ot_adresse AS a CROSS JOIN ot_produkt AS p)
LEFT OUTER JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';

Thanks for the suggestion, but the result is close to the original outer
join without the explicit cross join but far away from the speed of the
inner join.

This uses the index o_produkt_a_id_idx on o_produkt, but the index
o_kat_prod_p_id_idx on o_kat_prod is still not used:
EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ( ot_adresse AS a CROSS JOIN ot_produkt AS p) LEFT OUTER JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id ) WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';
NOTICE: QUERY PLAN:

Unique (cost=2217.96..2217.98 rows=1 width=272) (actual time=6776.21..6777.17 rows=11 loops=1)
-> Sort (cost=2217.96..2217.96 rows=2 width=272) (actual time=6776.20..6776.24 rows=46 loops=1)
-> Nested Loop (cost=0.00..2217.95 rows=2 width=272) (actual time=721.82..6773.09 rows=46 loops=1)
-> Nested Loop (cost=0.00..23.80 rows=1 width=102) (actual time=0.69..1.74 rows=11 loops=1)
-> Index Scan using o_adresse_id_uidx on o_adresse (cost=0.00..5.96 rows=1 width=34) (actual time=0.29..0.31 rows=1 loops=1)
-> Index Scan using o_produkt_a_id_idx on o_produkt (cost=0.00..17.83 rows=1 width=68) (actual time=0.38..1.31 rows=11 loops=1)
-> Subquery Scan pz (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.03..485.07 rows=40917 loops=11)
-> Seq Scan on o_kat_prod (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.02..281.77 rows=40917 loops=11)
Total runtime: 6777.55 msec

Is there any chance to use an index on the joined table o_kat_prod?
Thanks for any hints!
-tb
--
Thomas Beutin tb@laokoon.IN-Berlin.DE
Beam me up, Scotty. There is no intelligent live down in Redmond.

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #3

P: n/a
Thomas Beutin <ty****@laokoon.IN-Berlin.DE> writes:
Thanks for the suggestion, but the result is close to the original outer
join without the explicit cross join but far away from the speed of the
inner join. EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ( ot_adresse AS a CROSS JOIN ot_produkt AS p) LEFT OUTER JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id ) WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';
NOTICE: QUERY PLAN: -> Subquery Scan pz (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.03..485.07 rows=40917 loops=11)
-> Seq Scan on o_kat_prod (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.02..281.77 rows=40917 loops=11)


Hmm, I don't understand why ot_kat_prod is being treated as a subquery
here. It isn't a view or something is it?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #4

P: n/a
On Thu, Aug 28, 2003 at 02:10:15PM -0400, Tom Lane wrote:
Thomas Beutin <ty****@laokoon.IN-Berlin.DE> writes:
Thanks for the suggestion, but the result is close to the original outer
join without the explicit cross join but far away from the speed of the
inner join.

EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ( ot_adresse AS a CROSS JOIN ot_produkt AS p) LEFT OUTER JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id ) WHERE p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';
NOTICE: QUERY PLAN:

-> Subquery Scan pz (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.03..485.07 rows=40917 loops=11)
-> Seq Scan on o_kat_prod (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.02..281.77 rows=40917 loops=11)


Hmm, I don't understand why ot_kat_prod is being treated as a subquery
here. It isn't a view or something is it?

Sorry, You're correct, actually this is a view of an other table called
o_kat_prod defined as
SELECT * FROM o_kat_prod;
but supposed to be in my next step something like
SELECT * FROM o_kat_prod WHERE <a_single_field> IS NOT NULL;

And doing the explicit cross join statement on o_kat_prod instead of
ot_kat_prod gives the expected performance to me ( 7.42 msec instead
of 7324.49 msec with EXPLAIN ANALYZE).

Do i've any chance to get the same performance on the view?

Thanks for any help!
-tb
--
Thomas Beutin tb@laokoon.IN-Berlin.DE
Beam me up, Scotty. There is no intelligent live down in Redmond.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #5

P: n/a
Thomas Beutin <ty****@laokoon.IN-Berlin.DE> writes:
Hmm, I don't understand why ot_kat_prod is being treated as a subquery
here. It isn't a view or something is it?
Sorry, You're correct, actually this is a view of an other table called
o_kat_prod defined as
SELECT * FROM o_kat_prod;
but supposed to be in my next step something like
SELECT * FROM o_kat_prod WHERE <a_single_field> IS NOT NULL;
Still curious, since I'd expect 7.2 to flatten a simple view into the
upper query.
Do i've any chance to get the same performance on the view?


Try 7.3, it may be better. Or consider 7.4 beta.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #6

P: n/a
> And doing the explicit cross join statement on o_kat_prod instead of
ot_kat_prod gives the expected performance to me ( 7.42 msec instead
of 7324.49 msec with EXPLAIN ANALYZE).

Do i've any chance to get the same performance on the view?


I've had this problem and it was due to improper typing - It was a join on
an int8 column, which has to be explicitly cast if it's a parameter to the
query.

Jon
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.