Slow queries in PL/PGSQL function

Jim Crate

I have a couple queries in a PL/PGSQL function which execute very slowly (around
one minute each) which execute in .5 second when not executed from within the
function. Is there any way to determine why this is happening? I couldn't
figure out how to run EXPLAIN ANALYZE from within the function. The queries
aren't very complicated.

Here is an example query and plan. About 240K rows in x_rbl_ips, 7.5M rows in
filter_ips.

explain analyze SELECT DISTINCT i_ip
FROM x_rbl_ips
LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip
WHERE x_rbl_ips.dts_last_modified > '2004-02-18 22:24:15.901689+00'
AND filter_ips.i_filter_ip IS NOT NULL
AND (i_filter_type_flags & X'02000000'::integer) <> X'02000000'::integer
QUERY PLAN
Unique (cost=5703.70..5703.81 rows=23 width=4) (actual time=524.439..524.439
rows=0 loops=1)
-> Sort (cost=5703.70..5703.75 rows=23 width=4) (actual
time=524.438..524.438 rows=0 loops=1)
Sort Key: x_rbl_ips.i_ip
-> Nested Loop (cost=0.00..5703.18 rows=23 width=4) (actual
time=524.028..524.028 rows=0 loops=1)
-> Seq Scan on x_rbl_ips (cost=0.00..5616.56 rows=23 width=4)
(actual time=412.738..415.842 rows=1738 loops=1)
Filter: (dts_last_modified > '2004-02-18
22:44:15.901689'::timestamp without time zone)
-> Index Scan using filter_ips_source_ip on filter_ips
(cost=0.00..3.75 rows=1 width=4) (actual time=0.061..0.061 rows=0 loops=1738)
Index Cond: ("outer".i_ip = filter_ips.i_filter_ip)
Filter: ((i_filter_ip IS NOT NULL) AND ((i_filter_type_flags
& 33554432) <> 33554432))
Total runtime: 524.868 ms

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 22 '05 #1

Subscribe Post Reply

5224

Richard Huxton

On Thursday 19 February 2004 23:00, Jim Crate wrote:

I have a couple queries in a PL/PGSQL function which execute very slowly
(around one minute each) which execute in .5 second when not executed from
within the function. Is there any way to determine why this is happening?
I couldn't figure out how to run EXPLAIN ANALYZE from within the function.
You can't - hmm, looking here: http://developer.postgresql.org/todo.php
I can't even see a TODO. I'll suggest it on the hackers list.
explain analyze SELECT DISTINCT i_ip
FROM x_rbl_ips
LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip
WHERE x_rbl_ips.dts_last_modified > '2004-02-18 22:24:15.901689+00'
AND filter_ips.i_filter_ip IS NOT NULL
AND (i_filter_type_flags & X'02000000'::integer) <> X'02000000'::integer

I'm guessing that the values in your query are variables/parameters in the
plpgsql function? The problem is that the plan is compiled when the function
is first run, so it doesn't know what values you will use. You might tend to
use values that make sense to index, but it can't tell.

Try rephrasing this query as an EXECUTE ''query-string'' and see if that makes
the problem go away.
--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #2

Tom Lane

Richard Huxton <de*@archonet.com> writes:

On Thursday 19 February 2004 23:00, Jim Crate wrote:
explain analyze SELECT DISTINCT i_ip
FROM x_rbl_ips
LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip
WHERE x_rbl_ips.dts_last_modified > '2004-02-18 22:24:15.901689+00'
AND filter_ips.i_filter_ip IS NOT NULL
AND (i_filter_type_flags & X'02000000'::integer) <> X'02000000'::integer
I'm guessing that the values in your query are variables/parameters in the
plpgsql function? The problem is that the plan is compiled when the function
is first run, so it doesn't know what values you will use. You might tend to
use values that make sense to index, but it can't tell.

Specifically, the only part of that that looks indexable is the
dts_last_modified constraint. If it's always "dts_last_modified >
some-time-in-the-recent-past" then the range of values scanned is going
to be small enough to make an indexscan worthwhile. But if what the
planner sees is "dts_last_modified > variable" then it isn't gonna risk
an indexscan, because in the general case that could mean scanning a
large part of the table, and the indexscan would be tremendously slow.

What you can do to work around this (I'm assuming dts_last_modified
never contains values in the future) is add a dummy constraint:

WHERE x_rbl_ips.dts_last_modified > variable
AND x_rbl_ips.dts_last_modified <= now()
AND other-stuff

Now what the planner sees is dts_last_modified being constrained to a
range, rather than an open-ended interval. It still has to guess about
how much of the index will be scanned, but its guess in this scenario
is much smaller and it should usually pick the indexscan instead.

BTW, in recent releases you can investigate planner choices involving
queries with variables by using PREPARE and EXPLAIN EXECUTE. For
example

PREPARE myq(timestamptz) AS
...
WHERE x_rbl_ips.dts_last_modified > $1
...

EXPLAIN EXECUTE myq('2004-02-18 22:24:15.901689+00');

This allows you to exactly reproduce the conditions that the planner has
to work under when planning a query from a plpgsql function.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 22 '05 #3

Gary Doades

I have a very similar problem. I put the following SQL into a
function:

SELECT VS.*,VL.TEL1,SC.CONTRACT_ID,SC.CONTRACT_REF, SC.MAX_HOURS,
SC.MIN_HOURS,
(SELECT COUNT(*) FROM TIMESHEET_DETAIL
JOIN MAIN_ORDER ON (MAIN_ORDER.ORDER_ID = TIMESHEET_DETAIL.ORDER_ID
AND MAIN_ORDER.CLIENT_ID = $3)
WHERE TIMESHEET_DETAIL.CONTRACT_ID = SC.CONTRACT_ID) AS VISITS,
(SELECT (SUM(R.DURATION+1))/60.0 FROM ORDER_REQT R
JOIN STAFF_BOOKING B ON (B.REQT_ID = R.REQT_ID)
JOIN BOOKING_PLAN BP ON (BP.BOOKING_ID = B.BOOKING_ID)
WHERE B.CONTRACT_ID = SC.CONTRACT_ID
AND BP.BOOKING_DATE BETWEEN $1 AND $2) AS RHOURS
FROM VSTAFF VS
JOIN STAFF_CONTRACT SC ON (SC.STAFF_ID = VS.STAFF_ID)
JOIN VLOCATION VL ON (VL.LOCATION_ID = VS.LOCATION_ID)
JOIN SEARCH_REQT_RESULT SR ON (SR.STAFF_ID = VS.STAFF_ID)
WHERE SR.SEARCH_ID = $4
AND SC.CONTRACT_ID IN
(SELECT C.CONTRACT_ID FROM STAFF_PRODUCT P,STAFF_CONTRACT C
WHERE P.CONTRACT_ID=C.CONTRACT_ID AND C.STAFF_ID = VS.STAFF_ID AND
P.PRODUCT_ID IN (SELECT PRODUCT_ID FROM SEARCH_ORDER_REQT WHERE
SEARCH_ID = $4)
AND C.AVAIL_DATE_FROM <= $1 AND C.AVAIL_DATE_TO >= $2 GROUP BY
C.CONTRACT_ID
HAVING (COUNT(C.CONTRACT_ID) = (SELECT COUNT(DISTINCT PRODUCT_ID)
FROM SEARCH_ORDER_REQT WHERE SEARCH_ID = $4)));

When executed from the client and substituting constants for the
variables ($1,$2,$3) it takes 1 second. When executed from within a
function using exactly the same values it takes 30 seconds! Using your
method to get the execution plan with variables I can see it has
switched to a sequential scan on the BOOKING_PLAN table (250K records)
instead of the index it uses otherwise. I have spent a while
optimising this and tested with quite a range of date values. It
always uses the indexes when executed from the client. The date ranges
given are usually quite small.

I am trying to migrate from MS SQLSever to PG and this has now stopped
me for a while. I could transfer all the stored procedures into client
code, but that seems very inefficient.

Is there someway to force the use of an index. Or at least get the
backend to substitue the parameters in a function before doing the
first query plan so it has more typical values to work with?

Regards,
Gary.

On Fri, 20 Feb 2004 09:20:24 -0500, tg*@sss.pgh.pa.us (Tom Lane)
wrote:

Richard Huxton <de*@archonet.com> writes:
On Thursday 19 February 2004 23:00, Jim Crate wrote:
explain analyze SELECT DISTINCT i_ip
FROM x_rbl_ips
LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip
WHERE x_rbl_ips.dts_last_modified > '2004-02-18 22:24:15.901689+00'
AND filter_ips.i_filter_ip IS NOT NULL
AND (i_filter_type_flags & X'02000000'::integer) <> X'02000000'::integer

I'm guessing that the values in your query are variables/parameters in the
plpgsql function? The problem is that the plan is compiled when the function
is first run, so it doesn't know what values you will use. You might tend to
use values that make sense to index, but it can't tell.

Specifically, the only part of that that looks indexable is the
dts_last_modified constraint. If it's always "dts_last_modified >
some-time-in-the-recent-past" then the range of values scanned is going
to be small enough to make an indexscan worthwhile. But if what the
planner sees is "dts_last_modified > variable" then it isn't gonna risk
an indexscan, because in the general case that could mean scanning a
large part of the table, and the indexscan would be tremendously slow.

What you can do to work around this (I'm assuming dts_last_modified
never contains values in the future) is add a dummy constraint:

WHERE x_rbl_ips.dts_last_modified > variable
AND x_rbl_ips.dts_last_modified <= now()
AND other-stuff

Now what the planner sees is dts_last_modified being constrained to a
range, rather than an open-ended interval. It still has to guess about
how much of the index will be scanned, but its guess in this scenario
is much smaller and it should usually pick the indexscan instead.

BTW, in recent releases you can investigate planner choices involving
queries with variables by using PREPARE and EXPLAIN EXECUTE. For
example

PREPARE myq(timestamptz) AS
...
WHERE x_rbl_ips.dts_last_modified > $1
...

EXPLAIN EXECUTE myq('2004-02-18 22:24:15.901689+00');

This allows you to exactly reproduce the conditions that the planner has
to work under when planning a query from a plpgsql function.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 22 '05 #4

Tom Lane

Gary Doades <gp*@cwcom.net> writes:

Is there someway to force the use of an index. Or at least get the
backend to substitue the parameters in a function before doing the
first query plan so it has more typical values to work with?

Could we see the EXPLAIN ANALYZE output for your problem query?
Table schemas (column data types and available indexes) are necessary
background for this type of question as well.

You might want to take the question to pgsql-performance, too ...
it's a tad off topic for -general.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #5

Gary Doades

Thank for the reply.

I will move this onto the perfromance group, except that for some odd reason I cant see a
performance group on usenet. I will subscribe to the mailing list....

I will gladly upload the schema (script) or anything else that anyone wants, but it may not be
appropriate for a newsgroup or mailing list. If you want anything else please let me know where to
send it.

This is not the only query that is causing me a problem. I have about 30 stored procedures to move
from SQLServer. 15 of the 21 so far moved over to PG have similar problems. Very fast when submitted
as a complete SQL statement with constants. Very slow when used as functions.

Here is the explain for ths query using constants:

QUERY PLAN
Hash Join (cost=629.34..213908.73 rows=34 width=151) (actual time=110.000..1518.000 rows=98
loops=1)
Hash Cond: ("outer".staff_id = "inner".staff_id)
Join Filter: (subplan)
InitPlan
-> Seq Scan on wruserarea (cost=1.46..3.29 rows=1 width=4) (actual time=0.000..0.000 rows=1
loops=1)
Filter: ((uid = $4) AND (area_id = 1))
InitPlan
-> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (actual time=0.000..0.000
rows=1 loops=1)
Filter: ((username)::name = "current_user"())
-> Seq Scan on staff_contract sc (cost=0.00..10.35 rows=335 width=34) (actual time=0.000..1.000
rows=335 loops=1)
-> Hash (cost=625.88..625.88 rows=66 width=125) (actual time=10.000..10.000 rows=0 loops=1)
-> Nested Loop (cost=56.55..625.88 rows=66 width=125) (actual time=6.000..10.000 rows=98
loops=1)
-> Merge Join (cost=56.55..73.06 rows=101 width=111) (actual time=6.000..7.000
rows=98 loops=1)
Merge Cond: ("outer".staff_id = "inner".staff_id)
-> Index Scan using staff_pkey on staff (cost=7.74..21.90 rows=332 width=107)
(actual time=4.000..4.000 rows=332 loops=1)
Filter: ((hashed subplan) OR $5)
SubPlan
-> Seq Scan on staff_area (cost=3.16..7.52 rows=88 width=4) (actual
time=0.000..0.000 rows=18 loops=1)
Filter: ((hashed subplan) OR (area_id = 1))
SubPlan
-> Seq Scan on wruserarea (cost=1.46..3.16 rows=3 width=4)
(actual time=0.000..0.000 rows=1 loops=1)
Filter: (uid = $1)
InitPlan
-> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4)
(actual time=0.000..0.000 rows=1 loops=1)
Filter: ((username)::name = "current_user"())
-> Sort (cost=48.81..49.06 rows=101 width=4) (actual time=2.000..2.000 rows=98
loops=1)
Sort Key: sr.staff_id
-> Seq Scan on search_reqt_result sr (cost=0.00..45.45 rows=101 width=4)
(actual time=0.000..2.000 rows=98 loops=1)
Filter: (search_id = 143)
-> Index Scan using location_pkey on "location" (cost=0.00..5.46 rows=1 width=18)
(actual time=0.000..0.000 rows=1 loops=98)
Index Cond: ("location".location_id = "outer".location_id)
Filter: ((area_id = 1) OR (subplan))
SubPlan
-> Seq Scan on wruserarea (cost=1.46..3.44 rows=2 width=4) (never executed)
Filter: ((uid = $6) AND ((area_id = 1) OR (area_id = $7)))
InitPlan
-> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (never
executed)
Filter: ((username)::name = "current_user"())
SubPlan
-> GroupAggregate (cost=3.10..639.23 rows=1 width=4) (actual time=1.765..1.765 rows=1
loops=98)
Filter: (count(contract_id) = $9)
InitPlan
-> Aggregate (cost=1.55..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1
loops=1)
-> Seq Scan on search_order_reqt (cost=0.00..1.55 rows=1 width=4) (actual
time=0.000..0.000 rows=1 loops=1)
Filter: (search_id = 143)
-> Nested Loop IN Join (cost=1.55..637.67 rows=1 width=4) (actual time=1.439..1.765
rows=1 loops=98)
Join Filter: ("outer".product_id = "inner".product_id)
-> Nested Loop (cost=0.00..631.93 rows=186 width=8) (actual time=0.347..1.378
rows=245 loops=98)
-> Index Scan using staff_contract_pkey on staff_contract c
(cost=0.00..15.77 rows=1 width=4) (actual time=0.255..0.449 rows=1 loops=98)
Filter: ((staff_id = $8) AND (avail_date_from <= '2003-06-12'::date) AND
(avail_date_to >= '2003-06-18'::date))
-> Index Scan using staff_product_contract_id_key on staff_product p
(cost=0.00..613.80 rows=189 width=8) (actual time=0.061..0.571 rows=245 loops=98)
Index Cond: (p.contract_id = "outer".contract_id)
-> Materialize (cost=1.55..1.56 rows=1 width=4) (actual time=0.000..0.000 rows=1
loops=23972)
-> Seq Scan on search_order_reqt (cost=0.00..1.55 rows=1 width=4) (actual
time=0.000..0.000 rows=1 loops=1)
Filter: (search_id = 143)
-> Aggregate (cost=2252.27..2252.27 rows=1 width=2) (actual time=12.673..12.673 rows=1
loops=98)
-> Nested Loop (cost=2049.93..2252.19 rows=28 width=2) (actual time=8.959..12.612
rows=13 loops=98)
-> Hash Join (cost=2049.93..2167.47 rows=28 width=4) (actual time=8.908..12.337
rows=13 loops=98)
Hash Cond: ("outer".booking_id = "inner".booking_id)
-> Index Scan using booking_plan_idx2 on booking_plan bp (cost=0.00..98.15
rows=3822 width=4) (actual time=0.755..6.520 rows=4693 loops=98)
Index Cond: ((booking_date >= '2003-06-12'::date) AND (booking_date <=
'2003-06-18'::date))
-> Hash (cost=2048.37..2048.37 rows=624 width=8) (actual time=2.694..2.694
rows=0 loops=98)
-> Index Scan using staff_book_idx5 on staff_booking b
(cost=0.00..2048.37 rows=624 width=8) (actual time=0.041..2.031 rows=358 loops=98)
Index Cond: (contract_id = $0)
-> Index Scan using order_reqt_pkey on order_reqt r (cost=0.00..3.01 rows=1
width=6) (actual time=0.014..0.014 rows=1 loops=1312)
Index Cond: ("outer".reqt_id = r.reqt_id)
-> Aggregate (cost=3387.38..3387.38 rows=1 width=0) (actual time=0.806..0.806 rows=1 loops=98)
-> Nested Loop (cost=0.00..3387.37 rows=2 width=0) (actual time=0.765..0.786 rows=3
loops=98)
-> Index Scan using main_order_idx on main_order (cost=0.00..5.86 rows=2 width=4)
(actual time=0.041..0.051 rows=1 loops=98)
Index Cond: (client_id = 1011001947)
-> Index Scan using timesheet_detail_idx on timesheet_detail (cost=0.00..1690.72
rows=3 width=4) (actual time=0.714..0.735 rows=3 loops=98)
Index Cond: ("outer".order_id = timesheet_detail.order_id)
Filter: (contract_id = $0)
Total runtime: 1530.000 ms

-------------------------------------------------------------------------------------------------------------------------------------------------

and here is the explain for the same query with parameters:
Hash Join (cost=651.46..450445.01 rows=35 width=151) (actual time=304.000..23814.000 rows=98
loops=1)
Hash Cond: ("outer".staff_id = "inner".staff_id)
Join Filter: (subplan)
InitPlan
-> Seq Scan on wruserarea (cost=1.46..3.29 rows=1 width=4) (actual time=0.000..0.000 rows=1
loops=1)
Filter: ((uid = $4) AND (area_id = 1))
InitPlan
-> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (actual time=0.000..0.000
rows=1 loops=1)
Filter: ((username)::name = "current_user"())
-> Seq Scan on staff_contract sc (cost=0.00..10.35 rows=335 width=34) (actual time=0.000..1.000
rows=335 loops=1)
-> Hash (cost=648.00..648.00 rows=68 width=125) (actual time=8.000..8.000 rows=0 loops=1)
-> Nested Loop (cost=56.71..648.00 rows=68 width=125) (actual time=4.000..8.000 rows=98
loops=1)
-> Merge Join (cost=56.71..73.28 rows=105 width=111) (actual time=3.000..6.000
rows=98 loops=1)
Merge Cond: ("outer".staff_id = "inner".staff_id)
-> Index Scan using staff_pkey on staff (cost=7.74..21.90 rows=332 width=107)
(actual time=1.000..2.000 rows=332 loops=1)
Filter: ((hashed subplan) OR $5)
SubPlan
-> Seq Scan on staff_area (cost=3.16..7.52 rows=88 width=4) (actual
time=0.000..0.000 rows=18 loops=1)
Filter: ((hashed subplan) OR (area_id = 1))
SubPlan
-> Seq Scan on wruserarea (cost=1.46..3.16 rows=3 width=4)
(actual time=0.000..0.000 rows=1 loops=1)
Filter: (uid = $1)
InitPlan
-> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4)
(actual time=0.000..0.000 rows=1 loops=1)
Filter: ((username)::name = "current_user"())
-> Sort (cost=48.97..49.24 rows=105 width=4) (actual time=2.000..2.000 rows=98
loops=1)
Sort Key: sr.staff_id
-> Seq Scan on search_reqt_result sr (cost=0.00..45.45 rows=105 width=4)
(actual time=0.000..2.000 rows=98 loops=1)
Filter: (search_id = $4)
-> Index Scan using location_pkey on "location" (cost=0.00..5.46 rows=1 width=18)
(actual time=0.010..0.010 rows=1 loops=98)
Index Cond: ("location".location_id = "outer".location_id)
Filter: ((area_id = 1) OR (subplan))
SubPlan
-> Seq Scan on wruserarea (cost=1.46..3.44 rows=2 width=4) (never executed)
Filter: ((uid = $6) AND ((area_id = 1) OR (area_id = $7)))
InitPlan
-> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (never
executed)
Filter: ((username)::name = "current_user"())
SubPlan
-> GroupAggregate (cost=3.10..652.99 rows=1 width=4) (actual time=1.847..1.847 rows=1
loops=98)
Filter: (count(contract_id) = $9)
InitPlan
-> Aggregate (cost=1.55..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1
loops=1)
-> Seq Scan on search_order_reqt (cost=0.00..1.55 rows=1 width=4) (actual
time=0.000..0.000 rows=1 loops=1)
Filter: (search_id = $4)
-> Nested Loop IN Join (cost=1.55..651.43 rows=1 width=4) (actual time=1.449..1.837
rows=1 loops=98)
Join Filter: ("outer".product_id = "inner".product_id)
-> Nested Loop (cost=0.00..645.69 rows=186 width=8) (actual time=0.429..1.306
rows=245 loops=98)
-> Index Scan using staff_contract_pkey on staff_contract c
(cost=0.00..17.45 rows=1 width=4) (actual time=0.276..0.398 rows=1 loops=98)
Filter: ((staff_id = $8) AND ((avail_date_from)::timestamp without time
zone <= $1) AND ((avail_date_to)::timestamp without time zone >= $1))
-> Index Scan using staff_product_contract_id_key on staff_product p
(cost=0.00..625.85 rows=191 width=8) (actual time=0.133..0.582 rows=245 loops=98)
Index Cond: (p.contract_id = "outer".contract_id)
-> Materialize (cost=1.55..1.56 rows=1 width=4) (actual time=0.000..0.001 rows=1
loops=23972)
-> Seq Scan on search_order_reqt (cost=0.00..1.55 rows=1 width=4) (actual
time=0.000..0.000 rows=1 loops=1)
Filter: (search_id = $4)
-> Aggregate (cost=8760.23..8760.24 rows=1 width=2) (actual time=236.245..236.245 rows=1
loops=98)
-> Nested Loop (cost=2009.53..8760.21 rows=9 width=2) (actual time=226.378..236.194
rows=14 loops=98)
-> Hash Join (cost=2009.53..8732.97 rows=9 width=4) (actual time=226.286..235.755
rows=14 loops=98)
Hash Cond: ("outer".booking_id = "inner".booking_id)
-> Seq Scan on booking_plan bp (cost=0.00..6716.98 rows=1274 width=4)
(actual time=209.684..229.684 rows=4704 loops=98)
Filter: (((booking_date)::timestamp without time zone >= $1) AND
((booking_date)::timestamp without time zone <= $2))
-> Hash (cost=2008.02..2008.02 rows=606 width=8) (actual time=3.357..3.357
rows=0 loops=98)
-> Index Scan using staff_book_idx5 on staff_booking b
(cost=0.00..2008.02 rows=606 width=8) (actual time=0.163..3.061 rows=358 loops=98)
Index Cond: (contract_id = $0)
-> Index Scan using order_reqt_pkey on order_reqt r (cost=0.00..3.01 rows=1
width=6) (actual time=0.024..0.026 rows=1 loops=1326)
Index Cond: ("outer".reqt_id = r.reqt_id)
-> Aggregate (cost=3443.91..3443.91 rows=1 width=0) (actual time=4.745..4.745 rows=1 loops=98)
-> Nested Loop (cost=0.00..3443.90 rows=2 width=0) (actual time=4.531..4.724 rows=3
loops=98)
-> Index Scan using main_order_idx on main_order (cost=0.00..5.87 rows=2 width=4)
(actual time=0.143..0.153 rows=1 loops=98)
Index Cond: (client_id = $3)
-> Index Scan using timesheet_detail_idx on timesheet_detail (cost=0.00..1718.97
rows=3 width=4) (actual time=4.378..4.571 rows=3 loops=98)
Index Cond: ("outer".order_id = timesheet_detail.order_id)
Filter: (contract_id = $0)
Total runtime: 23853.000 ms

The only real difference I can see is the booking_plan table using a sequential scan.

Any help appreciated

Thanks,
Gary.
On Sat, 21 Feb 2004 11:15:50 -0500, tg*@sss.pgh.pa.us (Tom Lane) wrote:

Gary Doades <gp*@cwcom.net> writes:
Is there someway to force the use of an index. Or at least get the
backend to substitue the parameters in a function before doing the
first query plan so it has more typical values to work with?

Could we see the EXPLAIN ANALYZE output for your problem query?
Table schemas (column data types and available indexes) are necessary
background for this type of question as well.

You might want to take the question to pgsql-performance, too ...
it's a tad off topic for -general.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #6

Gary Doades

It's OK, I'm an idiot.

I read soooo many times that you need to get the data types in the query the same as the column
types or indexes won't work. So I go and get it wrong!

I had defined the parameter to the function as timestamp (equivalent to SQLServer datetime), but the
column data type is date. SQLServer does not have a date type so the datetime parameters work, but
straight conversion causes PG not to use indexes.

I guess that many might be put off PG because of this without bothering to find out why. I think I
read somewhere that there are plans to improve PGs behaviour in this area, is this so?
Cheers,
Gary.
On Sat, 21 Feb 2004 11:15:50 -0500, tg*@sss.pgh.pa.us (Tom Lane) wrote:

Gary Doades <gp*@cwcom.net> writes:
Is there someway to force the use of an index. Or at least get the
backend to substitue the parameters in a function before doing the
first query plan so it has more typical values to work with?

Could we see the EXPLAIN ANALYZE output for your problem query?
Table schemas (column data types and available indexes) are necessary
background for this type of question as well.

You might want to take the question to pgsql-performance, too ...
it's a tad off topic for -general.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #7

Jim Crate

Thanks for the help with this. I followed Richard's advice and changed the
function to EXECUTE the query so it would be planned with the actual values at
the time the query would be run, and it worked as expected.

Specifically, the only part of that that looks indexable is the
dts_last_modified constraint. If it's always "dts_last_modified >
some-time-in-the-recent-past" then the range of values scanned is going
to be small enough to make an indexscan worthwhile. But if what the
planner sees is "dts_last_modified > variable" then it isn't gonna risk
an indexscan, because in the general case that could mean scanning a
large part of the table, and the indexscan would be tremendously slow.
Actually, dts_last_modified was not indexed. A sequential scan on
dts_last_modified in x_rbl_ips was very fast even with 250K records, around .5
second for the entire query. The only reason I could think that the query would
take up to several minutes when run from the function is that it was joining the
entire 250K rows of x_rbl_ips with the 7M rows of filter_ips before filtering on
x_rbl_ips.dts_last_modified.
What you can do to work around this (I'm assuming dts_last_modified
never contains values in the future) is add a dummy constraint:

WHERE x_rbl_ips.dts_last_modified > variable
AND x_rbl_ips.dts_last_modified <= now()
AND other-stuff

Now what the planner sees is dts_last_modified being constrained to a
range, rather than an open-ended interval. It still has to guess about
how much of the index will be scanned, but its guess in this scenario
is much smaller and it should usually pick the indexscan instead.
Running the query with FOR loop_rec IN EXECUTE turned out to be so easy and fast
that I didn't try this. However, as I mentioned, x_rbl_ips.dts_last_modified
wasn't indexed in the first place so I don't know if it would have helped.
BTW, in recent releases you can investigate planner choices involving
queries with variables by using PREPARE and EXPLAIN EXECUTE. For

This will be very useful for future testing. I spend quite a bit of time
looking through the docs, and hadn't found this yet.

Thanks for the help.

--
Jim Crate
Deep Sky Technologies, Inc.

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 22 '05 #8

Slow queries in PL/PGSQL function

Similar topics