At 10:11 PM +0100 11/6/04, Pierre-Frédéric Caillaud wrote:
explain select notificationID from NOTIFICATION n, ITEM i where
n.itemID = i.itemID;
QUERY PLAN
------------------------------------------------------------------------
------
Hash Join (cost=47162.85..76291.32 rows=223672 width=44)
Hash Cond: ("outer".itemid = "inner".itemid)
-> Seq Scan on notification n (cost=0.00..12023.71 rows=223671
width=48)
-> Hash (cost=42415.28..42415.28 rows=741028 width=4)
-> Seq Scan on item i (cost=0.00..42415.28 rows=741028
width=4)
This query takes about 20 seconds to run.
Well, you're joining the entire two
tables, so yes, the seq scan might be faster.
Try your query with enable_seqscan=0 so
it'll use an index scan and compare the times.
You may be surprised to find that the
planner has indeed made the right choice.
This query selects 223672 rows, are you surprised it's slow ?
I'm not a SQL guru by any stretch but would a
constrained sub-select be appropriate here?
e.g. a simple test setup where each record in
table test1 has a FK referenced to an entry in
test:
joels=# \d test
Table "public.test"
Column | Type | Modifiers
--------+--------------+-----------
id | integer | not null
foo | character(3) |
Indexes:
"test_pkey" primary key, btree (id)
joels=# \d test1
Table "public.test1"
Column | Type | Modifiers
---------+---------+-----------
id | integer | not null
test_id | integer |
Indexes:
"test1_pkey" primary key, btree (id)
"test1_test_id_idx" btree (test_id)
Foreign-key constraints:
"$1" FOREIGN KEY (test_id) REFERENCES test(id) ON DELETE CASCADE
joels=# select count(*) from test;
count
-------
10001
(1 row)
joels=# select count(*) from test1;
count
-------
10001
(1 row)
joels=# explain select test_id from test1 t1, test t where t1.test_id =t.id;
QUERY PLAN
------------------------------------------------------------------------
Hash Join (cost=170.01..495.05 rows=10002 width=4)
Hash Cond: ("outer".test_id = "inner".id)
-> Seq Scan on test1 t1 (cost=0.00..150.01 rows=10001 width=4)
-> Hash (cost=145.01..145.01 rows=10001 width=4)
-> Seq Scan on test t (cost=0.00..145.01 rows=10001 width=4)
(5 rows)
joels=# explain select test_id from test1 t1
where test_id in (select id from test where id =
t1.test_id);
QUERY PLAN
------------------------------------------------------------------------------
Seq Scan on test1 t1 (cost=0.00..15269.02 rows=5001 width=4)
Filter: (subplan)
SubPlan
-> Index Scan using test_pkey on test (cost=0.00..3.01 rows=2 width=4)
Index Cond: (id = $0)
(5 rows)
So with the subselect the query planner would use
the primary key index on test when finding
referencing records in the test1 table.
Pierre, I seen the advice to use an additional
where condition in certain cases to induce an
index scan; how is this done?
my 1.2 pennies,
-Joel
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings