Very slow query

Rory Campbell-Lange

The following query on some small datasets takes over a second to run.
I'd be grateful for some help in understanding the explain output, and
to remake the code.

Looks like the sort is using up quite a bit of resources. I made an
index on boards using columns "b.n_type, b.n_id, b.t_name" but the index
was not recorded in explain analyze. (see "testindex" below).

I am using PostgreSQL 7.4.2 on i386-pc-linux-gnu (Debian). The query is
in a psql function (I've changed variables to hard coded integers
below). The db is vacuumed every night.

Thanks for any help;
Rory
SELECT DISTINCT
b.n_id as id,
b.n_type,
CASE b.n_type WHEN 0 THEN 'personal'
WHEN 1 THEN 'private'
WHEN 2 THEN 'blog'
ELSE 'public'
END as type,
b.t_name as title
FROM
boards b, people p, objects o
WHERE
b.b_hidden = 'f'
AND
(
b.n_type = 3
OR
b.n_creator = 71
OR
( b.n_id = o.n_board_id
AND
o.n_creator = 71
AND
o.n_joined > 0
)
)
ORDER BY
b.n_type, b.n_id ASC, b.t_name;

trial=> \d boards
Table "public.boards"
Column | Type | Modifiers
---------------+-----------------------------+----------------------------------------------------------
n_id | integer | not null default nextval('public.boards_n_id_seq'::text)
b_hidden | boolean | default false
dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
t_mobile | character varying(15) |
t_email | character varying(50) |
n_creator | integer | not null
n_type | smallint | not null default 0
t_name | character varying(100) | not null
t_description | character varying(500) |
n_id_photo | integer |
n_bg_colour | integer | default 0
Indexes:
"boards_pkey" primary key, btree (n_id)
"boards_t_email_key" unique, btree (t_email)
"boards_t_mobile_key" unique, btree (t_mobile)
"testindex" btree (n_type, n_id, t_name)
Foreign-key constraints:
"$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
"$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
Triggers:
tr_update_modified_time BEFORE UPDATE ON boards FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()

trial=> \d people
Table "public.people"
Column | Type | Modifiers
-------------------+-----------------------------+----------------------------------------------------------
n_id | integer | not null default nextval('public.people_n_id_seq'::text)
n_object_id | integer |
n_objects_counter | integer | default 0
b_hidden | boolean | default false
dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
t_nickname | character varying(20) | not null
t_firstname | character varying(20) | not null
t_surname | character varying(25) | not null
t_mobile | character varying(15) |
t_email | character varying(50) |
t_password | character varying(15) | not null
b_registered | boolean | default false
n_email_status | smallint | default 0
n_txt_status | smallint | default 0
b_work_hours | boolean | default false
b_confirmations | boolean | default true
Indexes:
"people_pkey" primary key, btree (n_id)
"people_t_email_key" unique, btree (t_email)
"people_t_mobile_key" unique, btree (t_mobile)
"people_t_nickname_key" unique, btree (t_nickname)
Foreign-key constraints:
"object_chk" FOREIGN KEY (n_object_id) REFERENCES objects(n_id) ON UPDATE CASCADE ON DELETE SET NULL
Triggers:
tr_update_modified_time BEFORE UPDATE ON people FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()

trial=> \d objects
Table "public.objects"
Column | Type | Modifiers
---------------+-----------------------------+-----------------------------------------------------------
n_id | integer | not null default nextval('public.objects_n_id_seq'::text)
t_text_id | character varying(25) | not null
b_hidden | boolean | default false
dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
n_creator | integer | not null
n_type | smallint | not null default 0
t_name | character varying(100) | not null
t_description | text |
t_postcode | character varying(10) |
n_id_photo | integer |
n_board_id | integer | not null
n_joined | smallint |
b_trigger | boolean | default true
Indexes:
"objects_pkey" primary key, btree (n_id)
Foreign-key constraints:
"$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
"$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
"$3" FOREIGN KEY (n_board_id) REFERENCES boards(n_id) ON UPDATE CASCADE ON DELETE CASCADE

Unique (cost=3677.26..3711.48 rows=102 width=18) (actual time=12566.422..13045.404 rows=4 loops=1)
-> Sort (cost=3677.26..3684.10 rows=2738 width=18) (actual time=12566.413..12809.928 rows=158254 loops=1)
Sort Key: b.n_type, b.n_id, b.t_name, CASE WHEN (b.n_type = 0) THEN 'personal'::text WHEN (b.n_type = 1) THEN 'private'::te
xt WHEN (b.n_type = 2) THEN 'blog'::text ELSE 'public'::text END
-> Nested Loop (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254 loops=1)
-> Seq Scan on people p (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1)
-> Materialize (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362 loops=67)
-> Nested Loop (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362 loops=1)
Join Filter: ((("inner".n_id = "outer".n_board_id) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) A
ND (("outer".n_creator = 71) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) AND (("outer".n_joined > 0) OR ("inner".n_type = 3
) OR ("inner".n_creator = 71)))
-> Seq Scan on objects o (cost=0.00..27.87 rows=787 width=10) (actual time=0.005..1.372 rows=787 loops=
1)
-> Materialize (cost=3.27..4.29 rows=102 width=22) (actual time=0.001..0.105 rows=102 loops=787)
-> Seq Scan on boards b (cost=0.00..3.27 rows=102 width=22) (actual time=0.017..0.352 rows=102 lo
ops=1)
Filter: (b_hidden = false)
Total runtime: 13051.065 ms

--
Rory Campbell-Lange
<ro**@campbell-lange.net>
<www.campbell-lange.net>

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 23 '05 #1

Subscribe Post Reply

1674

Rory Campbell-Lange

Sorry for replying to my own post, but I'm anxious for an answer. Should
I provide other information?

Thanks
Rory

On 10/05/04, Rory Campbell-Lange (ro**@campbell-lange.net) wrote:

The following query on some small datasets takes over a second to run.
I'd be grateful for some help in understanding the explain output, and
to remake the code.

Looks like the sort is using up quite a bit of resources. I made an
index on boards using columns "b.n_type, b.n_id, b.t_name" but the index
was not recorded in explain analyze. (see "testindex" below).

I am using PostgreSQL 7.4.2 on i386-pc-linux-gnu (Debian). The query is
in a psql function (I've changed variables to hard coded integers
below). The db is vacuumed every night.

Thanks for any help;
Rory
SELECT DISTINCT
b.n_id as id,
b.n_type,
CASE b.n_type WHEN 0 THEN 'personal'
WHEN 1 THEN 'private'
WHEN 2 THEN 'blog'
ELSE 'public'
END as type,
b.t_name as title
FROM
boards b, people p, objects o
WHERE
b.b_hidden = 'f'
AND
(
b.n_type = 3
OR
b.n_creator = 71
OR
( b.n_id = o.n_board_id
AND
o.n_creator = 71
AND
o.n_joined > 0
)
)
ORDER BY
b.n_type, b.n_id ASC, b.t_name;

trial=> \d boards
Table "public.boards"
Column | Type | Modifiers
---------------+-----------------------------+----------------------------------------------------------
n_id | integer | not null default nextval('public.boards_n_id_seq'::text)
b_hidden | boolean | default false
dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
t_mobile | character varying(15) |
t_email | character varying(50) |
n_creator | integer | not null
n_type | smallint | not null default 0
t_name | character varying(100) | not null
t_description | character varying(500) |
n_id_photo | integer |
n_bg_colour | integer | default 0
Indexes:
"boards_pkey" primary key, btree (n_id)
"boards_t_email_key" unique, btree (t_email)
"boards_t_mobile_key" unique, btree (t_mobile)
"testindex" btree (n_type, n_id, t_name)
Foreign-key constraints:
"$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
"$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
Triggers:
tr_update_modified_time BEFORE UPDATE ON boards FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()

trial=> \d people
Table "public.people"
Column | Type | Modifiers
-------------------+-----------------------------+----------------------------------------------------------
n_id | integer | not null default nextval('public.people_n_id_seq'::text)
n_object_id | integer |
n_objects_counter | integer | default 0
b_hidden | boolean | default false
dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
t_nickname | character varying(20) | not null
t_firstname | character varying(20) | not null
t_surname | character varying(25) | not null
t_mobile | character varying(15) |
t_email | character varying(50) |
t_password | character varying(15) | not null
b_registered | boolean | default false
n_email_status | smallint | default 0
n_txt_status | smallint | default 0
b_work_hours | boolean | default false
b_confirmations | boolean | default true
Indexes:
"people_pkey" primary key, btree (n_id)
"people_t_email_key" unique, btree (t_email)
"people_t_mobile_key" unique, btree (t_mobile)
"people_t_nickname_key" unique, btree (t_nickname)
Foreign-key constraints:
"object_chk" FOREIGN KEY (n_object_id) REFERENCES objects(n_id) ON UPDATE CASCADE ON DELETE SET NULL
Triggers:
tr_update_modified_time BEFORE UPDATE ON people FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()

trial=> \d objects
Table "public.objects"
Column | Type | Modifiers
---------------+-----------------------------+-----------------------------------------------------------
n_id | integer | not null default nextval('public.objects_n_id_seq'::text)
t_text_id | character varying(25) | not null
b_hidden | boolean | default false
dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
n_creator | integer | not null
n_type | smallint | not null default 0
t_name | character varying(100) | not null
t_description | text |
t_postcode | character varying(10) |
n_id_photo | integer |
n_board_id | integer | not null
n_joined | smallint |
b_trigger | boolean | default true
Indexes:
"objects_pkey" primary key, btree (n_id)
Foreign-key constraints:
"$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
"$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
"$3" FOREIGN KEY (n_board_id) REFERENCES boards(n_id) ON UPDATE CASCADE ON DELETE CASCADE

Unique (cost=3677.26..3711.48 rows=102 width=18) (actual time=12566.422..13045.404 rows=4 loops=1)
-> Sort (cost=3677.26..3684.10 rows=2738 width=18) (actual time=12566.413..12809.928 rows=158254 loops=1)
Sort Key: b.n_type, b.n_id, b.t_name, CASE WHEN (b.n_type = 0) THEN 'personal'::text WHEN (b.n_type = 1) THEN 'private'::te
xt WHEN (b.n_type = 2) THEN 'blog'::text ELSE 'public'::text END
-> Nested Loop (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254 loops=1)
-> Seq Scan on people p (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1)
-> Materialize (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362 loops=67)
-> Nested Loop (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362 loops=1)
Join Filter: ((("inner".n_id = "outer".n_board_id) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) A
ND (("outer".n_creator = 71) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) AND (("outer".n_joined > 0) OR ("inner".n_type = 3
) OR ("inner".n_creator = 71)))
-> Seq Scan on objects o (cost=0.00..27.87 rows=787 width=10) (actual time=0.005..1.372 rows=787 loops=
1)
-> Materialize (cost=3.27..4.29 rows=102 width=22) (actual time=0.001..0.105 rows=102 loops=787)
-> Seq Scan on boards b (cost=0.00..3.27 rows=102 width=22) (actual time=0.017..0.352 rows=102 lo
ops=1)
Filter: (b_hidden = false)
Total runtime: 13051.065 ms

--
Rory Campbell-Lange
<ro**@campbell-lange.net>
<www.campbell-lange.net>

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

--
Rory Campbell-Lange
<ro**@campbell-lange.net>
<www.campbell-lange.net>

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #2

Nick Barr

[snip]
Rory Campbell-Lange wrote:

SELECT DISTINCT
b.n_id as id,
b.n_type,
CASE b.n_type WHEN 0 THEN 'personal'
WHEN 1 THEN 'private'
WHEN 2 THEN 'blog'
ELSE 'public'
END as type,
b.t_name as title
FROM
boards b, people p, objects o
WHERE
b.b_hidden = 'f'
AND
(
b.n_type = 3
OR
b.n_creator = 71
OR
( b.n_id = o.n_board_id
AND
o.n_creator = 71
AND
o.n_joined > 0
)
)
ORDER BY
b.n_type, b.n_id ASC, b.t_name;

You don't seem to be joining the people table to the boards or objects
table. In fact the people table is never referenced anywhere but the
FROM clause. This might be why it is not running so well. People seems
to be referenced by both boards and objects, so does there need to be a
join between all 3 tables?

[snip]
"$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
[snip]
"$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
It is this "missing" join which seems to cause 158254 rows to appear in
this nested loop, when PG only reckons it is going to see 2738 rows
-> Nested Loop (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254 loops=1)
I cant remember what the correct term is, but basically all rows from
the seq scan combines with all rows from the materialize, which is what
is causing the 150,000 odd rows to appear (is that called a cross
product anyone?)

i.e

67 rows from the seq x 2362 rows from the materialize = 158254 output
rows for the nested loop to chug through.
-> Seq Scan on people p (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1)
-> Materialize (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362 loops=67)
I reckon fix the unconstrained join and your query might run a little
faster.
-> Nested Loop (cost=3.27..3442.79 rows=41

width=18) (actual time=0.216..273.709 rows=2362 loops=1)

The estimate for this nested loop seems a little off, note the
guesstimate in the first set of brackets of 41, and compare with the
actual result in the second set of 2362. Have you vacuum analyzed recently?

HTH
Nick
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #3

Paul Thomas

On 10/05/2004 22:30 Rory Campbell-Lange wrote:

Sorry for replying to my own post, but I'm anxious for an answer. Should
I provide other information?

Look carefully at your column types. I can see several smallint columns in
there WHERE clause which are not expicitely typed as such.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for
Business |
| Computer Consultants |
http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------+

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #4

Nick Barr

Rory Campbell-Lange wrote:

Wow, this is an education! My php profiler now shows:

0.02 db->db_board_listing C: /var/www/trial/php/db.php:175 M: 867024
0.02 ob_start C: /var/www/trial/php/db.php:238 M: 867024
0.02 pg_exec C: /var/www/trial/php/db.php:239 M: 908328
1.24 pg_result_status C: /var/www/trial/php/db.php:240 M: 908392
1.24 ob_end_clean C: /var/www/trial/php/db.php:241 M: 908416
1.24 db->result_checker C: /var/www/trial/php/db.php:243 M: 867472
1.24 is_resource C: /var/www/trial/php/db.php:1317 M: 867472
1.24 pg_numrows C: /var/www/trial/php/db.php:248 M: 867568
1.24 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867600
1.24 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867760

and now shows:

C: /var/www/trial/php/core.php:151 M: 867040
0.02 db->db_board_listing C: /var/www/trial/php/db.php:175 M: 867040
0.02 ob_start C: /var/www/trial/php/db.php:238 M: 867040
0.02 pg_exec C: /var/www/trial/php/db.php:239 M: 908344
0.14 pg_result_status C: /var/www/trial/php/db.php:240 M: 908408
0.14 ob_end_clean C: /var/www/trial/php/db.php:241 M: 908432
0.14 db->result_checker C: /var/www/trial/php/db.php:243 M: 867488
0.14 is_resource C: /var/www/trial/php/db.php:1317 M: 867488
0.14 pg_numrows C: /var/www/trial/php/db.php:248 M: 867584
0.14 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867616
0.14 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867776

This is a saving of 1.1 seconds!

Thats a bit better ;-)

On 10/05/2004 22:30 Rory Campbell-Lange wrote:

Sorry for replying to my own post, but I'm anxious for an answer. Should
I provide other information?

Look carefully at your column types. I can see several smallint columns in
there WHERE clause which are not expicitely typed as such.

I'm not sure how to do this, Paul. do I do b.n_id::smallint ? Is
smallint not implied?

Thanks
Rory

Not quite. Explicit casts are needed when you have any numbers in the
WHERE condition and the columns are not of type integer/int4. For
example I have tweaked your query.

WHERE
b.b_hidden = 'f'
AND
(
b.n_type = 3::smallint
OR
b.n_creator = 71
OR
(
b.n_id = o.n_board_id
AND
o.n_creator = 71
AND
o.n_joined > 0::smallint
)
)

Note that b.n_creator and o.n_creator do not need explicit casts because
they are both of type integer anyway. You could of course put them in
for clarity. PG only casts the numbers to integer's, and not to smallint
or bigint, which basically means it does not use any indexes on that
column. This is fixed in 7.4 I believe, which you seem to be running
anyway so you might not be affected.

HTH

Nick
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #5

Rory Campbell-Lange

On 11/05/04, Nick Barr (ni***@chuckie.co.uk) wrote:

Rory Campbell-Lange wrote:
Look carefully at your column types. I can see several smallint
columns in there WHERE clause which are not expicitely typed as
such.
I'm not sure how to do this, Paul. do I do b.n_id::smallint ? Is
smallint not implied?

Not quite. Explicit casts are needed when you have any numbers in the
WHERE condition and the columns are not of type integer/int4. For
example I have tweaked your query. .... Note that b.n_creator and o.n_creator do not need explicit casts because
they are both of type integer anyway. You could of course put them in
for clarity. PG only casts the numbers to integer's, and not to smallint
or bigint, which basically means it does not use any indexes on that
column. This is fixed in 7.4 I believe, which you seem to be running
anyway so you might not be affected.

Mmm. Seems like I should try profiling with and without the cast. Maybe
there is little value in defining a column as a smallint, other than
checking the length of input.

Thanks again for your help,
Rory
--
Rory Campbell-Lange
<ro**@campbell-lange.net>
<www.campbell-lange.net>

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #6

Tom Lane

Rory Campbell-Lange <ro**@campbell-lange.net> writes:

The following query on some small datasets takes over a second to run.
I'd be grateful for some help in understanding the explain output, and
to remake the code.

I think the problem is you have an unconstrained join against "people
p", which is wasting a bunch of time generating duplicate records
(and then sorting them, and then discarding them in the DISTINCT step...)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #7

Similar topics