469,271 Members | 1,718 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,271 developers. It's quick & easy.

explicit casting required for index use

Here's the basic issue: PostgreSQL doesn't use indexes unless a query
criterion is of exactly the same type as the index type. This occurs
even when a cast would enable the use of an index and greatly improve
performance. I understand that casting is needed to use an index and
will therefore affect performance -- the part I don't understand is why
postgresql doesn't automatically cast query arguments to the column
type, thereby enabling indexes on that column.
I have a table that looks like this (extra cols, indexes, and fk
constraints removed):

unison@csb=# \d paprospect2
Table "unison.paprospect2"
Column | Type | Modifiers
-------------+---------+-------------------------------------------------------------------
pseq_id | integer | not null
run_id | integer | not null
pmodel_id | integer | not null
svm | real |
Indexes: paprospect2_search1 btree (pmodel_id, run_id, svm),

I often search for pseq_ids based on all of pmodel_id, run_id, and svm
threshold as below, hence the multi-column index.

Without an explicit cast of the svm criterion:

unison@csb=# explain select pseq_id from paprospect2 where pmodel_id=8210 and run_id=1 and svm>=11;
Index Scan using paprospect2_search2 on paprospect2 (cost=0.00..43268.93 rows=2 width=4)
Index Cond: ((pmodel_id = 8210) AND (run_id = 1))
Filter: (svm >= 11::double precision)

And with an explicit cast to real (the same as the column type and
indexed type):

unison@csb=# explain select pseq_id from paprospect2 where pmodel_id=8210 and run_id=1 and svm>=11::real;
Index Scan using paprospect2_search1 on paprospect2 (cost=0.00..6.34 rows=2 width=4)
Index Cond: ((pmodel_id = 8210) AND (run_id = 1) AND (svm >= 11::real))
Note two things above: 1) The explicit cast greatly reduces the
predicted (and actual) cost. 2) The uncasted query eventually casts svm
to double precision, which seems odd since the column itself is real
(that is, it eventually does cast, but to the "wrong" type).

For small queries (returning ~10 rows), this is worth 100x in speed (9ms
v. 990ms... in absolute terms, no big deal). For larger result sets
(~200 rows), I've seen more like 1000x speed increases by using an
explicit cast. For the larger queries, this can mean seconds versus many
minutes.

Having to explicitly cast criterion is very non-intuitive. Moreover, it
seems quite straightforward that PostgreSQL might incorporate casts (and
perhaps even function calls like upper() for functional indexes) into
its query strategy optimization. (I suppose functional indexes would
apply only to immutable fx only, but that's fine.)

Thanks,
Reece

--
Reece Hart, Ph.D. rk*@gene.com, http://www.gene.com/
Genentech, Inc. 650/225-6133 (voice), -5389 (fax)
Bioinformatics and Protein Engineering
1 DNA Way, MS-93 http://www.in-machina.com/~reece/
South San Francisco, CA 94080-4990 re***@in-machina.com, GPG: 0x25EC91A0

Nov 12 '05 #1
0 1571

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

231 posts views Thread by Brian Blais | last post: by
11 posts views Thread by Steven Bartley | last post: by
2 posts views Thread by rajivpopat | last post: by
2 posts views Thread by John Richardson | last post: by
8 posts views Thread by mfc | last post: by
4 posts views Thread by Xavier Roche | last post: by
17 posts views Thread by sophia.agnes | last post: by
9 posts views Thread by Taras_96 | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.