By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,789 Members | 1,436 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,789 IT Pros & Developers. It's quick & easy.

Performance with different index types

P: n/a
Hello,

are there any experiences about the performance of indices
with different data types.

How do the performance of an index that consists of

- an integer field
- a varchar() field
- a text field

differ?

Is it a waste of memory/performance to make a text field
primary key?

Thanks,

Hans
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 11 '05 #1
Share this Question
Share on Google+
1 Reply

P: n/a
On Mon, 18 Aug 2003, Johann Uhrmann wrote:
Hello,

are there any experiences about the performance of indices
with different data types.

How do the performance of an index that consists of

- an integer field
- a varchar() field
- a text field

differ?
It's not so much about the field type as what you are doing with it.
Btree indexes are the default, and often are your best choice. Rtree
indexes are often used for spatial comparisons (i.e. is this point inside
this polygon stuff). Hash indexes should be a win for certain types of
problems, but their implementation is pretty slow in postgresql, so you're
often still better off with an Rtree index.

GiST seems like it may replace Rtree indexes at some future date, but
they're still kind of in development.

If you need multi-column indexes, you have to use either btree or gist.

The standard rule of thumb is, when in doubt, use btree. :-)

also, look into partial / functional indexes. For instance, if you have a
column that's a bool with 3 million rows, and <100 of those rows have the
bool set to true, while the others are all false, it might make sense to
create a partial index on that field for when you want one of those 100
rows with that field set to true:

create index abc123 on tableabc (bool_field) where bool_field IS TRUE.

the other issue folks have when they start using postgresql is that it
sometimes tends to seq scan when you think it should be using the index.
It may well be that a seq scan is a better choice, but often it's not, and
the query planny just doesn't have enough information to know that.

so, you need to vacuum, analyze, and possibly edit your postgresql.conf
file's random_page_cost, effective_cache_size, and a few other fields to
give the planner a kick in the right direction.

the final issue is the one of type mismatch. If you've got a field with
an int8, and you do this:

select * from table where int8field=123;

the planner may not use your index on int8field, since 123 gets coerced to
int4. You need to cast the 123 to int in one of a few ways:

select * from table where int8field=cast (123 as int8); <- SQL spec way
select * from table where int8field=123::int8;
select * from table where int8field='123';
Is it a waste of memory/performance to make a text field
primary key?


Well, that depends. If the text field is the NATURAL key, and you'll
likely want to refer to it from other tables, then it's often a good
choice, semantically at least, to use it as a pk.

Sometimes, though, you need better performance, and then you can use an
artificial pk, like a serial column, and create a unique index on the
"natural" key column (i.e. the text field) to make sure it stays unique,
but use the serial column for all table joins and such.

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.