473,386 Members | 1,795 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Performance with different index types

Hello,

are there any experiences about the performance of indices
with different data types.

How do the performance of an index that consists of

- an integer field
- a varchar() field
- a text field

differ?

Is it a waste of memory/performance to make a text field
primary key?

Thanks,

Hans
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 11 '05 #1
1 3398
On Mon, 18 Aug 2003, Johann Uhrmann wrote:
Hello,

are there any experiences about the performance of indices
with different data types.

How do the performance of an index that consists of

- an integer field
- a varchar() field
- a text field

differ?
It's not so much about the field type as what you are doing with it.
Btree indexes are the default, and often are your best choice. Rtree
indexes are often used for spatial comparisons (i.e. is this point inside
this polygon stuff). Hash indexes should be a win for certain types of
problems, but their implementation is pretty slow in postgresql, so you're
often still better off with an Rtree index.

GiST seems like it may replace Rtree indexes at some future date, but
they're still kind of in development.

If you need multi-column indexes, you have to use either btree or gist.

The standard rule of thumb is, when in doubt, use btree. :-)

also, look into partial / functional indexes. For instance, if you have a
column that's a bool with 3 million rows, and <100 of those rows have the
bool set to true, while the others are all false, it might make sense to
create a partial index on that field for when you want one of those 100
rows with that field set to true:

create index abc123 on tableabc (bool_field) where bool_field IS TRUE.

the other issue folks have when they start using postgresql is that it
sometimes tends to seq scan when you think it should be using the index.
It may well be that a seq scan is a better choice, but often it's not, and
the query planny just doesn't have enough information to know that.

so, you need to vacuum, analyze, and possibly edit your postgresql.conf
file's random_page_cost, effective_cache_size, and a few other fields to
give the planner a kick in the right direction.

the final issue is the one of type mismatch. If you've got a field with
an int8, and you do this:

select * from table where int8field=123;

the planner may not use your index on int8field, since 123 gets coerced to
int4. You need to cast the 123 to int in one of a few ways:

select * from table where int8field=cast (123 as int8); <- SQL spec way
select * from table where int8field=123::int8;
select * from table where int8field='123';
Is it a waste of memory/performance to make a text field
primary key?


Well, that depends. If the text field is the NATURAL key, and you'll
likely want to refer to it from other tables, then it's often a good
choice, semantically at least, to use it as a pk.

Sometimes, though, you need better performance, and then you can use an
artificial pk, like a serial column, and create a unique index on the
"natural" key column (i.e. the text field) to make sure it stays unique,
but use the serial column for all table joins and such.

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
by: zebi | last post by:
hello, What's your opinion : The best type of temporary tablespace (DB2V7.2 SP 7 in AIX 4.3 / multiprocessor ) is DMS ou SMS ? Thanks ZEB
115
by: Mark Shelor | last post by:
I've encountered a troublesome inconsistency in the C-language Perl extension I've written for CPAN (Digest::SHA). The problem involves the use of a static array within a performance-critical...
12
by: wxs | last post by:
Many times we have a bunch of enums we have from either different enums or the same enum that will have various numeric values assigned. Rarely will there be collisions in numbering between the...
19
by: Tom Jastrzebski | last post by:
Hello, I was just testing VB.Net on Framework.Net 2.0 performance when I run into the this problem. This trivial code attached below executed hundreds, if not thousand times faster in VB 6.0...
48
by: Alex Chudnovsky | last post by:
I have come across with what appears to be a significant performance bug in ..NET 2.0 ArrayList.Sort method when compared with Array.Sort on the same data. Same data on the same CPU gets sorted a...
2
by: rAinDeEr | last post by:
Hi, I have a table with the following definition. The primary key is west_nme and west_eff_tms and i have created unique index on them. The foreign keys which reference other parent tables are...
10
by: shsandeep | last post by:
The ETL application loaded around 3000 rows in 14 seconds in a Development database while it took 2 hours to load in a UAT database. UAT db is partitioned. Dev db is not partitioned. the...
9
by: HC | last post by:
Hello, all, I started out thinking my problems were elsewhere but as I have worked through this I have isolated my problem, currently, as a difference between MSDE and SQL Express 2005 (I'll just...
13
by: atlaste | last post by:
Hi, I'm currently developing an application that uses a lot of computational power, disk access and memory caching (to be more exact: an information retrieval platform). In these kind of...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.