473,803 Members | 3,752 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Not using index

Hi,

I have a table with about 100K rows, on which I have created a btree index
of the type table_name(int, int, int, timestamp).

At first postgres was using it for my AND query on all four columns, but
after dropping it and creating different ones and testing, it suddenly
stopped using it. Vaccuuming, reindexing, recreating the table and even
recreating the database all didn't help.

Then I discovered "set enable_seqscan to off". It started using the index
again. This is certainly not a case of it not being beneficial to use the
index; using the index drops the query time from 260ms to 36ms!

In most references I have seen, setting enable_seqscan to off is something
only to be done for testing. But obviously if Postgres doesn't start
behaving properly, I can't go to production without it. Is this the case
or do many use this feature in production?

Can you give Postgres index hints like you can do in Oracle and Sybase?

Any clues?

Thanks,
Bas.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #1
5 17329
Bas Scheffers wrote:
Then I discovered "set enable_seqscan to off". It started using the index
again. This is certainly not a case of it not being beneficial to use the
index; using the index drops the query time from 260ms to 36ms!

In most references I have seen, setting enable_seqscan to off is something
only to be done for testing. But obviously if Postgres doesn't start
behaving properly, I can't go to production without it. Is this the case
or do many use this feature in production?

Can you give Postgres index hints like you can do in Oracle and Sybase?


What is the exact query and table schema?

Are you sure it is not problem of cross data-types in where clause and indexes?

Postgresql uses indexes if it sees fit. Usually it is a better choice to let it
decide what it wants.

Shridhar

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 22 '05 #2
On Thu, 12 Feb 2004, Bas Scheffers wrote:
Hi,

I have a table with about 100K rows, on which I have created a btree index
of the type table_name(int, int, int, timestamp).

At first postgres was using it for my AND query on all four columns, but
after dropping it and creating different ones and testing, it suddenly
stopped using it. Vaccuuming, reindexing, recreating the table and even
recreating the database all didn't help.

Then I discovered "set enable_seqscan to off". It started using the index
again. This is certainly not a case of it not being beneficial to use the
index; using the index drops the query time from 260ms to 36ms!

In most references I have seen, setting enable_seqscan to off is something
only to be done for testing. But obviously if Postgres doesn't start
behaving properly, I can't go to production without it. Is this the case
or do many use this feature in production?

Can you give Postgres index hints like you can do in Oracle and Sybase?


Ok, there are two cases where postgresql doesn't use an index but should.

One is where, due to type mismatch, it can't, the other is where the
planner thinks it will be slower, but in fact would be faster.

the set enable_seqscan = off trick allows you to see which it is. Since
postgresql then used the index, it must be capable, but just doesn't
realize it should.

There are a few settings that tell postgresql how to decide which to use,
they are:

(All the following are in terms of comparison to a sequencial page access)

random_page_cos t:
This one tells the server how much a random access costs versus a
sequential access. default of 4. On machines with lots of cache / fast
drives / lots of drives in a RAID array / RAID array with cache (i.e.
things that speed up random access) you may want to set this lower. I run
mine at 1.4. anything below 1.0 is unrealistic, but may be necessary to
force the right plan sometimes. As a global setting, I'd recommend
something between 1 and 2 for most servers.

effective_cache _size:
This tells the planner how big the kernel cache being used for caching
postgresql is. If postgresql is the only thing on the machine, then it is
likely that all the kernel cache is being used for postgresql. higher
favors index scans, since it's more likely the data will be in memory,
where random and seq cost the same, 1.0
cpu_tuple_cost < each tuple retrieved
cpu_index_tuple _cost < each tuple's (additional?) cost for an index
Lowering this favors index scans.
cpu_operator_co st < not sure, I think it's stuff like nestloop loop cost
and such.

So, to start with, try changing random page cost. you can change it for
the current session only for testing, and try to find the "breakover
point" where it forces the planner to make the right decision.

Also, keep a large table around you can do a select * from bigtable to
clear the caches and then run the original query, and compare the
performance of seq versus index. you'll often find that a query that
screams when the caches are full of your data is quite slow when the cache
is empty.
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 22 '05 #3
Hi Scot,

scott.marlowe said:
So, to start with, try changing random page cost. you can change it for As "unrealisti c" as it should be, I need <1 before Postgres takes the
bait. Initialy 0.7, to be exact, but later It also worked at a little
higher setting of 1. I have given PG 96Mb of memory to play with, so
likely all my data will be in cache. So no very fast disk (6MB/sec reads),
but loads of RAM.

Should I try tweaking any of the other parameters?
performance of seq versus index. you'll often find that a query that
screams when the caches are full of your data is quite slow when the cache
is empty.

True, but as this single query is going to be the work horse of the web
service I am developing, it is likely all data will always be in memory,
even if I'd have to stick several gigs of ram in.

Thanks,
Bas.
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 22 '05 #4
On Thu, 12 Feb 2004, Bas Scheffers wrote:
Hi Scot,

As "unrealisti c" as it should be, I need <1 before Postgres takes the
bait. Initialy 0.7, to be exact, but later It also worked at a little
higher setting of 1. I have given PG 96Mb of memory to play with, so
likely all my data will be in cache. So no very fast disk (6MB/sec reads),
but loads of RAM.

Should I try tweaking any of the other parameters?


Yes. drop cpu_tuple_index _cost by a factor of 100 or so

cpu_index_tuple _cost = 0.001
to
cpu_index_tuple _cost = 0.0001
or
cpu_index_tuple _cost = 0.00001

Also up effective_cache _size. It's measured in 8k blocks, so for a
machine with 1 gig of ram, and 700 meg of that in kernel cache, you'd want
approximately 90000 for that. Note that this is not an exact measure, and
it's ok if you like and make it even larger to ensure the database
thinks we have gobs of RAM.
performance of seq versus index. you'll often find that a query that
screams when the caches are full of your data is quite slow when the cache
is empty.

True, but as this single query is going to be the work horse of the web
service I am developing, it is likely all data will always be in memory,
even if I'd have to stick several gigs of ram in.


Note that rather than "set enable_seqscan= off" for the whole database, you
can always set it for just this session / query.

When you run explain analyze <query> are any of the estimates of rows way
off versus the real number of rows? If so, you may need to analyze more
often or change the column's stat target to get a good number. and some
query plans just don't have any way of knowing, so they just guess, and
there's no way to change what they're guessing, so setting random page
cost to <1 may be the only answer.
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #5
scott.marlowe said:
Yes. drop cpu_tuple_index _cost by a factor of 100 or so No effect.
Also up effective_cache _size. It's measured in 8k blocks, so for a That's better, set to 9000, which seems reasonable for my current setup,
it will start using the index when RANDOM_PAGE_COS T <= 1.5.
Note that rather than "set enable_seqscan= off" for the whole database, you
can always set it for just this session / query. Considering how rare a case it is that a table scan is more efficient than
using proper indexes, that might not be a bad idea.
When you run explain analyze <query> are any of the estimates of rows way
off versus the real number of rows? If so, you may need to analyze more

They are actualy depending on what stage it is in, it is execting a factor
20 to 100 rows more than actualy are returned. That sounds way off to me.

Here's what's happening: first there is the index scan, which would return
about 5000 rows (the planner is expecting 3700). But it doesn't return
that, as there is another filter happening (circle ~ point) which reduces
the actual number of rows to 242. That number is then further reduced to
32 by a tsearch2 query, but the planner is still expecting 3700 rows by
that stage.

I tried upping the statistics for the columns I am searching on and
running analyze on the table, but without results.

So I guess I am stuck with setting the effective_cache _size to a sane
value and lowering the random_page_cos t value to something not much higher
than 1. Hey, as long as it works!

Thanks,
Bas.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 22 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1885
by: Dynamo | last post by:
Not sure if this is the right group for this question but here goes anyway. Does using index.php instead of index.htm affect the way that your site is crawled by major search engines? Reason I ask is that I have 2 sites. The site using index.php has no other pages listed against it when you click on the "More pages from this site" link within Yahoo and Lycos. The site using index.htm is correctly listed. Apologies if this is the wrong...
5
5140
by: Boo | last post by:
Can someone explain to me why this simple query will not use an index on the field confirm_date? select * from comments where confirm_date != 0 confirm_date is an integer, and I have a regular index on it. When I use EXPLAIN it shows all 1233 rows being searched with NULL for the possible keys. Thanks for your help.
2
2949
by: sridharg.rao | last post by:
Hi, I have a table t1 with columns a,b,c,d,e,f,g,h,i,j,k,l I have created a clustered index on a,b,d,e which forms the primary key. I have created a covering index on all the columns of t1. There are 1 million rows in this table. My query chooses the TOP20 rows based on some filter conditions. When I use an "ORDER BY 1", it uses the clustered index and I get the result
6
8228
by: talfano | last post by:
Hello, We are having a very strange problem. We have a table with about 5 million rows in it. The problem is with one of the non clustered indexes. I have noticed that sometimes in query analyzer, when doing an execution plan, the optimizer is NOT doing an index seek, or a bookmark lookup when the query should. It sometimes will do a full clustered index scan on the primary key, which takes much longer. For example:
1
1897
by: Rudy Koento | last post by:
Hi, I've created an index but it's not being used by postgresql when doing a query. But doing an "explain analyze" shows that with index, it's faster. Here's the output: ------------------------ SET enable_seqscan = off; EXPLAIN ANALYZE SELECT S.* FROM sales S, staff ST WHERE S.staff_no=ST.staff_no AND ST.name='Rudy';
14
2247
by: Craig O'Shannessy | last post by:
Hi all, Just thought I'd mention that I really think this problem needs to be fixed. I I'm patching the 7.4RC1 JDBC drivers as we speak due to this optimiser bug, and it's the third time I've had to do this. I would think this bug causes quite a lot of people to evaluate postgres and decide it has awful primary key performance! I love postgres, and hate to think that this could be happening.
6
2049
by: Jack Orenstein | last post by:
Suppose I have a table as follows: testdb=> \d person Table "public.person" Column | Type | Modifiers ------------+-------------------------+----------- id | integer | not null age | integer | other_info | character varying(1000) | Indexes: person_pkey primary key btree (id),
1
18725
by: sksksk | last post by:
I want to achieve the following process in the smarty for $item one i should be able to get the value using loop.index, but without any luck. any help is appreciated. <?php for ($i = 1; $i <= 30; $i++) : ?> <tr> <th><?= $i ?></th>
1
3217
by: Steffen Stellwag | last post by:
Truely is often better to scan a table in full passing by an index , but if you can force the optimizer to use an index via a hint for testing and comparing the results. But the index in the above example is not used , because the hint is malformed, if tables in a Select statment are named by aliases you have to specify the alias name in the hint statment , not the table name /*+ INDEX (ICWOIMP PK_ICWOIMP) */ change to /*+ INDEX (A...
0
10317
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10300
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9127
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7607
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6844
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5503
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4277
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3802
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2974
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.