473,568 Members | 2,964 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Optimizer isn't perfect


Hm, here's a query where the optimizer is choosing the wrong plan by far. I
think it boils down to it guessing wrong on how selective an rtree index is,
which I guess would be hard to predict.

Except if it guesses wrong by assuming it isn't selective it would be maybe
50% slower doing lots of index lookups instead of a more efficient full table
scan and join. If it guesses wrong by assuming it'll be very selective as it
is in this case then it's 1000% slower doing repeated full table scans.

This is the same thing someone else pointed out a while ago. The consequences
of guessing wrong and favouring a full table scan instead of index lookups
with nested loops are simply much more severe than the reverse.

Even if the full table scan was faster with one dataset I might prefer to use
the slower index lookup and live with the lower performance but be able to
guarantee that a sudden shift in data distribution or user behaviour wouldn't
suddenly cause a complete outage. I'm wondering if it doesn't make sense to go
into production with enable_seqscan = off.
slo=> explain analyze
SELECT (select count(distinct xx_id) from xx_thing where thing_id in (select thing_id from thing where group_id = group.group_id) ) as num_xxs
FROM group
WHERE geom2 @ make_box(-79.3885,43.6438 ,65)
AND earth_dist(geom , -79.3885,43.6438 ) < 65

;
slo-> slo-> slo-> slo-> slo-> slo->
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_group_geom on group (cost=0.00..130 .62 rows=1 width=4) (actual time=69.53..741 0.31 rows=125 loops=1)
Index Cond: (geom2 @ '(-78.580156635050 8,44.2287532037 437),(-80.196843364949 2,43.0588467962 563)'::box)
Filter: (sqrt((pow((80. 4113732090646:: double precision * (geom[0] - -79.3885::double precision)), 2::double precision) + pow((111.12::do uble precision * (geom[1] - 43.6438::double precision)), 2::double precision))) < 65::double precision)
SubPlan
-> Aggregate (cost=127.44..1 27.44 rows=1 width=4) (actual time=59.11..59. 11 rows=1 loops=125)
-> Hash IN Join (cost=19.00..12 7.42 rows=7 width=4) (actual time=36.63..58. 66 rows=11 loops=125)
Hash Cond: ("outer".thing_ id = "inner".thing_i d)
-> Seq Scan on xx_thing (cost=0.00..81. 23 rows=5423 width=8) (actual time=0.03..32.6 4 rows=5423 loops=125)
-> Hash (cost=18.97..18 .97 rows=15 width=4) (actual time=0.59..0.59 rows=0 loops=125)
-> Index Scan using idx_thing_group on thing (cost=0.00..18. 97 rows=15 width=4) (actual time=0.21..0.42 rows=8 loops=125)
Index Cond: (group_id = $0)
Total runtime: 7424.11 msec
(12 rows)

slo=> slo=> set enable_seqscan = off;
SET

slo=> set enable_mergejoi n = off;
SET

slo=> explain analyze
SELECT (select count(distinct xx_id) from xx_thing where thing_id in (select thing_id from thing where group_id = group.group_id) ) as num_xxs
FROM group
WHERE geom2 @ make_box(-79.3885,43.6438 ,65)
AND earth_dist(geom , -79.3885,43.6438 ) < 65

;
slo-> slo-> slo-> slo-> slo-> slo-> QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_group_geom on group (cost=0.00..149 .93 rows=1 width=4) (actual time=0.74..239. 90 rows=125 loops=1)
Index Cond: (geom2 @ '(-78.580156635050 8,44.2287532037 437),(-80.196843364949 2,43.0588467962 563)'::box)
Filter: (sqrt((pow((80. 4113732090646:: double precision * (geom[0] - -79.3885::double precision)), 2::double precision) + pow((111.12::do uble precision * (geom[1] - 43.6438::double precision)), 2::double precision))) < 65::double precision)
SubPlan
-> Aggregate (cost=146.75..1 46.75 rows=1 width=4) (actual time=1.81..1.81 rows=1 loops=125)
-> Nested Loop (cost=19.00..14 6.73 rows=7 width=4) (actual time=0.65..1.60 rows=11 loops=125)
-> HashAggregate (cost=19.00..19 .00 rows=15 width=4) (actual time=0.36..0.63 rows=8 loops=125)
-> Index Scan using idx_thing_group on thing (cost=0.00..18. 97 rows=15 width=4) (actual time=0.09..0.22 rows=8 loops=125)
Index Cond: (group_id = $0)
-> Index Scan using idx_xx_thing_lo c on xx_thing (cost=0.00..8.4 3 rows=7 width=8) (actual time=0.05..0.08 rows=1 loops=1055)
Index Cond: (xx_thing.thing _id = "outer".thing_i d)
Total runtime: 252.60 msec
(12 rows)

slo=> >

--
greg
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #1
2 1398
Greg Stark <gs*****@mit.ed u> writes:
This is the same thing someone else pointed out a while ago. The consequences
of guessing wrong and favouring a full table scan instead of index lookups
with nested loops are simply much more severe than the reverse.


It is easy to demonstrate cases where this isn't true.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #2
On Fri, Aug 29, 2003 at 10:37:32AM -0400, Greg Stark wrote:
Except if it guesses wrong by assuming it isn't selective it would be maybe
50% slower doing lots of index lookups instead of a more efficient full table
scan and join. If it guesses wrong by assuming it'll be very selective as it
is in this case then it's 1000% slower doing repeated full table scans.


Well, there are operators that need some work. I am seeing less than
ideal behaviour with OVERLAPS, but I think it isn't even documented...

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"In a specialized industrial society, it would be a disaster
to have kids running around loose." (Paul Graham)

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2392
by: Nick Mudge | last post by:
Hi, Does anybody know the performance difference between having your PHP code cached and just running your code with the zend optimizer? Is there much difference? Nick
9
4738
by: hemal | last post by:
I came across a very strange situation at work. There is an order of magnitude difference in execution time for the following two queries (10 v/s ~130 msec): select count(*) from table_name where column_name = 'value' or 1 = 0 select count(*) from table_name where column_name = 'value' I do not want to go into the reason why the...
2
1719
by: Daniel Roy | last post by:
Guys, what I need is a tool which gives details on the choice of an execution plan by the SQL Server. For example, the cost for a hash join might be 200 and 100 for a nested loop, and therefore a nested loop is used. Same thing for the access paths for each table/view involved. In Oracle, we turn on event 100053 to see this kind of info. ...
3
1585
by: Philip Yale | last post by:
I'm very puzzled by the choice of NC index being made by the optimizer in this example. I don't actually think it should use an NC index at all. I have: Table: CustomerStatus_T Single data page 19 records Clustered Index on CustomerStatusID:
14
1954
by: Bupp Phillips | last post by:
I have a customer table that has the field CUSTOMER_ID as the primary key (cust_pkkey), the table has 102,834 records in it. The following select statement works fine: select * from customer order by customer_id; QUERY PLAN: Index Scan using cust_pkkey on customer (cost=0.00..5175.17 rows=102834 width=724)
3
2081
by: Peter Arrenbrecht | last post by:
Hi all We ran into a very annoying optimizer problem recently. We had added a new key and self-join relation to a large production table. That key will be filled very rarely and having just being added was practically never filled in when the first user tried to delete a row from the table. Now, the optimizer tried to enforce RI on the...
9
2794
by: Andrea | last post by:
Hi, I've read the former postings but was not able to solve my problem: I have a Summary Table (or MQT as you like it) and the query optimizer does not seem to use the summary table. I run the following commands: set current query optimization 9;
2
3900
by: boa sema | last post by:
Way back when, and at least in version 7 IIRC, the query optimizer gave up when the where clause in a statement contained more than 4 search conditions. Does anyone know if such a limitation still exist in MS SQL 2005? The BOL seems to be silent on the issue. Boa
3
2369
by: aj | last post by:
DB2 LUW v8.2 FP 14 RHAS 2.1 Sorry if these are newbie questions. Optimizer stuff is black magic to me. For both of these, assume stats are current and an even distribution of data. ------------------------- Lets say I have a table FOO1 that has, among other columns, a column named A. There is a non-unique index on A that has medium...
0
7693
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7604
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7916
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8117
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
5498
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5217
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3651
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
2101
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
932
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.