473,385 Members | 1,356 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Problem with query plan

I have come up with a simple query that runs horribly depending on the
number of columns selected.

select order_lines.*
from orders, order_lines
where orders.merchant_order_id = '11343445' and
order_lines.order_id=orders.order_id;

merchant_order_id is indexed.
order_id is indexed.
Tables are analyzed.

I get the following plan:

---------------------------------------------------------
Merge Join (cost=nan..nan rows=3 width=1257)
Merge Cond: ("outer".order_id = "inner".order_id)
-> Sort (cost=5.33..5.33 rows=2 width=4)
Sort Key: orders.order_id
-> Index Scan using ak_po_number on orders (cost=0.00..5.32
rows=2 width=4)
Index Cond: ((merchant_order_id)::text =
'11343445'::text)
-> Sort (cost=nan..nan rows=2023865 width=1257)
Sort Key: order_lines.order_id
-> Seq Scan on order_lines (cost=0.00..83822.65 rows=2023865
width=1257)

If I restrict the columns (i.e., select 1 from ...), it works great.

I can add columns and it seems that once I get a width of more than
~610, it executes a Merge Join of cost nan that takes forever to return.

If I reduce the columns returned to slightly below this, I get a much
nicer plan:

----------------------------------------------------------
Nested Loop (cost=0.00..16.60 rows=4 width=606)
-> Index Scan using ak_po_number on orders (cost=0.00..5.69 rows=3
width=4)
Index Cond: ((merchant_order_id)::text = '11343445'::text)
-> Index Scan using ak_order_line_doid on order_lines
(cost=0.00..3.61 rows=2 width=610)
Index Cond: (order_lines.order_id = "outer".order_id)

Is this possibly just an overflow that causes a NaN that isn't properly
handled by the optimizer?

This is on Redhat 3.0 AS U3 x86 with the RPMs from postgresql.org.

Thanks!


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #1
8 1659
Cott Lang <co**@internetstaff.com> writes:
-> Sort (cost=nan..nan rows=2023865 width=1257)


What PG version is this? My recollection is we fixed such a thing quite
some time ago ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #2

Oops, sorry - guess I left that out - 7.4.5. :)
On Fri, 2004-10-22 at 12:28, Tom Lane wrote:
Cott Lang <co**@internetstaff.com> writes:
-> Sort (cost=nan..nan rows=2023865 width=1257)


What PG version is this? My recollection is we fixed such a thing quite
some time ago ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #3
Cott Lang <co**@internetstaff.com> writes:
Oops, sorry - guess I left that out - 7.4.5. :)


Hmm ... I can't duplicate any misbehavior here. Are you using
nondefault values for any planner parameters? (particularly sort_mem,
random_page_cost, effective_cache_size)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #4
shared_buffers = 16384
sort_mem = 8192
random_page_cost = 2
effective_cache_size = 3932160
On Fri, 2004-10-22 at 13:32, Tom Lane wrote:
Cott Lang <co**@internetstaff.com> writes:
Oops, sorry - guess I left that out - 7.4.5. :)


Hmm ... I can't duplicate any misbehavior here. Are you using
nondefault values for any planner parameters? (particularly sort_mem,
random_page_cost, effective_cache_size)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #5
Cott Lang <co**@internetstaff.com> writes:
sort_mem = 8192
random_page_cost = 2
effective_cache_size = 3932160


effective_cache_size 30Gb ? Seems a tad high ;-)

However, I set up a dummy test case on 7.4.5 and don't see any overflow.

regression=# create table z1(f1 char(1253));
CREATE TABLE
regression=# update pg_class set reltuples=2023865, relpages=65000 where relname = 'z1';
UPDATE 1
regression=# set sort_mem = 8192;
SET
regression=# set random_page_cost = 2;
SET
regression=# set effective_cache_size = 3932160;
SET
regression=# explain select * from z1 order by f1;
QUERY PLAN
---------------------------------------------------------------------
Sort (cost=2200533.17..2205592.83 rows=2023865 width=1257)
Sort Key: f1
-> Seq Scan on z1 (cost=0.00..85238.65 rows=2023865 width=1257)
(3 rows)

Can you try this exact test case and see if you get a NAN?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #6
On Fri, 2004-10-22 at 14:19, Tom Lane wrote:
Cott Lang <co**@internetstaff.com> writes:
sort_mem = 8192
random_page_cost = 2
effective_cache_size = 3932160
effective_cache_size 30Gb ? Seems a tad high ;-)


It's a 32GB machine with nothing else running on it except PG, buffers
hover around 31GB :)
However, I set up a dummy test case on 7.4.5 and don't see any overflow.
Can you try this exact test case and see if you get a NAN?


I don't. After a bounce, I also can't repeat my original case; it now
returns 16.60.

Fiddling with the above values, only setting sort_mem absurdly large
easily causes NAN.

My guess is there was a wonky setting for sort_mem that disappeared
after I bounced.
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #7
Cott Lang <co**@internetstaff.com> writes:
Fiddling with the above values, only setting sort_mem absurdly large
easily causes NAN.


Ah. I see an overflow case for sort_mem exceeding 1Gb; that's probably
what you tickled.

I've fixed this in HEAD, but it doesn't seem worth back-patching.
If you care, the change in HEAD is

*** src/backend/optimizer/path/costsize.c.orig Sun Aug 29 01:06:43 2004
--- src/backend/optimizer/path/costsize.c Fri Oct 22 20:02:39 2004
***************
*** 566,572 ****
if (nbytes > work_mem_bytes)
{
double npages = ceil(nbytes / BLCKSZ);
! double nruns = nbytes / (work_mem_bytes * 2);
double log_runs = ceil(LOG6(nruns));
double npageaccesses;

--- 566,572 ----
if (nbytes > work_mem_bytes)
{
double npages = ceil(nbytes / BLCKSZ);
! double nruns = (nbytes / work_mem_bytes) * 0.5;
double log_runs = ceil(LOG6(nruns));
double npageaccesses;
but the variable names have changed since 7.4 so this won't apply
cleanly.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #8
Tom Lane wrote:
Cott Lang <co**@internetstaff.com> writes:
Fiddling with the above values, only setting sort_mem absurdly large
easily causes NAN.

Ah. I see an overflow case for sort_mem exceeding 1Gb; that's probably
what you tickled.

I've fixed this in HEAD, but it doesn't seem worth back-patching.
If you care, the change in HEAD is

*** src/backend/optimizer/path/costsize.c.orig Sun Aug 29 01:06:43 2004
--- src/backend/optimizer/path/costsize.c Fri Oct 22 20:02:39 2004
***************
*** 566,572 ****
if (nbytes > work_mem_bytes)
{
double npages = ceil(nbytes / BLCKSZ);
! double nruns = nbytes / (work_mem_bytes * 2);
double log_runs = ceil(LOG6(nruns));
double npageaccesses;

--- 566,572 ----
if (nbytes > work_mem_bytes)
{
double npages = ceil(nbytes / BLCKSZ);
! double nruns = (nbytes / work_mem_bytes) * 0.5;
double log_runs = ceil(LOG6(nruns));
double npageaccesses;
but the variable names have changed since 7.4 so this won't apply
cleanly.


If somebody care about apply this for 7.4, here there is the equivalent change:
--- costsize.c.orig 2004-10-23 11:17:38.000000000 +0200
+++ costsize.c 2004-10-23 11:19:04.000000000 +0200
@@ -548,7 +548,7 @@
if (nbytes > sortmembytes)
{
double npages = ceil(nbytes / BLCKSZ);
- double nruns = nbytes / (sortmembytes * 2);
+ double nruns = ( nbytes / sortmembytes ) * 0.5 ;
double log_runs = ceil(LOG6(nruns));
double npageaccesses;



Regards
Gaetano Mendola



Nov 23 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Brian Oster | last post by:
After applying security patch MS03-031 (Sql server ver 8.00.818) a query that used to execute in under 2 seconds, now takes over 8 Minutes to complete. Any ideas on what the heck might be going...
0
by: apb18 | last post by:
A bit of query plan strangeness. Suppose you have an inheritance tree such that the columns 'ID' and 'field' appear in the top level table, call that table XXX. tables YYY and ZZZ both inherit...
8
by: Együd Csaba | last post by:
Hi All, how can I improve the query performance in the following situation: I have a big (4.5+ million rows) table. One query takes approx. 9 sec to finish resulting ~10000 rows. But if I run...
6
by: Ryan | last post by:
I came across a situation that I've been unable to explain and was hoping somebody had an answer: I had written an update query which was taking about 8 seconds to run and considered it too...
20
by: Development - multi.art.studio | last post by:
Hello everyone, i just upgraded my old postgres-database from version 7.1 to 7.4.2. i dumped out my 7.1 database (with pg_dump from 7.1) as an sql-file with copy-commands and to one file using...
15
by: Együd Csaba | last post by:
Hi All, I've a problem with the perfprmance of the production environment. I've two db servers. One on my laptop computer (2Ghz, 1GB, WinXP, Cygwin, Postgres 7.3.4) and one on a production server...
7
by: stig | last post by:
hi. coming from postgresql, i am used to textual references to most of the things i do with the database. i feel a little lost with all the graphical. i have few questions regarding MS SQL 2000...
6
by: Not4u | last post by:
Hello Config : SQL 2000 on WIN 2000 (IIS 5.0) In my ASP page for some queries i have this error : Microsoft OLE DB Provider for SQL Server error '80040e31' Timeout expired
6
by: billmiami2 | last post by:
I'm experiencing a strange problem that I believe is related to ADO.NET but I can't say for sure. I have a simple ASP.NET reporting interface to a SQL Server 2000 database. One report that we...
9
by: HC | last post by:
Hello, all, I started out thinking my problems were elsewhere but as I have worked through this I have isolated my problem, currently, as a difference between MSDE and SQL Express 2005 (I'll just...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.