I have a customer table that has the field CUSTOMER_ID as the primary key
(cust_pkkey), the table has 102,834 records in it.
The following select statement works fine:
select * from customer order by customer_id;
QUERY PLAN:
Index Scan using cust_pkkey on customer (cost=0.00..517 5.17 rows=102834
width=724)
Total runtime: 5999.47 msec
but...
select * from customer order by customer_id, first_name;
QUERY PLAN:
Sort(cost=14202 8.25..142028.25 rows=102834 width=724)
-> Seq Scan on customer (cost=0.00..461 7.34 rows=102834 width=724)
Total runtime: 19999.81 msec
It seems that the optimizer should be able to detect (in this instance at
least) that the first order by field is a primary key and should not
consider the other fields because it's pointless... the resultset will be in
<primary key> order.
NOTE: I'm testing this on Postgresql 7.2 for Windows, so this my have
already been dealt with.
Thanks and keep up the great work!! 14 1977
"Bupp Phillips" <he***@noname.c om> writes: but...
select * from customer order by customer_id, first_name; QUERY PLAN: Sort(cost=14202 8.25..142028.25 rows=102834 width=724) -> Seq Scan on customer (cost=0.00..461 7.34 rows=102834 width=724) Total runtime: 19999.81 msec
Actually in this case the optimizer would likely still use a sequential scan
even if it had an index it thought it could use. If you're going to be reading
the whole table anyways it'll be faster to read it in order than to jump all
around even if you have to sort it.
However you do have a point. In this case I don't think postgres even
considers using the index. Even if it would decide not to use it in this case
there could conceivably be cases where it would want to use it.
However I'm not sure I see a lot of cases where this would come up. Even in
automatically generated code, which is the usual cause of redundant things
like this, I don't think I've seen this particular combination ever come up.
--
greg
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives? http://archives.postgresql.org
Greg Stark <gs*****@mit.ed u> writes: "Bupp Phillips" <he***@noname.c om> writes: select * from customer order by customer_id, first_name; [ where customer_id is the primary key ]
However you do have a point. In this case I don't think postgres even considers using the index.
It will not, since the index does not appear to provide the correct sort
order.
However I'm not sure I see a lot of cases where this would come up.
Yes, that's the real crux of the matter. Should the optimizer spend
cycles on *every* query to detect cases where the user has written
useless sort keys? I've got grave doubts that it's a win. ISTM such
an optimization penalizes the folk who write their queries well to
benefit those who are careless.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Well, it's unfortunate that you feel that way, because SQL Server handles it
correctly.
"Tom Lane" <tg*@sss.pgh.pa .us> wrote in message
news:43******** *****@sss.pgh.p a.us... Greg Stark <gs*****@mit.ed u> writes: "Bupp Phillips" <he***@noname.c om> writes: select * from customer order by customer_id, first_name; [ where customer_id is the primary key ]
However you do have a point. In this case I don't think postgres even considers using the index.
It will not, since the index does not appear to provide the correct sort order.
However I'm not sure I see a lot of cases where this would come up.
Yes, that's the real crux of the matter. Should the optimizer spend cycles on *every* query to detect cases where the user has written useless sort keys? I've got grave doubts that it's a win. ISTM such an optimization penalizes the folk who write their queries well to benefit those who are careless.
regards, tom lane
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend
Tom Lane <tg*@sss.pgh.pa .us> writes: However I'm not sure I see a lot of cases where this would come up.
Yes, that's the real crux of the matter. Should the optimizer spend cycles on *every* query to detect cases where the user has written useless sort keys? I've got grave doubts that it's a win. ISTM such an optimization penalizes the folk who write their queries well to benefit those who are careless.
Well I'm sure the same arguments were made 30 years ago about optimizing
compilers. But thankfully the optimizer-geeks won the argument. As a result
these days we can more or less write our code in whatever form is most
readable and flexible. We can spend our time worrying about algorithmic
improvements and design abstraction, and not worry that the extra layer of
abstraction will introduce redundant code or inefficient expressions.
Already today we see database-independent toolkits that write SQL queries for
you. They introduce needless subqueries and other inefficiencies willy-nilly.
I argue that if the performance of the database is important down to a
sub-second constant term per query, then you're talking about an OLTP system
where all the queries ought to be prepared and the plans cached. If you're
talking about a system where queries are constructed ad-hoc for every
execution then you should be talking about a DSS system running batch jobs
where an extra few seconds spent optimizing could save you hours.
All that said I'm not sure the case at hand is a great example. I don't think
it would be a big performance loss, but the added code complexity for nothing
might be annoying. I don't see how even automatically generated code is likely
to generate this situation.
--
greg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)
Greg Stark <gs*****@mit.ed u> writes: Tom Lane <tg*@sss.pgh.pa .us> writes: Yes, that's the real crux of the matter. Should the optimizer spend cycles on *every* query to detect cases where the user has written useless sort keys? I've got grave doubts that it's a win.
Well I'm sure the same arguments were made 30 years ago about optimizing compilers. But thankfully the optimizer-geeks won the argument.
Um ... I *am* an optimizer-geek. You can find my name in the credits
for Bliss/11, which was the best optimizing compiler on the planet about
thirty years ago. I stand by my comment that there's a tradeoff between
the potential gain from an optimization and the time spent to find it.
PG is at a disadvantage compared to typical compilation scenarios, in
that a compiler assumes its output will be executed many times, while
SQL queries often are planned and then executed but once. There's been
some talk of working harder when planning a "prepared statement", but
so far I've not seen very many places where I'd really want to alter
the planner's behavior on that basis.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
On Thu, 4 Sep 2003, Tom Lane wrote: Greg Stark <gs*****@mit.ed u> writes: Tom Lane <tg*@sss.pgh.pa .us> writes: Yes, that's the real crux of the matter. Should the optimizer spend cycles on *every* query to detect cases where the user has written useless sort keys? I've got grave doubts that it's a win.
Well I'm sure the same arguments were made 30 years ago about optimizing compilers. But thankfully the optimizer-geeks won the argument.
Um ... I *am* an optimizer-geek. You can find my name in the credits for Bliss/11, which was the best optimizing compiler on the planet about thirty years ago. I stand by my comment that there's a tradeoff between the potential gain from an optimization and the time spent to find it.
PG is at a disadvantage compared to typical compilation scenarios, in that a compiler assumes its output will be executed many times, while SQL queries often are planned and then executed but once. There's been some talk of working harder when planning a "prepared statement", but so far I've not seen very many places where I'd really want to alter the planner's behavior on that basis.
An intresting point. Perhaps storing some stats on Views would
help. Maybe adding a cache facility for views would speed some things up.
I don't really see anything against storing stats on "Prepared
Statements" and Views like we do on Tables.
Maybe indexs on View would be useful but keeping them uptodate
would be a hazard.
Peter Childs
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)
Tom Lane <tg*@sss.pgh.pa .us> writes: Um ... I *am* an optimizer-geek.
That explains so much :)
I stand by my comment that there's a tradeoff between the potential gain from an optimization and the time spent to find it.
Well there are always tradoffs in engineering. I'm just trying to push a
little bit in one direction and make you rethink your assumptions.
In the past postgres users were largely OLTP systems (largely web sites)
running ad-hoc queries that are parsed and executed every time and interpolate
parameters into the query string. That's crap. With the new FE binary
protocol, and a bit of a push to the driver writers a good high performance
(and secure) system ought to be able to run entirely prepared queries that are
prepared once per backend process and executed thousands of times.
PG is at a disadvantage compared to typical compilation scenarios, in that a compiler assumes its output will be executed many times, while SQL queries often are planned and then executed but once. There's been some talk of working harder when planning a "prepared statement", but so far I've not seen very many places where I'd really want to alter the planner's behavior on that basis.
I think that's backwards actually. The queries where you would want to work
super extra hard, spending a few seconds or even minutes checking possible
plans are the ones that will run for hours. Those are more likely to be DSS
queries that are unprepared ad-hoc queries that will be executed only once.
For OLTP queries I think postgres can afford to not worry about small
(subsecond) constant-time optimizations even if they're unlikely to return big
benefits because OLTP systems should be running entirely prepared queries.
--
greg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)
On Wed, 3 Sep 2003, Bupp Phillips wrote: Well, it's unfortunate that you feel that way, because SQL Server handles it correctly.
For some definition of correctly. If you're in a system which gets
penalized .001 seconds for each query planning that uses a multi-column
order by and you do 100 million of them that this doesn't apply to, and
one that it does which save you 30 seconds, is that correct?
"Tom Lane" <tg*@sss.pgh.pa .us> wrote in message news:43******** *****@sss.pgh.p a.us... Greg Stark <gs*****@mit.ed u> writes: "Bupp Phillips" <he***@noname.c om> writes: > select * from customer order by customer_id, first_name; > [ where customer_id is the primary key ]
However you do have a point. In this case I don't think postgres even considers using the index.
It will not, since the index does not appear to provide the correct sort order.
However I'm not sure I see a lot of cases where this would come up.
Yes, that's the real crux of the matter. Should the optimizer spend cycles on *every* query to detect cases where the user has written useless sort keys? I've got grave doubts that it's a win. ISTM such an optimization penalizes the folk who write their queries well to benefit those who are careless.
regards, tom lane
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
What the heck are you talking about?
SQLServer returns the query immediately (milliseconds) where as PGSQL
returns 6 seconds!!
So, yes I would say that SQLServer did it correctly in this situation. The
optimizer did what it was suppose to do...find the quickest way to get me
the data.
Now if you guys here a PGSQL don't believe it's necessary to change the
optimizer for this purpose, that's fine, but don't start putting other DBMS
down because they DO handle this situation.
"Stephan Szabo" <ss****@megazon e.bigpanda.com> wrote in message
news:20******** *************** ***@megazone.bi gpanda.com... On Wed, 3 Sep 2003, Bupp Phillips wrote:
Well, it's unfortunate that you feel that way, because SQL Server
handles it correctly.
For some definition of correctly. If you're in a system which gets penalized .001 seconds for each query planning that uses a multi-column order by and you do 100 million of them that this doesn't apply to, and one that it does which save you 30 seconds, is that correct?
"Tom Lane" <tg*@sss.pgh.pa .us> wrote in message news:43******** *****@sss.pgh.p a.us... Greg Stark <gs*****@mit.ed u> writes: > "Bupp Phillips" <he***@noname.c om> writes: >> select * from customer order by customer_id, first_name; >> [ where customer_id is the primary key ]
> However you do have a point. In this case I don't think postgres
even > considers using the index.
It will not, since the index does not appear to provide the correct
sort order.
> However I'm not sure I see a lot of cases where this would come up.
Yes, that's the real crux of the matter. Should the optimizer spend cycles on *every* query to detect cases where the user has written useless sort keys? I've got grave doubts that it's a win. ISTM such an optimization penalizes the folk who write their queries well to benefit those who are careless.
regards, tom lane
---------------------------(end of
broadcast)--------------------------- TIP 8: explain analyze is your friend ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: hemal |
last post by:
I came across a very strange situation at work. There is an order of
magnitude difference in execution time for the following two queries
(10 v/s ~130 msec):
select count(*) from table_name where column_name = 'value' or 1 = 0
select count(*) from table_name where column_name = 'value'
I do not want to go into the reason why the redundent condition
exists, the example is representative of the real query where it
|
by: gefek |
last post by:
Hello,
I've got a server postgresql 7.4.6 installed from RPM for linux RH 9.
Lately, I've noticed, that some queries last too long... It appears to me,
that the optimizer does not use index optimizing....
This is what analyze shows:
------
explain analyze update activities set act_synch_date='2005-02-03 00:00:00'
where activities.act_id=17;
QUERY PLAN...
|
by: Daniel Roy |
last post by:
Guys,
what I need is a tool which gives details on the choice of an
execution plan by the SQL Server. For example, the cost for a hash
join might be 200 and 100 for a nested loop, and therefore a nested
loop is used. Same thing for the access paths for each table/view
involved. In Oracle, we turn on event 100053 to see this kind of info.
Thanx
Daniel
|
by: Philip Yale |
last post by:
I'm very puzzled by the choice of NC index being made by the optimizer
in this example. I don't actually think it should use an NC index at
all.
I have:
Table: CustomerStatus_T
Single data page
19 records
Clustered Index on CustomerStatusID:
|
by: Greg Stark |
last post by:
Hm, here's a query where the optimizer is choosing the wrong plan by far. I
think it boils down to it guessing wrong on how selective an rtree index is,
which I guess would be hard to predict.
Except if it guesses wrong by assuming it isn't selective it would be maybe
50% slower doing lots of index lookups instead of a more efficient full table
scan and join. If it guesses wrong by assuming it'll be very selective as it
is in this case...
| |
by: Andrea |
last post by:
Hi,
I've read the former postings but was not able to solve my problem:
I have a Summary Table (or MQT as you like it) and the query optimizer
does not seem to use the summary table.
I run the following commands:
set current query optimization 9;
|
by: Larry Bertolini |
last post by:
While trying to help a developer troubleshoot a performance
problem, I have discovered something that strikes me as odd.
When I run a particular query, using a non-privileged login that has
necessary access to the objects, the query runs in 3.5 hours.
When I run the same query, but use a login that has sysadmin
role, the query runs in 1 second.
I compared the showplan output, and the "regular" user's execution plan
|
by: Kevin |
last post by:
Using a base table, a MQT table was created. With optimization - when
querying the base table with calcuation that are already completed in
the MQT - I would assume the optimizer would use the MQT table instead
of the base table.
What causes the optimizer to use the MQT tables rather than the tables
as defined by the query? Is there a way to "encourage" use of the MQT
tables?
Thanks.
|
by: boa sema |
last post by:
Way back when, and at least in version 7 IIRC, the query optimizer gave
up when the where clause in a statement contained more than 4 search
conditions.
Does anyone know if such a limitation still exist in MS SQL 2005? The
BOL seems to be silent on the issue.
Boa
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |