I have a customer table that has the field CUSTOMER_ID as the primary key
(cust_pkkey), the table has 102,834 records in it.
The following select statement works fine:
select * from customer order by customer_id;
QUERY PLAN:
Index Scan using cust_pkkey on customer (cost=0.00..517 5.17 rows=102834
width=724)
Total runtime: 5999.47 msec
but...
select * from customer order by customer_id, first_name;
QUERY PLAN:
Sort(cost=14202 8.25..142028.25 rows=102834 width=724)
-> Seq Scan on customer (cost=0.00..461 7.34 rows=102834 width=724)
Total runtime: 19999.81 msec
It seems that the optimizer should be able to detect (in this instance at
least) that the first order by field is a primary key and should not
consider the other fields because it's pointless... the resultset will be in
<primary key> order.
NOTE: I'm testing this on Postgresql 7.2 for Windows, so this my have
already been dealt with.
Thanks and keep up the great work!!
Nov 11 '05
14 1979
Stephan Szabo <ss****@megazon e.bigpanda.com> writes: On Wed, 3 Sep 2003, Bupp Phillips wrote:
Well, it's unfortunate that you feel that way, because SQL Server handles it correctly.
For some definition of correctly. If you're in a system which gets penalized .001 seconds for each query planning that uses a multi-column order by and you do 100 million of them that this doesn't apply to, and one that it does which save you 30 seconds, is that correct?
Uhm. Yes. Absolutely.
For OLTP systems like a web site if there's a single query anywhere in the
system that takes 30s the system is broken. Every time that query happens to
be hit a few times in a row the OLTP system will simply break. This isn't a
performance issue, it's outright broken.
Whereas a 1ms performance hit on every query plan will make virtually no
difference at all in performance and most importantly, it will work. At worst
it means provisioning a 1% faster machine. Even then only if my system isn't
preparing all these queries in advance, in which case I have bigger
performance and security issues than a 1ms per query hit.
For DSS systems handling queries that take minutes or hours to run the case is
even clearer. The 1ms hit is even less of an issue, and the 30s gain will
balloon and turn into minutes or hours of speedup.
I'm pretty sure this particular case was not one of the cases where people
said it just wasn't worth doing. This was just too hard. Solving it in a way
that integrates cleanly with postgres's aggregates will be very hard and
require someone who understands lots of different parts of postgres to spend
an awful lot of time thinking about the problem.
[This is one of the strength's of free software. There's no marketing
department providing checklists that have to be met even if there's no good
solution today. So instead of a bad solution that sticks around for a long
time, we'll one day (hopefully) have a good solution when someone figures out
how to do it.]
--
greg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)
On 5 Sep 2003, Greg Stark wrote: Stephan Szabo <ss****@megazon e.bigpanda.com> writes:
On Wed, 3 Sep 2003, Bupp Phillips wrote:
Well, it's unfortunate that you feel that way, because SQL Server handles it correctly. For some definition of correctly. If you're in a system which gets penalized .001 seconds for each query planning that uses a multi-column order by and you do 100 million of them that this doesn't apply to, and one that it does which save you 30 seconds, is that correct?
Uhm. Yes. Absolutely.
Even if all the people that don't write queries with redundant sort
columns pay the cost for those that do? In some ways it'd be nice if we
could control the planner more so that individual systems could make
determinations of whether this was useful or not.
From the below, I'm not sure we're talking about the same case any longer.
:)
I'm pretty sure this particular case was not one of the cases where people said it just wasn't worth doing. This was just too hard. Solving it in a way that integrates cleanly with postgres's aggregates will be very hard and require someone who understands lots of different parts of postgres to spend an awful lot of time thinking about the problem.
I'm not sure how finding redundant sort columns due to unique constraints
really integrates with aggregates at all. Did we maybe cross conversation
with one of the aggregate discussions on min/max/count?
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
On Thu, 4 Sep 2003, Relaxin wrote: What the heck are you talking about?
Whether or not trying every optimization is worthwhile in every potential
situation where it might apply, in theory.
SQLServer returns the query immediately (milliseconds) where as PGSQL returns 6 seconds!! So, yes I would say that SQLServer did it correctly in this situation. The optimizer did what it was suppose to do...find the quickest way to get me the data.
*If* you had other queries that took 6 seconds rather than milliseconds
because it was trying to work out if the optimization applied, would you
still think it was correct to be trying the optimization on those queries?
What about if it added 50 ms to every query you did with an order by, and
you never used redundant sort columns?
Now if you guys here a PGSQL don't believe it's necessary to change the optimizer for this purpose, that's fine, but don't start putting other DBMS down because they DO handle this situation.
Where did I put down another DBMS? (I've left my text below). I said (not
in so many words), some optimizations take time, always doing every
available optimization may not be "correct" because it may affect the
running time of other queries adversely.
I don't really know if this particular case warrents doing, I don't know
if the time involved in checking it is entirely negligible or not for
complicated queries. I do know that there are people who use order by
however and do not use redundant sort columns and that if there is a
non-negligible cost, they're the ones that are really going to be paying
for it. Tom believed that the cost was non-negligible for its benefit, if
someone else wants to do it and get numbers and the costs are negligible,
it's likely to get put in.
"Stephan Szabo" <ss****@megazon e.bigpanda.com> wrote in message news:20******** *************** ***@megazone.bi gpanda.com... On Wed, 3 Sep 2003, Bupp Phillips wrote:
Well, it's unfortunate that you feel that way, because SQL Server handles it correctly.
For some definition of correctly. If you're in a system which gets penalized .001 seconds for each query planning that uses a multi-column order by and you do 100 million of them that this doesn't apply to, and one that it does which save you 30 seconds, is that correct?
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
Regardless of your argument, all I know is that SQL Server handles it
correctly (giving me the data in the fastest possible way) and postgresql
doesn't.
And your argument about how it will increase other queries is pointless to
me. Will you still stand by this argument when the PG folks find a situation
where the optimizer creates a terrible plan and the only way to fix is to
add additional logic, based on what you are saying, it's not worth it
because it could add .001 seconds of additional processing.
Again, like I said, if the PG folks don't see a need for this type of
optimization, then that's fine. What I'm stating is that SQLServer is a very
fast and robust database that also handles this, as you may call it, odd
situation.
Thanks
"Stephan Szabo" <ss****@megazon e.bigpanda.com> wrote in message
news:20******** *************** ***@megazone.bi gpanda.com... On Thu, 4 Sep 2003, Relaxin wrote:
What the heck are you talking about?
Whether or not trying every optimization is worthwhile in every potential situation where it might apply, in theory.
SQLServer returns the query immediately (milliseconds) where as PGSQL returns 6 seconds!! So, yes I would say that SQLServer did it correctly in this situation.
The optimizer did what it was suppose to do...find the quickest way to get
me the data.
*If* you had other queries that took 6 seconds rather than milliseconds because it was trying to work out if the optimization applied, would you still think it was correct to be trying the optimization on those queries? What about if it added 50 ms to every query you did with an order by, and you never used redundant sort columns?
Now if you guys here a PGSQL don't believe it's necessary to change the optimizer for this purpose, that's fine, but don't start putting other
DBMS down because they DO handle this situation.
Where did I put down another DBMS? (I've left my text below). I said (not in so many words), some optimizations take time, always doing every available optimization may not be "correct" because it may affect the running time of other queries adversely.
I don't really know if this particular case warrents doing, I don't know if the time involved in checking it is entirely negligible or not for complicated queries. I do know that there are people who use order by however and do not use redundant sort columns and that if there is a non-negligible cost, they're the ones that are really going to be paying for it. Tom believed that the cost was non-negligible for its benefit, if someone else wants to do it and get numbers and the costs are negligible, it's likely to get put in.
"Stephan Szabo" <ss****@megazon e.bigpanda.com> wrote in message news:20******** *************** ***@megazone.bi gpanda.com... On Wed, 3 Sep 2003, Bupp Phillips wrote:
> Well, it's unfortunate that you feel that way, because SQL Server handles it > correctly.
For some definition of correctly. If you're in a system which gets penalized .001 seconds for each query planning that uses a
multi-column order by and you do 100 million of them that this doesn't apply to,
and one that it does which save you 30 seconds, is that correct?
---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings
On Fri, 5 Sep 2003, Relaxin wrote: And your argument about how it will increase other queries is pointless to me. Will you still stand by this argument when the PG folks find a situation where the optimizer creates a terrible plan and the only way to fix is to add additional logic, based on what you are saying, it's not worth it because it could add .001 seconds of additional processing.
I've tried to say that you need to balance the cost of the optimization
against its potential gain in the cases it helps, the frequency of those
cases, the ability to solve the problems other ways and against the
development time that is used in it. There isn't a push button with two
states, best and not best.
When you want to argue about an optimization you should think about :
a) What does the optimization actually mean (actual specification, not
vague words)
b) What cases is the optimization legal for (doesn't change results,
doesn't violate spec, etc)
c) What cases does the optimization help (as separate from b in that some
cases may not actually be faster but the optimization is legal)
d) What is the gain over the cases that the optimization helps
e) What is the penalty over the cases that the optimization does not help
If someone feels strongly about it or feels that they have the time,
they can (and should be generally promoted to) attempt the optimization.
If the optimization is not expensive or has better than expected gains or
good side effects, it's likely to get accepted.
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: hemal |
last post by:
I came across a very strange situation at work. There is an order of
magnitude difference in execution time for the following two queries
(10 v/s ~130 msec):
select count(*) from table_name where column_name = 'value' or 1 = 0
select count(*) from table_name where column_name = 'value'
I do not want to go into the reason why the redundent condition
exists, the example is representative of the real query where it
|
by: gefek |
last post by:
Hello,
I've got a server postgresql 7.4.6 installed from RPM for linux RH 9.
Lately, I've noticed, that some queries last too long... It appears to me,
that the optimizer does not use index optimizing....
This is what analyze shows:
------
explain analyze update activities set act_synch_date='2005-02-03 00:00:00'
where activities.act_id=17;
QUERY PLAN...
|
by: Daniel Roy |
last post by:
Guys,
what I need is a tool which gives details on the choice of an
execution plan by the SQL Server. For example, the cost for a hash
join might be 200 and 100 for a nested loop, and therefore a nested
loop is used. Same thing for the access paths for each table/view
involved. In Oracle, we turn on event 100053 to see this kind of info.
Thanx
Daniel
|
by: Philip Yale |
last post by:
I'm very puzzled by the choice of NC index being made by the optimizer
in this example. I don't actually think it should use an NC index at
all.
I have:
Table: CustomerStatus_T
Single data page
19 records
Clustered Index on CustomerStatusID:
|
by: Greg Stark |
last post by:
Hm, here's a query where the optimizer is choosing the wrong plan by far. I
think it boils down to it guessing wrong on how selective an rtree index is,
which I guess would be hard to predict.
Except if it guesses wrong by assuming it isn't selective it would be maybe
50% slower doing lots of index lookups instead of a more efficient full table
scan and join. If it guesses wrong by assuming it'll be very selective as it
is in this case...
| |
by: Andrea |
last post by:
Hi,
I've read the former postings but was not able to solve my problem:
I have a Summary Table (or MQT as you like it) and the query optimizer
does not seem to use the summary table.
I run the following commands:
set current query optimization 9;
|
by: Larry Bertolini |
last post by:
While trying to help a developer troubleshoot a performance
problem, I have discovered something that strikes me as odd.
When I run a particular query, using a non-privileged login that has
necessary access to the objects, the query runs in 3.5 hours.
When I run the same query, but use a login that has sysadmin
role, the query runs in 1 second.
I compared the showplan output, and the "regular" user's execution plan
|
by: Kevin |
last post by:
Using a base table, a MQT table was created. With optimization - when
querying the base table with calcuation that are already completed in
the MQT - I would assume the optimizer would use the MQT table instead
of the base table.
What causes the optimizer to use the MQT tables rather than the tables
as defined by the query? Is there a way to "encourage" use of the MQT
tables?
Thanks.
|
by: boa sema |
last post by:
Way back when, and at least in version 7 IIRC, the query optimizer gave
up when the where clause in a statement contained more than 4 search
conditions.
Does anyone know if such a limitation still exist in MS SQL 2005? The
BOL seems to be silent on the issue.
Boa
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |