473,659 Members | 3,592 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

bug in query planning?

I have a query which does not use column indexes that it should use. I
have discovered some interesting behaviors of Postgres which may
indicate a bug in the database's query planning.

Take a look at the query below. There is a btree index on both
m.account_id and a.account_id. Query (1) does not use the index on the
messages table, instead opting for a full table scan, thus killing
performance. The messages table can contain potentially hundreds of
thousands or millions of rows. Even at 50,000, it's murder.

Query (2) below is the same query, but we reverse the order of the
tables. It's obviously not quite the same query semantically, even
though in my case it should always produce the same result. It is
interesting to note that it uses the indexes tho.

Finally, query (3) below uses traditional joining (non-ANSI). Indexes
are correctly used in that query. The suggestion is that Postgres does
not correctly analyze queries using ANSI joins. Indexes are
occasionally skipped when they should be used. This seems like a bug
in Postgres. I'm using version 7.3.4 of Postgres.

Thanks in advance for any comments...
steve
Query (1)
=========
defender=# explain analyze
defender-# select count(message_i d)
defender-# from messages m
defender-# left join accounts a
defender-# on m.account_id::b igint = a.account_id::b igint
defender-# where a.email = 's******@neosyn apse.net';
QUERY
PLAN
------------------------------------------------------------------------
--------------------------------------------------------------------
Aggregate (cost=20461.10. .20461.10 rows=1 width=47) (actual
time=1420.09..1 420.09 rows=1 loops=1)
-> Hash Join (cost=30.77..20 334.38 rows=50687 width=47) (actual
time=0.51..1319 .69 rows=51419 loops=1)
Hash Cond: ("outer".accoun t_id = "inner".account _id)
Filter: ("inner".ema il = 's******@neosyn apse.net'::text )
-> Seq Scan on messages m (cost=0.00..192 89.87 rows=50687
width=16) (actual time=0.06..703. 89 rows=52541 loops=1)
-> Hash (cost=30.76..30 .76 rows=3 width=31) (actual
time=0.40..0.40 rows=0 loops=1)
-> Index Scan using accounts_pkey on accounts a
(cost=0.00..30. 76 rows=3 width=31) (actual time=0.17..0.38 rows=3
loops=1)
Total runtime: 1420.25 msec
(8 rows)
Query (2)
=========
defender=# explain analyze
defender-# select count(message_i d)
defender-# from accounts a
defender-# left join messages m
defender-# on a.account_id::b igint = m.account_id::b igint
defender-# where a.email = 's******@neosyn apse.net';

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
--------
Aggregate (cost=6806.54.. 6806.54 rows=1 width=24) (actual
time=792.14..79 2.14 rows=1 loops=1)
-> Nested Loop (cost=0.00..676 4.30 rows=16896 width=24) (actual
time=0.38..718. 12 rows=51419 loops=1)
-> Index Scan using accounts_email on accounts a
(cost=0.00..8.9 8 rows=1 width=8) (actual time=0.22..0.25 rows=1
loops=1)
Index Cond: (email = 's******@neosyn apse.net'::text )
-> Index Scan using messages_accoun t_id on messages m
(cost=0.00..654 4.13 rows=16896 width=16) (actual time=0.15..593. 15
rows=51419 loops=1)
Index Cond: ("outer".accoun t_id = m.account_id)
Total runtime: 792.33 msec
(7 rows)
Query (3)
=========
defender=# explain analyze
defender-# select count(message_i d)
defender-# from messages m, accounts a
defender-# where m.account_id::b igint = a.account_id::b igint
defender-# and a.email = 's******@neosyn apse.net';

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
--------
Aggregate (cost=6806.54.. 6806.54 rows=1 width=24) (actual
time=782.30..78 2.30 rows=1 loops=1)
-> Nested Loop (cost=0.00..676 4.30 rows=16896 width=24) (actual
time=0.33..708. 52 rows=51422 loops=1)
-> Index Scan using accounts_email on accounts a
(cost=0.00..8.9 8 rows=1 width=8) (actual time=0.15..0.18 rows=1
loops=1)
Index Cond: (email = 's******@neosyn apse.net'::text )
-> Index Scan using messages_accoun t_id on messages m
(cost=0.00..654 4.13 rows=16896 width=16) (actual time=0.15..578. 23
rows=51422 loops=1)
Index Cond: (m.account_id = "outer".account _id)
Total runtime: 782.46 msec
(7 rows)
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 12 '05 #1
6 1907
"Steven D.Arnold" <st*****@neosyn apse.net> writes:
Query (2) below is the same query, but we reverse the order of the
tables. It's obviously not quite the same query semantically, even
though in my case it should always produce the same result.
Since it is in fact not the same query, I'm unclear on why you expect
it to produce the same plan.
I'm using version 7.3.4 of Postgres.


FWIW, I believe that 7.4 will recognize that (1) and (3) are
semantically equivalent.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 12 '05 #2

On Dec 21, 2003, at 11:47 PM, Tom Lane wrote:
"Steven D.Arnold" <st*****@neosyn apse.net> writes:
Query (2) below is the same query, but we reverse the order of the
tables. It's obviously not quite the same query semantically, even
though in my case it should always produce the same result.
Since it is in fact not the same query, I'm unclear on why you expect
it to produce the same plan.


What I expect is for both queries to use the index on the messages
table! Why is it not doing that?
FWIW, I believe that 7.4 will recognize that (1) and (3) are
semantically equivalent.


I will try 7.4 and report back.

steve
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #3
The queries are listed here for the referentially (yes that's a pun)
challenged.

Query 1:
SELECT COUNT(message_i d)
FROM messages m
LEFT JOIN accounts a
ON m.account_id::b igint = a.account_id::b igint
WHERE a.email = 's******@neosyn apse.net';

Query 2:
SELECT COUNT(message_i d)
FROM accounts a
LEFT JOIN messages m
ON a.account_id::b igint = m.account_id::b igint
WHERE a.email = 's******@neosyn apse.net';

Query 3:
SELECT COUNT(message_i d)
FROM messages m, accounts a
WHERE m.account_id::b igint = a.account_id::b igint
AND a.email = 's******@neosyn apse.net';

From what I can see they are not the same query and therefore shouldn't
use the same plan.

The first query is saying go get all the messages (best done with a seq
scan since there is no where to limit the results of the message table
[using an index scan would just add the overhead of reading the pages
for the index, the computational time to resolve the index entries, and
turn the table access into a random sector read instead of sequential
without actually limiting what gets returned]) match that with as many
accounts as you can and return a row for all of the messages (note the
LEFT JOIN). Next filter all of the results on the account email (which
only eliminates 1100 messages out of 52000). Now count how many
messages are left which should return 51419.

The second query is saying get all of the accounts filter by email
address (it can get this from the where this time) giving 1 row. Now
match that to every message for this account_id and return at least one
row even if there are no messages for this account (note again the LEFT
JOIN) (which uses the index scan because it expects the index
selectivity to be a approximately 1/4 of the full table [it's wrong]).
Now count how many messages I have which returns 51419.

The third query is saying give me all of the messages for the accounts
where my email = 's******@neosyn apse.net' and I don't care where you
start from. The optimizer, after going through consideration of
various possible plans, is then smart enough to realize the email =
'blah' is indexed and it's selectivity is 1 row which means that we now
return to the situation in query 2 with one small change if there are no
messages for the account in question you would get no row returned,
leading to a more efficient aggregation step.

Steven D.Arnold wrote:

On Dec 21, 2003, at 11:47 PM, Tom Lane wrote:
"Steven D.Arnold" <st*****@neosyn apse.net> writes:
Query (2) below is the same query, but we reverse the order of the
tables. It's obviously not quite the same query semantically, even
though in my case it should always produce the same result. You are correct the queries produce the same results, but they are
telling the planner to do completely different things. The query
doesn't show it bu if the behavior you are desiring happened in postgres
(unless show the relational algebra that makes it work), I would have to
start looking for a new database (that's a disturbing thought).

Since it is in fact not the same query, I'm unclear on why you expect
it to produce the same plan.

What I expect is for both queries to use the index on the messages
table! Why is it not doing that?


Because of the table ordering and the left join in 7.3.x
Because of the left join in 7.4
FWIW, I believe that 7.4 will recognize that (1) and (3) are
semantically equivalent.

I will try 7.4 and report back.


I don't believe the optimiser (in any database that cares about giving
you the correct results) can determine that a non-constrained primary
table in a left join can be rewritten as either of your other two
queries (but there are smarter people than me working on Postgres, so I
could be wrong).

steve


My suggestion would be to place the more selective table first in a
JOIN, and get rid of the LEFT JOIN's unless that's exactly what you
want. For more information about the different JOIN methods RTFM.

I would also suggest that you might want to tune your random page cost
toward 1, because obviously random access is being over estimated for
your hardware. (You might just want to look at tuning your parameters
in general.)

And in the future you should run a query at least one extra time to note
the different caching makes (the second run for an explain analyze is
usually quite different than the first for tables of this size).

DeJuan
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 12 '05 #4
The queries are listed here for the referentially (yes that's a pun)
challenged.

Query 1:
SELECT COUNT(message_i d)
FROM messages m
LEFT JOIN accounts a
ON m.account_id::b igint = a.account_id::b igint
WHERE a.email = 's******@neosyn apse.net';

Query 2:
SELECT COUNT(message_i d)
FROM accounts a
LEFT JOIN messages m
ON a.account_id::b igint = m.account_id::b igint
WHERE a.email = 's******@neosyn apse.net';

Query 3:
SELECT COUNT(message_i d)
FROM messages m, accounts a
WHERE m.account_id::b igint = a.account_id::b igint
AND a.email = 's******@neosyn apse.net';

From what I can see they are not the same query and therefore shouldn't
use the same plan.

The first query is saying go get all the messages (best done with a seq
scan since there is no where to limit the results of the message table
[using an index scan would just add the overhead of reading the pages
for the index, the computational time to resolve the index entries, and
turn the table access into a random sector read instead of sequential
without actually limiting what gets returned]) match that with as many
accounts as you can and return a row for all of the messages (note the
LEFT JOIN). Next filter all of the results on the account email (which
only eliminates 1100 messages out of 52000). Now count how many
messages are left which should return 51419.

The second query is saying get all of the accounts filter by email
address (it can get this from the where this time) giving 1 row. Now
match that to every message for this account_id and return at least one
row even if there are no messages for this account (note again the LEFT
JOIN) (which uses the index scan because it expects the index
selectivity to be a approximately 1/4 of the full table [it's wrong]).
Now count how many messages I have which returns 51419.

The third query is saying give me all of the messages for the accounts
where my email = 's******@neosyn apse.net' and I don't care where you
start from. The optimizer, after going through consideration of
various possible plans, is then smart enough to realize the email =
'blah' is indexed and it's selectivity is 1 row which means that we now
return to the situation in query 2 with one small change if there are no
messages for the account in question you would get no row returned,
leading to a more efficient aggregation step.

Steven D.Arnold wrote:

On Dec 21, 2003, at 11:47 PM, Tom Lane wrote:
"Steven D.Arnold" <st*****@neosyn apse.net> writes:
Query (2) below is the same query, but we reverse the order of the
tables. It's obviously not quite the same query semantically, even
though in my case it should always produce the same result. You are correct the queries produce the same results, but they are
telling the planner to do completely different things. The query
doesn't show it bu if the behavior you are desiring happened in postgres
(unless show the relational algebra that makes it work), I would have to
start looking for a new database (that's a disturbing thought).

Since it is in fact not the same query, I'm unclear on why you expect
it to produce the same plan.

What I expect is for both queries to use the index on the messages
table! Why is it not doing that?


Because of the table ordering and the left join in 7.3.x
Because of the left join in 7.4
FWIW, I believe that 7.4 will recognize that (1) and (3) are
semantically equivalent.

I will try 7.4 and report back.


I don't believe the optimiser (in any database that cares about giving
you the correct results) can determine that a non-constrained primary
table in a left join can be rewritten as either of your other two
queries (but there are smarter people than me working on Postgres, so I
could be wrong).

steve


My suggestion would be to place the more selective table first in a
JOIN, and get rid of the LEFT JOIN's unless that's exactly what you
want. For more information about the different JOIN methods RTFM.

I would also suggest that you might want to tune your random page cost
toward 1, because obviously random access is being over estimated for
your hardware. (You might just want to look at tuning your parameters
in general.)

And in the future you should run a query at least one extra time to note
the different caching makes (the second run for an explain analyze is
usually quite different than the first for tables of this size).

DeJuan
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 12 '05 #5
DeJuan Jackson <dj******@speed fc.com> writes:
Query 1:
SELECT COUNT(message_i d)
FROM messages m
LEFT JOIN accounts a
ON m.account_id::b igint = a.account_id::b igint
WHERE a.email = 's******@neosyn apse.net'; Query 2:
SELECT COUNT(message_i d)
FROM accounts a
LEFT JOIN messages m
ON a.account_id::b igint = m.account_id::b igint
WHERE a.email = 's******@neosyn apse.net'; Query 3:
SELECT COUNT(message_i d)
FROM messages m, accounts a
WHERE m.account_id::b igint = a.account_id::b igint
AND a.email = 's******@neosyn apse.net'; From what I can see they are not the same query and therefore shouldn't
use the same plan.


Actually, queries 1 and 3 are equivalent, and I believe PG 7.4 will
recognize them as such. The reason is that the WHERE clause "a.email =
'something'" cannot succeed when a.email is NULL; therefore, there is no
point in the JOIN being a LEFT JOIN --- any null-extended rows added by
the left join will be thrown away again by the WHERE clause. We may as
well reduce the LEFT JOIN to a plain inner JOIN, whereupon query 1 is
obviously the same as query 3. PG 7.4's optimizer can make exactly this
sequence of deductions. The bit of knowledge it needs for this is that
the '=' operator involved is STRICT, ie, yields NULL for NULL input.
All the standard '=' operators are strict and are so marked in the
catalogs. (If you are defining a user-defined type, don't forget to
mark your operators strict where applicable.)

I believe that query 2 is really equivalent to the others as well, but
proving it is more subtle. The reason is that COUNT(message_i d) does
not count rows where message_id is NULL, and so any null-extended rows
added by the LEFT JOIN won't be counted, and so we might as well reduce
the LEFT JOIN to a plain inner JOIN. PG's optimizer will not recognize
this, however. Possibly it could if anyone wanted to figure out how.
Right now we make very few assumptions about the behavior of aggregate
functions, but I think you could prove that this is safe based on the
behavior of nodeAgg.c for strict transition functions. Next question
is whether the case would come up often enough to be worth testing
for ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 12 '05 #6
Thanks to all for the detailed replies. I just wanted to let everyone
know -- for future google searches as much as anything else -- that
dumping the database, upgrading to 7.4.1 and reloading did solve the
problem. All the queries I mentioned now use the available indices,
except for understandable cases such as the number of rows in a table
being really small.

Thanks for the tip and thanks for the improvements in 7.4.1 that fixed
this problem.

steve
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 12 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1606
by: Dave | last post by:
Hi all, I have a problem with a query. (well, I actually have a few problems...) Here is my query. select FOCUS.SiteName, FOCUS.URL, OTWAdCars.* , REGION.SiteName as RegionSite, REGION.URL as RegionUrl from OTWSite as FOCUS right join OTWSite as REGION
9
5489
by: (Pete Cresswell) | last post by:
I've got some SQL that works as far as returning a recordset from a series of UNION statements. viz: SELECT whatever UNION this UNION that UNION other
3
1734
by: Myron | last post by:
I'm trying to create a query that will tell me which requests took longer than 10 days to move one from particular state to another state. The query I've created returns the correct requests, but not always the correct 'NextActionDate'/'NextStatus'/'NextState'. I'm sure I'm missing something easy, but I can't figure out what it might be. Any help is appreciated! Thanks, Myron -- remove SPAM-KILL from address to reply by email --
2
3546
by: Pete | last post by:
Before I get started with the question, does anyone have a (single) good book recommendation for database design? Not an Access-specific book, but something geared toward helping me figure out *what the user wants*. I've had brief formal education about data flow diagramming, but I'm looking for ... more, now that I'm actually running into problems I think stem from the fact that my users can't explain what they need done, compounded by...
2
15215
by: Robert | last post by:
when using the following function to create a pass through query is there a way to set the query property, "Returns Rows" to no. The default is yes. Since we are planning to create the pass through with new parameters in the where clause we need to set it each time. Thanx in advnance. Function CreateSPT(SPTQueryName As String, strSQL As String) Dim cat As ADOX.Catalog Dim cmd As ADODB.Command
2
720
by: Zygo Blaxell | last post by:
I have a table with a few million rows of temperature data keyed by timestamp. I want to group these rows by timestamp intervals (e.g. every 32 seconds), compute aggregate functions on the columns, and ultimately feed the result into a graph-drawing web thingy. I'm trying a few different ways to get what seems to be the same data, and seeing some odd behavior from the query planner. The table looks like this:
18
2357
by: simonmarkjones | last post by:
Hi all, I create a report to act as a receipt to customers. The report displays all the customer payment details and then i print this. This works fine. However, i now want to add some more information to this page. But i cant seem to do it. The report uses a query to pull in the data from the form.I have a drop down on my form called costcode1
1
2181
by: keri | last post by:
I would like to have a combo box on a form that shows the results of a query, however the query is variable and i am unsure how to do this. I have NO knowledge of code so very basc instructions would be appreciated! I have a table "appointments" which contains an "appointment date" field. I have a form "planning" (based on another table) with a field for "date to plan". I would like a combo box to show any "appointments" already made...
1
413
by: desai.rohit27 | last post by:
Hi. i have one table which contains employee name, date of joining, payment date and salary for the month. the employee name and date of joining together form the primary key. i need to create a query which will filter each employee and perform the necessary calculations on the salaries of that employee. please note that this has to be done automatically for all employees. can this be done?
0
8337
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8748
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
7359
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6181
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5650
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4335
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2754
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1978
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1739
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.