473,238 Members | 1,441 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,238 software developers and data experts.

Any way to Convince postgres to push join clause inside subquery aggregate?


I find I often want to be able to do joins against views where the view are
aggregates on a column that has an index. Ie, something like

SELECT a.*, v.n
FROM a
JOIN (select a_id,count(*) as n group by a_id) as v USING (a_id)

where there's an index on b.a_id. assume there are other tables being joined
so I can't just move the aggregate to the outermost layer.

These queries often come about because I often have a fully normalized
structure with many-to-many relationships but sometimes only want to display
an aggregated view of the data. For example, a list of products with the
number in stock, number on order, number sold recently, etc.

It would be really great if postgres could notice that the index is still
useful and push the index lookup inside the aggregate. Working around this is
really awkward and makes it impossible to use real views to hide the details
from the queries. Usually working around it actually means bulding
denormalized tables to act as "materialized views" and building indexes on
those tables.

The frustrating part is that it seems like postgres is almost capable of doing
it. In a simple query it is fully capable of pushing a where clause inside the
aggregate.

But in a join query it doesn't. (I put the "limit 1" on the second table to
ensure it was the best table to use to drive the join.)

Is there some key detail I'm missing that would allow it to push the join
clause inside the aggregate and use the index? If so I can simplify a lot of
design by creating views for these aggregate views instead of creating
denormalized tables and hacking complex queries everywhere.

slo=> explain select * from (select foo_id,count(*) as n from foo_bar group by foo_id) as x where foo_id = 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Subquery Scan x (cost=0.00..25.24 rows=1 width=12)
-> GroupAggregate (cost=0.00..25.23 rows=1 width=4)
-> Index Scan using idx_foo_bar_foo on foo_bar (cost=0.00..25.08 rows=30 width=4)
Index Cond: (foo_id = 1)
(4 rows)

Time: 226.15 ms

slo=> explain
select *
from (select foo_id,count(*) as n from foo_bar group by foo_id) as x
join (select * from foo limit 1) as foo using (foo_id)
;

QUERY PLAN
---------------------------------------------------------------------------------------
Hash Join (cost=108.46..111.28 rows=1 width=227)
Hash Cond: ("outer".foo_id = "inner".foo_id)
-> Subquery Scan x (cost=108.35..110.36 rows=161 width=12)
-> HashAggregate (cost=108.35..108.75 rows=161 width=4)
-> Seq Scan on foo_bar (cost=0.00..81.23 rows=5423 width=4)
-> Hash (cost=0.11..0.11 rows=1 width=219)
-> Subquery Scan foo (cost=0.00..0.11 rows=1 width=219)
-> Limit (cost=0.00..0.10 rows=1 width=625)
-> Seq Scan on foo (cost=0.00..17.82 rows=182 width=625)
(9 rows)
--
greg
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 11 '05 #1
2 3637
Greg Stark <gs*****@mit.edu> writes:
slo=> explain
select *
from (select foo_id,count(*) as n from foo_bar group by foo_id) as x
join (select * from foo limit 1) as foo using (foo_id)
;


Why not put the subselect in the output list, if that's the kind of plan
you want?

regression=# explain select foo.*, (select count(*) from foo_bar where foo_id = foo.foo_id) as n from foo;
QUERY PLAN
------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..17102.50 rows=1000 width=8)
SubPlan
-> Aggregate (cost=17.08..17.08 rows=1 width=0)
-> Index Scan using foobi on foo_bar (cost=0.00..17.07 rows=5 width=0)
Index Cond: (foo_id = $0)
(5 rows)
regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #2
Tom Lane <tg*@sss.pgh.pa.us> writes:
Greg Stark <gs*****@mit.edu> writes:
slo=> explain
select *
from (select foo_id,count(*) as n from foo_bar group by foo_id) as x
join (select * from foo limit 1) as foo using (foo_id)
;


Why not put the subselect in the output list, if that's the kind of plan
you want?


Actually that's the way the query is right now. The problem arises because I
want to add a second column without duplicating the whole thing.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Dave | last post by:
I have 2 tables, one with names, and another with addresses, joined by their CIVICID number (unique to the ADDRESSINFO table) in Oracle. I need to update a field in the NAMEINFO table for a...
4
by: Anthony Robinson | last post by:
I was actually just wondering if someone could possibly take a look and tell me what I may be doing wrong in this query? I keep getting ambiguous column errors and have no idea why...? Thanks in...
23
by: Brian | last post by:
Hello All - I am wondering if anyone has any thoughts on which is better from a performance perspective: a nested Select statement or an Inner Join. For example, I could do either of the...
7
by: deko | last post by:
SELECT tblTxAcct.TxAcctName, tblTxType.TxTypeName, Nz(qryTxAcctListCt.TxCount, 0) AS TxCt FROM (tblTxAcct INNER JOIN tblTxType ON tblTxAcct.TxType_ID=tblTxType.TxType_ID) LEFT JOIN qryTxAcctListCt...
6
by: jao | last post by:
My company has a product in beta which uses Postgres 7.4.3. We expect to have a code freeze for our 1.0 product in March 2005. I'd really like to use Postgres 8.x in our 1.0 product. We're...
26
by: Jeff | last post by:
Ok gang. Here is something complicated, well, at least to me anyway. Using Access DB I have a table in my DB called members. In that table, I have 2 tables I will be using "username" and...
10
by: Hank | last post by:
We have just recently migrated the data from our Access 2000 backend to Postgres. All forms and reports seem to run correctly but, in many cases, very slowly. We do not want to switch over until...
9
by: shanevanle | last post by:
I have two tables that are pretty big. I need about 10 rows in the left table and the right table is filtered to 5 rows as well. It seems when I join the tables in the FROM clause, I have to...
2
theGeek
by: theGeek | last post by:
I always wonder which one of join or subquery should I be using to solve a particuar problem so that I get the correct result set in minimum time. It usually happens that I can write a query quickly...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.