I have a non-performing query and am in need of tip or two to melt the
brain freeze I'm currently experiencing to get around it.
I have a large parent table, with an even larger child table (approx 5
rows per parent item). I want to join parent data with only one row
(the most current) from child table. The overall predicate for the
join is highly selective, but is very verbose and involves lots of
tables. I tried materializing the child's data several ways, but
always end up with a big table scan.
In its simplest form
TABLE PARENT
parent_id int (PK),
someinfo varchar(100)
TABLE CHILD
parent_id int (PK),
ts timestamp (PK)
otherinfo varchar(100)
select p.parent_id, p.someinfo, c.ts, c.otherinfo
from parent p
join (
select c.parent_id, max(c.ts) as max_ts
from child c
group by c.parent_id
) as max_child on max_child.parent_id = p.parent_id
join child c on max_child.parent_id = c.parent_id and max_child.max_ts
= c.ts
join <A BUNCH OF OTHER TABLES)
WHERE <LARGE PREDICATE>
The optimizer chooses a table scan for the inline table expression.
I've tried altering this same expression to use the rownumber() OLAP
function instead of the max() with similar slow results. I thought I
had seen, in past behavior, the optimizer's ability to push the
predicate down into the table expression. The only way I have found to
make this happen is to manually push the joins and predicates into the
table expression wherein the performance is restored.
So technically my problem is solved, but it makes for highly
unreadable verbose SQL. Isn't there another way represent that
functionality without the verbosity?
Thanks,
Evan
AIX 5.3 + DB2 8.2 FP16