Raj wrote:
We have a batch process that deletes about 7-8 million records every
day
& takes about 30-35 mins to complete. Is there a way to optimize
deletes? Below is the statement
delete from fact where (key,dt) in (select key,dt from dim1 union
select key,dt from dim2)
Fact table has composite index on key,someothecolumn,dt.
From the access plan, i see that the index is being used.
Can we implement parallel deletes or tune any other db paramaeter ? We
are on db2 udb v8.2 DPF on 10 logical nodes.
Thanks
Raj
First i'd changes the UNION to a UNION ALL. Probably don't want to run
a DISTINCT before it does the DELETE. Or, just run it in two
statements.
Second, i'd try EXISTS over IN(). Which such a large list IIUC, EXISTS
will utilize the INDEX:
DELETE FROM Fact WHERE EXISTS
(SELECT * FROM FROM Dim1 Dim WHERE Dim.Key = Fact.Key AND Dim.Dt =
Fact.Dt)
DELETE FROM Fact WHERE EXISTS
(SELECT * FROM FROM Dim2 Dim WHERE Dim.Key = Fact.Key AND Dim.Dt =
Fact.Dt)
Another idea is that DELETEing in batches is usually faster. Simplest
way would be to create another table and store all records to be
DELETEd there. Then, DELETE perhaps 10k rows at a time.
CREATE TABLE Fact_Delete(Key INT, Dt INT);
INSERT INTO Fact_Delete
(SELECT Key, Dt FROM Dim1 UNION SELECT Key, Dt FROM Dim2)
DELETE FROM Fact WHERE (Key,Dt) IN
(SELECT Key, Dt FROM Fact_Delete ORDER BY Key FETCH FIRST 10000 ROWS
ONLY)
DELETE FROM Fact_Delete ORDER BY Key FETCH FIRST 10000 ROWS ONLY
Just some ideas.
B.