473,806 Members | 2,879 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Slow deletes

Raj
We have a batch process that deletes about 7-8 million records every
day
& takes about 30-35 mins to complete. Is there a way to optimize
deletes? Below is the statement

delete from fact where (key,dt) in (select key,dt from dim1 union
select key,dt from dim2)

Fact table has composite index on key,someothecol umn,dt.
>From the access plan, i see that the index is being used.

Can we implement parallel deletes or tune any other db paramaeter ? We
are on db2 udb v8.2 DPF on 10 logical nodes.
Thanks
Raj

Nov 30 '06 #1
6 3844
Raj wrote:
We have a batch process that deletes about 7-8 million records every
day
& takes about 30-35 mins to complete. Is there a way to optimize
deletes? Below is the statement

delete from fact where (key,dt) in (select key,dt from dim1 union
select key,dt from dim2)

Fact table has composite index on key,someothecol umn,dt.
From the access plan, i see that the index is being used.


Can we implement parallel deletes or tune any other db paramaeter ? We
are on db2 udb v8.2 DPF on 10 logical nodes.
Thanks
Raj
First i'd changes the UNION to a UNION ALL. Probably don't want to run
a DISTINCT before it does the DELETE. Or, just run it in two
statements.

Second, i'd try EXISTS over IN(). Which such a large list IIUC, EXISTS
will utilize the INDEX:

DELETE FROM Fact WHERE EXISTS
(SELECT * FROM FROM Dim1 Dim WHERE Dim.Key = Fact.Key AND Dim.Dt =
Fact.Dt)

DELETE FROM Fact WHERE EXISTS
(SELECT * FROM FROM Dim2 Dim WHERE Dim.Key = Fact.Key AND Dim.Dt =
Fact.Dt)

Another idea is that DELETEing in batches is usually faster. Simplest
way would be to create another table and store all records to be
DELETEd there. Then, DELETE perhaps 10k rows at a time.

CREATE TABLE Fact_Delete(Key INT, Dt INT);
INSERT INTO Fact_Delete
(SELECT Key, Dt FROM Dim1 UNION SELECT Key, Dt FROM Dim2)

DELETE FROM Fact WHERE (Key,Dt) IN
(SELECT Key, Dt FROM Fact_Delete ORDER BY Key FETCH FIRST 10000 ROWS
ONLY)
DELETE FROM Fact_Delete ORDER BY Key FETCH FIRST 10000 ROWS ONLY

Just some ideas.

B.

Nov 30 '06 #2
Raj
Thanks!! I'll try the frequent commit option,
I tried exists but plan has very high timerons vs in.

Will a parallel delete work i.e. running the following in parallel;

delete from fact where (key,dt) in (select key,dt from dim1 union
select key,dt from dim2) and PARTITION(key)= 1

delete from fact where (key,dt) in (select key,dt from dim1 union
select key,dt from dim2) and PARTITION(key)= 2
..
..
..
..
..
..
dim1 & dim2 are small tables so i will replicate them in all nodes

Brian Tkatch wrote:

Nov 30 '06 #3
Raj wrote:
Thanks!! I'll try the frequent commit option,
I tried exists but plan has very high timerons vs in.
If you have the time it still might be worth trying the exists. The
timerons aren't always 100% accurate - at least, not in my experience.

Another possible option to speed up deletes is to use MDC's on the fact
table. If you cluster on key and dt and turn on MDC Rollout, then the
deletes should go much, much faster (and cause much less logging
overhead). Of course, there are a lot of other considerations to take
into account before doing this - such as disc space usage and the fact
that you would have to completely reload your table(s). Also, I'm
basing this suggestion solely on the literature about MDC's - I haven't
(yet) taken advantage of this so can't give a definite thumbs up.

-Chris

Nov 30 '06 #4
Raj wrote:
Thanks!! I'll try the frequent commit option,
Neat. That's what i did at one point when i ahd the same issue. :)

Note that if you need a TABLE, you don't actually have to CREATE a
TABLE, a GLOBAL TEMPORARY TABLE should work just as well.
I tried exists but plan has very high timerons vs in.

Will a parallel delete work i.e. running the following in parallel;
Way out of my league. Never used the stuff.

B.

>
delete from fact where (key,dt) in (select key,dt from dim1 union
select key,dt from dim2) and PARTITION(key)= 1

delete from fact where (key,dt) in (select key,dt from dim1 union
select key,dt from dim2) and PARTITION(key)= 2
.
.
.
.
.
.
dim1 & dim2 are small tables so i will replicate them in all nodes

Brian Tkatch wrote:
Nov 30 '06 #5
The DELETE should parallelize on it's on.
Check the explain.

FETCH FIRST and parallelism don't go well together...

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

WAIUG Conference
http://www.iiug.org/waiug/present/Fo...Forum2006.html
Nov 30 '06 #6
If that was z/OS, we would use REORG with DISCARD or UNLOAD/SORT w INCLUDE
to delete unwanted rows/LOAD REPLACE....

Thing to remember is that delete/insert use LOGGING and does slow down the
process (imagine logging 8 million rows, granted that you have frequent
commits - which further slow it down - you've still logged 8 million records
before images). Utilities like LOAD/REORG avoid logging and saves heaps on
resources.... Also, after deletion of so many rows, you would need to reorg
anyway......

HTH,
Ven

"Raj" <sp****@yahoo.c omwrote in message
news:11******** *************@n 67g2000cwd.goog legroups.com...
We have a batch process that deletes about 7-8 million records every
day
& takes about 30-35 mins to complete. Is there a way to optimize
deletes? Below is the statement

delete from fact where (key,dt) in (select key,dt from dim1 union
select key,dt from dim2)

Fact table has composite index on key,someothecol umn,dt.
>>From the access plan, i see that the index is being used.


Can we implement parallel deletes or tune any other db paramaeter ? We
are on db2 udb v8.2 DPF on 10 logical nodes.
Thanks
Raj

Dec 2 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
1575
by: Jeremy Howard | last post by:
I am finding delete queries on large InnoDB tables very slow - are there ways to speed this up? I have a table with about 100 million rows: I am trying to delete just a few of these rows (the following select takes a couple of seconds): > SELECT count(*) -> FROM UserSnap
2
1824
by: Craig Stadler | last post by:
mysql 4.0.22 (win32) Can anyone recommend best practices for the fastest way to remove large numbers of rows at once? I am diving my deletes into chunks (1000 rows at a time) but this still is very slow. -Craig
16
4219
by: Dave Weaver | last post by:
I'm having severe performance issues with a conceptually simple database. The database has one table, containing weather observations. The table currently has about 13.5 million rows, and is being updated constantly. (The database is running on a dual 550MHz PIII with 512MB RAM. I have PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC 2.96 on RedHat 7.2) On the whole, queries are of the form: SELECT ? FROM obs WHERE station = ?
7
1571
by: mp | last post by:
Hi, MS Access DB, C#, VS, SQL I have implemented search with SQL statements like follows: SQLString = "SELECT ENGLISH FROM MyTable WHERE ENGLISH LIKE '"+txtWordManipulation.Text+"%' ORDER BY ENGLISH"; end everything is desperately slow. DB is more than 150k words and I haven't defined primary key because indexed option is set on Yes (Duplicates OK).
3
4627
by: jdipalmajr | last post by:
I am using a strongly typed dataset with a design time schema. I load data into the dataset tables from xml. The problem I am having is after the XML load. The first time I Add a row to a table I get very slow performance (20-30 second delay). The add to table works but I get slow performance. Any record adds or deletes after this first add are fast. It does not seem to matter which table the record add occurs. I get slow...
6
6714
by: MadMan2004 | last post by:
Hello all! I'm having a problem with a project I'm working on and I'd like to ask for anyone's input that might be helpful. I'm building a rather large front-end application connecting to an AS400 for the back end database and I'm experiencing slow response times when executing sql statements. Some select statement response times are bad. Not all, but some. And there doesn't seem to be a consistent factor in any of the sql statements...
2
10209
by: Brock Henry | last post by:
Hello, I have a table with 29268 odd records. Deleting records is VERY slow, and I don't know why. I explained analysed the following query: delete from people where id < '2000' Index Scan using people_pkey on people (cost=0.00..71.68 rows=2792 width=6) (actual time=1.361..5.657 rows=2000 loops=1)
14
9657
by: Michel Esber | last post by:
Linux RH 4.0 running DB2 V8 FP 11. I have a table with ~ 11M rows and running DELETE statements is really slow. Deleting 1k rows takes more than 3 minutes. If I run select statements on the same table, I usually fetch rows in a reasonable time. The table has the following description: MACHINE_ID VARCHAR (24)
0
1075
by: Raj | last post by:
We have a batch process that deletes about 1 million records every day &takes about 30-35 mins to complete. Is there a way to optimize deletes? Below is the statement delete from fact where (key,dt) in (select key,dt from dim1 union select key,dt from dim2) Fact table has composite index on key,someothecolumn,dt. Can we implement parallel deletes or tune any other db paramaeter ? We
0
9597
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10620
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10369
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9187
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7650
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6877
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5682
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3851
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3008
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.