By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,279 Members | 1,673 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,279 IT Pros & Developers. It's quick & easy.

Simple SQL question

P: n/a
I want to delete some rows from a table (TAB1), but only those rows which
has a match in antother table (TAB2). Something like this:

TAB1: col1, col2, .....
TAB2, col1, col2,......
There are indexes on col1 in both the tables.

The sql would be something like this

delete from tab1
where col1 in (select col1 from TAB2)

when I run this it takes 'for ever' to finish. I have tried to rewrite this
but cannot find a solution that runs fast.

There are 1.5 million rows in TAB1 and 50000 rows in TAB2. And , yes, i have
run runstats on both tables.

Anyone got a good solution to this?

Regards
Odd A
Aug 18 '06 #1
Share this Question
Share on Google+
9 Replies


P: n/a
Odd Bjørn Andersen wrote:
I want to delete some rows from a table (TAB1), but only those rows which
has a match in antother table (TAB2). Something like this:

TAB1: col1, col2, .....
TAB2, col1, col2,......
There are indexes on col1 in both the tables.

The sql would be something like this

delete from tab1
where col1 in (select col1 from TAB2)

when I run this it takes 'for ever' to finish. I have tried to rewrite this
but cannot find a solution that runs fast.

There are 1.5 million rows in TAB1 and 50000 rows in TAB2. And , yes, i have
run runstats on both tables.

Anyone got a good solution to this?
I don't know why the DELETE is misbehaving without seeing the plan.
But you can try this:
MERGE INTO TAB1 USING (SELECT DISTINCT c1 FROM TAB2) AS TAB1
ON TAB1.c1 = TAB2.c1
WHEN MATCHED DELETE

How many rows do you expect to be deleted?
If c1 is unique it should be limited to 50000, which is nothing.
But if c1 is not unique it depends on how you define "forever"

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Aug 18 '06 #2

P: n/a

"Serge Rielau" <sr*****@ca.ibm.comwrote in message
news:4k************@individual.net...
Odd Bjørn Andersen wrote:
>I want to delete some rows from a table (TAB1), but only those rows which
has a match in antother table (TAB2). Something like this:

TAB1: col1, col2, .....
TAB2, col1, col2,......
There are indexes on col1 in both the tables.

The sql would be something like this

delete from tab1
where col1 in (select col1 from TAB2)

when I run this it takes 'for ever' to finish. I have tried to rewrite
this but cannot find a solution that runs fast.

There are 1.5 million rows in TAB1 and 50000 rows in TAB2. And , yes, i
have run runstats on both tables.

Anyone got a good solution to this?
I don't know why the DELETE is misbehaving without seeing the plan.
But you can try this:
MERGE INTO TAB1 USING (SELECT DISTINCT c1 FROM TAB2) AS TAB1
ON TAB1.c1 = TAB2.c1
WHEN MATCHED DELETE

How many rows do you expect to be deleted?
If c1 is unique it should be limited to 50000, which is nothing.
But if c1 is not unique it depends on how you define "forever"
c1 is unique in TAB1 (primary key), but not in TAB2. I expect a little less
than 50000 should be deleted - don't have the exact number but it will be
approximately 45000, which as you say is nothing. I have tried this today
and it ran for 4 hours before I cancelled the sql......

I will try your suggestion and see what happens.

I try this now in a test environment, but when time comes to run this in
production there will be almost 1 million rows to delete. So I must get a
better performance first.

Regards
Odd A




Aug 18 '06 #3

P: n/a
Odd Bjørn Andersen wrote:
"Serge Rielau" <sr*****@ca.ibm.comwrote in message
news:4k************@individual.net...
>Odd Bjørn Andersen wrote:
>>I want to delete some rows from a table (TAB1), but only those rows which
has a match in antother table (TAB2). Something like this:

TAB1: col1, col2, .....
TAB2, col1, col2,......
There are indexes on col1 in both the tables.

The sql would be something like this

delete from tab1
where col1 in (select col1 from TAB2)

when I run this it takes 'for ever' to finish. I have tried to rewrite
this but cannot find a solution that runs fast.

There are 1.5 million rows in TAB1 and 50000 rows in TAB2. And , yes, i
have run runstats on both tables.

Anyone got a good solution to this?
I don't know why the DELETE is misbehaving without seeing the plan.
But you can try this:
MERGE INTO TAB1 USING (SELECT DISTINCT c1 FROM TAB2) AS TAB1
ON TAB1.c1 = TAB2.c1
WHEN MATCHED DELETE

How many rows do you expect to be deleted?
If c1 is unique it should be limited to 50000, which is nothing.
But if c1 is not unique it depends on how you define "forever"

c1 is unique in TAB1 (primary key), but not in TAB2. I expect a little less
than 50000 should be deleted - don't have the exact number but it will be
approximately 45000, which as you say is nothing. I have tried this today
and it ran for 4 hours before I cancelled the sql......

I will try your suggestion and see what happens.

I try this now in a test environment, but when time comes to run this in
production there will be almost 1 million rows to delete. So I must get a
better performance first.
If the MERGE gives you any trouble please post the db2exfmt output:
db2 "EXPLAIN PLAN FOR MERGE ...."

and then:
db2exfmt -d <db-o merge.exfmt -1

run sqllib/misc/EXPLAIN.DDL if EXPLAIN PLAN complains about lack of the
explain tables

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Aug 18 '06 #4

P: n/a
Odd Bjørn Andersen wrote:
c1 is unique in TAB1 (primary key), but not in TAB2. I expect a little less
Besides what Serge suggested, other things to check would be to see if
the table has any large child tables with cascade delete relationship
(specially without supporting index), and/or delete triggers. These
don't show up in explain, but do have tangible overhead.

Regards

P Adhia
Aug 18 '06 #5

P: n/a
"Odd Bjørn Andersen" <ob****@online.nowrote in message
news:44***********************@news.sunsite.dk...
>I want to delete some rows from a table (TAB1), but only those rows which
has a match in antother table (TAB2). Something like this:

TAB1: col1, col2, .....
TAB2, col1, col2,......
There are indexes on col1 in both the tables.

The sql would be something like this

delete from tab1
where col1 in (select col1 from TAB2)

when I run this it takes 'for ever' to finish. I have tried to rewrite
this but cannot find a solution that runs fast.

There are 1.5 million rows in TAB1 and 50000 rows in TAB2. And , yes, i
have run runstats on both tables.

Anyone got a good solution to this?

Regards
Odd A
I am not 100 % sure, but I think this may be what you want:

delete from tab1 A
where col1 in (select B.col1 from TAB2 B where B.col1 = A.col1)

If the above is not correct, this may be what you want:

delete from tab1 A
where col1 in (select distinct col1 from TAB2)

Aug 18 '06 #6

P: n/a
How about this way?

delete from tab1
where EXISTS(SELECT * WHERE tab1.col1 = TAB2.col1 );

And make Index on TAN2.
CREATE INDEX Tab2_col1_IX ON TAB2 (col1);
(Oh, you wrote already that you made index on col1, This may be not
necessary.)

Aug 19 '06 #7

P: n/a

"P Adhia" <pa****@spamnot.yahoo.comwrote in message
news:53***************************@ALLTEL.NET...
Odd Bjørn Andersen wrote:
>c1 is unique in TAB1 (primary key), but not in TAB2. I expect a little
less

Besides what Serge suggested, other things to check would be to see if the
table has any large child tables with cascade delete relationship
(specially without supporting index), and/or delete triggers. These don't
show up in explain, but do have tangible overhead.

Regards

P Adhia
Thanks for all the response, but none have helped so far. There is something
strange going on here and I don't think it's the SQL in itself that causes
the problems.

It seems that the job stops after deleting some of the rows, and then
nothing seems to happen. The job does not continue or finish.

In TAB1 there are 1680554 rows, and I tried with 9465 rows (9295 unique
values in col1) in TAB2. When I tried the delete command like this:

delete from TAB1
where col1 in (select distinct col1 from TAB2)

the jobs deleted 308 rows and then STOPPED!

When I tried the merge command:

MERGE INTO TAB1 USING (SELECT DISTINCT col1 FROM TAB2) AS TAB2
ON TAB1.col1= TAB2.col1
WHEN MATCHED THEN DELETE

Then the job deleted 9244 rows and then stopped.

In both circumstances it took less than a second before the jobs stopped.

There are RI in question here, but I have deleted all the records in all the
underlying tables to TAB1 (just checked !!). So there should be no overhead
with delete cascade.

I set diaglevel to 4 to see if there are any messages concerning this, but
the only thing I got was:

2006-08-21-10.43.03.440000+120 E18248H389 LEVEL: Info (OS)
PID : 3960 TID : 1052 PROC : db2cmnclp.exe
INSTANCE: DB2 NODE : 000
FUNCTION: DB2 UDB, oper system services, sqlodelete, probe:100
CALLED : OS, -, unspecified_system_function
OSERR : 2 "The system cannot find the file specified."
DATA #1 : File name, 6 bytes
\s3ro.

but I cannot see that this has any relevance to my problem.

So now I am even more confused, if possbile, than when I started out.....

Odd B
Aug 21 '06 #8

P: n/a
Could it be DB2 is waiting on a lock?
Try db2pd an see what it reports for your application id.
(Figuring out db2pd shall be homework ;-)

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Aug 21 '06 #9

P: n/a
Odd Bjørn Andersen wrote:
It seems that the job stops after deleting some of the rows, and then
nothing seems to happen. The job does not continue or finish.
That is indeed strange. Could it be that DB2 is trying to allocate
secondary logs and that is taking too long for some reasons? Try (of
course not in prod) setting NOT LOGGED INITIALLY attribute for the
table and see if that helps.

P Adhia

Aug 22 '06 #10

This discussion thread is closed

Replies have been disabled for this discussion.