Avoiding Table Scan on Update with correlated subselect

Mike L. Bell

Query:

update table1 t1
set end_time = (
select end_time
from table2 t2
where t2.key1 = t1.key1
and t2.key2 = t1.key2
)
where exists
(
select 1
from table2 t2
where t2.key1 = t1.key1
and t2.key2 = t2.key2
)
T1 and T2 share the same primary key. T2 is a volatile table that has
zero or hundreds of rows in it. T1 is large table with millions of
rows.

DB2 explain facility shows table scan on large table. I need this to
be fast to avoid concurrency/locking issues. I'm using the full
primary key in the first correlated subselect, and in the "exists"
clause, and am confused why the optimizer would choose to do a table
scan.

Is there a better way to write this to avoid table scans?

Platform is 7.2 Fixpack 11.

Thanks,
Mike

Nov 12 '05 #1

Subscribe Post Reply

6109

Philip Sherman

Greatest concurrency occurs with minimal locking. Locking is minimized
when only a single row at a time is locked. Since you are updating only
"a few hundred rows" and T2 is "a volatile table" I'd assume that you
will be deleting the contents of T2 when finished and that T2 can't be
updated during your processing. You don't state where you are interested
on minimizing locking - T1 or T2.

A stored procedure with the following logic will do this.

1. Construct a parameterized statement to update T1.
2. Construct a cursor to read T2 - WITH HOLD clause will be needed.
3. Lock T2.
4. For each row read from T2
5. Update T1
6. Commit T1
5. Delete rows from T2.

The updates will be done using the primary key which should lock a
single row of T1. Commits will release the locks immediately after the
update, minimizing T1's locked time. The lock on T2 will prevent updates
during your use of the table. Taking a commit after each update of T1 is
not the best performer but does maximize concurrency on T1. A compromise
of locking and performance is to maintain a rows updated counter and
commit after every n updates.

Phil Sherman
Mike L. Bell wrote:

Query:

update table1 t1
set end_time = (
select end_time
from table2 t2
where t2.key1 = t1.key1
and t2.key2 = t1.key2
)
where exists
(
select 1
from table2 t2
where t2.key1 = t1.key1
and t2.key2 = t2.key2
)
T1 and T2 share the same primary key. T2 is a volatile table that has
zero or hundreds of rows in it. T1 is large table with millions of
rows.

DB2 explain facility shows table scan on large table. I need this to
be fast to avoid concurrency/locking issues. I'm using the full
primary key in the first correlated subselect, and in the "exists"
clause, and am confused why the optimizer would choose to do a table
scan.

Is there a better way to write this to avoid table scans?

Platform is 7.2 Fixpack 11.

Thanks,
Mike

Nov 12 '05 #2

by: Murali | last post by:

Hi All I was reading thro the posting(s) of Thomas Kyte and his nifty approach to doing updates without the need for unnecessary correlated subqueries. An alternative to correlated subquery...

Oracle Database

Random via Subselect or Temporary Table

by: Thomas T. Thai | last post by:

I would like to select a random record from a group of records so I'd end up with one random record per group: CREATE TABLE randtest ( catnum int, title varchar(32) ); INSERT INTO randtest...

PostgreSQL Database

Help on update that subselects other records in table, uses joins

by: Jeff Kowalczyk | last post by:

I need to adapt this an update statement to a general form that will iterate over multiple orderids for a given customerinvoiceid. My first concern is a form that will work for a given orderid,...

PostgreSQL Database

last update time of a table

by: pg | last post by:

Is there any simple way to query the most recent time of "changes" made to a table? I'm accessing my database with ODBC to a remote site thru internet. I want to eliminate some DUPLICATE long...

PostgreSQL Database

Query performance question on a large table

by: Együd Csaba | last post by:

Hi All, how can I improve the query performance in the following situation: I have a big (4.5+ million rows) table. One query takes approx. 9 sec to finish resulting ~10000 rows. But if I run...

PostgreSQL Database

about replicated summary table on two base tables

by: jane | last post by:

HI, I try to create summary table like following: create table summary (a int, b int, c int) (select a.aa, b.bb, b.cc from table_a a ,table_b b where a.key=b.key) data initially deferred...

DB2 Database

SQL correlated subqueries

by: Robert Stearns | last post by:

I obviously dave the following specified incorrectly. There is only one entity_id with 33.34 as the percent value for each bhid, as you can see from the attached data. I expected the use of t1.bhid...

DB2 Database

how to merge a table from another DB

by: Hought, Todd | last post by:

I have a bit of a dilemma, I'm fairly new to PG, and not quite sure how to do this: I have a DB that has been basically 'forked' - a copy of the original was made for a handful of users to play...

PostgreSQL Database

bizarre error trying to add UUID to a new table column

by: phillip.s.powell | last post by:

mysql> update student set uuid = concat(UUID(), '_', 'asdfasdfasdfasdf') where uuid is null or uuid = '' limit 1; ERROR 1270 (HY000): Illegal mix of collations (utf8_general_ci,COERCIBLE),...

MySQL Database

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Avoiding Table Scan on Update with correlated subselect

Similar topics