By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,300 Members | 2,044 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,300 IT Pros & Developers. It's quick & easy.

Performance problems when inserting into a large table

P: n/a
Hi all,

first apologies if this question looks the same as another one I recently
posted - its a different thing but for the same szenario:-).

We are having performance problems when inserting/deleting rows from a large
table.
My scenario:

Table (lets call it FACT1) with 1000 million rows distributed on 12
Partitions (3 physical hosts with 4 logical partitions each).
Overall size of table is 350 GB. Each night 1.5 Million new rows will be
added
and approx. the same amount of old records will be deleted (Roll in/Roll out
with SQL INSERT/DELETE).
The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
Extentsize.
The tablespace has 6 containers on each partition. Each container is on a
separate IBM ESS array.
Prefetchsize is 384 (6 containers * 64 pages). Prefetch behaves very well
with these settings (DB2_PARALLEL_IO is set)
DB2 is V8.1 ESE (DPF) FP5 and runs on AIX.

It takes 7 hours to insert 1.5 Million Rows into FACT1 and up to 7 hours to
delete the same amount.
The Insert is done via INSERT INTO FACT1 ... SELECT * FROM STAGING_TABLE.
Both the fact and the staging table are in tablespaces in the same nodegroup
and do have the same partitioning key.

On a similar table (lets call it FACT2) with a comparable amount of
data/rows and nearly identical configuration the same process takes only 5
minutes.

The main difference between these two tables is that FACT1 has 7 indexes
defined on it and FACT2 only 4.
One of the indexes in each case is unique, the others not (all type 2).
There is no clustering index and the APPEND attribute is set to ON.
I'm aware of the pseudo-delete mechanism of type-2 indexes and the
corresponding longer search time for insert's in the index leaf pages .
But an exclusive lock on the table before inserting/deleting does not change
the needed runtime.
(And the docs say that with a X-lock on table pseudo-deletes will not
happen).
Also after reorg of table and indexes the insert runtime is the same as
before.

Is it possible that the additional index maintenace for FACT1 leads to such
a longer runtime ?
What exactly happens internal for index maintenance (searched the docs - but
do not found internals)?
Anyone seen similar behaviour ?

I can post additional infos if required (table and Index definitions,
statistics ...) - but wanted to keep the posting small in first place.

TIA for any comments
Joachim

PS: Feel free to send comments by email to joklassen at web dot de
PPS: We are parallel investigating in MDC tables, using smaller tables (and
combining them with a UNION ALL view) and the use of LOAD FROM CURSOR
instead of INSERT
Nov 12 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
Joachim Klassen wrote:
Hi all,

first apologies if this question looks the same as another one I recently
posted - its a different thing but for the same szenario:-).

We are having performance problems when inserting/deleting rows from a large
table.
My scenario:

Table (lets call it FACT1) with 1000 million rows distributed on 12
Partitions (3 physical hosts with 4 logical partitions each).
Overall size of table is 350 GB. Each night 1.5 Million new rows will be
added
and approx. the same amount of old records will be deleted (Roll in/Roll out
with SQL INSERT/DELETE).
The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
Extentsize.
The tablespace has 6 containers on each partition. Each container is on a
separate IBM ESS array.
Prefetchsize is 384 (6 containers * 64 pages). Prefetch behaves very well
with these settings (DB2_PARALLEL_IO is set)
DB2 is V8.1 ESE (DPF) FP5 and runs on AIX.

It takes 7 hours to insert 1.5 Million Rows into FACT1 and up to 7 hours to
delete the same amount.
The Insert is done via INSERT INTO FACT1 ... SELECT * FROM STAGING_TABLE.
Both the fact and the staging table are in tablespaces in the same nodegroup
and do have the same partitioning key.

On a similar table (lets call it FACT2) with a comparable amount of
data/rows and nearly identical configuration the same process takes only 5
minutes.

The main difference between these two tables is that FACT1 has 7 indexes
defined on it and FACT2 only 4.
One of the indexes in each case is unique, the others not (all type 2).
There is no clustering index and the APPEND attribute is set to ON.
I'm aware of the pseudo-delete mechanism of type-2 indexes and the
corresponding longer search time for insert's in the index leaf pages .
But an exclusive lock on the table before inserting/deleting does not change
the needed runtime.
(And the docs say that with a X-lock on table pseudo-deletes will not
happen).
Also after reorg of table and indexes the insert runtime is the same as
before.

Is it possible that the additional index maintenace for FACT1 leads to such
a longer runtime ?
What exactly happens internal for index maintenance (searched the docs - but
do not found internals)? I'm not privy of index maintenance internals, but could it be the 7
indexes cause a spill of some heap? Maybe sort heap? Have you checked
the snapshots?
Have you verified that the plans are good? You shouldn't see any TQs.
Also are you sure you don't have any other complicating factors (SQL
Functions, Triggers, check or RI constraints) (The plans will show). PPS: We are parallel investigating in MDC tables, using smaller tables (and
combining them with a UNION ALL view) and the use of LOAD FROM CURSOR
instead of INSERT

Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do
that in a scalable fashion you would fire up concurrent LOADs on each
node filtering the source by DBPARTITION.
You shouldn't need UNION ALL.

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #2

P: n/a
Serge,
again thanks for your quick reply :-)

I will try to get snapshot information next days (Problem is that "get
snapshot for all " runs 1 hour on production and once crashed the instance
in the past :-) (problem is fixed in FP7 which will be applied in the near
time)).
Have you verified that the plans are good? You shouldn't see any TQs.
Also are you sure you don't have any other complicating factors (SQL
Functions, Triggers, check or RI constraints) (The plans will show). The plan looks good (for me). Maybe you can comment it:

Section Code Page = 819

Estimated Cost = 31926.718750
Estimated Cardinality = 75608.000000

Coordinator Subsection - Main Processing:
(-----) Distribute Subsection #1
| Broadcast to Node List
| | Nodes = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
| | 11, 12

Subsection #1:
( 3) Access Table Name = DTMP1T.STAGING ID = 411,121
| #Columns = 24
| Volatile Cardinality
| Relation Scan
| | Prefetch: Eligible
| Lock Intents
| | Table: Intent Share
| | Row : Next Key Share
( 2) Insert: Table Name = DPERMT.FACT1 ID = 1714,2

End of section
Optimizer Plan:

INSERT
( 2)
/----/ \
TBSCAN Table:
( 3) DPERMT
| F7KB_F_A_T_Q_B_K
Table:
DTMP1T
F7KB_F_A_T_Q_B_K
Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do that
in a scalable fashion you would fire up concurrent LOADs on each node
filtering the source by DBPARTITION.
Does that mean
DECLARE C1 CURSOR for select * from stage where dbpartitionnum(column) = 1
LOAD FROM C1 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNUMS 1
DECLARE C2 CURSOR for select * from stage where dbpartitionnum(column) = 2
LOAD FROM C2 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNUMS 2
and so on

Thanks
Joachim

"Serge Rielau" <sr*****@ca.ibm.com> schrieb im Newsbeitrag
news:35*************@individual.net... Joachim Klassen wrote:
Hi all,

first apologies if this question looks the same as another one I recently
posted - its a different thing but for the same szenario:-).

We are having performance problems when inserting/deleting rows from a
large table.
My scenario:

Table (lets call it FACT1) with 1000 million rows distributed on 12
Partitions (3 physical hosts with 4 logical partitions each).
Overall size of table is 350 GB. Each night 1.5 Million new rows will be
added
and approx. the same amount of old records will be deleted (Roll in/Roll
out with SQL INSERT/DELETE).
The table is stored in SMS tablespace with 16K Pagesize and 64 Pages
Extentsize.
The tablespace has 6 containers on each partition. Each container is on a
separate IBM ESS array.
Prefetchsize is 384 (6 containers * 64 pages). Prefetch behaves very well
with these settings (DB2_PARALLEL_IO is set)
DB2 is V8.1 ESE (DPF) FP5 and runs on AIX.

It takes 7 hours to insert 1.5 Million Rows into FACT1 and up to 7 hours
to delete the same amount.
The Insert is done via INSERT INTO FACT1 ... SELECT * FROM STAGING_TABLE.
Both the fact and the staging table are in tablespaces in the same
nodegroup and do have the same partitioning key.

On a similar table (lets call it FACT2) with a comparable amount of
data/rows and nearly identical configuration the same process takes only
5 minutes.

The main difference between these two tables is that FACT1 has 7 indexes
defined on it and FACT2 only 4.
One of the indexes in each case is unique, the others not (all type 2).
There is no clustering index and the APPEND attribute is set to ON.
I'm aware of the pseudo-delete mechanism of type-2 indexes and the
corresponding longer search time for insert's in the index leaf pages .
But an exclusive lock on the table before inserting/deleting does not
change the needed runtime.
(And the docs say that with a X-lock on table pseudo-deletes will not
happen).
Also after reorg of table and indexes the insert runtime is the same as
before.

Is it possible that the additional index maintenace for FACT1 leads to
such a longer runtime ?
What exactly happens internal for index maintenance (searched the docs -
but do not found internals)?

I'm not privy of index maintenance internals, but could it be the 7
indexes cause a spill of some heap? Maybe sort heap? Have you checked the
snapshots?
Have you verified that the plans are good? You shouldn't see any TQs.
Also are you sure you don't have any other complicating factors (SQL
Functions, Triggers, check or RI constraints) (The plans will show).
PPS: We are parallel investigating in MDC tables, using smaller tables
(and combining them with a UNION ALL view) and the use of LOAD FROM
CURSOR instead of INSERT

Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do that
in a scalable fashion you would fire up concurrent LOADs on each node
filtering the source by DBPARTITION.
You shouldn't need UNION ALL.

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Nov 12 '05 #3

P: n/a
Joachim Klassen wrote:
Optimizer Plan:

INSERT
( 2)
/----/ \
TBSCAN Table:
( 3) DPERMT
| F7KB_F_A_T_Q_B_K
Table:
DTMP1T
F7KB_F_A_T_Q_B_K Doesn't get easier than that...
Be careful with LOAD FROM CURSOR, the cursor is a bottle neck. To do that
in a scalable fashion you would fire up concurrent LOADs on each node
filtering the source by DBPARTITION.

Does that mean

Connect to node 1: DECLARE C1 CURSOR for select * from stage where dbpartitionnum(column) = 1
LOAD FROM C1 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNUMS 1 Connect to node 2: DECLARE C2 CURSOR for select * from stage where dbpartitionnum(column) = 2
LOAD FROM C2 OF CURSOR INSERT INTO FACT1 ... OUTPUT_DBPARTNUMS 2 connect to node "and so on" and so on


Basically you are your own splitter.

This, btw is a great way to do batch processing with procedures.

Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.