Merge Query Help

Sam Durai

Hello, I need to merge a small table (of rows less than 100,sometimes
even 0 rows) to a big table (of rows around 4 billion). I used the PK
of the big table as merge key but merge does a table scan so it runs
for ever.

I checked the table and PK statistics of the big table and it looks
good. Please let me know if I need to check for something else.

Here are more details
Small table - Non Partitioned ( Node 0)
Big table - Partitioned across 7 logical nodes (Node 1 - 8)

Query:
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR AND A.OPTN_CD =B.OPTN_CD AND
A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD, 13, '', '', '', 34, Current Timestamp, '
2007-02-20 10:39:53', '2007-02-20 10:39:53')
ELSE IGNORE
Reorgchk stats of big table is as below
Table statistics:
SCHEMA NAME CARD OV NP FP ACTBLK
TSIZE F1 F2 F3 REORG
----------------------------------------------------------------------------------------
Table: CARD.VIN_VEH_OPTNS
CARD VIN_VEH_OPTNS 4.6e+09 0 3e+07 3e+07 - 4.96e
+11 0 97 97 ---
----------------------------------------------------------------------------------------

Index statistics:

SCHEMA NAME CARD LEAF ELEAF LVLS ISIZE NDEL
KEYS F4 F5 F6 F7 F8 REORG
-------------------------------------------------------------------------------------------------
Table: CARD.VIN_VEH_OPTNS
CARD VVO_IX1 5e+09 7e+06 6916 4 10 13755
1000 100 80 0 0 0 -----
CARD VVO_IX2 5e+09 7e+06 287 4 12 8e+05 1e
+07 100 78 0 0 0 -----
SYSIBM SQL061028160336900 5e+09 6e+07 0 5 27 84 5e
+09 98 68 1 0 0 -----
-------------------------------------------------------------------------------------------------

Indexes on Big table:
CARD VVO_IX1
D 1
+DWH_EFCTV_TIMSTM
CARD VVO_IX2
D 2 +DWH_UPD_TIMSTM
+VEH_OPTN_CD_CSI
SYSIBM SQL061028160336900
P 3 +VEH_IDENT_NBR+OPTN_CD
+VEH_OPTN_CD_CSI

Access Plan:
Optimizer Plan:

INSERT
( 2)
/ \
DTQ Table:
( 3) CARD
| VIN_VEH_OPTNS
TBSCAN
( 4)
|
TEMP
( 5)
|
FILTER
( 6)
|
FILTER
( 7)
|
TBSCAN
( 8)
|
SORT
( 9)
|
NLJOIN
( 10)
/ \
TBSCAN TBSCAN
( 11) ( 12)
| |
Table: TEMP
ETL ( 13)
STG_NEWS_VIN_VEH_OPTNS2 |
BTQ
( 14)
|
TBSCAN
( 15)
|
Table:
CARD
VIN_VEH_OPTNS
Please share your thoughts.

Feb 20 '07 #1

Subscribe Post Reply

3556

Serge Rielau

Sam Durai wrote:

Hello, I need to merge a small table (of rows less than 100,sometimes
even 0 rows) to a big table (of rows around 4 billion). I used the PK
of the big table as merge key but merge does a table scan so it runs
for ever.

I checked the table and PK statistics of the big table and it looks
good. Please let me know if I need to check for something else.

Here are more details
Small table - Non Partitioned ( Node 0)
Big table - Partitioned across 7 logical nodes (Node 1 - 8)

Query:
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR AND A.OPTN_CD =B.OPTN_CD AND
A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD, 13, '', '', '', 34, Current Timestamp, '
2007-02-20 10:39:53', '2007-02-20 10:39:53')
ELSE IGNORE
Reorgchk stats of big table is as below
Table statistics:
SCHEMA NAME CARD OV NP FP ACTBLK
TSIZE F1 F2 F3 REORG
----------------------------------------------------------------------------------------
Table: CARD.VIN_VEH_OPTNS
CARD VIN_VEH_OPTNS 4.6e+09 0 3e+07 3e+07 - 4.96e
+11 0 97 97 ---
----------------------------------------------------------------------------------------

Index statistics:

SCHEMA NAME CARD LEAF ELEAF LVLS ISIZE NDEL
KEYS F4 F5 F6 F7 F8 REORG
-------------------------------------------------------------------------------------------------
Table: CARD.VIN_VEH_OPTNS
CARD VVO_IX1 5e+09 7e+06 6916 4 10 13755
1000 100 80 0 0 0 -----
CARD VVO_IX2 5e+09 7e+06 287 4 12 8e+05 1e
+07 100 78 0 0 0 -----
SYSIBM SQL061028160336900 5e+09 6e+07 0 5 27 84 5e
+09 98 68 1 0 0 -----
-------------------------------------------------------------------------------------------------

Indexes on Big table:
CARD VVO_IX1
D 1
+DWH_EFCTV_TIMSTM
CARD VVO_IX2
D 2 +DWH_UPD_TIMSTM
+VEH_OPTN_CD_CSI
SYSIBM SQL061028160336900
P 3 +VEH_IDENT_NBR+OPTN_CD
+VEH_OPTN_CD_CSI

Access Plan:
Optimizer Plan:

INSERT
( 2)
/ \
DTQ Table:
( 3) CARD
| VIN_VEH_OPTNS
TBSCAN
( 4)
|
TEMP
( 5)
|
FILTER
( 6)
|
FILTER
( 7)
|
TBSCAN
( 8)
|
SORT
( 9)
|
NLJOIN
( 10)
/ \
TBSCAN TBSCAN
( 11) ( 12)
| |
Table: TEMP
ETL ( 13)
STG_NEWS_VIN_VEH_OPTNS2 |
BTQ
( 14)
|
TBSCAN
( 15)
|
Table:
CARD
VIN_VEH_OPTNS
Please share your thoughts.

Do/Can you have dups in the source?
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B

assuming you don't have a UNIQUE key on VEH_IDENT_NBR, OPTN_CD
I recommend SELECT VEH_IDENT_NBR, OPTN_CD, MAX(ERR_FLAG)
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y')
GROUP BY VEH_IDENT_NBR, OPTN_CD) AS B

Do you have statistics on the small table?

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Feb 20 '07 #2

Sam Durai

On Feb 20, 12:44 pm, Serge Rielau <srie...@ca.ibm.comwrote:

Sam Durai wrote:
Hello, I need to merge a small table (of rows less than 100,sometimes
even 0 rows) to a big table (of rows around 4 billion). I used the PK
of the big table as merge key but merge does a table scan so it runs
for ever.

I checked the table and PK statistics of the big table and it looks
good. Please let me know if I need to check for something else.

Here are more details
Small table - Non Partitioned ( Node 0)
Big table - Partitioned across 7 logical nodes (Node 1 - 8)

Query:
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR AND A.OPTN_CD =B.OPTN_CD AND
A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD, 13, '', '', '', 34, Current Timestamp, '
2007-02-20 10:39:53', '2007-02-20 10:39:53')
ELSE IGNORE

Reorgchk stats of big table is as below
Table statistics:
SCHEMA NAME CARD OV NP FP ACTBLK
TSIZE F1 F2 F3 REORG
----------------------------------------------------------------------------------------
Table: CARD.VIN_VEH_OPTNS
CARD VIN_VEH_OPTNS 4.6e+09 0 3e+07 3e+07 - 4.96e
+11 0 97 97 ---
----------------------------------------------------------------------------------------

Index statistics:

SCHEMA NAME CARD LEAF ELEAF LVLS ISIZE NDEL
KEYS F4 F5 F6 F7 F8 REORG
-------------------------------------------------------------------------------------------------
Table: CARD.VIN_VEH_OPTNS
CARD VVO_IX1 5e+09 7e+06 6916 4 10 13755
1000 100 80 0 0 0 -----
CARD VVO_IX2 5e+09 7e+06 287 4 12 8e+05 1e
+07 100 78 0 0 0 -----
SYSIBM SQL061028160336900 5e+09 6e+07 0 5 27 84 5e
+09 98 68 1 0 0 -----
-------------------------------------------------------------------------------------------------

Indexes on Big table:
CARD VVO_IX1
D 1
+DWH_EFCTV_TIMSTM
CARD VVO_IX2
D 2 +DWH_UPD_TIMSTM
+VEH_OPTN_CD_CSI
SYSIBM SQL061028160336900
P 3 +VEH_IDENT_NBR+OPTN_CD
+VEH_OPTN_CD_CSI

Access Plan:
Optimizer Plan:

INSERT
( 2)
/ \
DTQ Table:
( 3) CARD
| VIN_VEH_OPTNS
TBSCAN
( 4)
|
TEMP
( 5)
|
FILTER
( 6)
|
FILTER
( 7)
|
TBSCAN
( 8)
|
SORT
( 9)
|
NLJOIN
( 10)
/ \
TBSCAN TBSCAN
( 11) ( 12)
| |
Table: TEMP
ETL ( 13)
STG_NEWS_VIN_VEH_OPTNS2 |
BTQ
( 14)
|
TBSCAN
( 15)
|
Table:
CARD
VIN_VEH_OPTNS

Please share your thoughts.

Do/Can you have dups in the source?
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B

assuming you don't have a UNIQUE key on VEH_IDENT_NBR, OPTN_CD
I recommend SELECT VEH_IDENT_NBR, OPTN_CD, MAX(ERR_FLAG)
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y')
GROUP BY VEH_IDENT_NBR, OPTN_CD) AS B

Do you have statistics on the small table?

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Thanks for looking into the issue. We cannot have duplicates on
VEH_IDENT_NBR, OPTN_CD, ERR_FLAG. I do have updated stats on the small
table and most of the time this table would be empty so I'm not sure
whether Unique Index would be of help.

Feb 20 '07 #3

Serge Rielau

Sam Durai wrote:

Thanks for looking into the issue. We cannot have duplicates on
VEH_IDENT_NBR, OPTN_CD, ERR_FLAG. I do have updated stats on the small
table and most of the time this table would be empty so I'm not sure
whether Unique Index would be of help.

Not on all three, but that doesn't help. Imporatnt is whether you could
have two (VEH_IDENT_NBR, OPTN_CD) with different ERR_FLAG.
because you join on (VEH_IDENT_NBR, OPTN_CD).
Anyway: try my rewrite and see what it does to the plan. It should
become much simpler. All the goo above the JOIN should go away (and
hopefully in the process remove the tablescan).
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Feb 20 '07 #4

Sam Durai

On Feb 20, 1:18 pm, Serge Rielau <srie...@ca.ibm.comwrote:

Sam Durai wrote:
Thanks for looking into the issue. We cannot have duplicates on
VEH_IDENT_NBR, OPTN_CD, ERR_FLAG. I do have updated stats on the small
table and most of the time this table would be empty so I'm not sure
whether Unique Index would be of help.

Not on all three, but that doesn't help. Imporatnt is whether you could
have two (VEH_IDENT_NBR, OPTN_CD) with different ERR_FLAG.
because you join on (VEH_IDENT_NBR, OPTN_CD).
Anyway: try my rewrite and see what it does to the plan. It should
become much simpler. All the goo above the JOIN should go away (and
hopefully in the process remove the tablescan).

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Here is the latest plan as per modified query.
INSERT
( 2)
/ \
DTQ Table:
( 3) CARD
| VIN_VEH_OPTNS
TBSCAN
( 4)
|
TEMP
( 5)
|
FILTER
( 6)
|
NLJOIN
( 7)
/--/ \--\
GRPBY TBSCAN
( 8) ( 12)
| |
TBSCAN TEMP
( 9) ( 13)
| |
SORT BTQ
( 10) ( 14)
| |
TBSCAN TBSCAN
( 11) ( 15)
| |
Table: Table:
ETL CARD
STG_NEWS_VIN_VEH_OPTNS2 VIN_VEH_OPTNS
Query :
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, MAX(ERR_FLAG)
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y')
GROUP BY VEH_IDENT_NBR, OPTN_CD)AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR AND A.OPTN_CD =B.OPTN_CD AND
A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD, 13, '', '', '', 34, Current Timestamp, '
2007-02-20 10:39:53', '2007-02-20 10:39:53')
ELSE IGNORE

The initial query which I posted earlier uses the PK index on other
DPF enabled environment but not on this. Not sure what the reason
eventhough the stat looks good :-(

Feb 20 '07 #5

mirof007

Hi Sam, what version of DB2 are you using?

Miro

Feb 21 '07 #6

Sam Durai

On Feb 20, 8:01 pm, "mirof007" <mirof...@gmail.comwrote:

Hi Sam, what version of DB2 are you using?

Miro

It is DB2 UDB V8.1 FP 10
DB21085I Instance "cardp1in" uses "64" bits and DB2 code release
"SQL08023"
with level identifier "03040106".
Informational tokens are "DB2 v8.1.1.96", "s050811", "U803920", and
FixPak
"10".
Product is installed at "/usr/opt/db2_08_01".

As I said it is a logically partitioned db on AIX 5.3 having 8 logical
nodes. The big table is split across 7 partitions while the other
small table is on partition 1.

Is there is something which I need to enable to make the optimizer use
the index.

Feb 21 '07 #7

Tonkuma

On Feb 21, 2:07 am, "Sam Durai" <reachsamdu...@gmail.comwrote:

Hello, I need to merge a small table (of rows less than 100,sometimes
even 0 rows) to a big table (of rows around 4 billion). I used the PK
of the big table as merge key but merge does a table scan so it runs
for ever.

I checked the table and PK statistics of the big table and it looks
good. Please let me know if I need to check for something else.

Here are more details
Small table - Non Partitioned ( Node 0)
Big table - Partitioned across 7 logical nodes (Node 1 - 8)

Query:
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR AND A.OPTN_CD =B.OPTN_CD AND
A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD, 13, '', '', '', 34, Current Timestamp, '
2007-02-20 10:39:53', '2007-02-20 10:39:53')
ELSE IGNORE

I thought that it is worth to try following index.
Because, to check MATCHED or NOT, data of columns A.VEH_IDENT_NBR,
A.OPTN_CD, A.VEH_OPTN_CD_CSI from Table "CARD.VIN_VEH_OPTNS AS A" are
necessary. If there is no appropriate index, DB2 need to see table
itself to acquire these values. Consequentry, tablespace scan may be
inevitable.
CREATE INDEX CARD_IDNBR_CD_CSI ON CARD.VIN_VEH_OPTNS
(VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI)

Feb 21 '07 #8

Sam Durai

On Feb 20, 9:32 pm, "Tonkuma" <tonk...@jp.ibm.comwrote:

On Feb 21, 2:07 am, "Sam Durai" <reachsamdu...@gmail.comwrote:

Hello, I need to merge a small table (of rows less than 100,sometimes
even 0 rows) to a big table (of rows around 4 billion). I used the PK
of the big table as merge key but merge does a table scan so it runs
for ever.

I checked the table and PK statistics of the big table and it looks
good. Please let me know if I need to check for something else.

Here are more details
Small table - Non Partitioned ( Node 0)
Big table - Partitioned across 7 logical nodes (Node 1 - 8)

Query:
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR AND A.OPTN_CD =B.OPTN_CD AND
A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD, 13, '', '', '', 34, Current Timestamp, '
2007-02-20 10:39:53', '2007-02-20 10:39:53')
ELSE IGNORE

I thought that it is worth to try following index.
Because, to check MATCHED or NOT, data of columns A.VEH_IDENT_NBR,
A.OPTN_CD, A.VEH_OPTN_CD_CSI from Table "CARD.VIN_VEH_OPTNS AS A" are
necessary. If there is no appropriate index, DB2 need to see table
itself to acquire these values. Consequentry, tablespace scan may be
inevitable.
CREATE INDEX CARD_IDNBR_CD_CSI ON CARD.VIN_VEH_OPTNS
(VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI)

Thats the PK in big table.
SYSIBM SQL061028160336900
P 3 +VEH_IDENT_NBR+OPTN_CD
+VEH_OPTN_CD_CSI

Feb 21 '07 #9

Serge Rielau

Sam Durai wrote:

On Feb 20, 9:32 pm, "Tonkuma" <tonk...@jp.ibm.comwrote:
>On Feb 21, 2:07 am, "Sam Durai" <reachsamdu...@gmail.comwrote:

>>Hello, I need to merge a small table (of rows less than 100,sometimes
even 0 rows) to a big table (of rows around 4 billion). I used the PK
of the big table as merge key but merge does a table scan so it runs
for ever.
I checked the table and PK statistics of the big table and it looks
good. Please let me know if I need to check for something else.
Here are more details
Small table - Non Partitioned ( Node 0)
Big table - Partitioned across 7 logical nodes (Node 1 - 8)
Query:
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR AND A.OPTN_CD =B.OPTN_CD AND
A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD, 13, '', '', '', 34, Current Timestamp, '
2007-02-20 10:39:53', '2007-02-20 10:39:53')
ELSE IGNORE
I thought that it is worth to try following index.
Because, to check MATCHED or NOT, data of columns A.VEH_IDENT_NBR,
A.OPTN_CD, A.VEH_OPTN_CD_CSI from Table "CARD.VIN_VEH_OPTNS AS A" are
necessary. If there is no appropriate index, DB2 need to see table
itself to acquire these values. Consequentry, tablespace scan may be
inevitable.
CREATE INDEX CARD_IDNBR_CD_CSI ON CARD.VIN_VEH_OPTNS
(VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI)

Thats the PK in big table.
SYSIBM SQL061028160336900
P 3 +VEH_IDENT_NBR+OPTN_CD
+VEH_OPTN_CD_CSI

Sam woudl you mind doing an EXPLAIN PLAN followed by db2exfmt rather
than db2expl.
db2exfmt produces has a lot more information. db2expln merely
disassembles the runtime into a plan.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Feb 21 '07 #10

Sam Durai

On Feb 21, 7:55 am, Serge Rielau <srie...@ca.ibm.comwrote:

Sam Durai wrote:
On Feb 20, 9:32 pm, "Tonkuma" <tonk...@jp.ibm.comwrote:
On Feb 21, 2:07 am, "Sam Durai" <reachsamdu...@gmail.comwrote:

>Hello, I need to merge a small table (of rows less than 100,sometimes
even 0 rows) to a big table (of rows around 4 billion). I used the PK
of the big table as merge key but merge does a table scan so it runs
for ever.
I checked the table and PK statistics of the big table and it looks
good. Please let me know if I need to check for something else.
Here are more details
Small table - Non Partitioned ( Node 0)
Big table - Partitioned across 7 logical nodes (Node 1 - 8)
Query:
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR AND A.OPTN_CD =B.OPTN_CD AND
A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD, 13, '', '', '', 34, Current Timestamp, '
2007-02-20 10:39:53', '2007-02-20 10:39:53')
ELSE IGNORE
I thought that it is worth to try following index.
Because, to check MATCHED or NOT, data of columns A.VEH_IDENT_NBR,
A.OPTN_CD, A.VEH_OPTN_CD_CSI from Table "CARD.VIN_VEH_OPTNS AS A" are
necessary. If there is no appropriate index, DB2 need to see table
itself to acquire these values. Consequentry, tablespace scan may be
inevitable.
CREATE INDEX CARD_IDNBR_CD_CSI ON CARD.VIN_VEH_OPTNS
(VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI)

Thats the PK in big table.
SYSIBM SQL061028160336900
P 3 +VEH_IDENT_NBR+OPTN_CD
+VEH_OPTN_CD_CSI

Sam woudl you mind doing an EXPLAIN PLAN followed by db2exfmt rather
than db2expl.
db2exfmt produces has a lot more information. db2expln merely
disassembles the runtime into a plan.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Here it is

******************** EXPLAIN INSTANCE ********************

DB2_VERSION: 08.02.3
SOURCE_NAME: SQLC2E06
SOURCE_SCHEMA: NULLID
SOURCE_VERSION:
EXPLAIN_TIME: 2007-02-21-09.51.40.619898
EXPLAIN_REQUESTER: CARDUSER

Database Context:
----------------
Parallelism: Inter-Partition Parallelism
CPU Speed: 4.251098e-07
Comm Speed: 100
Buffer Pool size: 494056
Sort Heap size: 3968
Database Heap size: 1200
Lock List size: 4000
Maximum Lock List: 10
Average Applications: 1
Locks Available: 25600

Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability

---------------- STATEMENT 1 SECTION 203 ----------------
QUERYNO: 1
QUERYTAG:
Statement Type: Unrecognized
Updatable: Not Applicable
Deletable: Not Applicable
Query Degree: 1

Original Statement:
------------------
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR
AND A.OPTN_CD =B.OPTN_CD AND A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD,
13, '', '', '', 34, Current Timestamp, '2007-02-20 10:39:53',
'2007-02-20 10:39:53')
ELSE IGNORE
Optimized Statement:
-------------------
INSERT INTO CARD.VIN_VEH_OPTNS AS Q9
SELECT NULL, '2007-02-20-10.39.53.000000',
'2007-02-20-10.39.53.000000',
CURRENT TIMESTAMP, 34, ' ', ' ', ' ',
13,
Q7.$C1, $INTERNAL_FUNC$(),
CASE
WHEN Q7.$C2 IS NULL
THEN 1
ELSE 0 END
FROM
(SELECT Q6.$C0, Q6.$C1
FROM
(SELECT Q5.$C0, Q5.$C1, ROWNUMBER() OVER (PARTITION BY
Q5.$C2)
FROM
(SELECT Q4.$C0, Q4.$C1
FROM
(SELECT Q2.$C0, Q2.$C1
FROM
(SELECT Q1.VEH_IDENT_NBR, Q1.OPTN_CD
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2 AS Q1
WHERE Q1.ERR_FLAG IN ('N', 'Y')) AS Q2 LEFT OUTER JOIN
CARD.VIN_VEH_OPTNS AS Q3 ON (Q3.VEH_OPTN_CD_CSI
= 13)
AND (Q3.OPTN_CD = Q2.$C1) AND (Q3.VEH_IDENT_NBR
=
Q2.$C0)) AS Q4) AS Q5) AS Q6
WHERE (1 =
CASE
WHEN ((Q6.$C3 1) AND Q6.$C2 IS NOT NULL)
THEN $INTERNAL_FUNC$()
ELSE 1 END SELECTIVITY 1.000000)) AS Q7
WHERE (CASE
WHEN Q7.$C2 IS NULL
THEN 1
ELSE 0 END = 1)

Access Plan:
-----------
Total Cost: 12.9695
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
0
INSERT
( 2)
12.9695
1
/---+---\
0 6.61723e+08
DTQ TABLE: CARD
( 3) VIN_VEH_OPTNS
12.9695
1
|
0
TBSCAN
( 4)
12.9162
1
|
0
TEMP
( 5)
12.89
1
|
0
FILTER
( 6)
12.8762
1
|
0
FILTER
( 7)
12.8762
1
|
0
TBSCAN
( 8)
12.8742
1
|
0
SORT
( 9)
12.8736
1
|
0
NLJOIN
( 10)
12.8722
1
/----+----\
0 0.00100551
TBSCAN TBSCAN
( 11) ( 12)
12.8722 9.36425e+06
1 4.57754e+06
| |
0 4.63206e+06
TABLE: ETL TEMP
STG_NEWS_VIN_VEH ( 13)
9.1837e+06
4.5255e+06
|
4.63206e+06
BTQ
( 14)
8.51225e+06
4.47345e+06
|
661723
TBSCAN
( 15)
8.51139e+06
4.47345e+06
|
6.61723e+08
TABLE: CARD
VIN_VEH_OPTNS

Please let me know if you want the rest of the output used as well

Feb 21 '07 #11

Sam Durai

On Feb 21, 9:54 am, "Sam Durai" <reachsamdu...@gmail.comwrote:

On Feb 21, 7:55 am, Serge Rielau <srie...@ca.ibm.comwrote:

Sam Durai wrote:
On Feb 20, 9:32 pm, "Tonkuma" <tonk...@jp.ibm.comwrote:
>On Feb 21, 2:07 am, "Sam Durai" <reachsamdu...@gmail.comwrote:

>>Hello, I need to merge a small table (of rows less than 100,sometimes
>>even 0 rows) to a big table (of rows around 4 billion). I used the PK
>>of the big table as merge key but merge does a table scan so it runs
>>for ever.
>>I checked the table and PK statistics of the big table and it looks
>>good. Please let me know if I need to check for something else.
>>Here are more details
>>Small table - Non Partitioned ( Node 0)
>>Big table - Partitioned across 7 logical nodes (Node 1 - 8)
>>Query:
>> MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
>> SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
>> FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
>> WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
>> B.VEH_IDENT_NBR AND A.OPTN_CD =B.OPTN_CD AND
>> A.VEH_OPTN_CD_CSI =13)
>> WHEN NOT MATCHED
>> THEN
>> INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
>> PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
>> DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
>> B.OPTN_CD, 13, '', '', '', 34, Current Timestamp, '
>> 2007-02-20 10:39:53', '2007-02-20 10:39:53')
>> ELSE IGNORE
>I thought that it is worth to try following index.
>Because, to check MATCHED or NOT, data of columns A.VEH_IDENT_NBR,
>A.OPTN_CD, A.VEH_OPTN_CD_CSI from Table "CARD.VIN_VEH_OPTNS AS A" are
>necessary. If there is no appropriate index, DB2 need to see table
>itself to acquire these values. Consequentry, tablespace scan may be
>inevitable.
>CREATE INDEX CARD_IDNBR_CD_CSI ON CARD.VIN_VEH_OPTNS
> (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI)

Thats the PK in big table.
SYSIBM SQL061028160336900
P 3 +VEH_IDENT_NBR+OPTN_CD
+VEH_OPTN_CD_CSI

Sam woudl you mind doing an EXPLAIN PLAN followed by db2exfmt rather
than db2expl.
db2exfmt produces has a lot more information. db2expln merely
disassembles the runtime into a plan.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Here it is

******************** EXPLAIN INSTANCE ********************

DB2_VERSION: 08.02.3
SOURCE_NAME: SQLC2E06
SOURCE_SCHEMA: NULLID
SOURCE_VERSION:
EXPLAIN_TIME: 2007-02-21-09.51.40.619898
EXPLAIN_REQUESTER: CARDUSER

Database Context:
----------------
Parallelism: Inter-Partition Parallelism
CPU Speed: 4.251098e-07
Comm Speed: 100
Buffer Pool size: 494056
Sort Heap size: 3968
Database Heap size: 1200
Lock List size: 4000
Maximum Lock List: 10
Average Applications: 1
Locks Available: 25600

Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability

---------------- STATEMENT 1 SECTION 203 ----------------
QUERYNO: 1
QUERYTAG:
Statement Type: Unrecognized
Updatable: Not Applicable
Deletable: Not Applicable
Query Degree: 1

Original Statement:
------------------
MERGE INTO CARD.VIN_VEH_OPTNS AS A USING (
SELECT VEH_IDENT_NBR, OPTN_CD, ERR_FLAG
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2
WHERE ERR_FLAG IN ('N', 'Y'))AS B ON (A.VEH_IDENT_NBR =
B.VEH_IDENT_NBR
AND A.OPTN_CD =B.OPTN_CD AND A.VEH_OPTN_CD_CSI =13)
WHEN NOT MATCHED
THEN
INSERT (VEH_IDENT_NBR, OPTN_CD, VEH_OPTN_CD_CSI, OPTN_INSTLT_TYP_CD,
PART_SERIAL_NBR, BROADCAST_CD, SOURCE_SYSTEM_CD, SOURCE_DT,
DWH_EFCTV_TIMSTM, DWH_UPD_TIMSTM)VALUES (B.VEH_IDENT_NBR,
B.OPTN_CD,
13, '', '', '', 34, Current Timestamp, '2007-02-20 10:39:53',
'2007-02-20 10:39:53')
ELSE IGNORE

Optimized Statement:
-------------------
INSERT INTO CARD.VIN_VEH_OPTNS AS Q9
SELECT NULL, '2007-02-20-10.39.53.000000',
'2007-02-20-10.39.53.000000',
CURRENT TIMESTAMP, 34, ' ', ' ', ' ',
13,
Q7.$C1, $INTERNAL_FUNC$(),
CASE
WHEN Q7.$C2 IS NULL
THEN 1
ELSE 0 END
FROM
(SELECT Q6.$C0, Q6.$C1
FROM
(SELECT Q5.$C0, Q5.$C1, ROWNUMBER() OVER (PARTITION BY
Q5.$C2)
FROM
(SELECT Q4.$C0, Q4.$C1
FROM
(SELECT Q2.$C0, Q2.$C1
FROM
(SELECT Q1.VEH_IDENT_NBR, Q1.OPTN_CD
FROM ETL.STG_NEWS_VIN_VEH_OPTNS2 AS Q1
WHERE Q1.ERR_FLAG IN ('N', 'Y')) AS Q2 LEFT OUTER JOIN
CARD.VIN_VEH_OPTNS AS Q3 ON (Q3.VEH_OPTN_CD_CSI
= 13)
AND (Q3.OPTN_CD = Q2.$C1) AND (Q3.VEH_IDENT_NBR
=
Q2.$C0)) AS Q4) AS Q5) AS Q6
WHERE (1 =
CASE
WHEN ((Q6.$C3 1) AND Q6.$C2 IS NOT NULL)
THEN $INTERNAL_FUNC$()
ELSE 1 END SELECTIVITY 1.000000)) AS Q7
WHERE (CASE
WHEN Q7.$C2 IS NULL
THEN 1
ELSE 0 END = 1)

Access Plan:
-----------
Total Cost: 12.9695
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
0
INSERT
( 2)
12.9695
1
/---+---\
0 6.61723e+08
DTQ TABLE: CARD
( 3) VIN_VEH_OPTNS
12.9695
1
|
0
TBSCAN
( 4)
12.9162
1
|
0
TEMP
( 5)
12.89
1
|
0
FILTER
( 6)
12.8762
1
|
0
FILTER
( 7)
12.8762
1
|
0
TBSCAN
( 8)
12.8742
1
|
0
SORT
( 9)
12.8736
1
|
0
NLJOIN
( 10)
12.8722
1
/----+----\
0 0.00100551
TBSCAN TBSCAN
( 11) ( 12)
12.8722 9.36425e+06
1 4.57754e+06
| |
0 4.63206e+06
TABLE: ETL TEMP
STG_NEWS_VIN_VEH ( 13)
9.1837e+06
4.5255e+06
|
4.63206e+06
BTQ
( 14)
8.51225e+06
4.47345e+06
|
661723
TBSCAN
( 15)
8.51139e+06
4.47345e+06
|
6.61723e+08
TABLE: CARD
VIN_VEH_OPTNS

Please let me know if you want the rest of the output used as well

The same merge query ran today but with some volume of records
(714466) on the small table. It did use the index and completed in 20
seconds. But if the small table has 0 rows (as yesterday) it goes for
tablescan.
Pasted below the access plan for today's run.
Rows
RETURN
( 1)
Cost
I/O
|
0.0114286
INSERT
( 2)
11663.4
6215
/----+---\
0.0114286 6.61723e+08
DTQ TABLE: CARD
( 3) VIN_VEH_OPTNS
11650.5
6214
|
0.08
TBSCAN
( 4)
11650.5
6214
|
0.08
TEMP
( 5)
11650.5
6214
|
0.08
FILTER
( 6)
11650.4
6214
|
2
FILTER
( 7)
11650.4
6214
|
2
MDTQ
( 8)
11650.4
6214
|
0.285714
TBSCAN
( 9)
11650.4
6214
|
0.285714
SORT
( 10)
11650.4
6214
|
0.285714
NLJOIN
( 11)
11650.4
6214
/----+---\
0.285714 2.62669e-06
DTQ IXSCAN
( 12) ( 14)
11637.5 51.4469
6213 4
| |
2 6.61723e+08
TBSCAN INDEX: SYSIBM
( 13) SQL0610281603369
11637.4
6213
|
714466
TABLE: ETL
STG_NEWS_VIN_VEH

Feb 21 '07 #12

Serge Rielau

This is the original explain before my proposal to use GROUP BY
Note the ROW_NUMBER() which is used to find updates in the source.
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Feb 21 '07 #13

Serge Rielau

Sam Durai wrote:

The same merge query ran today but with some volume of records
(714466) on the small table. It did use the index and completed in 20
seconds. But if the small table has 0 rows (as yesterday) it goes for
tablescan.

If the small table has0 rows, then it doesn't matter that the inner is a
table scan because it will never execute...

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Feb 21 '07 #14

Sam Durai

On Feb 21, 1:25 pm, Serge Rielau <srie...@ca.ibm.comwrote:

Sam Durai wrote:
The same merge query ran today but with some volume of records
(714466) on the small table. It did use the index and completed in 20
seconds. But if the small table has 0 rows (as yesterday) it goes for
tablescan.

If the small table has0 rows, then it doesn't matter that the inner is a
table scan because it will never execute...

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

But with zero rows on small table the merge ran for around 1 hour to
complete but if the small table has records it completed in few
seconds. Please let me know if I'm missing something.

Feb 21 '07 #15

Serge Rielau

Sam Durai wrote:

On Feb 21, 1:25 pm, Serge Rielau <srie...@ca.ibm.comwrote:
>Sam Durai wrote:
>>The same merge query ran today but with some volume of records
(714466) on the small table. It did use the index and completed in 20
seconds. But if the small table has 0 rows (as yesterday) it goes for
tablescan.
If the small table has0 rows, then it doesn't matter that the inner is a
table scan because it will never execute...

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

But with zero rows on small table the merge ran for around 1 hour to
complete but if the small table has records it completed in few
seconds. Please let me know if I'm missing something.

Yes, you are missing something, but I don't know what....
Can you open a PMR?

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Feb 22 '07 #16

Sam Durai

On Feb 21, 7:35 pm, Serge Rielau <srie...@ca.ibm.comwrote:

Sam Durai wrote:
On Feb 21, 1:25 pm, Serge Rielau <srie...@ca.ibm.comwrote:
Sam Durai wrote:
The same merge query ran today but with some volume of records
(714466) on the small table. It did use the index and completed in 20
seconds. But if the small table has 0 rows (as yesterday) it goes for
tablescan.
If the small table has0 rows, then it doesn't matter that the inner is a
table scan because it will never execute...

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

But with zero rows on small table the merge ran for around 1 hour to
complete but if the small table has records it completed in few
seconds. Please let me know if I'm missing something.

Yes, you are missing something, but I don't know what....
Can you open a PMR?

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

ok, I will. Thanks for your comments.

Feb 22 '07 #17

Similar topics