By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,978 Members | 1,362 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,978 IT Pros & Developers. It's quick & easy.

runstats sampling oddities

P: n/a
I've got a large table on db2 8.2.1 that I rarely perform runstats on.
It has about 600 million rows organized in a single MDC time dimension
on a non-dpf warehouse.

Anyhow, we recently ran runstats on it with a 20% sampling. After this
was performed the highly selective queries using the time dimension
slowed down drastically (8000% increase in duration). Earlier today I
marked the table 'volatile' and queries went back to normal. Tomorow
I'll do a non-sampling runstats - it'll take hours, and remove the
volatile flag.

I'm not sure where the delay was exactly - visual explain wasn't
helpful, nmon showed fine io throughput, but application snapshots
showed rows read & logical/physical reads just crawling.

What's going on here? What was db2 up to - acquiring incorrect blocks
of data & throwing results away? Is this a bug? Or am I missing some
limitations or issues with runstats (or runstats & MDC)?

And yes, I will be applying fp10 as soon as fp11 comes out. :-)

Thanks in advance,

Ken Farmer

Jan 26 '06 #1
Share this Question
Share on Google+
4 Replies


P: n/a
"kenfar" <ke****@gmail.com> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...
I've got a large table on db2 8.2.1 that I rarely perform runstats on.
It has about 600 million rows organized in a single MDC time dimension
on a non-dpf warehouse.

Anyhow, we recently ran runstats on it with a 20% sampling. After this
was performed the highly selective queries using the time dimension
slowed down drastically (8000% increase in duration). Earlier today I
marked the table 'volatile' and queries went back to normal. Tomorow
I'll do a non-sampling runstats - it'll take hours, and remove the
volatile flag.

I'm not sure where the delay was exactly - visual explain wasn't
helpful, nmon showed fine io throughput, but application snapshots
showed rows read & logical/physical reads just crawling.

What's going on here? What was db2 up to - acquiring incorrect blocks
of data & throwing results away? Is this a bug? Or am I missing some
limitations or issues with runstats (or runstats & MDC)?

And yes, I will be applying fp10 as soon as fp11 comes out. :-)

Thanks in advance,

Ken Farmer


On a table that large, once you execute runstats and everything seems to be
running as expected, then there is usually no good reason to run it again.
If you take that approach, then doing the non-sampling runstats should not
be a burden.
Jan 26 '06 #2

P: n/a
kenfar wrote:
I've got a large table on db2 8.2.1 that I rarely perform runstats on.
It has about 600 million rows organized in a single MDC time dimension
on a non-dpf warehouse.

Anyhow, we recently ran runstats on it with a 20% sampling. After this
was performed the highly selective queries using the time dimension
slowed down drastically (8000% increase in duration). Earlier today I
marked the table 'volatile' and queries went back to normal. Tomorow
I'll do a non-sampling runstats - it'll take hours, and remove the
volatile flag.

I'm not sure where the delay was exactly - visual explain wasn't
helpful, nmon showed fine io throughput, but application snapshots
showed rows read & logical/physical reads just crawling.

What's going on here? What was db2 up to - acquiring incorrect blocks
of data & throwing results away? Is this a bug? Or am I missing some
limitations or issues with runstats (or runstats & MDC)?

And yes, I will be applying fp10 as soon as fp11 comes out. :-)

There is an easy way to find out what happened: db2exfmt.
Note to know change its good to know the before picture as well.
There MUST have been a plan change.
Also since you are obviously not playing with a toy system I recommend
that you use db2exfmt. Visual explain is nice, but db2exfmt leaves thie
toy in the dust.
EXPLAIN PLAN FOR querystillgood
db2exfmt -d <dbname> -o good.exfmt -1
do your runstats (or flip of volatile, or whatever)
EXPLAIN PLAN FOR querynowbad
db2exmft -d <dbname> -o bad.exfmt -1

Then let's have a look. The problem will be staring in our faces.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
DB2 UDB for Linux, Unix, Windows
IBM Toronto Lab
Jan 26 '06 #3

P: n/a
I've got two scenarios then for this simple query:
good:
- volatile
- stats collected with 20% sampling
- takes about 35 seconds to export 1.9 m rows
bad:
- non-volatile
- stats collected with 20% sampling
- takes about 60 minutes to export 1.9 m rows

************************************************** ****************
Here's the good explain:
******************** EXPLAIN INSTANCE ********************

DB2_VERSION: 08.02.1
SOURCE_NAME: SQLC2E06
SOURCE_SCHEMA: NULLID
SOURCE_VERSION:
EXPLAIN_TIME: 2006-01-26-16.55.05.287940
EXPLAIN_REQUESTER: DBA

Database Context:
----------------
Parallelism: Intra-Partition Parallelism
CPU Speed: 5.668131e-07
Comm Speed: 2
Buffer Pool size: 17000
Sort Heap size: 10000
Database Heap size: 8000
Lock List size: 100
Maximum Lock List: 10
Average Applications: 1
Locks Available: 640

Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability

---------------- STATEMENT 1 SECTION 203 ----------------
QUERYNO: 1
QUERYTAG:
Statement Type: Select
Updatable: No
Deletable: No
Query Degree: -1

Original Statement:
------------------
SELECT 1
FROM sware.mf_trm_nid_3
WHERE sensor_time_id = 1485
Optimized Statement:
-------------------
SELECT 1
FROM SWARE.MF_TRM_NID_3 AS Q1
WHERE (Q1.SENSOR_TIME_ID = 1485)

Access Plan:
-----------
Total Cost: 1.25461e+07
Query Degree: 4

Rows
RETURN
( 1)
Cost
I/O
|
853509
LTQ
( 2)
1.25461e+07
508785
|
853509
FETCH
( 3)
1.25457e+07
508785
/---+---\
236.513 6.38425e+08
IXSCAN TABLE: SWARE
( 4) MF_TRM_NID_3
25.0285
1
|
6.38425e+08
INDEX: SYSIBM
SQL0402251807242


1) RETURN: (Return Result)
Cumulative Total Cost: 1.25461e+07
Cumulative CPU Cost: 6.17343e+09
Cumulative I/O Cost: 508785
Cumulative Re-Total Cost: 1.25457e+07
Cumulative Re-CPU Cost: 5.5809e+09
Cumulative Re-I/O Cost: 508785
Cumulative First Row Cost: 78.0764
Estimated Bufferpool Buffers: 508786

Arguments:
---------
BLDLEVEL: (Build level)
DB2 v8.1.1.80 : s041221
STMTHEAP: (Statement heap size)
4096

Input Streams:
-------------
5) From Operator #2

Estimated number of rows: 853509
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
2) TQ : (Table Queue)
Cumulative Total Cost: 1.25461e+07
Cumulative CPU Cost: 6.17343e+09
Cumulative I/O Cost: 508785
Cumulative Re-Total Cost: 1.25457e+07
Cumulative Re-CPU Cost: 5.5809e+09
Cumulative Re-I/O Cost: 508785
Cumulative First Row Cost: 78.0764
Estimated Bufferpool Buffers: 508786

Arguments:
---------
LISTENER: (Listener Table Queue type)
FALSE
TQ TYPE : (Table queue type)
LOCAL
TQDEGREE: (Degree of Intra-Partition parallelism)
4
TQMERGE : (Merging Table Queue flag)
FALSE
TQREAD : (Table Queue Read type)
READ AHEAD
UNIQUE : (Uniqueness required flag)
FALSE

Input Streams:
-------------
4) From Operator #3

Estimated number of rows: 853509
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
Output Streams:
--------------
5) To Operator #1

Estimated number of rows: 853509
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
3) FETCH : (Fetch)
Cumulative Total Cost: 1.25457e+07
Cumulative CPU Cost: 5.58093e+09
Cumulative I/O Cost: 508785
Cumulative Re-Total Cost: 1.25457e+07
Cumulative Re-CPU Cost: 5.5809e+09
Cumulative Re-I/O Cost: 508785
Cumulative First Row Cost: 77.9839
Estimated Bufferpool Buffers: 508786

Arguments:
---------
BLKLOCK : (Block Lock intent)
INTENT SHARE
MAXPAGES: (Maximum pages for prefetch)
1
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
TABLOCK : (Table Lock intent)
INTENT SHARE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY

Input Streams:
-------------
2) From Operator #4

Estimated number of rows: 236.513
Number of columns: 0
Subquery predicate ID: Not
Applicable

3) From Object SWARE.MF_TRM_NID_3

Estimated number of rows:
6.38425e+08
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q1.$RID$
Output Streams:
--------------
4) To Operator #2

Estimated number of rows: 853509
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
4) IXSCAN: (Index Scan)
Cumulative Total Cost: 25.0285
Cumulative CPU Cost: 50245.5
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.00957301
Cumulative Re-CPU Cost: 16889.2
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 25.0149
Estimated Bufferpool Buffers: 2

Arguments:
---------
BLKLOCK : (Block Lock intent)
INTENT SHARE
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
SCANGRAN: (Intra-Partition Parallelism Scan
Granularity)
1
SCANTYPE: (Intra-Partition Parallelism Scan Type)
LOCAL PARALLEL
TABLOCK : (Table Lock intent)
INTENT SHARE
VOLATILE: (Volatile type)
CARDINALITY

Predicates:
----------
2) Start Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.0013369

Predicate Text:
--------------
(Q1.SENSOR_TIME_ID = 1485)

2) Stop Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.0013369

Predicate Text:
--------------
(Q1.SENSOR_TIME_ID = 1485)
Input Streams:
-------------
1) From Object SYSIBM.SQL040225180724200

Estimated number of rows:
6.38425e+08
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q1.$BLOCKID$+Q1.SENSOR_TIME_ID
Output Streams:
--------------
2) To Operator #3

Estimated number of rows: 236.513
Number of columns: 0
Subquery predicate ID: Not
Applicable
Objects Used in Access Plan:
---------------------------

Schema: SYSIBM
Name: SQL040225180724200
Type: Index
Time of creation:
2004-02-25-18.07.24.131547
Last statistics update:
2006-01-17-19.15.49.725655
Number of columns: 1
Number of rows: 638424927
Width of rows: -1
Number of buffer pool pages: 5661216
Distinct row values: No
Tablespace name: TS_INDEXES
Tablespace overhead: 24.100000
Tablespace transfer rate: 0.900000
Source for statistics: Single Node
Prefetch page count: 192
Container extent page count: 32
Index clustering statistic: 100.000000
Index leaf pages: 40
Index tree levels: 2
Index full key cardinality: 747
Index first key cardinality: 747
Index first 2 keys cardinality: -1
Index first 3 keys cardinality: -1
Index first 4 keys cardinality: -1
Index sequential pages: 17
Index page density: 47
Index avg sequential pages: 17
Index avg gap between sequences:0
Index avg random pages: 20
Fetch avg sequential pages: -1
Fetch avg gap between sequences:-1
Fetch avg random pages: -1
Index RID count: 176914
Index deleted RID count: 2
Index empty leaf pages: 0
Base Table Schema: SWARE
Base Table Name: MF_TRM_NID_3
Columns in index:
SENSOR_TIME_ID

Schema: SWARE
Name: MF_TRM_NID_3
Type: Table
Time of creation:
2004-02-25-18.07.24.131547
Last statistics update:
2006-01-17-19.15.49.725655
Number of columns: 38
Number of rows: 638424927
Width of rows: 20
Number of buffer pool pages: 5661216
Distinct row values: No
Tablespace name: TS_FACT1
Tablespace overhead: 24.100000
Tablespace transfer rate: 0.900000
Source for statistics: Single Node
Prefetch page count: 128
Container extent page count: 32
Table overflow record count: 374897346
Table Active Blocks: 176912


************************************************** ****************
and here's the bad one
******************** EXPLAIN INSTANCE ********************

DB2_VERSION: 08.02.1
SOURCE_NAME: SQLC2E06
SOURCE_SCHEMA: NULLID
SOURCE_VERSION:
EXPLAIN_TIME: 2006-01-26-17.40.19.135670
EXPLAIN_REQUESTER: DBA

Database Context:
----------------
Parallelism: Intra-Partition Parallelism
CPU Speed: 5.668131e-07
Comm Speed: 2
Buffer Pool size: 17000
Sort Heap size: 10000
Database Heap size: 8000
Lock List size: 100
Maximum Lock List: 10
Average Applications: 1
Locks Available: 640

Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability

---------------- STATEMENT 1 SECTION 203 ----------------
QUERYNO: 1
QUERYTAG:
Statement Type: Select
Updatable: No
Deletable: No
Query Degree: -1

Original Statement:
------------------
SELECT 1
FROM sware.mf_trm_nid_3
WHERE sensor_time_id = 1485
Optimized Statement:
-------------------
SELECT 1
FROM SWARE.MF_TRM_NID_3 AS Q1
WHERE (Q1.SENSOR_TIME_ID = 1485)

Access Plan:
-----------
Total Cost: 9.53175e+06
Query Degree: 4

Rows
RETURN
( 1)
Cost
I/O
|
853509
LTQ
( 2)
9.53175e+06
5.66119e+06
|
853509
TBSCAN
( 3)
9.53141e+06
5.66119e+06
|
6.38425e+08
TABLE: SWARE
MF_TRM_NID_3


1) RETURN: (Return Result)
Cumulative Total Cost: 9.53175e+06
Cumulative CPU Cost: 3.05352e+11
Cumulative I/O Cost: 5.66119e+06
Cumulative Re-Total Cost: 9.53141e+06
Cumulative Re-CPU Cost: 3.0476e+11
Cumulative Re-I/O Cost: 5.66119e+06
Cumulative First Row Cost: 64.8634
Estimated Bufferpool Buffers: 5.66118e+06

Arguments:
---------
BLDLEVEL: (Build level)
DB2 v8.1.1.80 : s041221
STMTHEAP: (Statement heap size)
4096

Input Streams:
-------------
3) From Operator #2

Estimated number of rows: 853509
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
2) TQ : (Table Queue)
Cumulative Total Cost: 9.53175e+06
Cumulative CPU Cost: 3.05352e+11
Cumulative I/O Cost: 5.66119e+06
Cumulative Re-Total Cost: 9.53141e+06
Cumulative Re-CPU Cost: 3.0476e+11
Cumulative Re-I/O Cost: 5.66119e+06
Cumulative First Row Cost: 64.8634
Estimated Bufferpool Buffers: 5.66118e+06

Arguments:
---------
LISTENER: (Listener Table Queue type)
FALSE
TQ TYPE : (Table queue type)
LOCAL
TQDEGREE: (Degree of Intra-Partition parallelism)
4
TQMERGE : (Merging Table Queue flag)
FALSE
TQREAD : (Table Queue Read type)
READ AHEAD
UNIQUE : (Uniqueness required flag)
FALSE

Input Streams:
-------------
2) From Operator #3

Estimated number of rows: 853509
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
Output Streams:
--------------
3) To Operator #1

Estimated number of rows: 853509
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
3) TBSCAN: (Table Scan)
Cumulative Total Cost: 9.53141e+06
Cumulative CPU Cost: 3.0476e+11
Cumulative I/O Cost: 5.66119e+06
Cumulative Re-Total Cost: 9.53141e+06
Cumulative Re-CPU Cost: 3.0476e+11
Cumulative Re-I/O Cost: 5.66119e+06
Cumulative First Row Cost: 64.7709
Estimated Bufferpool Buffers: 5.66118e+06

Arguments:
---------
BLKLOCK : (Block Lock intent)
INTENT SHARE
MAXPAGES: (Maximum pages for prefetch)
ALL
PREFETCH: (Type of Prefetch)
SEQUENTIAL
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
SCANGRAN: (Intra-Partition Parallelism Scan
Granularity)
32
SCANTYPE: (Intra-Partition Parallelism Scan Type)
LOCAL PARALLEL
SCANUNIT: (Intra-Partition Parallelism Scan Unit)
PAGE
TABLOCK : (Table Lock intent)
INTENT SHARE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY

Predicates:
----------
2) Block Sarg Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.0013369

Predicate Text:
--------------
(Q1.SENSOR_TIME_ID = 1485)
Input Streams:
-------------
1) From Object SWARE.MF_TRM_NID_3

Estimated number of rows:
6.38425e+08
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q1.$RID$+Q1.SENSOR_TIME_ID
Output Streams:
--------------
2) To Operator #2

Estimated number of rows: 853509
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
Objects Used in Access Plan:
---------------------------

Schema: SWARE
Name: MF_TRM_NID_3
Type: Table
Time of creation:
2004-02-25-18.07.24.131547
Last statistics update:
2006-01-17-19.15.49.725655
Number of columns: 38
Number of rows: 638424927
Width of rows: 20
Number of buffer pool pages: 5661216
Distinct row values: No
Tablespace name: TS_FACT1
Tablespace overhead: 24.100000
Tablespace transfer rate: 0.900000
Source for statistics: Single Node
Prefetch page count: 128
Container extent page count: 32
Table overflow record count: 374897346
Table Active Blocks: 176912

Jan 26 '06 #4

P: n/a
I can see a major difference just looking at the access plans. (This
doesn't include a full analysis of the detail information available.)
Your "good" run used an index to locate rows that satisfied the
predicate. The "bad" one used a tablespace scan to read the entire table
looking for qualifying rows.

This difference should have been easily seen in a visual explain. If you
still have the two statements explained, then I'd go back to the visual
tool and see if you can spot the difference.

The optimizer's decision to scan or use an index uses, as one of its
factors, its estimate of the total number of I/Os needed to satisfy the
query. Scanning always reads multiple pages with each I/O; indexes can
be assumed to read one data page per row. The optimizer incorrectly
(based on your actual results) assumed around 850k rows would be
retrieved. I'd guess that the partial statistics you gathered tipped the
optimizer into deciding that the tablespace scan would be faster. The
VOLATILE table setting will cause the optimizer to use the index even
though it would prefer to use a tablespace scan. (See ALTER TABLE in SQL
Reference V2.)

My experience with large tables and this type of query is that index
access becomes almost a certainty when the table is clustered to match
the index. The statistics must also indicate that the table is currently
clustered and not in need of reorganization.

Phil Sherman

kenfar wrote:
I've got two scenarios then for this simple query:
good:
- volatile
- stats collected with 20% sampling
- takes about 35 seconds to export 1.9 m rows
bad:
- non-volatile
- stats collected with 20% sampling
- takes about 60 minutes to export 1.9 m rows

************************************************** ****************
Here's the good explain:
******************** EXPLAIN INSTANCE ********************
Access Plan:
-----------
Total Cost: 1.25461e+07
Query Degree: 4

Rows
RETURN
( 1)
Cost
I/O
|
853509
LTQ
( 2)
1.25461e+07
508785
|
853509
FETCH
( 3)
1.25457e+07
508785
/---+---\
236.513 6.38425e+08
IXSCAN TABLE: SWARE
( 4) MF_TRM_NID_3
25.0285
1
|
6.38425e+08
INDEX: SYSIBM
SQL0402251807242



************************************************** ****************
and here's the bad one
******************** EXPLAIN INSTANCE ********************


Optimized Statement:
-------------------
SELECT 1
FROM SWARE.MF_TRM_NID_3 AS Q1
WHERE (Q1.SENSOR_TIME_ID = 1485)

Access Plan:
-----------
Total Cost: 9.53175e+06
Query Degree: 4

Rows
RETURN
( 1)
Cost
I/O
|
853509
LTQ
( 2)
9.53175e+06
5.66119e+06
|
853509
TBSCAN
( 3)
9.53141e+06
5.66119e+06
|
6.38425e+08
TABLE: SWARE
MF_TRM_NID_3

Jan 28 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.