By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
458,191 Members | 1,599 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 458,191 IT Pros & Developers. It's quick & easy.

Question about MIN/MAX optimization

P: n/a
Hi all,

Db2 v8 FP15 LUW .

create table T (ID varchar (24), ABC timestamp)

There is an index for (ID, ABC), allowing reverse Scans.

My application needs to determine MIN and MAX(ABC) for a given ID. We
are currently using a simple statement:

select MIN(abc), MAX(abc) from T where ID = ? for read only

Table T has 100+ million rows, and several other applications are
reading/deleting data from it. The statement above runs with UR
isolation, however it takes a very long time to complete (5-10
minutes, or more).

I have studied the access plan, and it looks OK:

Access Plan:
-----------

Total Cost: 25.6855
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
NLJOIN
( 2)
25.6855
3.99557
/-----+-----\
1 1
GRPBY GRPBY
( 3) ( 5)
12.8262 12.8587
1.99557 2
| |
46.1442 46.1442
IXSCAN IXSCAN
( 4) ( 6)
12.8544 12.8544
2 2
| |
973678 973678
INDEX: RTM INDEX: RTM
IPSSTAT_WIN IPSSTAT_WIN
I am looking for some magic SQL or hint that will allow me to improve
this rather 'simple' query.

PS: Yes, table and indexes do have updated statistics.

Thanks in advance,

-Michel
Jun 27 '08 #1
Share this Question
Share on Google+
3 Replies


P: n/a
Hi.
>
Try this. It will probably generate one indexs can instead of two.

with temp (abc) as
* * ( select abc from T were id *= ?)
select *max (abc), min(abc) from temp

/dg

DG, DB2 optimized the original statement into two index scans.

Original Statement:
------------------
with temp (COLLECT_TIME) as
(select COLLECT_TIME
from RTM.TBL_COLLECT_PSSTAT_WIN_RTM
where MACHINE_ID= ? and COLLECT_TIME ?)
select MIN(COLLECT_TIME), MAX(COLLECT_TIME)
from temp
Optimized Statement:
-------------------
SELECT Q6.$C0, Q3.$C0
FROM
(SELECT MAX(Q2.$C0)
FROM
(SELECT Q1.COLLECT_TIME
FROM RTM.TBL_COLLECT_PSSTAT_WIN_RTM AS Q1
WHERE (:? < Q1.COLLECT_TIME) AND (Q1.MACHINE_ID = :?)) AS Q2) AS
Q3,
(SELECT MIN(Q5.$C0)
FROM
(SELECT Q4.COLLECT_TIME
FROM RTM.TBL_COLLECT_PSSTAT_WIN_RTM AS Q4
WHERE (:? < Q4.COLLECT_TIME) AND (Q4.MACHINE_ID = :?)) AS Q5) AS
Q6

Access Plan:
-----------
Total Cost: 51.2774
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
NLJOIN
( 2)
51.2774
7.9774
/-----+-----\
1 1
GRPBY GRPBY
( 3) ( 5)
25.6383 25.6383
3.98869 3.98871
| |
448.726 448.726
IXSCAN IXSCAN
( 4) ( 6)
87.5883 87.5879
13.6267 13.6267
| |
1.48278e+06 1.48278e+06
INDEX: RTM INDEX: RTM
IPSSTAT_WIN IPSSTAT_WIN

The overall cost is higher than the original statement :(

Thanks
Jun 27 '08 #2

P: n/a
This is the optimal plan we want. Can you reproduce it on DB2 V8 with
this DDL and DML?

Here is the plan with MIN/MAX:

Database Context:
----------------
Parallelism: None
CPU Speed: 3.778754e-07
Comm Speed: 0
Buffer Pool size: 165240
Sort Heap size: 1024
Database Heap size: 1024
Lock List size: 100
Maximum Lock List: 10
Average Applications: 1
Locks Available: 1020

Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability

---------------- STATEMENT 1 SECTION 203 ----------------
QUERYNO: 1
QUERYTAG:
Statement Type: Select
Updatable: No
Deletable: No
Query Degree: 1

Original Statement:
------------------
select MIN(ABC), MAX(ABC)
from T
where ID=?
Optimized Statement:
-------------------
SELECT Q6.$C0, Q3.$C0
FROM
(SELECT MAX(Q2.$C0)
FROM
(SELECT Q1.ABC
FROM DB2INST1.T AS Q1
WHERE (Q1.ID = :?)) AS Q2) AS Q3,
(SELECT MIN(Q5.$C0)
FROM
(SELECT Q4.ABC
FROM DB2INST1.T AS Q4
WHERE (Q4.ID = :?)) AS Q5) AS Q6

Access Plan:
-----------
Total Cost: 12.8453
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
NLJOIN
( 2)
12.8453
2
/-----+-----\
1 1
GRPBY GRPBY
( 3) ( 5)
6.42335 6.42127
1 1
| |
2.92 2.92
IXSCAN IXSCAN
( 4) ( 6)
6.42298 6.42298
1 1
| |
73 73
INDEX: DB2INST1 INDEX: DB2INST1
I I


Extended Diagnostic Information:
--------------------------------

Diagnostic Identifier: 1
Diagnostic Details: EXP0022W Index has no statistics. The index
"DB2INST1"."I" has not had runstats run on it.
This
can lead to poor cardinality and predicate
filtering estimates.

Plan Details:
-------------
1) RETURN: (Return Result)
Cumulative Total Cost: 12.8453
Cumulative CPU Cost: 124849
Cumulative I/O Cost: 2
Cumulative Re-Total Cost: 0.0064907
Cumulative Re-CPU Cost: 17176.8
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 12.8441
Estimated Bufferpool Buffers: 3

Arguments:
---------
BLDLEVEL: (Build level)
DB2 v8.1.2.136 : special_19546
HEAPUSE : (Maximum Statement Heap Usage)
56 Pages
STMTHEAP: (Statement heap size)
4096

Input Streams:
-------------
7) From Operator #2

Estimated number of rows: 1
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q7.$C1+Q7.$C0
2) NLJOIN: (Nested Loop Join)
Cumulative Total Cost: 12.8453
Cumulative CPU Cost: 124849
Cumulative I/O Cost: 2
Cumulative Re-Total Cost: 0.0064907
Cumulative Re-CPU Cost: 17176.8
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 12.8441
Estimated Bufferpool Buffers: 3

Arguments:
---------
EARLYOUT: (Early Out flag)
NONE
FETCHMAX: (Override for FETCH MAXPAGES)
IGNORE
ISCANMAX: (Override for ISCAN MAXPAGES)
IGNORE

Input Streams:
-------------
3) From Operator #3

Estimated number of rows: 1
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$C0

6) From Operator #5

Estimated number of rows: 1
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q6.$C0
Output Streams:
--------------
7) To Operator #1

Estimated number of rows: 1
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q7.$C1+Q7.$C0
3) GRPBY : (Group By)
Cumulative Total Cost: 6.42335
Cumulative CPU Cost: 61783.5
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.00503426
Cumulative Re-CPU Cost: 13322.5
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 6.4226
Estimated Bufferpool Buffers: 2

Arguments:
---------
AGGMODE : (Aggregration Mode)
COMPLETE
GROUPBYC: (Group By columns)
FALSE
GROUPBYN: (Number of Group By columns)
0
JN INPUT: (Join input leg)
OUTER
ONEFETCH: (One Fetch flag)
FALSE

Input Streams:
-------------
2) From Operator #4

Estimated number of rows: 2.92
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
Output Streams:
--------------
3) To Operator #2

Estimated number of rows: 1
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$C0
4) IXSCAN: (Index Scan)
Cumulative Total Cost: 6.42298
Cumulative CPU Cost: 60803.5
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.00466394
Cumulative Re-CPU Cost: 12342.5
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 6.42108
Estimated Bufferpool Buffers: 2

Arguments:
---------
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE

Predicates:
----------
2) Start Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q1.ID = :?)

2) Stop Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q1.ID = :?)
Input Streams:
-------------
1) From Object DB2INST1.I

Estimated number of rows: 73
Number of columns: 3
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q1.$RID$+Q1.ID+Q1.ABC

Output Streams:
--------------
2) To Operator #3

Estimated number of rows: 2.92
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
5) GRPBY : (Group By)
Cumulative Total Cost: 6.42127
Cumulative CPU Cost: 61285.6
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.000783828
Cumulative Re-CPU Cost: 2074.3
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 6.42117
Estimated Bufferpool Buffers: 2

Arguments:
---------
AGGMODE : (Aggregration Mode)
COMPLETE
GROUPBYC: (Group By columns)
FALSE
GROUPBYN: (Number of Group By columns)
0
JN INPUT: (Join input leg)
INNER
ONEFETCH: (One Fetch flag)
TRUE

Input Streams:
-------------
5) From Operator #6

Estimated number of rows: 2.92
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q5.$C0(A)
Output Streams:
--------------
6) To Operator #2

Estimated number of rows: 1
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q6.$C0
6) IXSCAN: (Index Scan)
Cumulative Total Cost: 6.42298
Cumulative CPU Cost: 60803.5
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.00466394
Cumulative Re-CPU Cost: 12342.5
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 6.42108
Estimated Bufferpool Buffers: 2

Arguments:
---------
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE

Predicates:
----------
3) Start Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q4.ID = :?)

3) Stop Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q4.ID = :?)
Input Streams:
-------------
4) From Object DB2INST1.I

Estimated number of rows: 73
Number of columns: 3
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q4.ABC(A)+Q4.$RID$+Q4.ID
Output Streams:
--------------
5) To Operator #5

Estimated number of rows: 2.92
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q5.$C0(A)
Objects Used in Access Plan:
---------------------------

Schema: DB2INST1
Name: T
Type: Table (reference only)

Schema: DB2INST1
Name: I
Type: Index
Time of creation:
2008-04-26-12.06.48.096459
Last statistics update:
Number of columns: 2
Number of rows: 73
Width of rows: -1
Number of buffer pool pages: 1
Distinct row values: No
Tablespace name:
IOSTATDATIDX
Tablespace overhead: 6.000000
Tablespace transfer rate: 0.400000
Source for statistics: Single Node
Prefetch page count: 128
Container extent page count: 32
Index clustering statistic: 80.000000
Index leaf pages: 2
Index tree levels: 2
Index full key cardinality: 25
Index first key cardinality: 25
Index first 2 keys cardinality: -1
Index first 3 keys cardinality: -1
Index first 4 keys cardinality: -1
Index sequential pages: 2
Index page density: 100
Index avg sequential pages: -1
Index avg gap between sequences:-1
Index avg random pages: -1
Fetch avg sequential pages: -1
Fetch avg gap between sequences:-1
Fetch avg random pages: -1
Index RID count: 0
Index deleted RID count: 0
Index empty leaf pages: 0
Base Table Schema: DB2INST1
Base Table Name: T
Columns in index:
ID
ABC

Base Table For Index Not Already Shown:
---------------------------------------

Schema: DB2INST1
Name: T
Time of creation:
2008-04-26-12.06.12.993169
Last statistics update:
Number of columns: 2
Number of rows: -1
Number of pages: -1
Number of pages with rows: -1
Tablespace name: IOSTATDATIDX
Tablespace overhead: 6.000000
Tablespace transfer rate: 0.400000
Prefetch page count: 128
Container extent page count: 32
Table overflow record count: -1

Both plans are very similar. In terms of performance (CPU/Disk IO),
how do they compare ?

Thanks,
Jun 27 '08 #3

P: n/a
Michel Esber wrote:
>This is the optimal plan we want. Can you reproduce it on DB2 V8 with
this DDL and DML?

Here is a v8 plan using your solution. I will post another message
with the plan for MIN/MAX and group by.
Total Cost: 12.846
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
NLJOIN
( 2)
12.846
2
/-----+-----\
1 1
IXSCAN IXSCAN
( 3) ( 4)
6.42298 6.42298
1 1
| |
73 73
INDEX: DB2INST1 INDEX: DB2INST1
I I
OK, so now the question is where is the difference between your original
scenario (not the best plan) and mine (best plan).
I recommend morphing it step by step and see where you loose your way.
Obviosuly the first step is to add data and statistics.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Jun 27 '08 #4

This discussion thread is closed

Replies have been disabled for this discussion.