Question about MIN/MAX optimization

Michel Esber

Hi all,

Db2 v8 FP15 LUW .

create table T (ID varchar (24), ABC timestamp)

There is an index for (ID, ABC), allowing reverse Scans.

My application needs to determine MIN and MAX(ABC) for a given ID. We
are currently using a simple statement:

select MIN(abc), MAX(abc) from T where ID = ? for read only

Table T has 100+ million rows, and several other applications are
reading/deleting data from it. The statement above runs with UR
isolation, however it takes a very long time to complete (5-10
minutes, or more).

I have studied the access plan, and it looks OK:

Access Plan:
-----------

Total Cost: 25.6855
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
NLJOIN
( 2)
25.6855
3.99557
/-----+-----\
1 1
GRPBY GRPBY
( 3) ( 5)
12.8262 12.8587
1.99557 2
| |
46.1442 46.1442
IXSCAN IXSCAN
( 4) ( 6)
12.8544 12.8544
2 2
| |
973678 973678
INDEX: RTM INDEX: RTM
IPSSTAT_WIN IPSSTAT_WIN
I am looking for some magic SQL or hint that will allow me to improve
this rather 'simple' query.

PS: Yes, table and indexes do have updated statistics.

Thanks in advance,

-Michel

Jun 27 '08 #1

Subscribe Post Reply

2925

Michel Esber

Hi.

>
Try this. It will probably generate one indexs can instead of two.

with temp (abc) as
* * ( select abc from T were id *= ?)
select *max (abc), min(abc) from temp

/dg

DG, DB2 optimized the original statement into two index scans.

Original Statement:
------------------
with temp (COLLECT_TIME) as
(select COLLECT_TIME
from RTM.TBL_COLLECT_PSSTAT_WIN_RTM
where MACHINE_ID= ? and COLLECT_TIME ?)
select MIN(COLLECT_TIME), MAX(COLLECT_TIME)
from temp
Optimized Statement:
-------------------
SELECT Q6.$C0, Q3.$C0
FROM
(SELECT MAX(Q2.$C0)
FROM
(SELECT Q1.COLLECT_TIME
FROM RTM.TBL_COLLECT_PSSTAT_WIN_RTM AS Q1
WHERE (:? < Q1.COLLECT_TIME) AND (Q1.MACHINE_ID = :?)) AS Q2) AS
Q3,
(SELECT MIN(Q5.$C0)
FROM
(SELECT Q4.COLLECT_TIME
FROM RTM.TBL_COLLECT_PSSTAT_WIN_RTM AS Q4
WHERE (:? < Q4.COLLECT_TIME) AND (Q4.MACHINE_ID = :?)) AS Q5) AS
Q6

Access Plan:
-----------
Total Cost: 51.2774
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
NLJOIN
( 2)
51.2774
7.9774
/-----+-----\
1 1
GRPBY GRPBY
( 3) ( 5)
25.6383 25.6383
3.98869 3.98871
| |
448.726 448.726
IXSCAN IXSCAN
( 4) ( 6)
87.5883 87.5879
13.6267 13.6267
| |
1.48278e+06 1.48278e+06
INDEX: RTM INDEX: RTM
IPSSTAT_WIN IPSSTAT_WIN

The overall cost is higher than the original statement :(

Thanks

Jun 27 '08 #2

Michel Esber

This is the optimal plan we want. Can you reproduce it on DB2 V8 with

this DDL and DML?

Here is the plan with MIN/MAX:

Database Context:
----------------
Parallelism: None
CPU Speed: 3.778754e-07
Comm Speed: 0
Buffer Pool size: 165240
Sort Heap size: 1024
Database Heap size: 1024
Lock List size: 100
Maximum Lock List: 10
Average Applications: 1
Locks Available: 1020

Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability

---------------- STATEMENT 1 SECTION 203 ----------------
QUERYNO: 1
QUERYTAG:
Statement Type: Select
Updatable: No
Deletable: No
Query Degree: 1

Original Statement:
------------------
select MIN(ABC), MAX(ABC)
from T
where ID=?
Optimized Statement:
-------------------
SELECT Q6.$C0, Q3.$C0
FROM
(SELECT MAX(Q2.$C0)
FROM
(SELECT Q1.ABC
FROM DB2INST1.T AS Q1
WHERE (Q1.ID = :?)) AS Q2) AS Q3,
(SELECT MIN(Q5.$C0)
FROM
(SELECT Q4.ABC
FROM DB2INST1.T AS Q4
WHERE (Q4.ID = :?)) AS Q5) AS Q6

Access Plan:
-----------
Total Cost: 12.8453
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
NLJOIN
( 2)
12.8453
2
/-----+-----\
1 1
GRPBY GRPBY
( 3) ( 5)
6.42335 6.42127
1 1
| |
2.92 2.92
IXSCAN IXSCAN
( 4) ( 6)
6.42298 6.42298
1 1
| |
73 73
INDEX: DB2INST1 INDEX: DB2INST1
I I

Extended Diagnostic Information:
--------------------------------

Diagnostic Identifier: 1
Diagnostic Details: EXP0022W Index has no statistics. The index
"DB2INST1"."I" has not had runstats run on it.
This
can lead to poor cardinality and predicate
filtering estimates.

Plan Details:
-------------
1) RETURN: (Return Result)
Cumulative Total Cost: 12.8453
Cumulative CPU Cost: 124849
Cumulative I/O Cost: 2
Cumulative Re-Total Cost: 0.0064907
Cumulative Re-CPU Cost: 17176.8
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 12.8441
Estimated Bufferpool Buffers: 3

Arguments:
---------
BLDLEVEL: (Build level)
DB2 v8.1.2.136 : special_19546
HEAPUSE : (Maximum Statement Heap Usage)
56 Pages
STMTHEAP: (Statement heap size)
4096

Input Streams:
-------------
7) From Operator #2

Estimated number of rows: 1
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q7.$C1+Q7.$C0
2) NLJOIN: (Nested Loop Join)
Cumulative Total Cost: 12.8453
Cumulative CPU Cost: 124849
Cumulative I/O Cost: 2
Cumulative Re-Total Cost: 0.0064907
Cumulative Re-CPU Cost: 17176.8
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 12.8441
Estimated Bufferpool Buffers: 3

Arguments:
---------
EARLYOUT: (Early Out flag)
NONE
FETCHMAX: (Override for FETCH MAXPAGES)
IGNORE
ISCANMAX: (Override for ISCAN MAXPAGES)
IGNORE

Input Streams:
-------------
3) From Operator #3

Estimated number of rows: 1
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$C0

6) From Operator #5

Estimated number of rows: 1
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q6.$C0
Output Streams:
--------------
7) To Operator #1

Estimated number of rows: 1
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q7.$C1+Q7.$C0
3) GRPBY : (Group By)
Cumulative Total Cost: 6.42335
Cumulative CPU Cost: 61783.5
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.00503426
Cumulative Re-CPU Cost: 13322.5
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 6.4226
Estimated Bufferpool Buffers: 2

Arguments:
---------
AGGMODE : (Aggregration Mode)
COMPLETE
GROUPBYC: (Group By columns)
FALSE
GROUPBYN: (Number of Group By columns)
0
JN INPUT: (Join input leg)
OUTER
ONEFETCH: (One Fetch flag)
FALSE

Input Streams:
-------------
2) From Operator #4

Estimated number of rows: 2.92
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
Output Streams:
--------------
3) To Operator #2

Estimated number of rows: 1
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$C0
4) IXSCAN: (Index Scan)
Cumulative Total Cost: 6.42298
Cumulative CPU Cost: 60803.5
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.00466394
Cumulative Re-CPU Cost: 12342.5
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 6.42108
Estimated Bufferpool Buffers: 2

Arguments:
---------
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE

Predicates:
----------
2) Start Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q1.ID = :?)

2) Stop Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q1.ID = :?)
Input Streams:
-------------
1) From Object DB2INST1.I

Estimated number of rows: 73
Number of columns: 3
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q1.$RID$+Q1.ID+Q1.ABC

Output Streams:
--------------
2) To Operator #3

Estimated number of rows: 2.92
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$C0
5) GRPBY : (Group By)
Cumulative Total Cost: 6.42127
Cumulative CPU Cost: 61285.6
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.000783828
Cumulative Re-CPU Cost: 2074.3
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 6.42117
Estimated Bufferpool Buffers: 2

Arguments:
---------
AGGMODE : (Aggregration Mode)
COMPLETE
GROUPBYC: (Group By columns)
FALSE
GROUPBYN: (Number of Group By columns)
0
JN INPUT: (Join input leg)
INNER
ONEFETCH: (One Fetch flag)
TRUE

Input Streams:
-------------
5) From Operator #6

Estimated number of rows: 2.92
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q5.$C0(A)
Output Streams:
--------------
6) To Operator #2

Estimated number of rows: 1
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q6.$C0
6) IXSCAN: (Index Scan)
Cumulative Total Cost: 6.42298
Cumulative CPU Cost: 60803.5
Cumulative I/O Cost: 1
Cumulative Re-Total Cost: 0.00466394
Cumulative Re-CPU Cost: 12342.5
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 6.42108
Estimated Bufferpool Buffers: 2

Arguments:
---------
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE

Predicates:
----------
3) Start Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q4.ID = :?)

3) Stop Key Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04

Predicate Text:
--------------
(Q4.ID = :?)
Input Streams:
-------------
4) From Object DB2INST1.I

Estimated number of rows: 73
Number of columns: 3
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q4.ABC(A)+Q4.$RID$+Q4.ID
Output Streams:
--------------
5) To Operator #5

Estimated number of rows: 2.92
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q5.$C0(A)
Objects Used in Access Plan:
---------------------------

Schema: DB2INST1
Name: T
Type: Table (reference only)

Schema: DB2INST1
Name: I
Type: Index
Time of creation:
2008-04-26-12.06.48.096459
Last statistics update:
Number of columns: 2
Number of rows: 73
Width of rows: -1
Number of buffer pool pages: 1
Distinct row values: No
Tablespace name:
IOSTATDATIDX
Tablespace overhead: 6.000000
Tablespace transfer rate: 0.400000
Source for statistics: Single Node
Prefetch page count: 128
Container extent page count: 32
Index clustering statistic: 80.000000
Index leaf pages: 2
Index tree levels: 2
Index full key cardinality: 25
Index first key cardinality: 25
Index first 2 keys cardinality: -1
Index first 3 keys cardinality: -1
Index first 4 keys cardinality: -1
Index sequential pages: 2
Index page density: 100
Index avg sequential pages: -1
Index avg gap between sequences:-1
Index avg random pages: -1
Fetch avg sequential pages: -1
Fetch avg gap between sequences:-1
Fetch avg random pages: -1
Index RID count: 0
Index deleted RID count: 0
Index empty leaf pages: 0
Base Table Schema: DB2INST1
Base Table Name: T
Columns in index:
ID
ABC

Base Table For Index Not Already Shown:
---------------------------------------

Schema: DB2INST1
Name: T
Time of creation:
2008-04-26-12.06.12.993169
Last statistics update:
Number of columns: 2
Number of rows: -1
Number of pages: -1
Number of pages with rows: -1
Tablespace name: IOSTATDATIDX
Tablespace overhead: 6.000000
Tablespace transfer rate: 0.400000
Prefetch page count: 128
Container extent page count: 32
Table overflow record count: -1

Both plans are very similar. In terms of performance (CPU/Disk IO),
how do they compare ?

Thanks,

Jun 27 '08 #3

Serge Rielau

Michel Esber wrote:

>This is the optimal plan we want. Can you reproduce it on DB2 V8 with
this DDL and DML?

Here is a v8 plan using your solution. I will post another message
with the plan for MIN/MAX and group by.
Total Cost: 12.846
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
NLJOIN
( 2)
12.846
2
/-----+-----\
1 1
IXSCAN IXSCAN
( 3) ( 4)
6.42298 6.42298
1 1
| |
73 73
INDEX: DB2INST1 INDEX: DB2INST1
I I

OK, so now the question is where is the difference between your original
scenario (not the best plan) and mine (best plan).
I recommend morphing it step by step and see where you loose your way.
Obviosuly the first step is to add data and statistics.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Jun 27 '08 #4

Similar topics

frozenset question

by: Will McGugan | last post by:

Hi, Are there any benefits in using a frozenset over a set, other than it being immutable? Will McGugan -- http://www.willmcgugan.com "".join({'*':'@','^':'.'}.get(c,0) or...

Python

Where to include header files / A question of style?

by: Exits Funnel | last post by:

Hello, I'm a little confused about where I should include header files and was wondering whether there was some convention. Imagine I've written a class foo and put the definition in foo.h and...

C / C++

Performance Question: Variablize or reference direct???

by: Randell D. | last post by:

Folks, I have a Javascript performance question that I might have problems explaining... In PHP, better performance can be obtained dealing directly with a variable, as opposed to an element...

Javascript

question on const

by: user | last post by:

Hello, Here is the program #include stdio int main(void) { const int num = 100; int *ip;

C / C++

C++ compiler "return" behavior (guru question ;)

by: Axel Bock | last post by:

Hi all, I am trying to get my head around what happens if I return a class object from a function. It seems C++ (MinGW) does not invoke the copy constructor if I do something like this: ...

C / C++

Double increment question.

by: Robbie Hatley | last post by:

Hello, group. I've been doing too much C++ programming lately, and I'm starting to become rusty at some aspects of the C way of doing things, esp. efficient low-level data copies. ...

C / C++

question about .NET Runtime Optimization & ASP.NET State services

by: Adam Sandler | last post by:

Hello, Using VWD 2005 here... I've noticed I've got .NET services on my system here: the .NET Runtime Optimization service and ASP.NET State service. I've noticed when the ASP.NET State service...

ASP.NET

A Python 3000 Question

by: brad | last post by:

Will len(a_string) become a_string.len()? I was just reading http://docs.python.org/dev/3.0/whatsnew/3.0.html One of the criticisms of Python compared to other OO languages is that it isn't OO...

Python

simple code performance question

by: galiorenye | last post by:

Hi, Given this code: A** ppA = new A*; A *pA = NULL; for(int i = 0; i < 10; ++i) { pA = ppA; //do something with pA

C / C++

another very basic question :-)

by: sagi | last post by:

Hello erveryone,I am a newcomer here and the word of c. Here I have a question confused me a lot that when I read codes I found some declaration like that: "int regcomp(regex_t *restrict...

C / C++

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing