By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,760 Members | 1,644 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,760 IT Pros & Developers. It's quick & easy.

Query tunning

P: n/a
Raj
I have an MDC index on Big_A.Dt column. The following query always goes
for a table scan.

SELECT Key,
Cd,
Dt,
SUM(Big_A [20 - other columns] )
FROM
( SELECT Big_A.Key ,
small_3.Cd,
small_1.Dt,
Big_A [20 - other columns]
FROM D.SALES AS Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt
AND small_1.Store_Ind = 'Y'
LEFT OUTER JOIN S.tab3 AS small_2
ON small_1.Key = small_2.Key
AND Big_A.U_Key = small_2.U_Key
AND small_1.Q_Cd = small_2.Q_Cd
AND small_1.Dt = small_2.Dt
INNER JOIN D.tab4 AS small_3
ON Big_A.U_Key = small_3.U_Key
INNER JOIN D.tab5 AS small_4
ON small_1.Chain_Cd = small_4.Chain_Cd
AND small_1.Qty_Qual_Cd = small_4.Qty_Qual_Cd
) AS SAndP
GROUP BY
Key,
Cd,
Dt

1.Big_A table has 80 months of data ( 10 billion records).

2. And the condition
AND Big_A.Dt BETWEEN small_1.From_Dt AND small_1.End_Dt
translates to
AND Big_A.Dt BETWEEN 05 AND 07 [ only 2 months data in the current
year]

The query is only selecting 2 months data, why is'nt the optimizer
using the MDC on the Big_A.Dt column?
If I add a where predicate with the following condition optimizer uses
the index on date column and executes much faster....
Big_A.Dt >= 05
Can I rewrite the query to make use of the index?

The sum in the outer query results in a sort how can I avoid it?

Any help greatly appreciated!!
Thanks,
Raj

Jul 31 '06 #1
Share this Question
Share on Google+
10 Replies


P: n/a
Hi Raj,

Where are you appending the dummy (for lack of a better term)
predicate?
Is it in the outer query or in the inner one?
Also, any reason as to why you think that the SUM is causing the sort?
I have a feeling that its the GROUP BY and not the SUM that's causing
the sort.

Just wondering - is this an online transactional system or an OLAP/Data
Warehouse kind of app? What is the "Maximum query degree of
parallelism" and have you enabled "intra-partition parallelism" ?

It might help to attach/send the query explain plan and the table DDLs.

-- Jayesh

Raj wrote:
I have an MDC index on Big_A.Dt column. The following query always goes
for a table scan.

SELECT Key,
Cd,
Dt,
SUM(Big_A [20 - other columns] )
FROM
( SELECT Big_A.Key ,
small_3.Cd,
small_1.Dt,
Big_A [20 - other columns]
FROM D.SALES AS Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt
AND small_1.Store_Ind = 'Y'
LEFT OUTER JOIN S.tab3 AS small_2
ON small_1.Key = small_2.Key
AND Big_A.U_Key = small_2.U_Key
AND small_1.Q_Cd = small_2.Q_Cd
AND small_1.Dt = small_2.Dt
INNER JOIN D.tab4 AS small_3
ON Big_A.U_Key = small_3.U_Key
INNER JOIN D.tab5 AS small_4
ON small_1.Chain_Cd = small_4.Chain_Cd
AND small_1.Qty_Qual_Cd = small_4.Qty_Qual_Cd
) AS SAndP
GROUP BY
Key,
Cd,
Dt

1.Big_A table has 80 months of data ( 10 billion records).

2. And the condition
AND Big_A.Dt BETWEEN small_1.From_Dt AND small_1.End_Dt
translates to
AND Big_A.Dt BETWEEN 05 AND 07 [ only 2 months data in the current
year]

The query is only selecting 2 months data, why is'nt the optimizer
using the MDC on the Big_A.Dt column?
If I add a where predicate with the following condition optimizer uses
the index on date column and executes much faster....
Big_A.Dt >= 05
Can I rewrite the query to make use of the index?

The sum in the outer query results in a sort how can I avoid it?

Any help greatly appreciated!!
Thanks,
Raj
Jul 31 '06 #2

P: n/a
You'll need to append an explain. You also might include approximate row
conuts for all the tables.

It's not surprising you're getting a tablespace scan. THe optimizer uses
information in the query to determine the access plan and there are no
predicates in the query that could be used to limit rows retrieved. Your
addition of the date predicate and the use of an index when it's present
is verification of this.

You'll need to do things that perform all of the table joins except
D.SALES before accessing the large table. Make sure your statistics are
current for all of the small tables and see if optimizing for a small
number of rows retrieved makes a difference. Try examining the result of
the small tables joins to help determine what you need to look for in
the large table.

I also don't see the purpose of S.tab3. You are doing a left outer join
of the prior result with it (keeping all existing rows) but not keeping
any columns from it. The following inner joins with tab4 and tab5 also
don't use any columns from tab3.

Phil Sherman

Raj wrote:
I have an MDC index on Big_A.Dt column. The following query always goes
for a table scan.

SELECT Key,
Cd,
Dt,
SUM(Big_A [20 - other columns] )
FROM
( SELECT Big_A.Key ,
small_3.Cd,
small_1.Dt,
Big_A [20 - other columns]
FROM D.SALES AS Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt
AND small_1.Store_Ind = 'Y'
LEFT OUTER JOIN S.tab3 AS small_2
ON small_1.Key = small_2.Key
AND Big_A.U_Key = small_2.U_Key
AND small_1.Q_Cd = small_2.Q_Cd
AND small_1.Dt = small_2.Dt
INNER JOIN D.tab4 AS small_3
ON Big_A.U_Key = small_3.U_Key
INNER JOIN D.tab5 AS small_4
ON small_1.Chain_Cd = small_4.Chain_Cd
AND small_1.Qty_Qual_Cd = small_4.Qty_Qual_Cd
) AS SAndP
GROUP BY
Key,
Cd,
Dt

1.Big_A table has 80 months of data ( 10 billion records).

2. And the condition
AND Big_A.Dt BETWEEN small_1.From_Dt AND small_1.End_Dt
translates to
AND Big_A.Dt BETWEEN 05 AND 07 [ only 2 months data in the current
year]

The query is only selecting 2 months data, why is'nt the optimizer
using the MDC on the Big_A.Dt column?
If I add a where predicate with the following condition optimizer uses
the index on date column and executes much faster....
Big_A.Dt >= 05
Can I rewrite the query to make use of the index?

The sum in the outer query results in a sort how can I avoid it?

Any help greatly appreciated!!
Thanks,
Raj
Jul 31 '06 #3

P: n/a
Raj
Jayesh,
Thanks for the reply....
I have the where predicate in the inner query
This is a data warehouse system with intra & inter parallelism
Here is o/p from explain...

******************** EXPLAIN INSTANCE ********************

DB2_VERSION: 08.02.1

Database Context:
----------------
Parallelism: Inter-Partition Parallelism
CPU Speed: 1.133626e-06
Comm Speed: 100
Buffer Pool size: 68432
Sort Heap size: 2500
Database Heap size: 12288
Lock List size: 1024
Maximum Lock List: 20
Average Applications: 10
Locks Available: 13107
Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability
__________________________________________________ _______________________

RETURN
( 1)
Cost
I/O
|
4.82085e+08
DTQ
( 2)
3.07501e+07
4.54084e+06
|
9.6417e+07
GRPBY
( 3)
3.00837e+07
4.54084e+06
|
1.54259e+08
TBSCAN
( 4)
3.004e+07
4.54084e+06
|
1.54259e+08
SORT
( 5)
2.765e+07
3.62263e+06
|
1.54259e+08
HSJOIN
( 6)
1.58716e+07
2.70442e+06
/-----------+-----------\
1.54259e+08
17094
HSJOIN
BTQ
( 7) (
18)
1.58246e+07
66.8375
2.70438e+06
40
/-----------+-----------\
|
1.54259e+08 1.20128e+06
2849
HSJOIN DTQ
TBSCAN
( 8) ( 16) (
19)
2.26877e+06 131499
57.8263
2.26877e+06 131499
57.8263
669715 54144
40
/------------+-----------\ |
|
1.73541e+08 336 6.0064e+06
2849
HSJOIN BTQ TBSCAN TABLE:
S.tab5
( 9) ( 13) ( 17)
2.20989e+06 8.8796 128426
669714 1 54144
/---+---\ | |
1.58698e+08 25316.6 56 6.0064e+06
TBSCAN DTQ FETCH TABLE:S.tab3
( 10) ( 11) ( 14)
1.33192e+06 5660.97 8.52482
667282 2432 1
| | /---+---\
1.58698e+08 126583 56 56
TABLE: DW TBSCAN IXSCAN TABLE: S.tab2
SALE2004 ( 12) ( 15)
5589.72 0.154681
2432 0
| |
126583 56
TABLE: S.tab2 INDEX: DPPK

__________________________________________________ _______________________
10) TBSCAN: (Table Scan)
Cumulative Total Cost: 1.33192e+06
Cumulative CPU Cost: 1.36721e+11
Cumulative I/O Cost: 667282
Cumulative Re-Total Cost: 1.33192e+06
Cumulative Re-CPU Cost: 1.36721e+11
Cumulative Re-I/O Cost: 667282
Cumulative First Row Cost: 25.9173
Cumulative Comm Cost: 0
Cumulative First Comm Cost: 0
Estimated Bufferpool Buffers: 667280

Arguments:
---------
BLKLOCK : (Block Lock intent)
INTENT SHARE
JN INPUT: (Join input leg)
OUTER
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
SEQUENTIAL
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY

Input Streams:
-------------
1) From Object D.Big_A

Estimated number of rows:
1.58698e+08
Partition Map ID: 5
Partitioning: (MULT )

Multiple Partitions
Number of columns: 6
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$RID$+Q3.SALE_AMT+Q3.SALE_QTY
+Q3.U_KEY+Q3.DT+Q3.KEY

Partition Column Names:
----------------------
+1: Q3.KEY
Output Streams:
--------------
2) To Operator #9

Estimated number of rows:
1.58698e+08
Partition Map ID: 5
Partitioning: (MULT )

Multiple Partitions
Number of columns: 6
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$RID$+Q3.SALE_AMT+Q3.SALE_QTY
+Q3.U_KEY+Q3.DT+Q3.KEY

Partition Column Names:
----------------------
+1: Q3.KEY
__________________________________________________ _______________________

Thanks,
Raj

Jul 31 '06 #4

P: n/a
Raj
Jayesh,
Thanks for the reply....
I have the where predicate in the inner query
This is a data warehouse system with intra & inter parallelism

******************** EXPLAIN INSTANCE ********************

DB2_VERSION: 08.02.1

Database Context:
----------------
Parallelism: Inter-Partition Parallelism
CPU Speed: 1.133626e-06
Comm Speed: 100
Buffer Pool size: 68432
Sort Heap size: 2500
Database Heap size: 12288
Lock List size: 1024
Maximum Lock List: 20
Average Applications: 10
Locks Available: 13107
Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability
---------------- STATEMENT 1 SECTION 201 ----------------
QUERYNO: 1
QUERYTAG: CLP
Statement Type: Select
Updatable: No
Deletable: No
Query Degree: 1

---------------------------------------------------------------------------------------
return
( 1)
cost
i/o
|
4.82085e+08
dtq
( 2)
3.07501e+07
4.54084e+06
|
9.6417e+07
grpby
( 3)
3.00837e+07
4.54084e+06
|
1.54259e+08
tbscan
( 4)
3.004e+07
4.54084e+06
|
1.54259e+08
sort
( 5)
2.765e+07
3.62263e+06
|
1.54259e+08
hsjoin
( 6)
1.58716e+07
2.70442e+06
/-----------+-----------\
1.54259e+08
17094
hsjoin
btq
( 7) (
18)
1.58246e+07
66.8375
2.70438e+06
40
/-----------+-----------\
|
1.54259e+08 1.20128e+06
2849
hsjoin dtq
tbscan
( 8) ( 16) (
19)
2.26877e+06 131499
57.8263
2.26877e+06 131499
57.8263
669715 54144
40
/------------+-----------\ |
|
1.73541e+08 336 6.0064e+06
2849
hsjoin btq tbscan table:
s.tab4
( 9) ( 13) ( 17)
2.20989e+06 8.8796 128426
669714 1 54144
/---+---\ | |
1.58698e+08 25316.6 56 6.0064e+06
tbscan dtq fetch table:s.tab3
( 10) ( 11) ( 14)
1.33192e+06 5660.97 8.52482
667282 2432 1
| | /---+---\
1.58698e+08 126583 56 56
table: big_a tbscan ixscan table: s.tab5
( 12) ( 15)
5589.72 0.154681
2432 0
| |
126583 56
table: s.tab2 index: dppk

-----------------------------------------------------------------------------------------------------
Section 10: Tablescan
10) TBSCAN: (Table Scan)
Cumulative Total Cost: 1.33192e+06
Cumulative CPU Cost: 1.36721e+11
Cumulative I/O Cost: 667282
Cumulative Re-Total Cost: 1.33192e+06
Cumulative Re-CPU Cost: 1.36721e+11
Cumulative Re-I/O Cost: 667282
Cumulative First Row Cost: 25.9173
Cumulative Comm Cost: 0
Cumulative First Comm Cost: 0
Estimated Bufferpool Buffers: 667280

Arguments:
---------
BLKLOCK : (Block Lock intent)
INTENT SHARE
JN INPUT: (Join input leg)
OUTER
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
SEQUENTIAL
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY

Input Streams:
-------------
1) From Object D.Big_A

Estimated number of rows:
1.58698e+08
Partition Map ID: 5
Partitioning: (MULT )

Multiple Partitions
Number of columns: 6
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$RID$+Q3.SALE_AMT+Q3.SALE_QTY
+Q3.U_KEY+Q3.DT+Q3.KEY

Partition Column Names:
----------------------
+1: Q3.KEY
Output Streams:
--------------
2) To Operator #9

Estimated number of rows:
1.58698e+08
Partition Map ID: 5
Partitioning: (MULT )

Multiple Partitions
Number of columns: 6
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.$RID$+Q3.SALE_AMT+Q3.SALE_QTY
+Q3.U_KEY+Q3.DT+Q3.KEY

Partition Column Names:
----------------------
+1: Q3.KEY
-------------------------------------------------------------------------------------------------------------
Thanks,
Raj

Jul 31 '06 #5

P: n/a
Raj
Phil,
Thanks for the reply, I attached the explain in my previous post.
All the tables have updated statistics ( with distribution and detailed
index..)
Following are the cardinalities
Card:
D.sales Big_A - 793,491,665
S.tab2 AS small_1 - 126,583
S.tab3 AS small_2 - 6,006,402
tab4 AS small_3 - 17417
D.tab5 AS small_4 -127

Should'nt the optimizer use the between clause to filter the big
table? Both the following statements result in exact same counts, but
the second query uses the index and is much faster...

++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++
FROM D.SALES AS Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt
AND small_1.Store_Ind = 'Y'
LEFT OUTER JOIN S.tab3 AS small_2
ON small_1.Key = small_2.Key
AND Big_A.U_Key = small_2.U_Key
AND small_1.Q_Cd = small_2.Q_Cd
AND small_1.Dt = small_2.Dt
INNER JOIN D.tab4 AS small_3
ON Big_A.U_Key = small_3.U_Key
INNER JOIN D.tab5 AS small_4
ON small_1.Chain_Cd = small_4.Chain_Cd
AND small_1.Qty_Qual_Cd = small_4.Qty_Qual_Cd
and
++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++
FROM D.SALES AS Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt
AND small_1.Store_Ind = 'Y'
LEFT OUTER JOIN S.tab3 AS small_2
ON small_1.Key = small_2.Key
AND Big_A.U_Key = small_2.U_Key
AND small_1.Q_Cd = small_2.Q_Cd
AND small_1.Dt = small_2.Dt
INNER JOIN D.tab4 AS small_3
ON Big_A.U_Key = small_3.U_Key
INNER JOIN D.tab5 AS small_4
ON small_1.Chain_Cd = small_4.Chain_Cd
AND small_1.Qty_Qual_Cd = small_4.Qty_Qual_Cd
where Big_A.Dt >= 05

Thanks,
Raj

Jul 31 '06 #6

P: n/a
Raj
Phil,
Thanks for the reply, I attached the explain in my previous post.
All the tables have updated statistics ( with distribution and detailed
index..)
Following are the cardinalities
Card:
D.sales Big_A - 793,491,665
S.tab2 AS small_1 - 126,583
S.tab3 AS small_2 - 6,006,402
tab4 AS small_3 - 17417
D.tab5 AS small_4 -127

Should'nt the optimizer use the between clause to filter the big
table? Both the following statements result in exact same counts, but
the second query uses the index and is much faster...

++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++
FROM D.SALES AS Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt
AND small_1.Store_Ind = 'Y'
LEFT OUTER JOIN S.tab3 AS small_2
ON small_1.Key = small_2.Key
AND Big_A.U_Key = small_2.U_Key
AND small_1.Q_Cd = small_2.Q_Cd
AND small_1.Dt = small_2.Dt
INNER JOIN D.tab4 AS small_3
ON Big_A.U_Key = small_3.U_Key
INNER JOIN D.tab5 AS small_4
ON small_1.Chain_Cd = small_4.Chain_Cd
AND small_1.Qty_Qual_Cd = small_4.Qty_Qual_Cd
and
++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++
FROM D.SALES AS Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt
AND small_1.Store_Ind = 'Y'
LEFT OUTER JOIN S.tab3 AS small_2
ON small_1.Key = small_2.Key
AND Big_A.U_Key = small_2.U_Key
AND small_1.Q_Cd = small_2.Q_Cd
AND small_1.Dt = small_2.Dt
INNER JOIN D.tab4 AS small_3
ON Big_A.U_Key = small_3.U_Key
INNER JOIN D.tab5 AS small_4
ON small_1.Chain_Cd = small_4.Chain_Cd
AND small_1.Qty_Qual_Cd = small_4.Qty_Qual_Cd
where Big_A.Dt >= 05

Thanks,
Raj

Jul 31 '06 #7

P: n/a
Your second query supplies a "fixed" starting point for the scan which
is why it's running faster.

Big_A.Dt=05 indicates you are probably storing a month in the column.
This gives you a cardinality of 12, which, when combined with the
between clause and the probability of multiple rows should make the
optimizer think that it's going to retrieve enough of the sales table
that it would be faster to scan it rather than retrieve individual rows.
When you provide the Dt>=5 predicate, you automatically restrict out 1/3
of the rows which makes using the index a good deal. If you use a
BETWEEN predicate and supply data values, it should further restrict the
scan and show additional improvement.

You haven't said anything about the environment in which this query is
run. If it's from an application program, then you may be able to
preselect the date limits and provide them as predicates in the query.

Is the tab2.From_Dt and tab2.End_Dt values 5 and 7 a happenstance of
today's data or is the date range always limited to just a few months?
If it is, then you could try (no guarantee that it'll improve anything):

where Big_A.dt between (select min(From_Dt) from tab2) and (select
max(End_Dt) from tab2)

Where predicates and join predicates can be evaluated at different
points in the internal processing of the query. That's why adding the
WHERE predicate improved the performance.
Phil Sherman

Raj wrote:
Phil,
Thanks for the reply, I attached the explain in my previous post.
All the tables have updated statistics ( with distribution and detailed
index..)
Following are the cardinalities
Card:
D.sales Big_A - 793,491,665
S.tab2 AS small_1 - 126,583
S.tab3 AS small_2 - 6,006,402
tab4 AS small_3 - 17417
D.tab5 AS small_4 -127

Should'nt the optimizer use the between clause to filter the big
table? Both the following statements result in exact same counts, but
the second query uses the index and is much faster...

++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++
FROM D.SALES AS Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt
AND small_1.Store_Ind = 'Y'
LEFT OUTER JOIN S.tab3 AS small_2
ON small_1.Key = small_2.Key
AND Big_A.U_Key = small_2.U_Key
AND small_1.Q_Cd = small_2.Q_Cd
AND small_1.Dt = small_2.Dt
INNER JOIN D.tab4 AS small_3
ON Big_A.U_Key = small_3.U_Key
INNER JOIN D.tab5 AS small_4
ON small_1.Chain_Cd = small_4.Chain_Cd
AND small_1.Qty_Qual_Cd = small_4.Qty_Qual_Cd
and
++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++
FROM D.SALES AS Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt
AND small_1.Store_Ind = 'Y'
LEFT OUTER JOIN S.tab3 AS small_2
ON small_1.Key = small_2.Key
AND Big_A.U_Key = small_2.U_Key
AND small_1.Q_Cd = small_2.Q_Cd
AND small_1.Dt = small_2.Dt
INNER JOIN D.tab4 AS small_3
ON Big_A.U_Key = small_3.U_Key
INNER JOIN D.tab5 AS small_4
ON small_1.Chain_Cd = small_4.Chain_Cd
AND small_1.Qty_Qual_Cd = small_4.Qty_Qual_Cd
where Big_A.Dt >= 05

Thanks,
Raj
Jul 31 '06 #8

P: n/a
Raj
Hi Phil,
Thanks a lot for the reply,
Big_A.Dt is a date column

the where predicate actually was ( it has a card of 60)
where month(Big_A.Dt) >= 05

I tried the following
where Big_A.dt between (select min(From_Dt) from tab2) and (select
max(End_Dt) from tab2)
and
where Big_A.dt >= (select min(From_Dt) from tab2)

These modifications did not make any difference to the optimzer..

Would it be better to rewrite the query using a subquery and filter out
data from sale table before the left outer join?
Thanks,
Raj

Aug 1 '06 #9

P: n/a
Hi Raj,

I agree with Phil's points that we need an entry point to "filter" out
data in the BIG_A table. However the min() and max() most likely did
not help because the values for that were probably the same as the
complete range of dates in BIG_A. This will be known by the stats
(detailed stats) that you have collected. Therefore the optimizer will
not use your MDC index.
>From the portion of the explain plan that you have attached, it seems
that the driving table is small_2 or tab_3 - since it seems to have
joins to BIG_A, tab_2 and tab_5.

Is there any way for you to try other combinations for the
cluster/dimension columns for BIG_A? I drew a rough query graph and
have a feeling that BIG_A.key or BIG_A.Q_CD might be a better column to
cluster on. Also, I would add the following redundant join predicates
to the query -
1) BIG_A.key = small_2.key (since BIG_A.key = small_1.key and
small_1.key = small_2.key)
2) BIG_A.Q_CD = small_2.Q_CD (since BIG_A.Q_CD = small_1.Q_CD and
small_1.Q_CD = small_2.Q_CD)
Raj wrote:
Hi Phil,
Thanks a lot for the reply,
Big_A.Dt is a date column

the where predicate actually was ( it has a card of 60)
where month(Big_A.Dt) >= 05

I tried the following
where Big_A.dt between (select min(From_Dt) from tab2) and (select
max(End_Dt) from tab2)
and
where Big_A.dt >= (select min(From_Dt) from tab2)

These modifications did not make any difference to the optimzer..

Would it be better to rewrite the query using a subquery and filter out
data from sale table before the left outer join?
Thanks,
Raj
Aug 1 '06 #10

P: n/a
Raj
Hi Jayesh,

Thanks for the reply,

BIG_A.key & BIG_A.Q_CD have very high card, about 1 million. & i did
not have any luck in reducing the card using a generated column. The dt
column has a card of 600 and i think [ i might be wrong] is a good
choice.

I modified the query as follows and it seems to be using the index, (
but has higher estimated timerons than the earlier posted hard-coded dt
query....)

FROM
(
select * from
D.SALES
where Dt >= (select min(From_Dt) from S.tab2)
) as Big_A
INNER JOIN S.tab2 AS small_1
ON Big_A.Key = small_1.Ke_y
AND Big_A.Q_Cd = small_1.QCd
AND Big_A.Dt BETWEEN small_1.From_Dt AND
small_1.End_Dt

Thanks,
Raj

Aug 1 '06 #11

This discussion thread is closed

Replies have been disabled for this discussion.