============================================
Collection of Statistics for Cost-Based Optimization/DBMS_STATS vs. ANALYZE
The cost-based approach relies on statistics and if the cost-based
Approach is used , then statistics should be gernerated for all tables, clusters, and all types of indexes accessed by SQL statements. If the size and data distribution of your tables change frequently, then generate statistics regularly to ensure the statistics accurately represent the data in the tables.
Select * from test partition;
This uses global statistics but no predicate
select * from test s where s.amount_of_shift > 1000;
This uses a predicate for more than one partition and may use global statistics
select * from test partition (sep2009) s where s.amount_of_shift > 1000;
This uses global statistics and predicate to one partition.
Gathering global statistics with the DBMS_STATS package is more useful because ANALYZE always runs serially. DBMS_STATS can run in serial or parallel. Whenever possible, DBMS_STATS calls a parallel query to gather statistics with the specified degree of parallelism; otherwise, it calls a serial query or the ANALYZE statement. Index statistics are not gathered in parallel.
ANALYZE gathers statistics for the individual partitions and then calculates the global statistics from the partition statistics. DBMS_STATS can gather separate statistics for each partition as well as global statistics for the entire table or index. Depending on the SQL statement being optimized, the optimizer may choose to use either the partition (or subpartition) statistics or the global statistics.
Expand|Select|Wrap|Line Numbers
- CREATE TABLE PARTTAB(
- ordid NUMBER,
- PARTCOL DATE,
- DETAILS NUMBER,
- AMOUNT NUMBER)
- PARTITION BY RANGE(PARTCOL)
- SUBPARTITION BY HASH(DETAILS) SUBPARTITIONS 2
- (PARTITION q1 VALUES LESS THAN(TO_DATE('01-04-2009','DD-MM-YYYY')) TABLESPACE TBLSPCE1,
- PARTITION q2 VALUES LESS THAN(TO_DATE('01-07-2009','DD-MM-YYYY')) TABLESPACE TBLSPCE2,
- PARTITION q3 VALUES LESS THAN(TO_DATE('01-10-2009','DD-MM-YYYY')) TABLESPACE TBLSPCE3,
- PARTITION q4 VALUES LESS THAN(TO_DATE('01-12-2009','DD-MM-YYYY')) TABLESPACE TBLSPCE4
- );
- A local non prefixed index will be associated with it:
- CREATE INDEX IDX_PARTTAB ON PARTTAB (ordid) LOCAL;
- The PARTTAB table has been populated before to start the following examples.
- GATHER_TABLE_STATS
- -------------------
- Collects table, column, and index statistics.
- Compute, serial mode, without histograms, Default granularity.
- ============================================================
- SQL> execute dbms_stats.gather_table_stats(-
- >ownname => 'test',-
- >tabname => 'PARTTAB',-
- >partname => null,- --> Gather stats on all partitions.
- >estimate_percent => null,- --> Compute mode
- >block_sample => false,- --> Default value. No Sense in Compute mode
- >method_opt => 'FOR ALL COLUMNS SIZE 1',- --> Table and columns statistics. No histogram generated
- >degree => null,- --> default parallel degree based on DOP set on PARTTAB.
- >granularity => 'default',- --> Gather Global and Partition statistics
- >cascade => true ,- --> with index stats generated
- >stattab => null,- --> The statistics will be stored in the dictionary.
- >statid => null,-
- >statown => null);
- PL/SQL procedure successfully completed.
- Index Statistics won't be calculated by default if CASCADE=>TRUE .
- SQL> select table_name, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, AVG_ROW_LEN
- , GLOBAL_STATS, USER_STATS, sample_size from user_tables
- where table_name = 'PARTTAB';
- TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN GLOBAL_STATS USER_STATS SAMPLE_SIZE
- ---------- -------- ------ ------------ --------- ----------- ------------ ---------- -----------
- PARTTAB 400 8 0 0 11 YES NO 400
- Now that the statistics have been updated.
- The column GLOBAL_STATS has been also initialized.
- SQL> select partition_name "Partition", NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, AVG_ROW_LEN,
- SAMPLE_SIZE, global_stats, user_stats
- from user_tab_partitions
- where table_name = 'PARTTAB'
- order by partition_position
- /
- Partition NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN GLOBAL_STATS USER_STATS SAMPLE_SIZE
- ---------- -------- ------ ------------ ---------- ----------- ------------ ---------- -----------
- Q1 100 2 0 0 11 YES NO 100
- Q2 100 2 0 0 11 YES NO 100
- Q3 100 2 0 0 11 YES NO 100
- Q4 100 2 0 0 11 YES NO 100
- The statistics are again obtained at the table level with the GLOBAL_STATS .
- SQL> select partition_name "Partition", subpartition_name "Subpartition", NUM_ROWS, BLOCKS, EMPTY_BLOCKS
- SAMPLE_SIZE, global_stats, user_stats
- from user_tab_subpartitions
- where table_name = 'PARTTAB'
- order by partition_name, subpartition_position
- /
- Partition Subpartition NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN SAMPLE_SIZE GLOBAL_STATS USER_STATS
- ---------- ------------- -------- ------ ------------ --------- ----------- ----------- ------------ ----------
- Q1 SYS_SUBP10365 NO NO
- Q1 SYS_SUBP10366 NO NO
- Q2 SYS_SUBP10367 NO NO
- Q2 SYS_SUBP10368 NO NO
- Q3 SYS_SUBP10369 NO NO
- Q3 SYS_SUBP10370 NO NO
- Q4 SYS_SUBP10371 NO NO
- Q4 SYS_SUBP10372 NO NO
- The statistics aren't computed at the subpartition level which is in phase
- with the 'DEFAULT' granularity.
- SQL>select COLUMN_NAME, NUM_DISTINCT, DENSITY, NUM_NULLS, NUM_BUCKETS, LAST_ANALYZED from
- user_tab_col_statistics where table_name = 'PARTTAB'
- /
- COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS LAST_ANALYZED
- --------------- ------------ ---------- ---------- ----------- -------------
- ORDID 0 0 400 1 12-DEC-02
- PARTCOL 4 .25 0 1 12-DEC-02
- DETAILS 100 .01 0 1 12-DEC-02
- AMOUNT 0 0 400 1 12-DEC-02
- The NUM_BUCKETS is set to 1 as there is no histogram generation. but, the column
- statistics are well initialized
- The same result is showed below on each partition columns:
- SQL>select partition_name, COLUMN_NAME, NUM_DISTINCT, DENSITY, NUM_NULLS, NUM_BUCKETS, LAST_ANALYZED from user_part_col_statistics
- where table_name = 'PARTTAB'
- /
- PARTITION_ COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS LAST_ANALYZED
- ---------- --------------- ------------ ---------- ---------- ----------- -------------
- Q1 ORDID 0 0 100 1 12-DEC-07
- Q1 PARTCOL 1 1 0 1 12-DEC-07
- Q1 DETAILS 100 .01 0 1 12-DEC-07
- Q1 AMOUNT 0 0 100 1 12-DEC-07
- Q2 ORDID 0 0 100 1 12-DEC-07
- Q2 PARTCOL 1 1 0 1 12-DEC-07
- Q2 DETAILS 100 .01 0 1 12-DEC-07
- Q2 AMOUNT 0 0 100 1 12-DEC-07
- Q3 ORDID 0 0 100 1 12-DEC-07
- Q3 PARTCOL 1 1 0 1 12-DEC-07
- Q3 DETAILS 100 .01 0 1 12-DEC-07
- Q3 AMOUNT 0 0 100 1 12-DEC-07
- Q4 ORDID 0 0 100 1 12-DEC-07
- Q4 PARTCOL 1 1 0 1 12-DEC-07
- Q4 DETAILS 100 .01 0 1 12-DEC-07
- Q4 AMOUNT 0 0 100 1 12-DEC-07
- the statistics loaded for subpartitions of the PARTTAB table are displayed below:
- SQL> select subpartition_name "Subpartition", COLUMN_NAME, NUM_DISTINCT, DENSITY, NUM_NULLS,
- NUM_BUCKETS from dba_subpart_col_statistics where table_name = 'PARTTAB'
- order by column_name
- /
- Subpartition COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS
- --------------- --------------- ------------ ---------- ---------- -----------
- SYS_SUBP10365 PARTCOL
- SYS_SUBP10365 ORDID
- SYS_SUBP10365 DETAILS
- SYS_SUBP10365 AMOUNT
- SYS_SUBP10366 PARTCOL
- SYS_SUBP10366 ORDID
- SYS_SUBP10366 DETAILS
- SYS_SUBP10366 AMOUNT
- SYS_SUBP10367 PARTCOL
- SYS_SUBP10367 ORDID
- SYS_SUBP10367 DETAILS
- SYS_SUBP10367 AMOUNT
- SYS_SUBP10368 PARTCOL
- SYS_SUBP10368 ORDID
- SYS_SUBP10368 DETAILS
- SYS_SUBP10368 AMOUNT
- SYS_SUBP10369 PARTCOL
- SYS_SUBP10369 ORDID
- SYS_SUBP10369 DETAILS
- SYS_SUBP10369 AMOUNT
- SYS_SUBP10370 PARTCOL
- SYS_SUBP10370 ORDID
- SYS_SUBP10370 DETAILS
- SYS_SUBP10370 AMOUNT
- SYS_SUBP10371 PARTCOL
- SYS_SUBP10371 ORDID
- SYS_SUBP10371 DETAILS
- SYS_SUBP10371 AMOUNT
- SYS_SUBP10372 PARTCOL
- SYS_SUBP10372 ORDID
- SYS_SUBP10372 DETAILS
- SYS_SUBP10372 AMOUNT
Partitioned objects can contain more than one sets of statistics. This is because statistics can be generated for the entire object, partition, or subpartition.
Thanks & Regards,
Vinod Sadanandan
Oracle DBA