Oracle Histograms Simplified: Part 2

In my previous post Oracle Histograms Simplified: Part 1, I focused on what is histogram & why it is used. In this post I will explain different ways to create & drop histograms.

HISTOGRAM CREATION:

Creation of histogram is always depends on parameter method_opt that you specify during gathering statstics.

Typical syntax for method_opt is as follows:

First column part of clause has different options as given below:


1. ALL COULMNS :

As name suggests oracle considers all columns on table for histogram creation.

EXEC DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>'<OWNER>’,TABNAME=>'<TABLE_NAME>’,METHOD_OPT => ‘FOR ALL COLUMNS SIZE AUTO’,CASCADE = TRUE);


2. ALL INDEXED COLUMNS :

Oracle will consider only columns with indexes for histogram creation.

EXEC DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE AUTO',CASCADE = TRUE);

3. ALL HIDDEN COLUMNS :

Oracle will cosider only hidden columns for histogram creation.

EXEC DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',METHOD_OPT => 'FOR ALL HIDDEN COLUMNS SIZE AUTO',CASCADE = TRUE);

4. COLUMNS <COLUMN NAMES SEPERATED BY COMMA> :

Oracle will consider only given columns for histogram creation.

EXEC DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',METHOD_OPT => 'FOR COLUMNS COLUMN1,COLUMN2 SIZE AUTO',CASCADE = TRUE);

Moving to next portion of syntax, which lets you controls histogram creation and its type.

Size clause has different options as given below:


1. INTEGER:

You can specify number of buckets. It must be in range of 1 to 254. 1 denotes no histogram with base column statistics (i.e. minimal information required for the Optimizer) while 254 is highest allowable value.

EXEC DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',METHOD_OPT => 'FOR ALL COLUMNS SIZE 10',CASCADE = TRUE);

2. AUTO:

This most commonly used size in real time scenarios. The basic process is as follows:

  • You run DBMS_STATS for a table with the METHOD_OPT parameter set to the default SIZE AUTO.
  • You query the table.
  • The database keeps track of the predicates in the preceding query and updates the data dictionary table SYS.COL_USAGE$.
  • Then you run DBMS_STATS again, DBMS_STATS takes into consideration data in SYS.COL_USAGE$ to determine which columns require histograms based on queries ran on table.

This method has following drawbacks:

  • If you gather stats with auto option & do not query table at all, then Oracle will not create any histogram on column.
  • If queries are constantly being changed in particular environment, so even data in table almost the same, can cause changes in execution plans for the queries.
EXEC DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO',CASCADE = TRUE);

3. REPEAT :

It will just collect histograms on columns which already have histograms.

EXEC DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',METHOD_OPT => 'FOR ALL COLUMNS SIZE REPEAT',CASCADE = TRUE);

4. SKEWONLY :

Oracle collects histograms on the columns with only skewed data.

EXEC DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',METHOD_OPT => 'FOR ALL COLUMNS SIZE SKEWONLY',CASCADE = TRUE);

HISTOGRAM DELETION:

There are different methods for dropping histograms on table columns:


To completely delete all types of column statistics plus histograms for a column including minimal statistics:

EXEC DBMS_STATS.DELETE_COLUMN_STATS(OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>','<COLUMN_NAME>');

This is generally not recommended as the optimizer needs at least the minimal (size 1) information for creating optimal execution plans.


To delete histograms from a column without touching base column statistics i.e. minimal information required for the Optimizer (This is recommended method for pre 11g databases).

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',METHOD_OPT=>'FOR COLUMNS <COLUMN_NAME> SIZE 1');

From Oracle 11g onwards, there is new syntax to delete the histogram for a column while still leaving behind the base column statistics:

EXEC DBMS_STATS.DELETE_COLUMN_STATS(OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',COLNAME=>'<COLUMN_NAME>',COL_STAT_TYPE=>'HISTOGRAM');

Additionally we can delete histograms for a partition:

EXEC DBMS_STATS.DELETE_COLUMN_STATS('<OWNER>',TABNAME=>'<TABLE_NAME>',COLNAME=>'<COLUMN_NAME>',PARTNAME=>'<PARTITION_NAME>', COL_STAT_TYPE=>'HISTOGRAM');

and to delete column histograms for the table and all its partitions:

 EXEC DBMS_STATS.DELETE_COLUMN_STATS(OWNNAME=>'<OWNER>',TABNAME=>'<TABLE_NAME>',COLNAME=>'<COLUMN_NAME>',CASCADE_PARTS=>TRUE,COL_STAT_TYPE=>'HISTOGRAM');

Hope so u will find this post very useful:-)

Visit Oracle Histograms Simplified: Part 1, in which I focussed on what is histogram & why it is used.

Cheers

Regards,
Adityanath

Advertisements

Oracle Histograms Simplified: Part 1

Oracle Histogram is very nice feature to help CBO to make right decision while constructing execution plan. But many of DBA’s do not use it, assuming it is complex concept to use. In next few posts, I will try to simplify it, so that we can achieve maximum benefit from it.

What it Histogram & what is its purpose:

A histogram is a special type of column statistic that provides more detailed information about the data distribution in a table column. In simple words, it is a feature in CBO and it helps optimizer to determine how data are distributed/skewed within column. Predicates with skewed data are good candidates for histogram creation.

Basically CBO assumes uniform data distribution of data inside column across the distinct values. For the columns that contain skewed data, histogram helps to generate more accurate execution plans.

Below example will explain you significance of histograms:


I have a table ADI.SALES_ORDER with 100000 rows with index SALES_ORDER_N1 on column GENERATED.

GENERATED column has 4 distinct values: A, B, Y, N.  CBO will assume uniform data distribution across these values. So any query with column GENERATED as predicate will go for FTS, as CBO will assume it has to select 25% data from table.

But in reality data under column generated in skewed as given below:

I gathered fresh statistics without histogram on table as shown below:

Now will compare execution plan with different values for predicate:

As expected this query is going for FTS, as it has to select approx. 80% of data.

But I will expect query to pick index in case of GENERATED=’N’ as it will just select less than 1% of rows from table.

But NO its still going for FTS 😦 & it is due to CBO’s default behavior.

Now here Histogram should come into picture:

Now I will once again compare plans:

As expected following query is going for FTS, as it has to select approx. 80% of data.

But following query has picked up index due to creation of histogram. 🙂

In this post, we focused on what is histogram & why it is used.

Hope so u will find this post very useful:-)

Stay tuned for Oracle Histograms Simplified: Part 2, in which I will explain what are different ways to create & drop histogram on particular column.

Cheers

Regards,
Adityanath