Oracle Query Tuning – Example 2

Day by day technology is upgrading and every organization tries to make easy way of communication for their customers. In our day-today life when we make any payment, Recharge Mobiles etc, we get notification on mobile phone via SMS. These organization sends SMS to you for every transaction. In this post, I am going to show you how queries of such transaction performs badly because of the skewed data distribution in the table and which possible query tuning technique help us . I have seen this problem with many clients. I have created sample table SMS_PROCESS_TABLE for our demonstration which has skewed data distribution.

SQL> desc SMS_PROCESS_TABLE
 Name                                      Null?    Type
 ----------------------------------------- -------- ---------------
 ID                                                 NUMBER(20)
 MESSAGE                                            VARCHAR2(160)
 MOBILE_NO                                          VARCHAR2(20)
 PROCESS                                            VARCHAR2(1)

SQL>

This table contains 4 columns. 
ID is for unique id of message in the table.
Message : Message for the customer.
MOBILE_NO : Mobile number of customer
PROCESS : Flag Y/N to identify whether SMS has sent or not

Generally there are scheduled jobs which runs almost every minute. These jobs picks the data which is not processed that is with flag N and mark flag to Y after processing. This is continuous process. So there are very less number of rows with the flag N which is not processed. Table SMS_PROCESS_TABLE is having same type of data distribution.

SQL> select count(1),process from SMS_PROCESS_TABLE group by process;

  COUNT(1) P
---------- -
     99900 Y
       100 N

SQL> 

SQL> select column_name,num_distinct,num_nulls,histogram 
from dba_tab_columns where table_name='SMS_PROCESS_TABLE';

COLUMN_NAME          NUM_DISTINCT  NUM_NULLS HISTOGRAM
-------------------- ------------ ---------- ---------------
PROCESS                         2          0 NONE
MOBILE_NO                   99768          0 NONE
MESSAGE                    100000          0 NONE
ID                         100000          0 NONE

We can see that column PROCESS is having only 2 distinct values. Check the number of rows present with each flag. Flag N has only 100 rows while flag Y has 99900 rows. Also there is an index created on PROCESS column. Now I am generating an execution plan for the PROCESS FLAG N;

 

Example2 - Execution Plan Pre

From the above execution plan we can see that table SMS_PROCESS_TABLE is going for Full Table Scan even we have an index on PROCESS column. Plan also shows that query will return 50000 rows. But we know that table contains only 100 rows for FLAG N.

Why Oracle has not used an Index here ?

There are only two distinct values present for column PROCESS and there are 100000 rows in the table. Optimizer divides these number of rows with the distinct values which gives 50000 which is 50% of data. To fetch the 50% data using an index is costlier than Full Table Scan as index range scan performs single block read where Full Table Scan performs multi-block read.
But we know that there are only 100 rows for flag N. Due to this skewed data distribution query is going for Full Table Scan.

Resolution:

1. We can add INDEX hint in the query for using the index. If we have multiple statement with such filter then we need to add hint for every statement.
2. We can create Histogram for PROCESS column. A histogram is special type of column statistics that provide more detailed information about the data distribution in a table column. Use METHOD_OPT parameter while statistics gathering to create histograms.

SQL> exec dbms_stats.gather_table_stats('VISHAL','SMS_PROCESS_TABLE',method_opt=>'FOR COLUMNS PROCESS SIZE 254');

PL/SQL procedure successfully completed.

SQL>

SQL> select column_name,num_distinct,num_nulls,histogram
  2  from dba_tab_columns where table_name='SMS_PROCESS_TABLE';

COLUMN_NAME          NUM_DISTINCT  NUM_NULLS HISTOGRAM
-------------------- ------------ ---------- ---------------
PROCESS                         2          0 FREQUENCY
MOBILE_NO                   99768          0 NONE
MESSAGE                    100000          0 NONE
ID                         100000          0 NONE

SQL>

The above output shows that column PROCESS has FREQUENCY histogram created on it. Now again I am generating an execution plan for the same statement.

Example2 - Execution Plan Post

This time optimizer chooses to use an index for the execution of the query. Also you can see that plan shows that it is going to return 100 rows from the query.

What if I want to execute this query for PROCESS flag Y. As query will return around 99% of data from the table it should go with Full Table Scan. Let’s Check it.

Example2 - Execution Plan Post2

This time query is performing FULL TABLE SCAN which is correct.

Note:

Think twice before created histograms. If you are creating histogram test all the functionalities which used that table as sometimes oracle generated bad plans due to Histograms.

Related Articles:

Oracle Query Tuning – Example 1

Oracle Query Tuning – Example 2
Tagged on:     

2 thoughts on “Oracle Query Tuning – Example 2

Leave a Reply