Main Page Content
Use Oracle S Explain Plan To Tune Your Queries
Introduction
Query speed is a perpetual challenge for anyone using a SQL database. Many
newer Oracle Users are unaware of the issues involved with designing a good query and those who have been at the game for some time want to make sure they are not needlessly burdening the database server. If you use Oracle, Explain Plan is a great way to tune your queries. As a bonus for using Explain Plan, you will learn more about how the DBMS works "behind the scenes", enabling you to write efficient queries the first time aroundUsing Explain Plan
What does Explain Plan do? Explain Plan executes your query and records the
"plan" that Oracle devises to execute your query. By examining this plan, you can find out if Oracle is picking the right indexes and joining your tables in the most efficient manner. There are a few different ways to utilize Explain Plan. I will focus on using it through SQL*Plus since most Oracle programmers will have access to SQL*Plus.The first thing you will need to do is make sure you have a table called plan_table
available in your schema. The following SQL will create it for you if you don't have it:CREATE TABLE plan_table ( statement_id VARCHAR2(30), timestamp DATE, remarks VARCHAR2(80), operation VARCHAR2(30), options VARCHAR2(30), object_node VARCHAR2(128), object_owner VARCHAR2(30), object_name VARCHAR2(30), object_instance NUMBER, object_type VARCHAR2(30), optimizer VARCHAR2(255), search_columns NUMBER, id NUMBER, parent_id NUMBER, position NUMBER, other LONG )
Next, you can run the following script to get a list of the steps that Oracle
will perform in order to execute your query:set echo ondelete from plan_table
where statement_id = 'MINE';commit;COL operation FORMAT A30COL options FORMAT A15COL object_name FORMAT A20EXPLAIN PLAN set statement_id = 'MINE' for/* ------ Your SQL here ------*/select * from scott.salgrade/*----------------------------*/
/set echo off
select operation, options, object_name
from plan_table where statement_id = 'MINE'start with id = 0connect by prior id=parent_id and prior statement_id = statement_id;set echo on
Making Use of Indexes
Making sure your query is using indexes to find rows faster is the most basic
use of Explain Plan. We will examine this process first. When the aforementioned script is run with the query "select * from scott.salgrade", this is the output we will see:OPERATION OPTIONS OBJECT_NAME------------------------------ --------------- --------------------SELECT STATEMENTTABLE ACCESS FULL SALGRADE
What this tells us is that in order to execute the SELECT STATEMENT, Oracle
will access the table SALGRADE using a FULL table scan. In other words the DBMS will read every record in SALGRADE. You would expect this result for a query with no where clause. What if we want to look for the record for salary grade #1. We will limit our results using the GRADE column in SALGRADE and our query will look like this:select *
from scott.salgrade
where grade = 1
But our plan looks the same. Oracle still has to read every row in the table
to find all rows with the grade = 1 because there is no index on that column. Since, GRADE should be the primary key of the SALGRADE table, we will create a primary key constraint on that column. After creating the primary key constraint (which creates a related unique index), and running the plan script again, our plan looks like this:OPERATION OPTIONS OBJECT_NAME------------------------------ --------------- --------------------SELECT STATEMENTTABLE ACCESS BY INDEX ROWID SALGRADEINDEX UNIQUE SCAN SALGRADE_PK
This time Oracle ran a unique scan on the SALGRADE_PK index and then went directly
to the row we wanted in the SALGRADE table using the ROWID. An index scan is much faster than a full table scan so the result of our query is sure to come back nearly instantaneously, even if our table has millions of rows.What if we had simply created a non-unique index instead of a primary (unique)
key? Our plan would have looked like this:OPERATION OPTIONS OBJECT_NAME------------------------------ --------------- --------------------SELECT STATEMENTTABLE ACCESS BY INDEX ROWID SALGRADEINDEX RANGE SCAN SALGRADE_IDX1
This is a slightly slower path for Oracle to take. Notice that the OPTIONS
column says RANGE SCAN instead of UNIQUE SCAN. This is because it has to scan for multiple records with the same value in the index in case there is more than one row. If this was the case, our query speed would depend on the uniqueness of our data. Assuming a fairly large table size, we can illustrate this dependance using two extreme cases. In the first extreme case, GRADE is totally unique (even though the index is non-unique). Because of this uniqueness, Oracle will be able to find the row it needs quite quickly and bring back the results. In the second extreme case, there are only a few different values for GRADE distributed among many rows. Because of this non-uniqueness, Oracle will have to scan through many values in the index to find the ones it wants, finalizing the results some time later.Most of the time, when you write your queries, the data model will have already
been built to match the purpose of the application. To maximize your application's speed, use criteria that will allow Oracle to use the fastest (most unique) indexes on a given table.Joining Tables Efficiently
Sometimes we use slower (less unique) indexes in small to medium sized tables
and don't notice that our application is slower than it should be. However, when joining tables, you will notice a difference if you are not using the optimal query plan. The key to making joins faster is making your query select from the table that will return the least number of records first and then use the information gained from the first table to make subsequent searches more unique. The following query retrieves the employee and salary records for employee #1:select *
from emp e, salgrade s
where empid = 1
and s.grade = e.grade
Given a unique index on EMPID in the EMP table and on GRADE in the SALGRADE
table, Oracle's plan will look like this:OPERATION OPTIONS OBJECT_NAME------------------------------ --------------- --------------------SELECT STATEMENTNESTED LOOPSTABLE ACCESS BY INDEX ROWID EMPINDEX UNIQUE SCAN EMP_PKTABLE ACCESS BY INDEX ROWID SALGRADEINDEX UNIQUE SCAN SALGRADE_PK
The NESTED LOOPS operation indicates that Oracle will look for rows in GRADE
for each row in EMP that it finds. We have an efficient query because Oracle is searching for our EMPID = 1 first and then looking for 1's GRADE. If Oracle had looked in SALGRADE first, it would have had to read all rows in that table because it wouldn't know what the GRADE of EMPID = 1 was. In this case, it would be hard to write an inefficient query because we are searching on criteria that are by definition unique (and fast). What if we had to search on fields with a low level of uniqueness? Examine the following query, which searches for employees with low salaries (LOSAL) between 10000 and 30000:select * from emp e, salgrade s where s.grade = e.grade and s.losal between 10000 and 30000
We will get this plan from Oracle:
OPERATION OPTIONS OBJECT_NAME------------------------------ --------------- --------------------SELECT STATEMENTNESTED LOOPSTABLE ACCESS FULL EMPTABLE ACCESS BY INDEX ROWID SALGRADEINDEX UNIQUE SCAN SALGRADE_PK
Under the circumstances, this is the best we could have done. Oracle scans
the employee table (all of it) because it can narrow that table the most using the given criteria. Hypothetically, there are a couple of ways to speed up this query. One way is to use some criteria on the EMP table that is indexed. if we assume a non-unique index on EMP.NAME (EMP_IDX2), and have some sort of criteria we can use for NAME, we could narrow our results from EMP more quickly. Let's get the employee and salary records for all employees whose names start with 'SM' and whose low salaries are between 10000 and 30000, like so:select *
from emp e, salgrade s
where e.name like 'SM%' and s.grade = e.grade
and s.losal between 10000 and 30000
We will get this plan from Oracle:
OPERATION OPTIONS OBJECT_NAME------------------------------ --------------- --------------------SELECT STATEMENTNESTED LOOPSTABLE ACCESS BY INDEX ROWID EMPINDEX RANGE SCAN EMP_IDX2TABLE ACCESS BY INDEX ROWID SALGRADEINDEX UNIQUE SCAN SALGRADE_PK
The criteria on NAME could narrow our search fairly quickly. The actual speed
will depend on the uniqueness of the NAME column. If all of our employees are named SMITH and SMYTHE and SMALL, the criteria will not help much. Given an even distribution of names, the criteria will help a lot.What if we didn't have the option to narrow down the result set by NAME? We
might want to consider creating an index on the SALGRADE.LOSAL and another index on EMP.GRADE if this query is used often. Given a non-unique index on SALGRADE.LOSAL, a non-unique index on EMP.GRADE and this query:select *
from emp e, salgrade s
where s.grade = e.grade
and s.losal between 10000 and 30000
Oracle will produce this plan:
OPERATION OPTIONS OBJECT_NAME------------------------------ --------------- --------------------SELECT STATEMENTNESTED LOOPSTABLE ACCESS BY INDEX ROWID SALGRADEINDEX RANGE SCAN SALGRADE_IDX2TABLE ACCESS BY INDEX ROWID EMPINDEX RANGE SCAN EMP_IDX3
Oracle now will search for the appropriate salary grades first and then all
employees with that grade.Using the fastest indexes in a Join
Up until now, we have been striving to use any avaliable index. More complicated
issues arise when trying to decide which of two or three indexes is faster. In the case that are looking for names starting with 'SM' and low salaries between 10000 and 30000 and all the indexes given up to this point, we will probably have to tell Oracle which table to search first, in order to get results as quickly as possible. We can do that by changing the order of the tables in the from clause. This query:select *
from emp e, salgrade s
where e.name like 'SM%'
and s.grade = e.grade
and s.losal between 10000 and 30000
Will produce this plan:
OPERATION OPTIONS OBJECT_NAME------------------------------ --------------- --------------------SELECT STATEMENTNESTED LOOPSTABLE ACCESS BY INDEX ROWID SALGRADEINDEX RANGE SCAN SALGRADE_IDX2TABLE ACCESS BY INDEX ROWID EMPINDEX RANGE SCAN EMP_IDX3
The plan indicates that Oracle looks at SALGRADE.LOSAL first and EMP.NAME second.
The choice is made because SALGRADE comes last in the from clause. If Oracle can't decide which table is better to go with first, it will read the from clause from right to left. This is great if SALGRADE.LOSAL is more unique than EMP.NAME. However, what if searching for EMP.NAME starting with 'SM' will return fewer rows than searching for SALGRADE.LOSALs between 10000 and 30000? Simply switch the tables in the from clause to produce the opposite result: select * from salgrade s, emp e where e.name like 'SM%' and s.grade = e.grade and s.losal between 10000 and 30000
This query produces this plan:
OPERATION OPTIONS OBJECT_NAME------------------------------ --------------- --------------------SELECT STATEMENTNESTED LOOPSTABLE ACCESS BY INDEX ROWID EMPINDEX RANGE SCAN EMP_IDX2TABLE ACCESS BY INDEX ROWID SALGRADEINDEX UNIQUE SCAN SALGRADE_PK
EMP.NAME is queried first because we instructed Oracle to do so.
Utilizing Multiple Column Indexes
Multiple-column indexes will eventually make their way into your database applications.
In order to use the index, we have to search on the first column before searching on the second column in the index and the criteria for the second column will not help us if our criteria for the first column are not very unique. Also keep in mind that in order to get a fully unique search on a unique multi-column index, we have to limit all columns in the index.Conclusion
Remember that by utilizing Explain Plan you can explore the differences that
subtle changes in your query make in the way Oracle executes your query. In a short time you will be writing very fast queries. By Adam Patrick