Oracle SQL and Informatica Performance Tuning

Why SQL Tuning ?

A SQL statement becomes a problem when it fails to perform according to a predetermined and measureable standard.

The 2 (Two) goals in performance tuning:

Reduce user response time, which means decreasing the time between when a user issues a statement and receives a response.
Improve throughput, which means using the least amount of resources necessary to process all rows accessed by a statements.

There are 2 (Two) concepts below:

SOFT parse

During the parse, the database performs a shared pool check to see if an existing parsed statement already exists and then determines whether it can skip resource-intensive steps of statement processing.

Hard parse

If Oracle Database cannot reuse existing code, then it must build a new executable version of the application code. This operation is known as a hard parse.

What is Oracle Cost Based Optimization ?

Oracle cost-based optimizer is designed to determine the most efficient path to access the data stored in the table by SQL statements, it generates an execution plan for a SQL statement, Oracle gathers statistics about the data in the tables and indexes in order to generate the most optimal execution plan; this is known as cost-based optimization. Comparing the execution plans generated by the optimizer will allow to judge the relative cost of one SQL statement versus another. These results can be use to make SQL statements optimal.

To collect the statistics, there are two methods you can use:

The ANALYZE statement can generate statistics for cost-based optimization:

ANALYZE TABLE sh.sales COMPUTE STATISTICS;

Using the DBMS_STATS Package

The PL/SQL package DBMS_STATS lets you generate and manage statistics for cost-based optimization:

exec dbms_stats.gather_table_stats (user,'SALES',cascade=>true);

Execution Plan :

An execution plan is a series of steps occurred in order to execute SQL Statements. see below:

The most complex SQL statement can be broken down into a series of basic operation.
To execute any SQL statement Oracle has to derive an execution plan.
Execution plan contains the series of steps and relationship between them.

SQL Tuning Tools:

AUTOTRACE :

The AUTOTRACE command is SQL*Plus generates the execution plan and statistics
about the performance of a query. This command provides statistics such as disk
reads and memory reads.

EXPLAIN PLAN

This SQL statement enables you to view the execution plan that the optimizer
would use to execute a SQL statement without actually executing the statement.

V$SQL_PLAN

These views contain information about executed SQL statements, and their
execution plans, that are still in the shared pool.
You can use the DBMS_XPLAN package methods to display the execution plan
generated by the EXPLAIN PLAN command and query of V$SQL_PLAN.

Running Explain plan:

EXPLAIN PLAN FOR
SELECT * FROM SH.COUNTRIES
WHERE
COUNTRY_ID = 52790;

SELECT * FROM TABLE (DBMS_XPLAIN.DISPLAY) ;

Types of Table Accesses:

Oracle can retrieve data in a number of ways. The most common techniques for retrieving table data are :

Read the entire table using a full table
Access a specific row using ROWID
Using an Index to locate the rows

TABLE ACCESS FULL:

A full table Scan sequentially reads each row from the table.
Oracle reads multiple blocks during each database read.
If there are million rows in a table, it will read all million rows it is reading table.

TABLE ACCESS BY ROWID:

Oracle reads rows by their ROWID pseudo column.

ROWID contains the physical location of the row.

Oracle uses INDEX to correlate data values with ROWID and help in improving the performance of a query.

The database finds the ROWID that go with these values and follows them to the precise table blocks, and rows within those blocks, that satisfy the query.

TABLE INDEX UNIQUE SCAN:

The scan searches the index in order for the specified key. An index unique scan stops processing as soon as it finds the first record because no second record is possible.
The database obtains the ROWID from the index entry, and then retrieves the row specified by the ROWID.

TABLE INDEX RANGE SCAN:

An index range scan is used when there is a NON UNIQUE index or you are searching for a range of values.

TABLE Choosing between FULL Table scan and Indexed Access:

Indexed reads examine only a small part of each block, the rows you want, instead of every row in the block, saving CPU time.
Indexed reads usually scale better as a table grows, giving stable performance, whereas a full table scan becomes steadily worse,
even while it might start out a little better than the indexed plan.
Index reads are almost always cached.

What should we look from Execution Plan:

Cardinality        - Looks for Estimate of the number of rows coming out of each of the operations.
Access Method - The way in which the data is being accessed, via either a table scan or index access.
Join Method      - The method (e.g., has, sort-merge, etc) used to join tables with each other.
Join Type          - The type of join (e.g., outer, anti semi, etc).
Join Order         - The order in which the tables are joined to each other.
Partition Pruning - Are only the necessary partitions being accessed to answer the query ?
Parallel Execution - In care of parallel execution, is each operation in the plan being conducted in parallel ? Is the right data redistribution method being used?

What is COST ?

The Oracle Optimizer is a cost-based optimizer.
The optimizer's cost model accounts for the IO, CPU, and network resources that will be used by the query.
The Optimizer selects the execution plan with the lowest cost, where cost represents the estimated resource usage for that plan.

SELECT consideration:

The simplest tuning rule is 'select the data only need, don't select all the data'

SELECT * FROM HR.EMPLOYEES E;

SELECT COLUMN_1, COLUMN_2, COLUMN_N
FROM HR.EMPLOYEES E ;

Use Table Aliases:

Adding table aliases and prefixing all column names by their aliases is a good rule to follow.
This is very important when there is more than 1 table involved in the query.
Mentioning table aliases reduces the parse time.

BAD

SELECT
ORDER_ID
, SALES_ID
, TOTAL_AMOUNT
, PRODUCT_NAME
, PRODUCT_CATEGORY
, COLOR

FROM

SALES_HISROTY, PRODUCT

WHERE
PRODUCT_ID = PRODUCT_ID

;

GOOD

SELECT
S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
, P.PRODUCT_CATEGORY
, P.COLOR

FROM

SALES_HISTORY S, PRODUCT P

WHERE
S.PRODUCT_ID = P.PRODUCT_ID

;

Use WHERE rather than HAVING:

HAVING clause filters the rows only after all rows have been fetched, grouped and sorted. So when ever possible try to filter as much as data using WHERE rather than HAVING. So when ever possible try to filter as much as data using WHERE rather than HAVING.
This reduces lot of overhead on the database.

BAD

SELECT
ORDER_ID
, PRODUCT_NAME
, SUM(TOTAL_AMOUNT)
FROM
SALES_HISTORY s
GROUP BY
ORDER_ID
, PRODUCT_NAME
HAVING ORDER_ID = 1267
;

GOOD

SELECT
S.ORDER_ID
, S.PRODUCT_NAME
, SUM(S.TOTAL_AMOUNT) AS TOTAL_AMOUNT
FROM
SALES_HISTORY s
WHERE
S.ORDER_ID = 1267
GROUP BY
S.ORDER_ID
, S.PRODUCT_NAME
;

Use Index Suppression Reasons:

There are many reasons why an Index might not be used even though you have an index on the column you are searching for.

Use of <> operator on an indexed column
Use of SUBSTR function on an indexed column
Use of Arithmetic operators(+,-,/,*) on an indexed column
Use of TRUNC function on an indexed column
Use of || operator on an indexed column
Data type conversion on an indexed column
Use of IS NULL/NOT NULL on an indexed column

Accidental Index Suppression:

Indexes can tell you what is in a table, but not what is not in a table.
!= and <> disables index usage.

BAD

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
ORDER_ID <> 12345
;

GOOD

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
ORDER_ID > 12345 AND
ORDER_ID < 12345 ;

SUBSTR function disables the index:

BAD

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
SUBSTR(PRODUCT_NAME, 1, 3) = ‘SAM’
;

GOOD

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
PRODUCT_NAME LIKE ‘SAM%’
;

function - , + , / , * operators disables the indexes:

BAD

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
ORDER_ID + 1000 < 5000
;

GOOD

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
ORDER_ID < 4000
;

function TRUNC function disables the index:

BAD

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
TRUNC(SALES_DATE,'MON') = '01-JAN-2015'
;

GOOD

Use between instead of TRUNC function

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
SALES_DATE BETWEEN '01-JAN-2015' AND '31-JAN-2015'
;

function  || operator disables the index:

BAD

SELECT
*
FROM
PRODUCT
WHERE
COLOR || PRODUCT_CATEGORY = ‘BLACKMobile’ ;

GOOD

Use AND operator

SELECT
*
FROM
PRODUCT
WHERE
COLOR = ‘BLACK’             AND
PRODUCT_CATEGORY = ‘Mobile’ ;

Comparing a character column to a number value:

BAD

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
PRODUCT_NAME = 123 ;

GOOD

Use AND operator

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
PRODUCT_NAME = '123' ;

Function based Indexes:

Where ever possible try to avoid using functions on an indexed column, But if there is a business requirement which you can not avoid, then instead of using functions on the column, try to use function based Indexes.

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
SUBSTR(PRODUCT_NAME, 1, 3 ) = 'SAM' ;

SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY) ;

CREATE INDEX product_s_idx ON SALES_HISROTY (SUBSTR(PRODUCT_NAME,1,3));

Exec dbms_stats.gather_table_stats(user,’SALES_HISTORY’, cascade=>true);

Use UNION instead of OR:

Consider using UNION ALL when you have to use OR condition on 2 index columns.

SELECT
ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
SALES_HISTORY s
WHERE
PRODUCT_NAME = 'SAMSUNG' OR
SALES_DATE = '01-JAN-2015' ;

GOOD

SELECT
S.ORDER_ID
,S.SALES_DATE
,S.TOTAL_AMOUNT
,S.PRODUCT_NAME
FROM
SALES_HISTORY S
WHERE
S.PRODUCT_NAME = 'SAMSUNG'

UNION ALL

SELECT
S.ORDER_ID
,S.SALES_DATE
,S.TOTAL_AMOUNT
,S.PRODUCT_NAME
FROM
SALES_HISTORY S
WHERE
S.SALES_DATE = '01-JAN-2015' ;

Use UNION ALL instead of UNION:

Avoid Unnecessary sorts !!!
Don’t sort if order is not import.
UNION performs an extra operation called SORT UNIQUE, which is really expensive.

SELECT
S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
SALES_HISTORY S
WHERE
S.PRODUCT_NAME = 'SAMSUNG'

UNION

SELECT
S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
S.SALES_HISTORY S
WHERE
S.SALES_DATE = '01-JAN-2015' ;

Better

SELECT
S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
SALES_HISTORY S
WHERE
S.PRODUCT_NAME = 'SAMSUNG'

UNION ALL

SELECT
S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
SALES_HISTORY S
WHERE
S.SALES_DATE = '01-JAN-2015' ;

Use Function based Indexes:

Where ever possible try to avoid using functions on an indexed column, But if there is a business requirement which you can not avoid, then instead of using functions on the column, try to use function based Indexes.

CREATE INDEX product_s_idx ON SALES_HISROTY (SUBSTR(PRODUCT_NAME,1,3));

Exec dbms_stats.gather_table_stats(user,’SALES_HISTORY’, cascade=>true);

EXPLAIN PLAN FOR
SELECT
S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
SALES_HISTORY S
WHERE
SUBSTR(S.PRODUCT_NAME, 1, 3 ) = ‘SAM’ ;

SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY) ;

Use UNION instead of OR:

Consider using UNION when you have to use OR condition on 2 index columns.

SELECT
S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
SALES_HISTORY S
WHERE
S.PRODUCT_NAME = 'SAMSUNG' OR
S.SALES_DATE = '01-JAN-2015' ;

GOOD

SELECT
S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
SALES_HISTORY S
WHERE
S.PRODUCT_NAME = 'SAMSUNG'

UNION ALL

SELECT
S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
SALES_HISTORY S
WHERE
S.SALES_DATE = '01-JAN-2015' ;

Oracle SQL and Informatica Performance Tuning

Tuesday, August 30, 2016

Informatica Power Center - Performance Tunning Details

Wednesday, August 10, 2016

Oracle Optimizer and its Execution Plan

Using the DBMS_STATS Package