Tuesday, August 30, 2016

Informatica Power Center - Performance Tunning Details

The goal of performance tuning is to optimize session performance by eliminating performance bottlenecks. To tune session performance, first identify a performance bottleneck, eliminate it, and then identify the next performance bottleneck until you are satisfied with the session performance. You can use the test load option to run sessions when you tune session performance.


If we tune all the bottlenecks, we can further optimize session performance by increasing the number of pipeline partitions in the session. Adding partitions can improve performance by utilizing more of the system hardware while processing the session.

Because determining the best way to improve performance can be complex, change one variable at a time, and time the session both before and after the change.

Complete the following tasks to improve session performance:
  1. Optimize the target. Enables the Integration Service to write to the targets efficiently.
  2. Optimize the source. Enables the Integration Service to read source data efficiently.
  3. Optimize the mapping. Enables the Integration Service to transform and move data efficiently.
  4. Optimize the transformation. Enables the Integration Service to process transformations in a mapping efficiently.
  5. Optimize the session. Enables the Integration Service to run the session more quickly.
  6. Optimize the grid deployments. Enables the Integration Service to run on a grid with optimal performance.
  7. Optimize the PowerCenter components. Enables the Integration Service and Repository Service to function optimally.
  8. Optimize the system. Enables PowerCenter service processes to run more quickly.
The first step in performance tuning is to identify performance bottlenecks. Performance bottlenecks can occur in the source and target databases, the mapping, the session, and the system. The strategy is to identify a performance bottleneck, eliminate it, and then identify the next performance bottleneck until you are satisfied with the performance.

Look for performance bottlenecks in the following order:

1. Target
2. Source
3. Mapping
4. Session
5. System

How to know where the bottleneck is:

Using Thread Statistics:

There are 3 types of threads:
  1. Reader
  2. Writer
  3. Transformation
Note: The more busy a thread is causing higher bottleneck.

Use Writer Thread Statistics to find out target bottleneck:

Solution:

  • Configure Indexes and Key constraints on target Database
  • Use Bulk option
  • Increase Commit Intervals to reduce the number of checkpoints
  • Increase the connection group size to decrease chances of Deadlocks
  • keep a dedicated storage for Target Flat Files in the server/native directory
  • Increase the DB network packet size

Wednesday, August 10, 2016

Oracle Optimizer and its Execution Plan


Why SQL Tuning ?

A SQL statement becomes a problem when it fails to perform according to a predetermined and measureable standard.

The 2 (Two) goals in performance tuning:

  • Reduce user response time, which means decreasing the time between when a user issues a statement and receives a response.
  • Improve throughput, which means using the least amount of resources necessary to process all rows accessed by a statements.

There are 2 (Two) concepts below:

SOFT parse

During the parse, the database performs a shared pool check to see if an existing parsed statement already exists and then determines whether it can skip resource-intensive steps of statement processing.

Hard parse

If Oracle Database cannot reuse existing code, then it must build a new executable version of the application code. This operation is known as a hard parse.


What is Oracle Cost Based Optimization ?

Oracle cost-based optimizer is designed to determine the most efficient path to access the data stored in the table by SQL statements, it generates an execution plan for a SQL statement, Oracle gathers statistics about the data in the tables and indexes in order to generate the most optimal execution plan; this is known as cost-based optimization. Comparing the execution plans generated by the optimizer will allow to judge the relative cost of one SQL statement versus another. These results can be use to make SQL statements optimal.

To collect the statistics, there are two methods you can use:


The ANALYZE statement can generate statistics for cost-based optimization:


ANALYZE TABLE sh.sales COMPUTE STATISTICS;

Using the DBMS_STATS Package


The PL/SQL package DBMS_STATS lets you generate and manage statistics for cost-based optimization:
 
exec dbms_stats.gather_table_stats (user,'SALES',cascade=>true);

Execution Plan :

An execution plan is a series of steps occurred in order to execute SQL Statements. see below:

The most complex SQL statement can be broken down into a series of basic operation.
To execute any SQL statement Oracle has to derive an execution plan.
Execution plan contains the series of steps and relationship between them.









SQL Tuning Tools:


AUTOTRACE :

The AUTOTRACE command is SQL*Plus generates the execution plan and statistics
about the performance of a query. This command provides statistics such as disk
reads and memory reads.

EXPLAIN PLAN

This SQL statement enables you to view the execution plan that the optimizer
would use to execute a SQL statement without actually executing the statement.

V$SQL_PLAN

These views contain information about executed SQL statements, and their
execution plans, that are still in the shared pool.
You can use the DBMS_XPLAN package methods to display the execution plan
generated by the EXPLAIN PLAN command and query of V$SQL_PLAN.

Running Explain plan:


EXPLAIN PLAN FOR
SELECT * FROM SH.COUNTRIES
WHERE
COUNTRY_ID = 52790;

SELECT * FROM TABLE (DBMS_XPLAIN.DISPLAY) ;






Types of Table Accesses:

Oracle can retrieve data in a number of ways. The most common techniques for retrieving table data are :

  • Read the entire table using a full table
  • Access a specific row using ROWID
  • Using an Index to locate the rows


TABLE ACCESS FULL:

  • A full table Scan sequentially reads each row from the table.
  • Oracle reads multiple blocks during each database read.
  • If there are million rows in a table, it will read all million rows it is reading table.


TABLE ACCESS BY ROWID:

  • Oracle reads rows by their ROWID pseudo column.
  • ROWID contains the physical location of the row.
  • Oracle uses INDEX to correlate data values with ROWID and help in improving the performance of a query.
  • The database finds the ROWID that go with these values and follows them to the precise table blocks, and rows within those blocks, that satisfy the query.

TABLE INDEX UNIQUE SCAN:

  • The scan searches the index in order for the specified key. An index unique scan stops processing as soon as it finds the first record because no second record is possible.
  • The database obtains the ROWID from the index entry, and then retrieves the row specified by the ROWID.

TABLE INDEX RANGE SCAN:

  • An index range scan is used when there is a NON UNIQUE index or you are searching for a range of values.

TABLE Choosing between FULL Table scan and Indexed Access:

  • Indexed reads examine only a small part of each block, the rows you want, instead of every row in the block, saving CPU time.
  • Indexed reads usually scale better as a table grows, giving stable performance, whereas a full table scan becomes steadily worse,
  • even while it might start out a little better than the indexed plan.
  • Index reads are almost always cached.

What should we look from Execution Plan:

Cardinality        - Looks for Estimate of the number of rows coming out of each of the operations.
Access Method - The way in which the data is being accessed, via either a table scan or index access.
Join Method      - The method (e.g., has, sort-merge, etc) used to join tables with each other.
Join Type          - The type of join (e.g., outer, anti semi, etc).
Join Order         - The order in which the tables are joined to each other.
Partition Pruning - Are only the necessary partitions being accessed to answer the query ?
Parallel Execution - In care of parallel execution, is each operation in the plan being conducted in parallel ? Is the right data redistribution method being used?



What is COST ?

The Oracle Optimizer is a cost-based optimizer.
The optimizer's cost model accounts for the IO, CPU, and network resources that will be used by the query.
The Optimizer selects the execution plan with the lowest cost, where cost represents the estimated resource usage for that plan.

SELECT consideration:

The simplest tuning rule is 'select the data only need, don't select all the data'


SELECT * FROM HR.EMPLOYEES E;

SELECT COLUMN_1, COLUMN_2, COLUMN_N
 FROM HR.EMPLOYEES E ;

Use Table Aliases:

  • Adding table aliases and prefixing all column names by their aliases is a good rule to follow.
  • This is very important when there is more than 1 table involved in the query.
  • Mentioning table aliases reduces the parse time.
BAD
 
SELECT
  ORDER_ID
, SALES_ID
, TOTAL_AMOUNT
, PRODUCT_NAME
, PRODUCT_CATEGORY
, COLOR
FROM
  SALES_HISROTY, PRODUCT
WHERE
  PRODUCT_ID = PRODUCT_ID
;
 
GOOD

SELECT
  S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
, P.PRODUCT_CATEGORY
, P.COLOR
FROM
  SALES_HISTORY S, PRODUCT P
WHERE
  S.PRODUCT_ID = P.PRODUCT_ID
;

Use WHERE rather than HAVING:

HAVING clause filters the rows only after all rows have been fetched, grouped and     sorted. So when ever possible try to filter as much as data using WHERE rather than HAVING. So when ever possible try to filter as much as data using WHERE rather than HAVING.
This reduces lot of overhead on the database.

 
BAD

SELECT
  ORDER_ID
, PRODUCT_NAME
, SUM(TOTAL_AMOUNT)
 FROM
   SALES_HISTORY s
 GROUP BY
 ORDER_ID
,  PRODUCT_NAME
 HAVING ORDER_ID = 1267
;

GOOD

SELECT
  S.ORDER_ID
, S.PRODUCT_NAME
, SUM(S.TOTAL_AMOUNT) AS TOTAL_AMOUNT
FROM
  SALES_HISTORY s
WHERE
  S.ORDER_ID = 1267
GROUP BY
  S.ORDER_ID
, S.PRODUCT_NAME
;


Use Index Suppression Reasons:

There are many reasons why an Index might not be used even though you have an index on the column you are searching for.
  • Use of <> operator on an indexed column
  • Use of SUBSTR function on an indexed column
  • Use of Arithmetic operators(+,-,/,*) on an indexed column
  • Use of TRUNC function on an indexed column
  • Use of || operator on an indexed column
  • Data type conversion on an indexed column
  • Use of IS NULL/NOT NULL on an indexed column

Accidental Index Suppression:

Indexes can tell you what is in a table, but not what is not in a table.
!= and <> disables index usage.

BAD

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
  ORDER_ID <> 12345
;

GOOD

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
 FROM
  SALES_HISTORY s
WHERE
  ORDER_ID >  12345 AND      
  ORDER_ID < 12345 ;

SUBSTR function disables the index:

BAD

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
  SUBSTR(PRODUCT_NAME, 1, 3) = ‘SAM’
;

GOOD

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
  PRODUCT_NAME LIKE ‘SAM%’
;


function  - , + , / , * operators disables the indexes:

BAD

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
ORDER_ID + 1000 < 5000
;

GOOD

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
ORDER_ID < 4000
;

function  TRUNC function disables the index:

BAD

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
TRUNC(SALES_DATE,'MON') = '01-JAN-2015'
;

GOOD

Use between instead of TRUNC function

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
SALES_DATE BETWEEN '01-JAN-2015' AND '31-JAN-2015'
;

function  || operator disables the index: 

BAD

SELECT
 *
FROM
PRODUCT
WHERE
COLOR || PRODUCT_CATEGORY = ‘BLACKMobile’ ;

GOOD

Use AND operator

SELECT
 *
FROM
PRODUCT
WHERE
COLOR = ‘BLACK’             AND
PRODUCT_CATEGORY = ‘Mobile’ ;

Comparing a character column to a number value:

BAD

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
  PRODUCT_NAME = 123 ;

GOOD

Use AND operator

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
  PRODUCT_NAME = '123' ;

Function based Indexes:

Where ever possible try to avoid using functions on an indexed column, But if there is a business requirement which you can not avoid, then instead of using functions on the column, try to use function based Indexes.

SELECT
 ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
  SUBSTR(PRODUCT_NAME, 1, 3 ) = 'SAM' ;

SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY) ;

CREATE INDEX product_s_idx ON SALES_HISROTY (SUBSTR(PRODUCT_NAME,1,3));

Exec dbms_stats.gather_table_stats(user,’SALES_HISTORY’, cascade=>true);


Use UNION instead of OR:


Consider using UNION ALL when you have to use OR condition on 2 index columns.

SELECT
  ORDER_ID
, SALES_DATE
, TOTAL_AMOUNT
, PRODUCT_NAME
FROM
  SALES_HISTORY s
WHERE
  PRODUCT_NAME = 'SAMSUNG' OR
  SALES_DATE = '01-JAN-2015' ;


GOOD

SELECT
 S.ORDER_ID
,S.SALES_DATE
,S.TOTAL_AMOUNT
,S.PRODUCT_NAME
 FROM
SALES_HISTORY S
WHERE
 S.PRODUCT_NAME = 'SAMSUNG'

UNION ALL

SELECT
S.ORDER_ID
,S.SALES_DATE
,S.TOTAL_AMOUNT
,S.PRODUCT_NAME
FROM
 SALES_HISTORY S
WHERE
 S.SALES_DATE = '01-JAN-2015' ;

Use UNION ALL instead of UNION:

  • Avoid Unnecessary sorts !!!
  • Don’t sort if order is not import.
  • UNION performs an extra operation called SORT UNIQUE, which is really expensive.

SELECT
  S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
  SALES_HISTORY S
WHERE
  S.PRODUCT_NAME = 'SAMSUNG'

UNION

SELECT
  S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
  S.SALES_HISTORY S
WHERE
  S.SALES_DATE = '01-JAN-2015' ;

Better

SELECT
  S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
  SALES_HISTORY S
WHERE
  S.PRODUCT_NAME = 'SAMSUNG'

UNION ALL

SELECT
  S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
  SALES_HISTORY S
WHERE
  S.SALES_DATE = '01-JAN-2015' ;

Use Function based Indexes:

Where ever possible try to avoid using functions on an indexed column, But if there is a business requirement which you can not avoid, then instead of using functions on the column, try to use function based Indexes.

 CREATE INDEX product_s_idx ON SALES_HISROTY (SUBSTR(PRODUCT_NAME,1,3));

Exec dbms_stats.gather_table_stats(user,’SALES_HISTORY’, cascade=>true);

EXPLAIN PLAN FOR
SELECT
  S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
  SALES_HISTORY S
WHERE
  SUBSTR(S.PRODUCT_NAME, 1, 3 ) = ‘SAM’ ;

SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY) ;


Use UNION instead of OR:

Consider using UNION when you have to use OR condition on 2 index columns.

SELECT
  S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
  SALES_HISTORY S
WHERE
  S.PRODUCT_NAME = 'SAMSUNG' OR
  S.SALES_DATE = '01-JAN-2015' ;


GOOD

SELECT
  S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
  SALES_HISTORY S
WHERE
  S.PRODUCT_NAME = 'SAMSUNG'

UNION ALL

SELECT
  S.ORDER_ID
, S.SALES_DATE
, S.TOTAL_AMOUNT
, S.PRODUCT_NAME
FROM
  SALES_HISTORY S
WHERE
  S.SALES_DATE = '01-JAN-2015' ;