Informatica Performance Tuning and Optimization Techniques

Name: Informatica Performance Tuning and Optimization Techniques
Rating: 4.2 (339 reviews)

Partitions, Pushdown Optimization and other Optimization Techniques

Created bySid Inf

Last updated 3/2019

English

What you'll learn

Push Down Optimization.
Partitions in Informatica.
Dynamic Session Partitions.
Concurrent Workflow Execution.
Load Balancing for Workflow Distribution on GRID.
Steps to determine the best way to improve performance of the existing Informatica Objects.
Iterative methods of identifying one performance bottleneck at a time and eliminate it.
Optimization techniquies for Flat File Targets.
Optimization techniquies for Flat File Sources.
Optimization techniquies for Mappings.
Optimization techniquies for Datatype Conversions.
Optimization techniquies for Expression Transformation.
Optimization techniquies for Aggregator Transformation.
Optimization techniquies for Custom Transformation.
Optimization techniquies for Joiner Transformation.
Optimization techniquies for Lookup Transformation.
Optimization techniquies for Normalizer Transformation.
Optimization techniquies for Sequence Generator Transformation.
Optimization techniquies for Sorter Transformation.
Optimization techniquies for Source Qualifier Transformation.
Optimization techniquies for SQL Transformation.
Optimization techniquies for the Sessions.
Optimization techniquies for Grid Deployment.
Optimization techniquies for the (Operating Systems) Systems.
Optimizing Buffer Block Sizes.
Optimizing Cache Sizes for Cache using transformations.

Course content

9 sections • 48 lectures • 3h 31m total length

Introduction1:12
Welcome to the Informatica Power Center course!
What is the goal of the Performance Tuning exercise?3:51
What should be the goal of any Performance Tuning exercise? is discussed in this lecture.
Iterative Approach and Tracking the Progress8:58
Performance Tuning is always an iterative approach. In this lecture, we see how the iterative process works and how to track the numbers.

How important is Business Process for solving the Performance Tuning issues?4:46
In this lecture, we understand the importance of Business process in resolving the Performance Tuning issues.
How will the Technology Landscape of the application help?3:40
In this lecture, we understand the importance of Technology Landscape in resolving the Performance Tuning issues.
Development and Design Process of the application1:09
In this lecture, we discuss about the impact of development and design process on the performance tuning exercise.
Issues, Concerns and User expectations1:16
In this lecture, we see how the known issues, concerns and the expectations of multiple stakeholders will help resolve lot of questions on the non functional requirements and fix the performance tuning issues.

The back stage artists - Session's Anatomy1:57
What really happens once the workflow is triggered is discussed in this lecture.
What are Source Bottlenecks?3:49
How does a Source Bottleneck occur is discussed in this lecture.
What are Target Bottlenecks?1:34
How does a Target Bottleneck occur is discussed in this lecture.
What are Transformation Bottlenecks?3:03
How does a Transformation Bottleneck occur is discussed in this lecture.
What are Session Bottlenecks?3:24
How does a Session Bottleneck occur?
What are System Bottlenecks?2:53
In this lecture, we discuss about the System Bottlenecks.
Summary of the Bottlenecks2:42
A quick review on all the types of bottlenecks.

What are Partitions?7:01
A partition is a pipeline stage that executes in a single reader, transformation, or Writer thread. The number of partitions in any pipeline stage equals the number of Threads in the stage. By default, the Integration Service creates one partition in every pipeline stage.
What are the different situations in which Partitions should be considered?5:01
In this lecture,we talk about the different situations in which Partitions can be used.
What are Partition Points?7:55
By default, IS sets partition points at various transformations in the pipeline. Partition points mark thread boundaries and divide the pipeline into stages.A stage is a section of a pipeline between any two partition points.
What are different types of Partitions in Informatica?2:46
In this lecture, we talk about the different types of partitions available in Informatica.
Database Partitions2:58
In this lecture, the Database Partitions are discussed.
What is the difference between Database partitions and Informatica Partitions?0:55
This difference between the Database partitions and Informatica partitions are discussed in this lecture.
What happens at the back-end when the Database partitions are enabled?2:43
What happens at the back end of the database and how Informatica gets the database partition information is discussed in this lecture.
Example of Database Partitions6:43
In this lecture, an example of the Database Partitions is shown.
Passthrough Partitions5:35
This lecture details the Passthrough partitions with an example.
Round Robin Partitions4:16
The PowerCenter Integration Service distributes blocks of data to one or more partitions. Use round-robin partitioning so that each partition processes rows based on the number and size of the blocks.

What are Dynamic Partitions and different types of Dynamic Partitions?10:03
Dynamic Partitioning:

If the volume of data grows or you add more CPUs, session run time increases. In dynamic partitioning, Integration Service determines the number of partitions to create at run time based on factors such as source database partitions or the number of nodes in a grid.

Types of Dynamic Partitioning:

1.    Based on Source Partitioning - Determines the number of partitions using database partition information. The number of partitions is the maximum of the number of partitions at the source.

2.    Based on number of CPUs - Sets the number of partitions equal to the number of CPUs on the node that prepares the session. If the session is configured to run on a grid, dynamic partitioning sets the number of partitions equal to the number of CPUs on the node that prepares the session multiplied by the number of nodes in the grid.

3.    Based on number of Nodes in Grid - Sets the partitions to the number of nodes in the grid running the session. If you configure this option for sessions that do not run on a grid, the session runs in one partition and logs a message in the session log.

4.    Based on No. of Partitions - Sets the partitions to a number that you define in the Number of Partitions attribute. Use the $DynamicPartitionCount session parameter, or enter a number greater than 1.
Dynamic Partitions - Example8:29
In this lecture, we talk about the Dynamic Partitions with the option of 'Based on Source Partitions'

What is Push Down Optimization?2:45
Pushdown optimization technique in inIormatica pushes the part or complete transformation logic to the source or target database. The integration service translates the transformation logic into SQL queries and sends those SQL statements to the database. Then the source or target database runs those SQL queries to process the transformation logic.

The amount of transformation logic you can push to the database depends on the database, transformation logic, mapping and session configuration. The Integration Service processes all transformation logic that it cannot push to a database.

To preview the SQL statements and mapping logic that the integration service pushes to the source or target database, use the pushdown optimization viewer.
How does Push Down Optimization work?3:47
The Integration Service translates the transformation logic into SQL queries and sends the SQL queries to the source or the target database which executes the SQL queries to process the transformations. The amount of transformation logic one can push to the database depends on the database, transformation logic, and mapping and session configuration. The Integration Service analyzes the transformation logic it can push to the database and executes the SQL statement generated against the source or target tables, and it processes any transformation logic that it cannot push to the database.
Different Types of PDO and Source PDO5:31
There are three types of pushdown optimization techniques that you can configure in a session. They are:
Source-side Pushdown Optimization
Target-side Pushdown Optimization
Full Pushdown Optimization
Source-Side Pushdown Optimization:

The Integration Service first analyzes the mapping from the source to the target or until it reaches a downstream transformation that it cannot push to the source database. Then the integration service generates the SELECT statement and executes the SQL in the source database. Then it reads the result rows and process the remaining transformations.
Target Push Down Optimization1:45
Target-Side Pushdown Optimization:

The Integration Service first analyzes the mapping from the target to the source or until it reaches an upstream transformation that it cannot push to the target database. Then the integration service generates the Insert, Update or Delete statements. The integration service processes the transformation logic up to the point that it can push the transformation logic to the database. Then, it executes the generated SQLs on the target database.
Full Push Down Optimization2:19
Full Pushdown Optimization:

You can use full pushdown optimization only when the source and target databases are in the same relational database management system.

The Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the target database. If the Integration Service cannot push all transformation logic to the target database, it tries to push all transformation logic to the source database.

If it cannot push all transformation logic to the source or target, the Integration Service pushes as much transformation logic to the source database, processes intermediate transformations that it cannot push to any database, and then pushes the remaining transformation logic to the target database. The Integration Service generates and executes an INSERT SELECT, DELETE, or UPDATE statement for each database to which it pushes transformation logic.
Session Properties for Push Down Optimization2:49
In this lecture, we discuss the different options available at session level for PDO.
How is Push Down Optimization performed?7:47
Example of the PDO.
Different Databases which support PDO2:18
In this lecture, we see the list of different databases which can be used for Push Down Optimization.
Error Handling for PDO4:37
This lecture, has the details of Error handling while performing the PDO.

Source Qualifier Transformation Best Practices8:30
Only use SQL Overrides if there is a substantial performance gain or complexity decrease. SQL Overrides need to be maintained manually and any changes to the data structure will result in rewriting or modifying the SQL Override.
Do use the WHERE condition and SORTED ports in the Source Qualifier if possible, rather than adding a filter or a sorter transformation.
Delete unused ports / only connect what is used. Reducing the number of records used throughout the mapping provides better performance by minimizing the amount of data moved.
Tune source qualifier queries to return only the data you need.
Perform large lookups in the Source Qualifier instead of through a traditional lookup.
When applicable, generate the default SQL in the Source Qualifier and use the ‘Validate’ option to verify that the resulting SQL is valid.
Expression Transformation Best Practices6:34
Calculate once, use many times. Avoid calculating or testing the same value over and over. Calculate it once in an expression, and set a true/false flag. Within an expression, use variables to calculate a value used several times.
Create an anchor expression transformation that will map the source table to an intermediary transformation using the source column names. Do simple processes (LTRIM/RTRIM, string/numeric conversions, testing for NULL, etc.) in this transformation. This will enable an easier transition if the source table changes in the future.
Watch your data types. The engine will automatically convert compatible types. Sometimes conversion is excessive and happens on every transformation which slows the mapping.
Expression names should begin with "EXP" followed by descriptive words
Do not propagate ports out of an Expression transformation if they are not used in the mapping going forward.
Group input-outputs ports first, followed by variable ports and then by output ports. Incorrectly ordering the ports in an Expression transformation can lead to errors and/or inaccurate results.
Lookup Transformation - Best Practices13:21
Avoid using the $Source and $Target variables in the Lookup Connection Information. Connection names have been set up to be generic across Production, Test. If possible, set the Connection Information in the Lookup transformation to one of these non-level specific connection names.
Set the connections in the session for ease of migration.
Do not include any more ports in the Lookup other than necessary. Reducing the amount of data processed provides better performance.
Avoid date time comparisons in lookup; replace with string.
Not all sources and targets treat strings with leading or trailing blanks the same. It may be necessary to RTRIM and LTRIM string data prior to using it in a Lookup.
Lookups on small tables (<10,000) records can be cached and use ODBC. Lookups on large tables should be cached As a general practice, do not use uncached lookups.
In place of lookups, tables can be joined in the source qualifier. However, often necessitates left joins, which can complicate source qualifiers. Weigh performance vs ease of maintenance when deciding between source qualifiers and lookups.
Sequence Generator Transformation - Best Practices2:40
Try to use reusable sequence generator than using separate sequence generator if you are using it for generating unique primary key.
Aggregator Transformation - Best Practices4:29
Do not use Aggregators for simple sorting; use the sorter transformation or the SORTED ports option of the Source Qualifier.
Minimize aggregate function calls by using “group by”.
Do place Aggregators as early in the mapping as possible, as they reduce the number of records being processed, thereby improving performance.
Wherever possible, sort incoming data to an Aggregated use the ‘Sorted input’ option to improve performance.
Filter Transformation - Best Practices4:15
Place Filters as early in the mapping as possible, as they reduce the number of records being processed, thereby improving performance.
Use a Filter to screen rows that would be rejected by an update strategy. (Rejected rows from an update strategy are logged to the bad file, decreasing performance.)
If you have Aggregator transformation in mapping, use filter before aggregation to avoid unnecessary aggregation.
Router Transformation - Best Practices2:50
Routers may not be the best choice if load order of the target(s) is important since it is not possible to control the load order of the legs from a router.
The target load method(s) must be carefully chosen when using routers, especially if the data is loading to the same target, in order to avoid table locks and ensure that the data is loaded in the correct order.
Joiner Transformation - Best Practices2:54
Whenever possible, perform joins in the database i.e. the source qualifier itself.
Whenever possible, sort incoming data to a Joiner transformation and use the ‘Sorted input’ option to improve performance.
To improve performance of an unsorted Joiner transformation, designate the source with fewer s as the ‘Master’.
Update Strategy Transformation - Best Practices2:54
Do not codes update strategies when all rows to the target insert.
DO include an update strategy when all rows to the target are update, unless a proof of concept shows that there is performance degradation. This adds clarity to the mapping for future developers.
Rejected rows from an update strategy are logged to the bad file. Consider filtering if retaining these rows isn’t critical due to the performance hit caused by logging.
Avoid loading to the same target between different data flows where possible.
Sorter Transformation - Best Practices1:50
Whenever possible, sort source data in the database i.e. the source qualifier itself.

Requirements

You should know Informatica Power Center
ETL Methodologies using Informatica Power Center

Description

The purpose of this course is to walk you through the steps to identify performance bottlenecks in Informatica Power Center and different optimization techniques to fix them. After completing this course, you will have a working understanding of the different performance bottlenecks, how to identify them based on the iterative approach and the thread statistics. Different best practices and optimization techniques which should be consideration while designing the mapping or fine tuning the existing ones.

Who this course is for:

ETL Developers
ETL Testers/QA Analysts
Informtaica Administrators
UNIX Administrators managing the Data Warehouse Enviornments
Relational DBAs managing the Data Warehouse Enviornments
Windows Administrators managing the Data Warehouse Enviornments

Informatica Performance Tuning and Optimization Techniques

What you'll learn

Explore related topics

Course content

Introduction3 lectures • 14min

Pre-requisuts for Performance Tuning4 lectures • 11min

Bottlenecks7 lectures • 19min

How to identify different Bottlenecks?2 lectures • 18min

Partitions in Informatica10 lectures • 46min

Dynamic Partitions2 lectures • 19min

Push Down Optimization (PDO)9 lectures • 34min

Transformation Optimization Techniques and Best Practices10 lectures • 50min

New Topics are added every week1 lecture • 1min

Requirements

Description

Who this course is for: