Informatica Performance Tuning and Optimization Techniques
4.2 (179 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
874 students enrolled

Informatica Performance Tuning and Optimization Techniques

Partitions, Pushdown Optimization and other Optimization Techniques
4.2 (179 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
874 students enrolled
Created by Sid Inf
Last updated 5/2019
English
English [Auto-generated]
Current price: $139.99 Original price: $199.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 3.5 hours on-demand video
  • 3 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Push Down Optimization.
  • Partitions in Informatica.
  • Dynamic Session Partitions.
  • Concurrent Workflow Execution.
  • Load Balancing for Workflow Distribution on GRID.
  • Steps to determine the best way to improve performance of the existing Informatica Objects.
  • Iterative methods of identifying one performance bottleneck at a time and eliminate it.
  • Optimization techniquies for Flat File Targets.
  • Optimization techniquies for Flat File Sources.
  • Optimization techniquies for Mappings.
  • Optimization techniquies for Datatype Conversions.
  • Optimization techniquies for Expression Transformation.
  • Optimization techniquies for Aggregator Transformation.
  • Optimization techniquies for Custom Transformation.
  • Optimization techniquies for Joiner Transformation.
  • Optimization techniquies for Lookup Transformation.
  • Optimization techniquies for Normalizer Transformation.
  • Optimization techniquies for Sequence Generator Transformation.
  • Optimization techniquies for Sorter Transformation.
  • Optimization techniquies for Source Qualifier Transformation.
  • Optimization techniquies for SQL Transformation.
  • Optimization techniquies for the Sessions.
  • Optimization techniquies for Grid Deployment.
  • Optimization techniquies for the (Operating Systems) Systems.
  • Optimizing Buffer Block Sizes.
  • Optimizing Cache Sizes for Cache using transformations.
Requirements
  • You should know Informatica Power Center
  • ETL Methodologies using Informatica Power Center
Description

The purpose of this course is to walk you through the steps to identify performance bottlenecks in Informatica Power Center and different optimization techniques to fix them. After completing this course, you will have a working understanding of the different performance bottlenecks, how to identify them based on the iterative approach and the thread statistics. Different best practices and optimization techniques which should be consideration while designing the mapping or fine tuning the existing ones.

Who this course is for:
  • ETL Developers
  • ETL Testers/QA Analysts
  • Informtaica Administrators
  • UNIX Administrators managing the Data Warehouse Enviornments
  • Relational DBAs managing the Data Warehouse Enviornments
  • Windows Administrators managing the Data Warehouse Enviornments
Course content
Expand all 49 lectures 03:32:48
+ Introduction
3 lectures 14:01

Welcome to the Informatica Power Center course!

Preview 01:12

What should be the goal of any Performance Tuning exercise? is discussed in this lecture. 

Preview 03:51

Performance Tuning is always an iterative approach. In this lecture, we see how the iterative process works and how to track the numbers. 

Preview 08:58
+ Pre-requisuts for Performance Tuning
4 lectures 10:51

In this lecture, we understand the importance of Business process in resolving the Performance Tuning issues. 

How important is Business Process for solving the Performance Tuning issues?
04:46

In this lecture, we understand the importance of Technology Landscape in resolving the Performance Tuning issues. 

How will the Technology Landscape of the application help?
03:40

In this lecture, we discuss about the impact of development and design process on the performance tuning exercise. 

Development and Design Process of the application
01:09

In this lecture, we see how the known issues, concerns and the expectations of multiple stakeholders will help resolve lot of questions on the non functional requirements and fix the performance tuning issues. 

Issues, Concerns and User expectations
01:16
+ Bottlenecks
7 lectures 19:22

What really happens once the workflow is triggered is discussed in this lecture. 

The back stage artists - Session's Anatomy
01:57

How does a Source Bottleneck occur is discussed in this lecture. 

What are Source Bottlenecks?
03:49

How does a Target Bottleneck occur is discussed in this lecture.

What are Target Bottlenecks?
01:34

How does a Transformation Bottleneck occur is discussed in this lecture.

What are Transformation Bottlenecks?
03:03

How does a Session Bottleneck occur? 

What are Session Bottlenecks?
03:24

In this lecture, we discuss about the System Bottlenecks.

What are System Bottlenecks?
02:53

A quick review on all the types of bottlenecks.

Summary of the Bottlenecks
02:42
+ How to identify different Bottlenecks?
2 lectures 18:28

Thread statics gives run time information from the reader, transformation and writer thread. The session log provides enough run time thread statistics to help us understand and identify the performance bottleneck.

What are Thread Statistics and how would these help?
12:43
What are Performance Counters and how are these used to identify the bottlenecks
05:45
+ Partitions in Informatica
10 lectures 45:53
A partition is a pipeline stage that executes in a single reader, transformation, or Writer thread. The number of partitions in any pipeline stage equals the number of Threads in the stage. By default, the Integration Service creates one partition in every pipeline stage.
What are Partitions?
07:01

In this lecture,we talk about the different situations in which Partitions can be used. 

What are the different situations in which Partitions should be considered?
05:01

By default, IS sets partition points at various transformations in the pipeline. Partition points mark thread boundaries and divide the pipeline into stages.A stage is a section of a pipeline between any two partition points.

What are Partition Points?
07:55

In this lecture, we talk about the different types of partitions available in Informatica.

What are different types of Partitions in Informatica?
02:46

In this lecture, the Database Partitions are discussed. 

Database Partitions
02:58

This difference between the Database partitions and Informatica partitions are discussed in this lecture.

What is the difference between Database partitions and Informatica Partitions?
00:55

What happens at the back end of the database and how Informatica gets the database partition information is discussed in this lecture.

What happens at the back-end when the Database partitions are enabled?
02:43

In this lecture, an example of the Database Partitions is shown. 

Example of Database Partitions
06:43

This lecture details the Passthrough partitions with an example.

Passthrough Partitions
05:35

The PowerCenter Integration Service distributes blocks of data to one or more partitions. Use round-robin partitioning so that each partition processes rows based on the number and size of the blocks.

Round Robin Partitions
04:16
+ Dynamic Partitions
2 lectures 18:32

Dynamic Partitioning:

If the volume of data grows or you add more CPUs, session run time increases. In dynamic partitioning, Integration Service determines the number of partitions to create at run time based on factors such as source database partitions or the number of nodes in a grid.

Types of Dynamic Partitioning:

1.    Based on Source Partitioning - Determines the number of partitions using database partition information. The number of partitions is the maximum of the number of partitions at the source.

2.    Based on number of CPUs - Sets the number of partitions equal to the number of CPUs on the node that prepares the session. If the session is configured to run on a grid, dynamic partitioning sets the number of partitions equal to the number of CPUs on the node that prepares the session multiplied by the number of nodes in the grid.

3.    Based on number of Nodes in Grid - Sets the partitions to the number of nodes in the grid running the session. If you configure this option for sessions that do not run on a grid, the session runs in one partition and logs a message in the session log.

4.    Based on No. of Partitions - Sets the partitions to a number that you define in the Number of Partitions attribute. Use the $DynamicPartitionCount session parameter, or enter a number greater than 1.

What are Dynamic Partitions and different types of Dynamic Partitions?
10:03

In this lecture, we talk about the Dynamic Partitions with the option of 'Based on Source Partitions'

Dynamic Partitions - Example
08:29
+ Push Down Optimization (PDO)
9 lectures 33:38

Pushdown optimization technique in inIormatica pushes the part or complete transformation logic to the source or target database. The integration service translates the transformation logic into SQL queries and sends those SQL statements to the database. Then the source or target database runs those SQL queries to process the transformation logic. 

The amount of transformation logic you can push to the database depends on the database, transformation logic, mapping and session configuration. The Integration Service processes all transformation logic that it cannot push to a database. 

To preview the SQL statements and mapping logic that the integration service pushes to the source or target database, use the pushdown optimization viewer. 

What is Push Down Optimization?
02:45

The Integration Service translates the transformation logic into SQL queries and sends the SQL queries to the source or the target database which executes the SQL queries to process the transformations. The amount of transformation logic one can push to the database depends on the database, transformation logic, and mapping and session configuration. The Integration Service analyzes the transformation logic it can push to the database and executes the SQL statement generated against the source or target tables, and it processes any transformation logic that it cannot push to the database.

How does Push Down Optimization work?
03:47

There are three types of pushdown optimization techniques that you can configure in a session. They are: 

  • Source-side Pushdown Optimization
  • Target-side Pushdown Optimization
  • Full Pushdown Optimization

Source-Side Pushdown Optimization

The Integration Service first analyzes the mapping from the source to the target or until it reaches a downstream transformation that it cannot push to the source database. Then the integration service generates the SELECT statement and executes the SQL in the source database. Then it reads the result rows and process the remaining transformations. 

Different Types of PDO and Source PDO
05:31

Target-Side Pushdown Optimization

The Integration Service first analyzes the mapping from the target to the source or until it reaches an upstream transformation that it cannot push to the target database. Then the integration service generates the Insert, Update or Delete statements. The integration service processes the transformation logic up to the point that it can push the transformation logic to the database. Then, it executes the generated SQLs on the target database. 

Target Push Down Optimization
01:45

Full Pushdown Optimization

You can use full pushdown optimization only when the source and target databases are in the same relational database management system. 

The Integration Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot push to the target database. If the Integration Service cannot push all transformation logic to the target database, it tries to push all transformation logic to the source database. 

If it cannot push all transformation logic to the source or target, the Integration Service pushes as much transformation logic to the source database, processes intermediate transformations that it cannot push to any database, and then pushes the remaining transformation logic to the target database. The Integration Service generates and executes an INSERT SELECT, DELETE, or UPDATE statement for each database to which it pushes transformation logic. 

Full Push Down Optimization
02:19

In this lecture, we discuss the different options available at session level for PDO.

Session Properties for Push Down Optimization
02:49

Example of the PDO. 

How is Push Down Optimization performed?
07:47

In this lecture, we see the list of different databases which can be used for Push Down Optimization. 

Different Databases which support PDO
02:18

This lecture, has the details of Error handling while performing the PDO.

Error Handling for PDO
04:37
+ Transformation Optimization Techniques and Best Practices
10 lectures 50:17
  • Only use SQL Overrides if there is a substantial performance gain or complexity decrease.  SQL Overrides need to be maintained manually and any changes to the data structure will result in rewriting or modifying the SQL Override.
  • Do use the WHERE condition and SORTED ports in the Source Qualifier if possible, rather than adding a filter or a sorter transformation.
  • Delete unused ports / only connect what is used.  Reducing the number of records used throughout the mapping provides better performance by minimizing the amount of data moved.
  • Tune source qualifier queries to return only the data you need. 
  • Perform large lookups in the Source Qualifier instead of through a traditional lookup.
  • When applicable, generate the default SQL in the Source Qualifier and use the ‘Validate’ option to verify that the resulting SQL is valid.
Source Qualifier Transformation Best Practices
08:30
  • Calculate once, use many times.  Avoid calculating or testing the same value over and over.  Calculate it once in an expression, and set a true/false flag.  Within an expression, use variables to calculate a value used several times.
  • Create an anchor expression transformation that will map the source table to an intermediary transformation using the source column names.  Do simple processes (LTRIM/RTRIM, string/numeric conversions, testing for NULL, etc.) in this transformation.  This will enable an easier transition if the source table changes in the future.
  • Watch your data types.  The engine will automatically convert compatible types.  Sometimes conversion is excessive and happens on every transformation which slows the mapping.
  • Expression names should begin with "EXP" followed by descriptive words
  • Do not propagate ports out of an Expression transformation if they are not used in the mapping going forward.
  • Group input-outputs ports first, followed by variable ports and then by output ports. Incorrectly ordering the ports in an Expression transformation can lead to errors and/or inaccurate results.


    Expression Transformation Best Practices
    06:34
    • Avoid using the $Source and $Target variables in the Lookup Connection Information. Connection names have been set up to be generic across Production, Test.  If possible, set the Connection Information in the Lookup transformation to one of these non-level specific connection names.
    • Set the connections in the session for ease of migration. 
    • Do not include any more ports in the Lookup other than necessary.  Reducing the amount of data processed provides better performance.
    • Avoid date time comparisons in lookup; replace with string.
    • Not all sources and targets treat strings with leading or trailing blanks the same.  It may be necessary to RTRIM and LTRIM string data prior to using it in a Lookup.
    • Lookups on small tables (<10,000) records can be cached and use ODBC.  Lookups on large tables should be cached As a general practice, do not use uncached lookups. 
    • In place of lookups, tables can be joined in the source qualifier.  However, often necessitates left joins, which can complicate source qualifiers.  Weigh performance vs ease of maintenance when deciding between source qualifiers and lookups.
    Lookup Transformation - Best Practices
    13:21

    Try to use reusable sequence generator than using separate sequence generator if you are using it for generating unique primary key.

    Sequence Generator Transformation - Best Practices
    02:40
    • Do not use Aggregators for simple sorting; use the sorter transformation or the SORTED ports option of the Source Qualifier. 
    • Minimize aggregate function calls by using “group by”. 
    • Do place Aggregators as early in the mapping as possible, as they reduce the number of records being processed, thereby improving performance.
    • Wherever possible, sort incoming data to an Aggregated use the ‘Sorted input’ option to improve performance.


    Aggregator Transformation - Best Practices
    04:29
    • Place Filters as early in the mapping as possible, as they reduce the number of records being processed, thereby improving performance.
    • Use a Filter to screen rows that would be rejected by an update strategy.  (Rejected rows from an update strategy are logged to the bad file, decreasing performance.) 
    • If you have Aggregator transformation in mapping, use filter before aggregation to avoid unnecessary aggregation.
    Filter Transformation - Best Practices
    04:15
    • Routers may not be the best choice if load order of the target(s) is important since it is not possible to control the load order of the legs from a router. 
    • The target load method(s) must be carefully chosen when using routers, especially if the data is loading to the same target, in order to avoid table locks and ensure that the data is loaded in the correct order.
    Router Transformation - Best Practices
    02:50
    • Whenever possible, perform joins in the database i.e. the source qualifier itself.
    • Whenever possible, sort incoming data to a Joiner transformation and use the ‘Sorted input’ option to improve performance.
    • To improve performance of an unsorted Joiner transformation, designate the source with fewer s as the ‘Master’.
    Joiner Transformation - Best Practices
    02:54
    • Do not codes update strategies when all rows to the target insert.  
    • DO include an update strategy when all rows to the target are update, unless a proof of concept shows that there is performance degradation.  This adds clarity to the mapping for future developers. 
    • Rejected rows from an update strategy are logged to the bad file.  Consider filtering if retaining these rows isn’t critical due to the performance hit caused by logging.
    • Avoid loading to the same target between different data flows where possible.
    Update Strategy Transformation - Best Practices
    02:54

    Whenever possible, sort source data in the database i.e. the source qualifier itself.

    Sorter Transformation - Best Practices
    01:50
    + New Topics are added every week
    1 lecture 00:09

    New lectures will be added on a weekly basis. 

    Upcoming lectures
    00:09
    + Bonus Section
    1 lecture 01:37

    This lecture will provide you with the coupons for other courses.

    Direct links to other courses
    01:37