Informatica Transformations are repository objects which can read, modify or pass data to the defined target structures like tables, files, or any other targets required. A Transformation is basically used to represent a set of rules, which define the data flow and how the data is loaded into the targets. Informatica PowerCenter provides multiple transformations, each serving a particular functionality. Plus, with Informatica leading today’s market in the data integration platform, Informatica Transformations come as a crucial concept required for Informatica Certification.
To understand Informatica Transformations better, let us first understand what is mapping? A mapping is a collection of source and target objects linked together by a set of transformations. Hence transformations in a mapping represent the operations that the integration service will perform on the data during the execution of the workflow. To get a better understanding of workflow, you can check out our blog Informatica Tutorial: Workflow management
What are the Various Informatica Transformations?
Informatica Transformations can be mainly classified into two categories. Firstly based on the connectivity(Linking in mapping) of the transformations with each other and the second is based on the change in the overall no of rows between the source and target. Let’s start by taking a look at the Informatica transformations based on connectivity.
1) Types of transformations in Informatica based on connectivity:
- Connected Transformations.
- Unconnected Transformations.
In Informatica, those transformations which are connected to one or more transformations are called as Connected transformations.
The connected transformations are used when for every input row, a transformation is called and is expected to return a value. For example, we can use a connected lookup transformation to know the names of every employee working a specific department by specifying the Department ID in the lookup expression.
Some of the Major connected Informatica transformations are Aggregator, Router, Joiner, Normalizer, etc.
Those transformations that are not connected to any other transformations are called Unconnected transformations. Their functionality is used by calling them inside other transformations like Expression transformation. These transformations are not part of the mapping pipeline.
The unconnected transformations are used when their functionality is only required based upon certain conditions. For example, As a programmer you wish to perform a complicated operation on the data, however you do not wish to use Informatica transformations like expression or filter transformations to perform this operation. In such a case, you can create an external DLL or UNIX shared library with the codes to perform the operation and call them in the External procedure transformation.
There are 3 Informatica transformations viz. External Procedure, Lookup, and Stored Procedure which can be unconnected in a valid mapping (A mapping which the Integration Service can execute).
2) Types of Informatica transformations based on the change in no of rows
- Active Transformations
- Passive Transformations
Active Transformations: – An active transformation can perform any of the following actions:
- Change the number of rows that passes through the transformation: For instance, the Filter transformation is active because it removes rows that do not meet the filter condition.
- Change the transaction boundary: A transaction boundary is a boundary that encloses all the transactions before a commit is called or between two commit calls. For e.g., During a transactional operation, the user feels that after certain transactions a commit is required and calls the commit command to create a savepoint and by doing so the user changes the default transaction boundary. By default, the transaction boundary lies between the start of the file to auto commit point or EOF.
- Change the rowtype attribute: Rowtype attribute is a record type that represents a row in a table. The record can store an entire row of data selected from the table or fetch from a pointer or pointer variable. For e.g., The Update Strategy transformation flags rowstype as 0 for inserting values, 1 for update, 2 for delete or 3 for reject.
- Aggregator, Filter, Joiner, Normalizer, etc. are a few examples of Active transformation.
Passive Transformation: A passive transformation is one which will satisfy all these conditions:
- The number of rows before and after transformation is the same.
- Maintains the transaction boundary .
- Maintains the rowtype attribute.
- Expression, ExternalProcedure, HTTP, etc. are a few examples of Passive transformation.
In the passive transformation, no new rows are created, or existing rows are dropped.
You must be wondering why passive transformations are used for if they do not change the number of rows. They are generally used to update values, calling an external procedure from a shared library and to define the input and output of maplets. A maplet is a collection of only the transformations from the mapping. For e.g., For a student database we wish to update the values of marks column to percentile instead of the percentage, this can be done by using an expression transformation which will convert the values and update in the same columns keeping the overall number of rows same after the transformations.
There is no restriction that if a transformation is being used as a passive transformation, it cannot be used later as active transformation. Similarly, an unconnected transformation can be used as a connected transformation as per needs. All possible combinations can be formed between these categories and this is the magic of Informatica transformations. You will get a better idea later in this blog about the possible types a transformation can belong to.
Now that we have gotten an understanding of the various types of Informatica transformations, let’s begin exploring them. Below are a few major types of Informatica transformations:
Transformation | Type | Description |
Aggregator | Active Connected | Performs aggregate calculations. |
Expression | Passive Connected | Calculates a value. |
Java | Active Connected or Passive Connected | Executes user logic coded in Java. The bytecode for the user logic is stored in the repository |
Joiner | Active Connected | Joins data from different databases or flat file systems. |
Lookup | Active Connected or Passive Connected or Active Unconnected or Passive Unconnected | Lookup and return data from a flat file, relational table, view, or synonym. |
Normalizer | Active Connected | Used in the pipeline to normalize data from relational or flat file sources. |
Rank | Active Connected | Limits records to a top or bottom range. |
Router | Active Connected | Routes data into multiple transformations based on group conditions. |
SQL | Active Connected or Passive Connected | Executes SQL queries against a database. |
Union | Active Connected | Merges data from different databases or flat file systems. |
XML Generator | Active Connected | Reads data from one or more input ports and outputs XML through a single output port. |
XML Parser | Active Connected | Reads XML from one input port and outputs data to one or more output ports. |
XML Source Qualifier | Active Connected | Represents the rows that the Integration Service reads from an XML source when it runs a session. |
Let us now start looking at the transformations one by one.
Aggregator Transformation
Aggregator transformation is an Active and Connected transformation. This Informatica transformation is useful to perform calculations such as averages and sums (mainly to perform calculations on multiple rows or groups). For example, to calculate the total number of daily sales or to calculate the average of monthly or yearly sales. Aggregate functions such as AVG, FIRST, COUNT, PERCENTILE, MAX, SUM, etc., can be used in aggregate transformation.
Lookup Transformation
Lookup transformation is the most popular and widely used Informatica transformation. Based on the requirement of the user has, the lookup transformation can be used as a Connected or Unconnected transformation combining it as an Active or Passive transformation. It is used to mainly look up the details from a source, source qualifier, or target in order to get relevant required data. You can also look up a ‘flat file’, ‘relational table’, ‘view’ or ‘synonym’. One can use multiple lookup transformations in a mapping.
The lookup transformation is created with the following type of ports(Logical points for transfer of information):
- Input port (I)
- Output port (O)
- Look up Ports (L)
- Return Port (R) (Only in case of Unconnected lookup)
Differences between Connected and UnConnected Lookup Transformation:
- Connected lookup receives input values directly from mapping pipeline, whereas UnConnected lookup receives values from the lookup expression from another transformation. A mapping in Informatica may contain Source, Transformations and Targets connected together are considered as a pipeline.
- Connected lookup returns multiple columns from the same row as they have multiple return ports, whereas UnConnected lookup has only one return port and returns one column from each row. For e.g., If we use a connected lookup on an employee database for a specific department id as a parameter, we can get all the details related to the employees of that department like their Names, Employee ID number, Address, etc., whereas with an Unconnected lookup we can get only one attribute of the employee like their Name or Employee Id number or any attribute specified by the user.
- Connected lookup caches all lookup columns, whereas UnConnected lookup caches only the lookup output and lookup conditions.
- Connected lookup supports user-defined default values, whereas UnConnected lookup does not support user defined values. For e.g., If you wish to change all values of a certain column to NULL after lookup, you can set the default value of those columns to NULL in the lookup expressions. This feature is however not possible in case of UnConnected lookup.
Let’s say from a customer database, I wish to know the details of customers who have more than 1 non-cancelled invoice. To obtain this data, We can use a lookup transformation.
Here are the steps.
- Begin by loading the Invoice table as the source into the mapping designer. In case you are not clear on how to load source data into the Designer, Click here.
- Let us now filter out the Invoices which are not cancelled. To do this Create a new filter named fil_ODS_CUSTOMER_ACTIVE to the Source Qualifier with the property NOT (ISNULL (DATE_CLOSED)) AND CANCELED = 0.
- Now Add a lookup transformation in the designer as seen below with name as lkp_CUSTOMER :
- Specify the lookup table as the customer table.
- Double click on the header of lkp_CUSTOMER to open the edit menu. Under Condition tab set the lookup condition as CUST_ID = CUST_NO.
- In the Properties tab change the Connection Information to $Source and click on OK to save the transformation:
- Link the lkp_CUSTOMER ports to ODS_CUSTOMER_ACTIVE ports to complete the required transformation where ODS_CUSTOMER_ACTIVE is the required target file:
- The final iconic map including the lookup transformation should be as below:
Expression Transformation
Expression transformation is a Passive and Connected Informatica transformation. Expression transformations are used for row-wise manipulation. For any type of manipulation you wish to perform on an individual record, use an Expression transformation. The Expression transformation accepts the row-wise data, manipulates it, and passes it to the target. For example, to calculate the discount for each product or to concatenate first and last names or to convert dates to a string field.
Joiner Transformation
- Normal
- Master Outer
- Detail Outer
- Full Outer
We cannot join more than two sources using a single joiner. To join three sources, we need to have two joiner transformations.
Let’s say, we want to join three tables – Employees, Departments and Locations – using Joiner. We will need two joiners. Joiner-1 will join, Employees and Departments and Joiner-2 will join, the output from the Joiner-1 and Locations table.
Here are the steps:
- Bring three sources into the mapping designer.
- Create the Joiner -1 to join Employees and Departments using Department_ID.
- Create the next joiner, Joiner-2. Take the Output from Joiner-1 and ports from Locations Table and bring them to Joiner-2. Join these two data sources using Location_ID.
- The last step is to send the required ports from the Joiner-2 to the target or via an expression transformation to the target table.
Union Transformation
The Union Transformation is an Active and Connected Informatica transformation. It is used to merge multiple datasets from various streams or pipelines into one dataset. This Informatica transformation works similar to the UNION ALL command in SQL but, it does not remove any duplicate rows. It is recommended to use an aggregator to remove duplicates which are not expected at the target.
Normalizer Transformation
Normalizer Transformation is an Active and Connected Informatica transformation. It is one of the most widely used Informatica transformations mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data.
Let’s try to load a comma separated data flat file from a flat file/Cobol Source.
Here are the steps:
- Start by loading the Store (flat file) with the store name and Quarterly revenue:
- Create a new Normalizer transformation named NRM_STORE_EXP with two ports Store and Quarter(Repeats 4 times because we have data for 4 quarters) as seen below:
- The ports tab should be as seen below:
- Copy/Link the following columns and connect to Normalizer Transformation.
Store
Quarter1
Quarter2
Quarter3
Quarter4
The mapping should look as follows: - Create a new Expression Transformation with exp_STORE. Copy/Link the following columns and connect to Expression Transformation as seen below:
Store
Quarter
GK_QUARTER
GCID_QUARTER - Link the expression to the final target to complete the mapping using Normalization transformation.
XML transformation
XML transformations is an Active and Connected Informatica transformation. In Informatica transformations, XML transformation is mainly used when the source file is of XML type or data is of XML type. XML transformation can mainly be classified into 3 transformations:
- XML Source Qualifier Transformation.
- XML Parser Transformation.
- XML Generator Transformation.
XML Source Qualifier Transformation: XML Source Qualifier is an Active and Connected transformation. XML Source Qualifier is used only with an XML source definition. It represents the data elements that the Informatica Server reads when it executes a session with XML sources. XML Source Qualifier has one input or output port for every column in the source. If you remove an XML source definition from a mapping, the Designer also removes the corresponding XML Source Qualifier transformation.
XML Parser Transformation : XML Parser Transformation is an Active and Connected transformation. XML Parser transformation is used to extract XML inside a pipeline and then pass this to the target. The XML is extracted from the source systems such as files or databases. The XML Parser transformation reads XML data from a single input port and writes data to one or more output ports.
XML Generator Transformation : XML Generator is an Active and Connected transformation. XML Generator transformation is used to create XML inside a pipeline. XML Generator Transformation reads data from one or more input ports and outputs XML through a single output port.
Rank Transformation
Rank transformation is an Active and Connected transformation. It is an Informatica transformations that helps you in selecting the top or bottom rank of data. For example, to select top 10 Regions where the sales volume was very high or to select 10 lowest priced products.
Consider you wish to load the first and last record into a target table from my employee database.The idea behind this is to add a sequence number to the records and then take the Top 1 rank and Bottom 1 Rank from the records.
- Drag and drop ports from source qualifier to two rank transformations.
- Create a reusable sequence generator having start value 1 and connect the next value to both rank transformations.
- Set rank properties as follows. The newly added sequence port should be chosen as Rank Port. No need to select any port as Group by Port.Rank – 1
- Rank – 2
- Make two instances of the target. Connect the output port to target.
Router Transformation
Router is an Active and Connected transformation. It is similar to filter transformation. The only difference is, filter transformation drops the data that do not meet the condition whereas router has an option to capture the data that do not meet the condition. It is useful to test multiple conditions. It has input, output and default groups.
Let’s say you wish to separate the odd and even records of a table, this can be done by using a router transformation.
The idea is to add a sequence number to the records and then divide the record number by 2. If it is divisible, then move it to even target and if not then move it to odd target.
- Drag the source and connect to an expression transformation.
- Add the next value of a sequence generator to expression transformation.
- In expression transformation make two port, one is “odd” and another “even”.
- Write the expression as below
- Connect a router transformation to expression.
- Make two group under the router transformation.
- Give condition as below
- Then send the two group to different targets. This is the entire flow.
I hope this Informatica Transformation blog was helpful to build your understanding on the various Informatica transformation and has created enough interest to learn more about Informatica.
If you found this blog helpful, you can also check out our Informatica Tutorial blog series What is Informatica: A Beginner Tutorial of Informatica PowerCenter and Informatica Tutorial: Understanding Informatica ‘Inside Out’ . In case if you are looking for details on Informatica Certification, you can check our blog Informatica Certification: All there is to know.