We learnt in the last blog about What is Informatica and its real life application. Let us deep dive now and understand in this Informatica Tutorial blog about Informatica, its architecture and a use case. Informatica Certification is one of the most engrossed skills in today’s market as it is an unique and unbiased data integration platform that interoperates over broad ranges of disparate standards, systems, and applications. As discussed in the last blog, Informatica PowerCenter is the flagship product of Informatica and is often used interchangeably. Just to recap, Informatica Powercenter is a single, unified enterprise data integration platform that allows companies and government organizations of all sizes to access, discover and integrate data from virtually any business system, in any format and deliver that data throughout the enterprise at any speed. It is an ETL tool (Extract, Transform and Load) with its main advantage over other ETL tool are as follows:
- It is robust, and can be used in both windows and UNIX based systems
- It is high performing yet very simple for developing, maintaining and administering
Informatica Tutorial: Understanding Informatica PowerCenter
To understand Informatica real time, we should understand in depth about Informatica Architecture and other components of Informatica. So at the end of this Informatica Tutorial blog, you will be able to understand the following:
- What is Informatica Architecture?
- Client Component of Informatica
- Informatica PowerCenter Repository Manager
- Informatica PowerCenter Designer
- PowerCenter Workflow Manager
- PowerCenter Workflow Monitor
- Administrator Console
- Server Component of Informatica
- Repository Service
- Integration Service
- SAP BW Service
- Webservices Hub
- Client Component of Informatica
- Flow of data in Informatica
- Informatica Domain & Nodes
- Informatica Services & Service Manager
- Use Case: How to load product dimension table using SCD
What is Informatica Architecture?
The architecture of Informatica PowerCenter is based on the Service Oriented Architecture (SOA) concept. A service oriented architecture (SOA) can be defined as a group of services, which communicate with each other. The process of communication involves either simple data transfer or it could involve two or more services coordinating same activity.
Development of Informatica is based on Component Based Development Techniques. Component-based development is a technique where predefined components or functional units, or both, with specific functionalities are used to assemble the final product. PowerCenter follows the component-based development methodologies by allowing to build a data flow from a source to the target, using different components (called transformations) and linking them to each other as required. A good way to go about it would be to first understand what are the components of Informatica and then we will learn how to apply Informatica to solve typical business problem through a use case.
So, the Informatica PowerCenter tool consists of 2 components. They are:
- Client component
- Server component
Client Components of Informatica PowerCenter:
PowerCenter Repository Manager:
Repository Manager is used to administer repositories. It can manage user and groups. We can create, delete, and edit repository users and user groups. We can also assign and revoke repository privileges and folder permissions.
The Repository Manager has the following windows:
- Navigator: It displays all objects that you create in the Repository Manager, the Designer, and the Workflow Manager. It is organized first by repository and then by folder.
- Main: It provides properties of the object selected in the Navigator. The columns in this window change depending on the object selected in the Navigator.
- Output: It provides the output of tasks executed within the Repository Manager.
Informatica PowerCenter Designer
The PowerCenter Designer is the client where we specify how to move the data between various sources and targets. This is where we interpret the various business requirements by using different PowerCenter components called transformations, and pass the data through them (transformations). The Designer is used to create source definitions, target definitions, and transformations, that can be further utilized for developing mappings.
Informatica PowerCenter Workflow Manager
It is an ordered set of one or more sessions and other tasks, designed to accomplish an overall operational purpose. It executes a series of Mappings (as Sessions) and other tasks.
The Workflow Manager is the PowerCenter application that enables designers to build and run Workflows. It can be opened as follows:
- Can be launched from Designer by clicking the “W” icon
- Can be opened independently from the path Start > All Programs > Informatica PowerCenter 9.6.1 > Client > PowerCenter Client > PowerCenter Workflow Manager
- Can be opened from the Workflow Designer -The tool you use to create Workflow objects
The Workflow Manager displays the following windows to help you create and organize workflows:
- You can connect to and work in multiple repositories and folders. In the Navigator, the Workflow Manager displays a red icon over invalid objects.
- You can create, edit, and view tasks, workflows, and worklets.
- It contains tabs to display different types of output messages. The Output window contains the following tabs:
- Displays messages when you save a workflow, worklet, or task. The Save tab displays a validation summary when you save a workflow or a worklet.
- Fetch Log. Displays messages when the Workflow Manager fetches objects from the repository.
- Displays messages when you validate a workflow, worklet, or task.
- Displays messages when you copy repository objects.
- Displays messages from the Integration Service.
- Displays messages from the Repository Service.
Informatica Workflow Designer
It maps the execution order and dependencies of Sessions, Tasks and Worklets, for the Informatica Server
Task Developer
It creates Session, Shell Command and Email tasks. Tasks created in the Task Developer are reusable
Worklet Designer
It creates objects that represent a set of tasks. Worklet objects are reusable.
The Workflow Manager also displays a status bar that shows the status of the operation you perform.
The following figure illustrates how a typical workflow looks like including the Start task, Link, and Session task components.
Informatica PowerCenter Workflow Monitor
The Workflow Monitor, a PowerCenter tool, is used to monitor the execution of workflows and tasks.
Workflow Monitor can be used to:
- View details about a workflow or task run in Gantt chart view or task view
- Run, stop, abort, and resume workflows or tasks
- The Workflow Monitor displays workflows that have run at least once.
- The Workflow Monitor continuously receives information from the Integration Service and Repository Service. It also fetches information from the repository to display historic information.
How to Open Informatica Workflow Monitor:
To open the Workflow Monitor, go to:
Start>All Programs>lnformatica PowerCenter 9.6.1>Client>PowerCenter Client > PowerCenter Workflow Monitor
The monitor can also be opened:
- From the Workflow Manager Navigator
- The Workflow Manager can be configured to open the Workflow Monitor when a workflow is run from the Workflow Manager
- From Tools > Workflow Monitor in the Designer, Workflow Manager, or Repository Manager
- Or, from the Workflow Monitor icon on the Tools toolbar
Informatica Administrator Console
Informatica Administrator console (Administrator tool) is the administration tool to administer the Informatica domain and Informatica security. Informatica Administrator console (the Administrator tool) is available after Informatica installation.
The Administration Console performs the following tasks in the domain:
- Managing application services: It manages all application services in the domain, including the integration service and repository service.
- Configuring nodes: It configures node properties including backup directory and resources. It allows the nodes to be shut down and then restarted as well when required.
- Managing domain objects: It creates as well as manages objects such as services, nodes, licenses, and folders.
- Viewing and editing domain object properties: It allows properties for all objects in the domain to be viewed as well as edited within it.
- Security administrative tasks: Manage users, groups, roles, and privileges.
- Viewing log events: It uses the log viewer to view log events of domain, integration service, SAP BW service, web services hub, as well as repository service.
So, in nutshell, client component of Informatica comprises of 5 components viz. Informatica Repository Manager, Informatica PowerCenter Designer, Informatica Workflow Manager, Informatica Workflow Monitor and Informatica Administrator Console. It forms the form-work of the entire tool. Lets now try to understand the Server component of Informatica PowerCenter.
Server Components of Informatica PowerCenter
The PowerCenter server components comprises of the following services:
- Repository service: The Repository service manages the repository. It retrieves, inserts, and updates metadata into the repository database tables.
- Integration service: The Integration service runs sessions and workflows.
- SAP BW service: The SAP BW service looks out for RFC requests from SAP BW and initiates workflows to extract data from, or load data into the SAP BW.
- Web services hub: The Web services hub receives requests from web service clients and exposes PowerCenter workflows as services.
Now that we have understood both client and server components of Informatica, the following info-graphic will explain the flow of data in Informatica i.e. how data is processed:
It is very logical at this point to understand what are other fundamental units in Informatica such as Domain & Node, Service & Service Manager. So lets take a moment to understand them before we perform a handson on Informatica.
Informatica Domain & Nodes:
The salient features of a Domain are as follows:
- A Domain is a logical collection or set of nodes and services
- The PowerCenter Domain is the fundamental administrative unit of PowerCenter
- A Domain can be a single PowerCenter installation, or it can consist of multiple PowerCenter installations
The salient features of a node are as follows:
- A node is a logical representation of a physical machine. It has physical attributes such as a hostname and a port number
- Each node runs a service manager which is responsible for the application and core services
- A node can be a gateway node or a worker node, but it can belong to only one Domain
Informatica Services & Service Manager:
A service is a resource that provides specialized functions. All PowerCenter processes run as services on a node.
Informatica PowerCenter has two types of services:
- Application Services represent server based functions including Repository and Integration Services.
- Core Services represent functions that manage and maintain the environment in which PowerCenter operates and include services like Log Service, Licensing Service, and Domain Service amongst many others.
Service Manager
- The Service Manager is a service that manages all Domain operations and runs on each node within a Domain
- On the gateway node, the Service Manager is responsible for the following:
- Controlling the Domain
- Managing the services running on the Domain
- Providing service lookup
- On all nodes, the Service Manager is meant to control the core services and application services
How different components of PowerCenter interact:
Use Case: How to load a Product Dimension Table using SCD
Problem statement: Our aim is to load a Product Dimension table using Slowly Changing Dimensions (SCDs) Type 2 using effective date.
Given a customer source system which contains the Customer ID, Name, City, State and Country details of the customers, We need to create a new entry in the target dimension table every time a customer comes with a different value.
To understand this better, if a customer returns with a different value for state or city compared to the value already present in the target dimension table, a new entry has to be created with the updated value. This is achieved by the use of SCD solution based target table.
Below is a step-by-step process of loading the product dimension table using SCD.
Step 1: Open PowerCenter Designer.
Step 2: Connect to the repository
Step 3: Launch the Designer
Step 4: Load the source from Database
Step 5: Connect to Database
Step 6: Select SCD_INPUT_DATA table
Step 7: Similarly load target set from database
Step 8: Design a workflow to perform the required operation as seen below
Step 9: Launch Oracle SQL Developer and load SCD_CUSTOMER table
Step 10: Modify the values of state for customers Mary and Hannah
Step 11: Launch Workflow monitor and execute the workflow
Step 12: Execute the command below to obtain the targeted data base
- select* from scd_customer_target
Step 13: Product Dimension table output
To conclude, the product table loaded contains a historical values of the data including the variation to the values present and this is obtained by using Informatica PowerCenter.
I hope this Informatica Tutorial blog was helpful to build your foundation of Informatica and has created enough interest to learn more about Informatica.
Got a question for us? Please mention it in the comments section and we will get back to you.