It's all about Business Intelligence

July 21, 2010

Crash Course on Informatica part 2

Crash Course on Informatica part 2

In this article we will see how to use the PowerCenter client tools and creating a sample transformation.

PowerCenter Client tools:

Client tools will help us in Creating, Maintaining & Running the ETL programs.It has totally five components.

Tool Usage
Repository Server Administration Console Creating Repositories
Repository Manager Managing Repositories
Designer Creating Data Tranformations
WorfFlow Manager Creating flows which runs the transformations
WorkFlow Monitor Running and monitoring the workflows

Repository Server Administration Console:

Access Path:

We can create, delete, start, and stop repositories through the console.

Repository Manager:

Access Path:

We can use Repository manager for the following things.

i. Importing ETL metadata/objects from XML file

ii. Creating Folders which are containers for ETL objects

iii. Managing users & security related things

iv. Versioning the Repository

v. Analyzing dependencies between ETL objects

Designer:

Access Path:

This is the tool which is used for creating the ETL designs.It has five sub components in it.

Component Usage
Source Analyzer Contains metadata about the data sources
Warehouse Designer Contains metadata for warehouse(typically) where data has to be loaded
Transformation Developer Place where we can design different types of data transformations
Mapplet Designer Place where we can create the mappings and resuse them.Typical use case will be dimension lookups will be performed while loading every fact.So for common dimensions we can write the lookup once and use it in all fact loading places.
Mapping Designer Place where we can combine the source, tranformation and the target.One end to end transformation

We can navigate to each component by going to tool and clicking the appropriate tool.

Now let us see each component with its screenshot and their fntionalities.

Source analyzer:

We can define source as a database,file or XML file.Now we can see a sample of how to define the source as databae.

Step 1:

Define the ODBC for the corresponding database.In my case it is 11g database.

Step 2:

Click on Import from Database and fill the database credentials and click connect.

Step 3:

After selecting the table click ok.Now we had defined the source.

Warehouse Designer:

Now let us create the defintion for the target (warehouse).Let us say I have a table called w_countries where I have to load the data.

Step 1:

In mycase source and target I use the same ODBC for example purpose.If your target is a different database then you have to configure another ODBC for the target database.

Click on Import from database and select the table and then click ok.

Now we had defined the target.

Transformation Designer:

Already we had defined the source & targets.Now let us build a simple transformation just to propogate the data from source to target.

Let’s choose expression as the type and give the transformation name

Double click on the tranformation and enter the port information.Just to explain the usage of input, output and variable ports I had created 4 extra ports.

Mapplet Designer:

Even we have a simple use case of loading the source data to target, Just to explain the usage of mapplet I had created the simple mapplet.Its mandatory for a mapplet to have mapplet input and mapplet output transformations which acts like an entry and exit for a mapplet which will useful when we use this mapplet in a mapping.

I had simply created two extra transformations (mapplet input & mapplet output) and connected them with our transformation.Now we will how to use this mapplet in out mapping.

Mapping Designer:

Mapping is the place where the source & target are connected using the transformation and this mapping will be used to run the actual ETL.In our case we will reuse the mapplet which is created in the previous step. Drag and drop the sources, targets and the mapplet and connect them.As you see in the mapplet you will see only the input and output ports.Actual definition on the mapplet is hidden.you can open it by right clicking and click on open mapplet.We have to create one more transformation for transactional control.Create a transformation with type “Transactional Control” and set the properties as in the below screenshot.This will ensure the data loading gets committed.

Now we are done with creating the mapping.Next we will see how to run this by learning workflow manager and workflow monitor.

WorkFlow Manager:

Access Path

This tool is used to create workflows which contains tasks which runs the mapping we designed using the designer.It has three sub components

Component Usage
Tasks Simply defining task is a unit of work like running a mapping etc.There are many types of tasks.In my next posting I will explain purpose of each task.
Worklets Worklets are reusable tasks similar to Mapplets which helps us to have reusability and maintainability.
Workflows Workflow combines many tasks and run it in a flow.

Tasks:

Let’s create a task for our sample case. Open the workflow manager and connect to the repository.

Click on the task developer from the menu

Click on the Tasks menu and click create

For now choose the type as session and give a name for the task and click create.

Choose our mapping from the list mapping available from the repository.

Now the task is created.After creating the task we need to create the source and target database (or flat file) connections which will be used by the workflow to run them.

Go to connections and click on Relational (as mine is Oracle 11g)

Choose Oracle and enter the database details.

In my case both source and target uses the same database otherwise we have to configure one more for the target data source.

After configuring the connection click on the task and go to its properties, mapping tab.Enter the connection details for both source and target as shown below.

Then click ok.Now we are done with the task and we will proceed to worklet.

Worklets:

Worklets are reusable components similar to the mapplets which contains tasks.We can create both reusable and non reusable worklets based on the requirements.

For our case we will create non reusable worklet.

Go to tool and choose Worklet Designer

Then click on creating a new worklet

After creating the worklet, insert the session type task which we created already and link them using a link.

Everytime after completing your work while saving make sure that you don’t get any errors through the output window.

WorkFlow Designer:

Access Path:

Similar to worklet menu, click on creating new workflow

Insert the worklet and connect them using the links.

Now save the work and make sure that you don’t have any compile error.

After creating our workflows we need to assign one server to this workflow which will run this workflow everytime its called.

Let us see how to create and assign the informatica server to this workflow.

Server Configuration:

Follow the screenshot and click on server configuration

Enter the server details which is same as given during Informatica Server Setup.

After finishing the server configuration, Disconnect from the repository and connect again.This will ensure that all folders in the repository are closed and you wont get any error during assigning workflows (may be try the next step without doing thisJ)

Right click on the server and choose “Assign to Workflows”

Select the corresponding workflow and click assign.

Now we had completed the full cycle.Next lets go to worlflow monitor and test what we had developed till now.

WorkFlow Monitor:

This tool helps us in monitor the workflows we have.

Access Path:

Connect to the repository as below

Start the server and make sure its running.

Now let’s go to the workflow manager and start the workflow which we had designed.

Once you click on start, you will be able to see the status in monitor.on successful completion you will see a window similar one below.

Wow we are done.First ETL run went on successfully.

If you go to the task view tab, you can get statistics information for each task like number of rows loaded etc.

We are pretty much done with some basics of Informatica.In my next next post lets see about types of transformations available in Informatica and Types of tasks available with examples.

Please provide us your feedback on the content..

Thanks..byee

Blog at WordPress.com.