It's all about Business Intelligence

March 21, 2011

Transformations

Transformations

In this posting we will discuss about Transformation in Informatica.  Basically the term “Transformation” refers to transforming the data which comes from Source and loads into the Target.

Transformation is a core capability of any ETL product and particularly when it comes to Informatica it provides 360 degree transformation. It means virtually we can transform the data as we want it to be.

We have classifications in transformation based on the way that they work.

Based on their participation in changing the number of rows we can classify them in two categories.

1.       Active – changes the number of rows like Aggregator Transformation

2.       Passive – Doesn’t change the number of rows like Expression Transformation

Based on their participation in data flow it can be classified into two categories

1.       Connected – Used in the flow with other transformation/source/target

2.       Unconnected – Used inside another transformation similar lookup transformation

For example to merge the data horizontally (joining two tables like in SQL) we can use the joiner transformation and to merge data vertically (like a union all in SQL) we can use union transformation.

Below is the list of transformations that Informatica 8.x version supports.

Transformation Type Description
Aggregator Active/ 

Connected

Used to perform aggregate calculations similar to Group by in SQL(SUM,AVG etc)
Source Qualifier Active/ 

Connected

Provides the number of rows read from the source
Expression Passive/ 

Connected

Used to perform calculations on each row
External Procedure Passive/ 

Connected or Unconnected

Calls a procedure in a shared library
Filter Active/ 

Connected

Filters data.
HTTP Passive/ 

Connected

Connects to an HTTP server to read or update data.
Java Active or Passive/ 

Connected

Executes user logic coded in Java. The byte code for the user logic is stored in the repository.
Joiner Active/ 

Connected

Joins data from different databases or flat file systems similar to joins in SQL
Lookup Passive/ 

Connected or Unconnected

Can perform lookup operations
Normalizer Active/ 

Connected

Source qualifier for COBOL sources. Can also use in the pipeline to normalize (transforming rows from horizontal to vertical) data from relational or flat file sources.
Rank Active/ 

Connected

Limits records to a top or bottom range.
Router Active/ 

Connected

Routes data into multiple transformations based on group conditions.
Sequence Generator Passive/ 

Connected

Generates primary keys.
Sorter Active/ 

Connected

Sorts data based on a sort key.
SQL Active or Passive/ 

Connected

Executes SQL queries against a database.
Stored Procedure Passive/ 

Connected or Unconnected

Calls a stored procedure.
Transaction Control Active/ 

Connected

Defines commit and rollback transactions.
Union Active/ 

Connected

Merges data from different databases or flat file systems.
Unstructured Data Active or Passive/ 

Connected

Transforms data in unstructured and semi-structured formats.
Update Strategy Active/ 

Connected

Determines whether to insert, delete, update, or reject rows.

In upcoming posts let us have a detailed discussion on all these transformations.

BFN

Create a free website or blog at WordPress.com.