Learning Informatica PowerCenter 9.x
上QQ阅读APP看书,第一时间看更新

SCD2 (version number) – I need my ancestors!

To implement SCD2 using the wizard, perform the following steps:

  1. In the Designer, navigate to Tools | Mapping Designer | Mapping | Wizard | Slowly Changing Dimensions, as shown in the following screenshot:
    SCD2 (version number) – I need my ancestors!

    A new window will pop up, asking you the name of the new SCD mapping (m_SCD2_VERSION_NUMBER). Also, select the type of SCD you wish to implement. Select Type 2 Dimension - keep a full history of the changes in the target, as we are implementing SCD2 using a version number. Click on Next.

    SCD2 (version number) – I need my ancestors!
  2. The next screen will ask you to select the source. Select a source from the drop-down list. We are using EMP_FILE.txt as the source file for our reference. Also, specify the name of the target you wish to create. We will name the target EMPLOYEE_SCD2_VERSION_NUMBER in this book for our reference. Then, click on Next.
    SCD2 (version number) – I need my ancestors!
  3. In the next window, select EMPLOYEE_ID as Logical Key Field. Also, add LOCATION under Fields to compare the changes, and then click on Next.
    SCD2 (version number) – I need my ancestors!
  4. The next screen asks you to choose the option to maintain the history in the target. Select Keep the 'version' number in separate column and click on Finish.
    SCD2 (version number) – I need my ancestors!
  5. The wizard creates a complete mapping in your Mapping Designer Workspace. Make the necessary changes in the mapping, if required.
    SCD2 (version number) – I need my ancestors!
  6. Change the target data type from the flat file to Oracle table, as shown in the following screenshot:
SCD2 (version number) – I need my ancestors!

When we create an SCD2 mapping using a version number, the wizard creates two additional columns in the target table:

  • PM_PRIMARY_KEY: The wizard generates the primary key for each row to be inserted into target. Note that EMPLOYEE_ID will not be the primary key in the table.
  • PM_VERSION_NUMBER: The wizard generates a version number for each row inserted into table; this allows us to differentiate between current and historical records.

The Informatica PowerCenter SCD2 mapping uses a 0 lookup transformation to look up the data in the target table and uses the EXP_Detect_Changes expression transformation to compare the target data with the source data. Based on the comparison, the expression transformation marks a record as NewFlag or ChangedFlag using a flag. The mapping is divided into the following two flows:

  • The FIL_InsertNewRecord filter transformation allows only the NewFlag record to pass further and filters the ChangedFlag record from the first flow. It passes new records to UPD_ForceInserts, which inserts these records into the target. The sequence generator, which is SEQ_generateKeys, generates the primary key for each NewFlag record. The EXP_KeyProcessing_InsertNew expression transformation multiplies the primary key value by 1000 and loads 0 as the version number for each new row into the target, which is EMPLOYEE_SCD2_VERSION_NUMBER.
  • The FIL_InsertChangedRecord filter transformation allows only the ChangedFlag record to pass further and filters the records marked as NewFlag from second flow. It passes the changed records to UPD_ChangedUpdate, which replaces existing rows in the target to reflect the latest changes. The expression transformation, which is EXP_KeyProcessing_InsertChanged, increments both the primary key and version number by 1 and loads them into the target instance, which is EMPLOYEE_SCD2_VERSION_NUMBER1.

Let's work through each transformation that is used in the SCD2 mapping:

  • Source qualifier (SQ_EMP_FILE): This extracts the data from the file or table that you used as a source in the mapping. It passes data to the downstream transformations, that is, lookup, expression, and filter transformation.
  • Lookup (LKP_GetData): This is used to look up the target table. It caches the existing data from EMPLOYEE_SCD2_VERSION_NUMBER. The EMPLOYEE_ID=IN_EMPLOYEE_ID condition will compare the data with the source table and target table. It passes the data based on the comparison with the expression transformation.
  • Expression (EXP_DetectChanges): This receives the data from an upstream transformation and based on that, it creates two flags, which are NewFlag and ChangedFlag:
    • Condition for NewFlag: IIF(ISNULL(PM_PRIMARYKEY), TRUE, FALSE)
    • Condition for ChangedFlag: IIF(NOT ISNULL(PM_PRIMARYKEY) AND (DECODE(LOCATION,PM_PREV_LOCATION,1,0)=0), TRUE, FALSE)

    It passes the data to downstream filter transformations.

  • Filter (FIL_InsertNewRecord): This filters the records that come from an upstream expression transformation and are marked as ChangedFlag; it only allows records with NewFlag to get passes to the UPD_ForceInserts update strategy.
  • Filter (FIL_UpdateChangedRecord): This filters the records that come from an upstream expression transformation and are marked as NewFlag; it only allows records with ChangedFlag to get passed to the UPD_ChangedInserts update strategy.
  • Update strategy (UPD_ForceInserts): This uses the DD_INSERT condition to insert data into the target, which is EMPLOYEE_SCD2_VERSION_NUMBER.
  • Update strategy (UPD_ChangedInserts): This uses the DD_UPDATE condition to overwrite existing LOCATION value into the target instance, which is EMPLOYEE_SCD2_VERSION_NUMBER1.
  • Sequence generator (SEQ_GenerateKeys): This generates a sequence of values for each new row marked as NewFlag, which incrementally comes into the target by 1. It populates the value into the PM_PRIMARYKEY column in the EMPLOYEE_SCD2_VERSION_NUMBER target.
  • Expression (EXP_KeyProcessing_InsertNew): This multiplies NEXTVAL generated by the sequence generator by 1000 using the NEXTVAL*1000 condition. Note that you can change this number as per your requirement. Using 1000 here means that we can maintain a 1000 history of a particular record.
  • Expression (EXP_KeyProcessing_InsertChanged): This is used to increment the primary key by 1 and also increment the version number by 1 for every changed record.
  • Target (EMPLOYEE_SCD2_VERSION_NUMBER): This is the target table instance that accepts new records into the target table.
  • Target (EMPLOYEE_SCD2_VERSION_NUMBER1): This is the target table instance that accepts changed records into the target table.