Reference Data Stage Pipeline (Non-Historized)

How to design a reference table?

Table of type SL - Volatile Zone As always, a table of type Volatile Zone is required to load the…

Table of type SL - Volatile Zone

As always, a table of type Volatile Zone is required to load the data 1:1 from the source into the Stage Layer. For the Reference Data Stage Pipeline there are no special requirements to consider. The following picture shows an example of a Volatile Zone table.

Reference Data Stage Source Table My Reference TableReference Data Stage Source Table My Reference Table

Table of type SL - Reference Zone

To populate all data, which are stored in the previous shown Volatile Zone table to a non-historized reference data table create a new table with stereotype SL - Reference Zone.

A table of type SL - Reference Zone stores (insert, update and delete) all current reference data which were available within the full load. There is no versioning of the data.

This table can contain as many or fewer columns compared to the Volatile Zone table. Depends on the design decision of the data modeler and the requirements for the reference table.

Two additional (metadata) columns are mandatory:

  • Audit Trail Id (Name and Code as configured)
  • Status CDC

The metadata column Status CDC was already introduced in section Change Data Capture Pipeline. It is reused here because it serves the exact same purpose. To document the change. This column shows whether the row in the reference table is the initial insert (CDCStatus = 'I') or has already been updated (CDCStatus = 'U').

The primary key correspond to Compare Key described in section Change Data Capture Pipeline.

Reasons to populate data into the reference table are and identified by

  • New data: Primary key does not exist in reference table. Data will be inserted into reference table.
  • Changed (context) data: Changed dates are identified within all other, non metadata- and non-pk-, columns. Data will be updated in reference table.
  • Deleted data: Primary key does not exist anymore in Volatile Zone table. Data will be deleted in reference table.

 

Hint: For historized reference data it is recommended to use Data Vault patterns in the Reference Zone of the Stage Layer.

The following picture shows two examples of a reference table. One with a composite primary key including the column Valid From to detect bitemporal reference data changes as well as reference table with a single column primary key.

Reference Data Stage Target TableReference Data Stage Target Table

Posted 1 year agoby Dirk.Lerner

How to design mappings for a reference table?

Mapping table of type SL - Volatile Zone to SL - Reference Zone The mapping of the columns from the…

Mapping table of type SL - Volatile Zone to SL - Reference Zone

The mapping of the columns from the table of type SL - Volatile Zone to type SL - Reference Zone is kept simple. Connect the columns to be loaded by drag & drop, give the mapping a suitable name and that's it.

The two additional metadata columns must not be mapped. The Load Process Step takes care of the loading of the two columns:

  • Audit Trail Id (Name and Code as configured)
  • Status CDC

The following two pictures show exemplary mappings for a primary key with one column as well as with several columns (composite primary key).

Reference Data Stage Mapping to Ref Single Column KeyReference Data Stage Mapping to Ref Single Column Key

Reference Data Stage Mapping to Ref Composite KeyReference Data Stage Mapping to Ref Composite Key

Posted 1 year agoby Dirk.Lerner