CDC Data Pipeline (General)

Which Stereotypes for CDC are available?

Extended table Type: Volatile Zone - 1:1 copy of source Column stereotype: CDC - Compare Key (Mandatory) This stereotype identifies…

Extended table Type: Volatile Zone - 1:1 copy of source

  • Column stereotype: CDC - Compare Key (Mandatory)

    This stereotype identifies one or many columns which drive the compare to identify changes and capture them.

  • Column stereotype: CDC - Ignore (Optional) 

    This stereotype identifies one or many columns which will be ignored in diff-columns which are used to detect changes. This means that the Load Process Step ignores changes in the data in these columns when comparing for differences.
    Nevertheless, the Load Process Step copies these data into the comparison tables (ODLB or HiZ) and CDC tables.
    If the data modeller subsequently removes or adds this stereotype to columns, note that the Hash-Diff CDC column in the comparison tables (ODLB or HiZ) must be recalculated.

Extended view Type: Volatile Zone (View) - Partial data flow of Volatile Zone

  • Column stereotype: CDC - Compare Key (Mandatory)

    This stereotype identifies one or many columns which drive the compare to identify changes and capture them.

  • Column stereotype: CDC - Ignore (Optional) 

    This stereotype identifies one or many columns which will be ignored in diff-columns which are used to detect changes. This means that the Load Process Step ignores changes in the data in these columns when comparing for differences.
    Nevertheless, the Load Process Step copies these data into the comparison tables (ODLB or HiZ) and CDC tables.
    If the data modeller subsequently removes or adds this stereotype to columns, note that the Hash-Diff CDC column in the comparison tables (ODLB or HiZ) must be recalculated.

Table type: One Data Load Before (ODLB) - Contains the previous full load to which the current full load should be compared.

  • Table stereotype: SL - CDC One Data Load Before

    This table store all data which were available one data load before the current load. The ODLB table is used to execute the change data capture.
    Hint: Alternatively an existing history table can be used.

  • Mandatory column: Hash-Diff CDC (Code: HashDiffCDC)

    This column stores a hash, generated with all non CDC Compare Key columns. It increases speed to identify changes.
    To ensure data quality it is recommended to use a domain for the Hash-Diff CDC column (see also the following topics).

Table type: Change Data Capture (CDC) - Stores all detected changes

  • Table stereotype: SL - CDC Complete

    This table stores all data identified by the CDC Data Pipeline: I,U,D

  • Table stereotype: SL - CDC Partially

    This table stores exclusively deleted data identified by the CDC Data Pipeline: D

  • Mandatory column: Status CDC (Code: StatusCDC)

    This column stores a flag, generated by CDC: I,U,D,X
    To ensure data quality it is recommended to use a domain for the Status CDC column (see also the following topics).

  • Column stereotype: MD - Audit Trail Id (Optional)

    This stereotype identifies one column which will store the Audit Trail Id provided by ExMeX Logging Framework.

Extended table Type: All Satellites (with CDC enabled)

  • Mandatory column: Status CDC (Code: StatusCDC)

    This column stores a flag, generated by CDC: I,U,D,X
    To ensure data quality it is recommended to use a domain for the Status CDC column (see also the following topics).

  • Column stereotype: CDC - Complete (Mandatory) or CDC - Partially (Mandatory)

    This stereotype identifies a column which drives the Satellite either to load in Full or Partially CDC Mode.
Posted 2 years agoby Dirk.Lerner

How to design a domain Status CDC?

To ensure data quality it is recommended to use a domain for the Status CDC column. Set up a domain…

To ensure data quality it is recommended to use a domain for the Status CDC column. Set up a domain with the following characteristics.

  • Data type: CHAR(1)
  • Mandatory: Yes
  • List of values, set complete = yes:
    I = Insert (new rows)
    U = Update (existing rows)
    D = Delete (existing rows)
    X = Undefined (only within partially CDC, which only detects Delete)

The domain ensures that only the values listed above are allowed and possible to insert. NULL is not possible.

Example within PowerDesigner

CDC Data Pipeline General Domain Status CDC Data typeCDC Data Pipeline General Domain Status CDC Data type
CDC Data Pipeline General Domain Status CDC List of valuesCDC Data Pipeline General Domain Status CDC List of values

 

Posted 1 year agoby Dirk.Lerner

How to design a domain Hash-Diff CDC?

To ensure data quality it is recommended to use a domain for the Hash-Diff CDC column in stage ODLB tables.…

To ensure data quality it is recommended to use a domain for the Hash-Diff CDC column in stage ODLB tables. Set up a domain with the following characteristics.

  • Data type: VARBINARY(20)
  • Mandatory: Yes

The domain ensures that always a Hash-Diff CDC value is provided. NULL is not possible.

Example within PowerDesigner

CDC Data Pipeline General Domain Hash Diff CDC Data typeCDC Data Pipeline General Domain Hash Diff CDC Data type

Posted 1 year agoby Dirk.Lerner