Housekeeping

In the topic Housekeeping all questions about housekeeping are covered. Housekeeping is introduced with version 2.10.0 of the TEDAMOH ExMeX Framework.

How to set the amount of data kept in history tables, aka housekeeping?

Purpose If the data pipeline into the data warehouse stores data in the History Zone as simple historicized tables, this…
Purpose

If the data pipeline into the data warehouse stores data in the History Zone as simple historicized tables, this data should only be stored for a limited period of time or number of loading runs.

Motivation

The motivation behind the so-called housekeeping is to keep the amount of data in the History Zone to a minimum. If it is necessary to store data for a longer period of time, then an Active Archive Zone is a good choice.

Applicability

The housekeeping is designed to keep only a certain amount of data (days, months or loading slices) in a table. In this way, housekeeping enables the amount of data to be kept as small as possible but as large as necessary, e.g. in the Hist-tables or a PSA.

Content

Housekeeping is a two-stage procedure.

First Stage - Default in Framework

A generally valid default for housekeeping is to be made in the global variable configuration settings of the ExMeX Framework Core. For more information on how to change global variables go to section Getting Started -> Configuration.

The possible values and global variables are listed in the following table. You can get the current settings of these variables with the shown DML.

SELECT *
FROM MetadataZone.ConfigExMeXFramework
WHERE VariableName IN (
     'HousekeepingHiZUnit'
    ,'HousekeepingHiZAmount'
    ,'HousekeepingHiZAttributeCode'
)
Variable Name Value
 HousekeepingHiZUnit

Possible valid units are:

'DAY'
'WEEK'
'MONTH'
'QUARTER'
'YEAR'
'ProcessStepRun'

 HousekeepingHiZAmount

Possible values are positive integer numbers:

= 0: keep history to infinity.
> 0: keep amount of history. Remove older data.

 HousekeepingHiZAttributeCode  On which global available column the housekeeping should be done.

Example

Configuration

HousekeepingHiZAmount: 3
HousekeepingHiZUnit: ProcessStepRun
HousekeepingHiZAttributeCode: AuditTrailId

This means that

Keep last 3 load slices (aka Load Process Steps) and use column AuditTrailId. Delete all older load slices:

AuditTrailId
17799
17720
17641
17562
17483
17404
17327

Second Stage - Ovveride defaults

It is possible to overwrite the default housekeeping settings for a table individually. The easiest way to do this is to use the TEDAMOH ExMeX Companion App in the filter settings.

ExMeX Companion App Filter configuration Window Housekeeping

Or with SQL by using the usp uspEntityClassAttributeFilter. For more details about the Stored Procedure uspEntityClassAttributeFilter go to section Framework Core In Depth -> Stored Procedures.

Example

Configuration

Amount: 5 (Override existing default value)
Unit: ProcessStepRun
Table: StgCleverHiZ
Column: AuditTrailId

SQL to call usp

-- Default (old) setting
SELECT TargetModelCode
      ,TargetDatabaseName
      ,TargetSchemaName
      ,TargetEntityClassCode
      ,TargetAttributeCode
      ,HousekeepingAmount
      ,HousekeepingUnit
FROM MetadataZone.MetadataForGenerateSource2StageLayer
WHERE TargetEntityClassCode = 'StgCleverHiZ'
;

-- Override settings
DECLARE @EntityClassFilterCondition VARCHAR(2048); EXEC MetadataZone.uspEntityClassAttributeFilter @ModelCode = 'StageLayerHiZ' ,@EntityClassCode = 'StgCleverHiZ' ,@AttributeCode = 'AuditTrailId' ,@EntityClassAttributeFilterCondition = 'LAST' ,@EntityClassAttributeFilterUnit = 'ProcessStepRun' ,@EntityClassAttributeFilterValue = '5' ,@IsHousekeeping = 1 ,@EntityClassFilterCondition = @EntityClassFilterCondition OUTPUT ;
SELECT @EntityClassFilterCondition;
-- New setting SELECT TargetModelCode ,TargetDatabaseName ,TargetSchemaName ,TargetEntityClassCode ,TargetAttributeCode ,HousekeepingAmount ,HousekeepingUnit FROM MetadataZone.MetadataForGenerateSource2StageLayer WHERE TargetEntityClassCode = 'StgCleverHiZ'

This means that

Keep last 5 load slices (aka Load Process Steps) in table StgCleverHiZ and use column AuditTrailId. Delete all older load slices:

AuditTrailId
17799
17720
17641
17562
17483
17404
17327

Considerations and consequences

It is possible to set housekeeping definitions on more than one column. This ist NOT recommended. But if used, than make sure the housekeeping process can handle more than one column for housekeeping.

How to use and implement the housekeeping definitions, see the next How-To.

Posted 2 years agoby Dirk.Lerner

How to use DAPI information for a housekeeping process?

See also previous How-To: How to set the amount of data kept in history tables, aka housekeeping? Purpose If the…

See also previous How-To: How to set the amount of data kept in history tables, aka housekeeping?

Purpose

If the data pipeline into the data warehouse stores data in the History Zone as simple historicized tables, this data should only be stored for a limited period of time or number of loading runs.

Motivation

The motivation behind the so-called housekeeping is to keep the amount of data in the History Zone to a minimum. If it is necessary to store data for a longer period of time, then an Active Archive Zone is a good choice.

Applicability

The housekeeping is designed to keep only a certain amount of data (days, months or loading slices) in a table. In this way, housekeeping enables the amount of data to be kept as small as possible but as large as necessary, e.g. in the Hist-tables or a PSA.


This question focus on how to use the information of housekeeping within a housekeeping process.

Content

To use information for housekeeping the DAPI MetadataForGenerateSource2StageLayer in Schema MetadataZone has to be used. All information about which and how many data to keep are stored in two additional columns of the DAPI:

  • HousekeepingAmount - How many data to keep
  • HousekeepingUnit - Unit of data to keep
HousekeepingUnit

Possible valid units are:

'DAY'
'WEEK'
'MONTH'
'QUARTER'
'YEAR'
'ProcessStepRun'

HousekeepingAmount

Possible values are positive integer numbers:

= 0: keep history to infinity.
> 0: keep amount of history. Remove older data.

The values are associated to the tablecolumn which drives the housekeeping.

Example (ProcessStepRun)

Doing housekeeping for

Table: StgCleverHiZ
Column: AuditTrailId
Unit: ProcessStepRun

SQL to lookup on DAPI

-- Default (old) setting
SELECT TargetModelCode
      ,TargetDatabaseName
      ,TargetSchemaName
      ,TargetEntityClassCode
      ,TargetAttributeCode
      ,HousekeepingAmount
      ,HousekeepingUnit
FROM MetadataZone.MetadataForGenerateSource2StageLayer
WHERE TargetEntityClassCode = 'StgCleverHiZ'
;

HowToUseHousekeepingInformation for Housekeeping - Here: Keep the last 3 ProcessStepRuns, aka load slices.

This means that

Keep last 3 load slices (aka Load Process Steps) in table StgCleverHiZ and use column AuditTrailId. Delete all older load slices:

AuditTrailId
17799
17720
17641
17562
17483
17404
17327

Example (DAY)

Doing housekeeping for

Table: StgCleverHiZ
Column: AuditTrailId (contains now a timestamp)
Unit: DAY

SQL to lookup on DAPI

-- Default (old) setting
SELECT TargetModelCode
      ,TargetDatabaseName
      ,TargetSchemaName
      ,TargetEntityClassCode
      ,TargetAttributeCode
      ,HousekeepingAmount
      ,HousekeepingUnit
FROM MetadataZone.MetadataForGenerateSource2StageLayer
WHERE TargetEntityClassCode = 'StgCleverHiZ'
;

HowToUseHousekeepingInformation for Housekeeping - Here: Keep the last 10 days.

This means that

Keep last 10 days in table StgCleverHiZ and use column AuditTrailId, which contains a timestamp, to identify data for housekeeping. Delete all older dates:
The following applies NOW() = 2020-07-30

AuditTrailId
2020-07-30
2020-07-29
2020-07-[...]
2020-07-22
2020-07-21
2020-07-20
2020-07-19
2020-07-18
2020-07-[...]

Considerations and consequences

It is possible to set housekeeping definitions on more than one column. This is NOT recommended. But if used, than make sure the housekeeping process can handle more than one column for housekeeping.

 

Posted 2 years agoby Dirk.Lerner