CDC Data Pipeline (Batch)

How to set up a batch for a CDC Data Pipeline in Stage?

ExMeX Companion App Starting with version 0.2.0 of the ExMeX Companion App, the data modeler can create a new batch…

ExMeX Companion App

Starting with version 0.2.0 of the ExMeX Companion App, the data modeler can create a new batch for the Reference Data Stage Pipeline in the ExMeX Companion App.

ExMeX Framework Core Scripts

As always, the data modeler must add a new batch to the \etc\config\ConfigLoadBatchLogistic.sql file. The following SQL example shows an sample batch for the CDC Data Pipeline.

-- Example Batch for CDC Data Pipeline
MERGE INTO MetadataZone.LoadBatch AS target
	USING (	SELECT 'CDC Data Pipeline'
				  ,LoadBatchId
			FROM MetadataZone.LoadBatch
			WHERE BatchLabel = 'BigOne'
		   ) AS source (
				 BatchLabel
				,ParentLoadBatchId
				)
	ON (target.BatchLabel = source.BatchLabel)
WHEN MATCHED THEN
	UPDATE SET 
		target.ParentLoadBatchId = source.ParentLoadBatchId
WHEN NOT MATCHED THEN
	INSERT (
		 BatchLabel
		,ParentLoadBatchId
		)
	VALUES (
			 source.BatchLabel
			,source.ParentLoadBatchId
			)
;

Next, the data modeler must place the new batch in the dependencies to other batches (file \etc\config\ConfigLoadBatchLogistic.sql). In the following example SQLs the batch is arranged after the Stage Layer is loaded (Batch 'Load Stage with SSIS') and before the Batch 'Default - Hub'.

-- Stage to CDC Stage
MERGE INTO MetadataZone.LoadBatchDependency AS target
	USING (	SELECT prelb.LoadBatchId
				  ,suclb.LoadBatchId
			FROM MetadataZone.LoadBatch prelb
			CROSS JOIN MetadataZone.LoadBatch suclb
			WHERE prelb.BatchLabel = 'Load Stage with SSIS'
			AND   suclb.BatchLabel = 'CDC Data Pipeline'
		   ) AS source (
				 PredecessorLoadBatchId
				,SuccessorLoadBatchId)
	ON (target.PredecessorLoadBatchId = source.PredecessorLoadBatchId
	AND target.SuccessorLoadBatchId = source.SuccessorLoadBatchId)
WHEN NOT MATCHED THEN
	INSERT 
	VALUES ( source.PredecessorLoadBatchId
			,source.SuccessorLoadBatchId)
;

-- Stage to Core
MERGE INTO MetadataZone.LoadBatchDependency AS target
	USING (	SELECT prelb.LoadBatchId
				  ,suclb.LoadBatchId
			FROM MetadataZone.LoadBatch prelb
			CROSS JOIN MetadataZone.LoadBatch suclb
			WHERE prelb.BatchLabel = 'CDC Data Pipeline'
			AND   suclb.BatchLabel = 'Default - Hub'
		   ) AS source (
				 PredecessorLoadBatchId
				,SuccessorLoadBatchId)
	ON (target.PredecessorLoadBatchId = source.PredecessorLoadBatchId
	AND target.SuccessorLoadBatchId = source.SuccessorLoadBatchId)
WHEN NOT MATCHED THEN
	INSERT 
	VALUES ( source.PredecessorLoadBatchId
			,source.SuccessorLoadBatchId)
;

In the next step, the data modeler must add the Load Process Steps respectively the tables to the batch. The assignment is done in the file \etc\config\ConfigLoadBatchProcessStepLogistic.sql. The following two SQL statements show as an example the assignment of all tables of type SL - CDC One Data Load Before, SL - CDC Complete and SL - CDC Partially of the data model STAGE_LAYER to the new batch 'CDC Data Pipeline'.

-- Delete Load Process Steps from Batch
DELETE FROM MetadataZone.LoadBatchProcessStep
	WHERE LoadBatchId = (SELECT LoadBatchId
						 FROM MetadataZone.LoadBatch
						 WHERE BatchLabel = 'CDC Data Pipeline'
						)
;

-- DML to assign all CDC Load Process Steps to a batch
INSERT INTO MetadataZone.LoadBatchProcessStep (LoadBatchId, ProcessStepId)
		SELECT LoadBatchId, ProcessStepId
		FROM (SELECT LoadBatchId 
                FROM MetadataZone.LoadBatch 
                WHERE BatchLabel = 'CDC Data Pipeline') lb
	    CROSS JOIN (SELECT ProcessStepId
					FROM MetadataZone.EntityClass iec
					INNER JOIN MetadataZone.Mapping im
						ON  iec.EntityClassCode    = im.TargetEntityClassCode
						AND iec.ModelCode          = im.TargetModelCode
                    INNER JOIN MetadataZone.LoadProcessStep ilps
                        ON  ilps.MappingGUID       = im.MappingGUID
                    INNER JOIN (
                            SELECT ProcessStepTypeId
                            FROM MetadataZone.ProcessStepType
                            WHERE ProcessStepType = 'DEF'
                            ) pst
                        ON ilps.ProcessStepTypeId = pst.ProcessStepTypeId
                    -- Assign Data Models to batch
 					WHERE iec.ModelCode IN (
                                             'STAGE_LAYER'
                                            --,'StageLayerHiZ'
                                            )
                    -- Assign Stereotypes to batch
					  AND iec.Stereotype IN (
										    -- 'Volatile Zone'
                                            --,'History Zone'
                                            --'Reference Zone'
                                            ,'CDC Complete'
                                            ,'CDC One Data Load Before'
                                            ,'CDC Partially'
					  					    )
                    -- Assign tables (according to naming conventions) to batch
                    -- e.g. %edSat% for EdDating-Satellites
                      AND iec.EntityClassCode     LIKE '%%'
                    -- Include or exclude mappings according to naming conventions
                    -- NoRun% -> e.g. not used integrity runs
                    -- TestRun% -> mappings not ready to run on quality or production
                    --             but available in test-batches
                      AND im.MappingName NOT LIKE 'NoRun%'
                      AND im.MappingName NOT LIKE 'TestRun%'
                      AND im.MappingName NOT LIKE 'BusinessRun%'
					) lps
;
Posted 1 year agoby admin.tedamoh