Some time ago a customers asked me how to load easy and simple some (test)data into their database XYZ (chose the one of your choice and replace XYZ) to test their new developed Data Vault logistic processes.
The point was: They don’t want to use all this ETL-tool and IT-processes overhead just to some small test in their own environment. If this this is well done from a data governance perspective? Well, that’s not part of this blogpost. Just do this kind of thingis only in your development environment.
All databases, as far as I know, have their own loading utilities, e.g. FastLoad at Teradata or Bulk Insert at MS SQLServer. So before considering some weird loading practices like Data to Excel to Access via ODBC to database XYZ (be sure, these techniques are out there) you can simply use the provided toolset of vendor XYZ.
Let me give you a small example using Bulk Insert with MS SQL Server.
Let’s say my (fictitious) customer FastChangeCo want to load some data out of his source system Fibonacci. The interface contains three attributes (ID, Fibonacci-Sequence [1] and Lucas-Sequence [2]) which should be loaded into a Hub and Satellite.
First create a simple table in your Staging Layer:
ID INTEGER
,FibonacciSequence BIGINT
,LucasSequence BIGINT
)
Then put the Fibonacci Source File into a folder of your choice (Btw. I provide all shown scripts via download), let’s assume to c:\FibonacciSource
If you execute the following simple script, your data will be loaded. That’s it.
-- set date format
SET LANGUAGE British;
-- clear target table
DELETE FROM FibonacciSource;
BULK INSERT FibonacciSource
FROM 'c:\FibonacciSource\FibonacciSource.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ';', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
ERRORFILE = 'c:\FibonacciSource\FibonacciSourceError.csv',
TABLOCK
)
SELECT *
FROM FibonacciSource
;
Now FastChageCo can easy test their Hub and Satellite data logistic processes.
So long,
Dirk