Data Vault on EXASOL - Modeling and Implementation
Read the full article! Simply click on the link.
If you use the EXASOL Community Edition, the scale factor of the TPC-H benchmark should not exceed 10, aka 10 GB. Hint: In general, the scale factor should not exceed your database RAM.
Load Statements into Data Vault are very simple and just for single loads within this example. Don’t just copy them for your own data warehouse!
Depending on the size of data you generated for the TPC-H, you will have to adjust the filters in the statements to get the data back!
All SQL files that are provided are designed for a single-node EXASOL database, e.g. Community Edition. If you use those files on a cluster EXASOL database, please define proper distribution keys.
All queries are executed twice. Thereby all required indexes, statistics etc. are created by the database and the data required by a query will definitely reside in RAM for the second run.The second run is marked with the word "HOT" in the comment at the beginning of the statement.
The check queries that are provided are designed to compare the "HOT" executions and require auditing to be enabled in the EXASOL database. Auditing is enabled in EXAoperation (the administration frontend of EXASOL clusters). Simply shut down your database, click on the database where you want auditing to be enabled, click “Edit” and tick the box for auditing. Start the database afterwards. Further information on auditing can be found here.