Data Warehouse

  • Im April 2013 war ich wieder beim Matter-Programm, Data Vault Architecture, in den Niederlanden wo ich Tom Breur kennen lernen durfte.

    In einer angeregten Diskussion über die Automatisierung von Data Warehousing mit Data Vault und der Eignung von Projektmethoden dafür lud Tom mich und Oliver Cramer zu einem Besuch von einem Kunden von sich ein: Der BinckBank.

    Tom Breur: “The best Agile BI shops I have ever seen.”

    Am 24. September 2013 war es dann soweit. Wir besuchten gemeinsam mit Tom die BinckBanck in Amsterdam und schauten uns das Agile Data Warehouse an, welches mit Data Vault aufgebaut wurde. Wir trafen uns mit dem BICC-Team, um über die Entstehungsgeschichte, die Umsetzung, die Herausforderungen und die Erfolgsfaktoren zu sprechen.

  • All articles I wrote about data warehousing, Data Vault, data modeling and more.

    Enjoy reading and your comments are welcome.

  • Immer wieder kommt in Projekten die Frage auf, besser gesagt die Diskussion, ob Constraints in der Datenbank physisch sinnvoll sind oder nicht. Meist gibt es Vorgaben von DBAs oder durchsetzungsstarken ETLern, die eine generelle Abneigung gegen Constraints zu haben scheinen, dass Constraints nicht erwünscht sind. OK, diese Woche wurde mir wieder das Gegenteil bewiesen. Doch wie heißt es so schön: Ausnahmen...

    Auf dem #WWDVC und im Advanced Data Vault 2.0 Boot Camp haben wir ebenfalls über dieses Phänomen gesprochen. Das scheint weltweit zu existieren. Dazu hat kurz nach dem #WWDVC auch Kent Graziano einen Blogpost verfasst. Auf LinkedIn gab es dazu einige Kommentare.

    Gut, wie argumentiert man am besten, bzw. was sind eigentlich die Vor- und Nachteile Constraints zu verwenden?

  • Objective review and data quality goals of data models

    Did you ever ask yourself which score your data model would achieve? Could you imagine  90%, 95% or even 100% across 10 categories of objective criteria?


    Either way, if you answered with “no” or “yes”, recommend using something to test the quality of your data model(s). For years there have been methods to test and ensure quality in software development, like ISTQB, IEEE, RUP, ITIL, COBIT and many more. In data warehouse projects I observed test methods testing everything: loading processes (ETL), data quality, organizational processes, security, …
    But data models? Never! But why?

  • DMZone2015Flyer

    Do you want to learn something about data modelling with Steve Hoberman? You want to explore new methods like Data Vault 2.0, Anchor Modeling, Data Design, DMBOK and many more? E.g. a keynote where Dan Linstedt, Lars Rönnbäck and Hans Hultgren talks together, and another one with Bill Inmon?

  • Wie bereits in meinem Blogpost Modellierung oder Business Rule beschrieben ist es notwendig sich bei der Datenmodellierung über Geschäftsobjekte, die Wertschöpfungskette, fachliche Details und die Methodik des Modellierens einige Gedanken zu machen.

    Oder doch nicht? Kann ich mit Data Vault einfach loslegen? Schließlich ist Data Vault auf den ersten Blick ganz einfach. Drei Objekte: HUBs, LINKs und SAT(elliten), einem einfachen Vorgehensmodell und ein paar wenige Regeln. Brauche ich für Data Vault noch die Datenmodellierung?

  • Read the full article, I wrote, in BI-Spektrum 05/2014.

    So long

    Data Vault im Einsatz beim Gutenberg Rechenzentrum

  • This year’s European Data Modeling Zone (DMZ) will take place at the wonderful German capital Berlin and I’m very happy to be again speaker at this great event! This year I’ll speak about how to start with a conceptual model, using a logical model and finally how to model the physical Data Vault. During this session we will do some exercises (no, no push-ups!!) to bring our brains up and running about modeling.

  • Months ago I talked to Stephan Volkmann, the student I mentor, about possibilities to write a seminar paper. One suggestion was to write about Information Modeling, namely FCO-IM, ORM2 and NIAM, siblings of the Fact-Orietented Modeling (FOM) family. In my opinion, FOM is the most powerful technique for building conceptual information models, as I wrote in a previous blogpost.

  • FCO-IM - Data Modeling by Example

    Do You want to visit a presentation about Fully Communication Oriented Information Modeling (FCO-IM) in Frankfurt?
    I’m very proud that we, the board of the TDWI Roundtable FFM, could win Marco Wobben to speak about FCO-IM. In my opinion, it’s one of the most powerful technique for building conceptual information models. And the best is, that such models can be automatically transformed into ERM, UML, Relational or Dimensional models and much more. So we can gain more wisdom in data modeling at all.

    But, what is information modeling? Information modeling is making a model of the language used to communicate about some specific domain of business in a more or less fixed way. This involved not only the words used but also typical phrases and patterns that combine these words into meaningful standard statements about the domain [3].

  • In July 2016 Mathias Brink and I had given a webinar how to implement Data Vault on a EXASOL database. Read more about in my previous blogpost or watch the recording on Youtube.

    Afterward I became a lot of questions per our webinar. I’ll now answer all questions I got till today. If you have further more questions feel free to ask via my contact page,via Twitter, or write a comment right here.

  • As already mentioned in my previous blogpost I will give a talk at the first day of the Data Modeling Zone 2017 about temporal data in the data warehouse.

    Another interesting talk will take place on the third day of the DMZ 2017: Martijn Evers will give a full day session about Full Scale Data Architects.

    Ahead of this session there will be a Kickoff Event sponsored by I-Refact, data42morrow and TEDAMOH: At 6 pm on Tuesday, 24. October, after the second day of the Data Modeling Zone 2017, all interested people can meet up and join the launch of the German chapter of Full Scale Data Architects.

  • Several times I had the need for some large data sets to do some Data Vault tests at customer site, writing a blogpost, doing a demo or a webinar and many more. And sometimes I need data to do performance or data usage tests on different databases. Due to my work together with EXASOL I focused on the TPC-H tool DBGen to generate gigabytes of data.

    To share my experience with DBGen generating large data sets I wrote this blogpost as a step by step instruction.

  • You may have received an e-mail invitation from EXASOL or from ITGAIN inviting you to our forthcoming webinar, such as this:

    Do you have difficulty incorporating different data sources into your current database? Would you like an agile development environment? Or perhaps you are using Data Vault for data modeling and are facing performance issues?
    If so, then attend our free webinar entitled “Data Vault Modeling with EXASOL: High performance and agile data warehousing.” The 60-minute webinar takes place on July 15 from 10:00 to 11:00 am CEST.
  • Over the last few weeks, Mathias Brink and I have worked hard on the topic of Data Vault on EXASOL.

    Our (simple) question: How does EXASOL perform with Data Vault?

    First, we had to decide what kind of data to run performance tests against in order to get a feeling for the power of this combination. And we decided to use the well-known TPC-H benchmark created by the non-profit organisation TPC.

    Second, we built a (simple) Data Vault model and loaded 500 GB of data into the installed model.  And to be honest, it was not the best model. On top of it we built a virtual TPC-H data model to execute the TPC-H SQLs in order to analyse performance.

  • Some time ago a customers asked me how to load easy and simple some (test)data into their database XYZ (chose the one of your choice and replace XYZ) to test their new developed Data Vault logistic processes.
    The point was: They don’t want to use all this ETL-tool and IT-processes overhead just to some small test in their own environment. If this this is well done from a data governance perspective? Well, that’s not part of this blogpost. Just do this kind of thingis only in your development environment.



Mostly Cloudy

Humidity: 87%

Wind: 6.44 km/h

  • 03 Jan 2019 2°C -3°C
  • 04 Jan 2019 3°C 0°C