Friday: Data integration @ Roche with DV: MAY BE CHANGED

Data integration @ Roche using Data Hubs and Data Vault

NOTE: Due to Coronavirus, Roche has issued a no-travel ban.  We are working with Pawel to see if he can present remotely, or if we can get the session pre-recorded for playback.  Due to the pandemic, we reserve the right to alter the schedule if speakers cannot make it.

I will present our two-year journey designing and implementing a Data Vault integrating 200 entities (and growing) in a regulated pharmaceutical industry environment. We followed the 2.0 insert-only architecture patterns, our design has been recently positively reviewed by Kent Graziano.

We combined MarkLogic Data Hubs together with a Vault on Teradata and REST data services exposing the data. The implementation process is model-driven – the Vault and the ETL processes loading it are generated using erwin Data Intelligence Suite (formerly Analytix DS).

Testing is heavily automated with Python scripts leveraging erwin metadata. Last year we started the process of migration of the whole solution into the cloud.

Topics Covered:

  • real practical experience implementing a Data Vault
  • lessons learned, DOs and DONTs, pitfalls to avoid
  • special considerations implementing systems that might affect people’s lives, subject to industry regulations

Full Conference
Location: Pinnacle Room Date: May 28, 2021 Time: 8:00 am - 8:55 am Pawel Banasik