Copyright © 2013 Varigence, Inc. Auto-generate a Data Vault Series Peter Avenant and Michael Buller

Slides:



Advertisements
Similar presentations
The Organisation As A System An information management framework The Performance Organiser Data Warehousing.
Advertisements

IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Business Information Warehouse Business Information Warehouse.
Data Vault RMOUG Training Days 2006 Colorado Convention Center Denver, Colorado February
© Genesee Academy, /1/ The.
City of Charlotte Data Warehousing and Business Intelligence and Building Mashups By Example by Rattapoom Tuchinda, Pedro Szekely, and Craig A. Knoblock.
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Technical BI Project Lifecycle
Management Information Systems, Sixth Edition
Copyright © 2013 Varigence, Inc. Biml - Introduction Session Peter Avenant
Data Warehousing M R BRAHMAM.
Accessing Organizational Information—Data Warehouse
Introduction to data warehouses
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Copyright © 2013 Varigence, Inc. CSV files import automation Kostya Khomyakov
Copyright © 2013 Varigence, Inc. Auto-generate a Data Vault Series Peter Avenant and Michael Buller
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
ETL By Dr. Gabriel.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
Sayed Ahmed Logical Design of a Data Warehouse.  Free Training and Educational Services  Training and Education in Bangla: Training and Education in.
Understanding Data Warehousing
Introduction to the Orion Star Data
Data-mining & Data As we used Excel that has capability to analyze data to find important information, the data-mining helps us to extract information.
Data Warehouse Architecture. Inmon’s Corporate Information Factory The enterprise data warehouse is not intended to be queried directly by analytic applications,
OLAP Theory-English version On-Line Analytical processing (Business Intelligence) [Ing.J.Skorkovský,CSc.] Department of corporate economy.
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor Danijela Subotić, Teaching Assistant.
Data Warehouse. Design DataWarehouse Key Design Considerations it is important to consider the intended purpose of the data warehouse or business intelligence.
1 Data Warehouses BUAD/American University Data Warehouses.
University of Nevada, Reno Organizational Data Design Architecture 1 Organizational Data Architecture (2/19 – 2/21)  Recap current status.  Discuss the.
ETL Extract. Design Logical before Physical Have a plan Identify Data source candidates Analyze source systems with data- profiling tools Receive walk-through.
Data Warehousing.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Dimensional Modeling Primer Chapter 1 Kimball & Ross.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Designing a Data Warehousing System. Overview Business Analysis Process Data Warehousing System Modeling a Data Warehouse Choosing the Grain Establishing.
UNIT-II Principles of dimensional modeling
McGraw-Hill/Irwin ©2009 The McGraw-Hill Companies, All Rights Reserved CHAPTER 6 DATABASES AND DATA WAREHOUSES CHAPTER 6 DATABASES AND DATA WAREHOUSES.
Rajesh Bhat Director, PLM Analytics Applications
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Data Warehousing.
Data Warehouses, Online Analytical Processing, and Metadata 11 th Meeting Course Name: Business Intelligence Year: 2009.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Fundamentals of Information Systems, Sixth Edition Chapter 3 Database Systems, Data Centers, and Business Intelligence.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Introduction Data Vault. Historical development Business Intelligence 1950 Turing : First computers 1960Codd : 3NF 1970Management Information Systems.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Copyright © 2006, Oracle. All rights reserved. Czinkóczki László oktató Using the Oracle Warehouse Builder.
Building the Corporate Data Warehouse Pindaro Demertzoglou Data Resource Management.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
Copyright 2015 Varigence, Inc. Unit and Integration Testing in SSIS A New Approach Scott @varigence.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
Business Intelligence Overview
A Better Data Warehouse Modelling Paradigm
Operation Data Analysis Hints and Guidelines
Data warehouse and OLAP
Data Warehouse.
Introduction to Data Vault on SQL Server
Data Warehouse and OLAP
CHAPTER SIX OVERVIEW SECTION 6.1 – DATABASE FUNDAMENTALS
Introduction to Data Vault
Data Warehousing Concepts
Data Warehouse and OLAP
Presentation transcript:

Copyright © 2013 Varigence, Inc. Auto-generate a Data Vault Series Peter Avenant and Michael Part One – Converting AdventureWorksLT2012 Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Auto-generate a Data Vault Series Converting AdventureWorksLT2012 Generating a Data Vault using an Offline Schema and Metadata Model. Populating the Data Vault Staging environment using BIML. Populating the Historical Staging environment using BIML. Populating Hubs using BIML. Populating Satellites using BIML. Populating Links using BIML. Populating Reference Tables using BIML. Translate Raw DV into Business DV using BIML Generate a Star Schema from DW using BIML Generate OLAP Cube from Star Schema using BIML Generate Tabular Cube from Star Schema using BIML Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Product Overview Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. High Level Overview Without Data Vault OperationalStage Kimball Data Warehouse Cubes, OLAP, Tabular Documentation ERP Stage Extract BDW Accounting Sales CRM Copyright 2014 Varigence, Inc. Transform Load Extract

Copyright © 2013 Varigence, Inc. Why do we need Data Vault? W HEN DATA VAULT MODELING IS APPLIED, THE RESULTING DATA WAREHOUSE WILL … M ORE READILY ABSORB CHANGES ( IMPROVED AGILITY ) R ESPOND WELL TO NEW SUBJECT AREAS ( INCREMENTAL BUILD ) I NNATELY MANAGE HISTORICAL TIME SLICES OF DATA ( HISTORIZATION ) P ROVIDE FULL TRACEABILITY BACK TO SOURCE FEEDS ( AUDITABILITY ) G ROW AND ADAPT WITH MINIMAL IMPACT, NO SILOS ( LOWER TCO) I NTEGRATE, ALIGN & RECONCILE DATA ( ENTERPRISE INTEGRATION ) T RACK, MANAGE AND REPORT ON EXCEPTIONS ( PROVIDES FEEDBACK LOOP ) Hultgren, Hans ( ). Modeling the Agile Data Warehouse with Data Vault (Kindle Locations ). New Hamilton. Kindle Edition. 1. Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Layer Analysis OperationalData WarehouseData Mart Data CaptureData IntegrationData Delivery DepartmentalEnterprise WideDemand Driven Transactional ProcessingIntegration HistorizationOnline Analytical Processing Business FunctionCore Business ConceptFact Based Analysis AccuracyCompletenessFlexibility SpeedAuditabilityUsability System of RecordAll Data All TimeRight Data Right Time Business OperationsEnterprise KnowledgeSpecific Analytics Capture and LogHistorize and Time SlicePrepare and Deliver Running of OperationsAll Data Over TimePresentation and Analysis FirmAgilityRespond and Deliver Copyright 2014 Varigence, Inc. Hultgren, Hans ( ). Modeling the Agile Data Warehouse with Data Vault (Kindle Location 699). New Hamilton. Kindle Edition.

Copyright © 2013 Varigence, Inc. Data Vault – The Hub The Hub represents a Core Business Concept such as Customer, Vendor, Sale or Product. The Hub table is formed around the Business Key of this concept and is established the first time a new instance of that business key is introduced to the EDW. A Hub may require a multiple part key to assure an enterprise wide unique key however the cardinality of the Hub must be 1: 1 with a single instance of the business concept. The Hub contains no descriptive information and contains no FKs. The Hub consists of the business key only, with a data warehouse sequence id, a load date/ time stamp and a record source. Hultgren, Hans ( ). Modeling the Agile Data Warehouse with Data Vault (Kindle Locations ). New Hamilton. Kindle Edition. Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Data Vault – The Link The Link construct is used to represent all relationships in a data vault model. Each Link is based on a unique, specific, natural business relationship. In this way the Link is very much like a Hub. It captures only the existence of a relationship the same way that a Hub captures the existence of a business key. The Link contains no descriptive information and does not have its own Business Key. The Link consists of the sequence ids of the concepts it is relating, with a warehouse machine Link sequence id, a load date/ time stamp and a record source. The Link captures the first time this relationship was seen in the data warehouse. So any subsequent references to the same keyed relationship will be ignored by the Link. Hultgren, Hans ( ). Modeling the Agile Data Warehouse with Data Vault (Kindle Locations ). New Hamilton. Kindle Edition. Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Data Vault – The Satellite The Satellite construct is perhaps the hardest working construct in data vault modeling. The Satellite tracks all context and all time-slice history in the data warehouse. The Satellite contains all descriptive information for both core business concepts and their relationships. The Satellite does not have its own Business Key but manages all information and history concerning a Hub or Link by inheriting the Sequence ID from that Hub or Link. The two-part primary key for each Satellite is the inherited Sequence ID plus the Date/ Time Stamp. In this way the Satellite can track History in the same manner as a Type-2 Dimension (a Dimension designed to track history in the dimensional modeling approach). The Satellite is the only construct in data vault modeling that uses the Date/ Time Stamp as part of the key. For this reason it is the only construct in data vault modeling capable of tracking history. The Satellite consists of the Sequence ID of the Hub or Link that it is describing, combined with a load date /time stamp to form a primary key, a record source and then a set of Context Attributes that depend on the Sequence ID. Hultgren, Hans ( ). Modeling the Agile Data Warehouse with Data Vault (Kindle Locations ). New Hamilton. Kindle Edition. Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Stage Extract EDW Transform Load Stage Kimball Data Warehouse BDW Extract BDW Transform Load ERP Data Mart Accounting Data Mart Sales Data Mart CRM Data Mart Data Warehouse Data Mart High Level Overview With Data Vault High Level Overview Without Data Vault OperationalCubes, OLAP, Tabular Documentation ERP Accounting Sales CRM Copyright 2014 Varigence, Inc. Stage Extract Transform Load

Copyright © 2013 Varigence, Inc. High Level Overview StageEnterprise Data WarehouseData Marts RawBDW Hultgren, Hans ( ). Modeling the Agile Data Warehouse with Data Vault (Kindle Locations ). New Hamilton. Kindle Edition. Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. STEP 1, ANALYZE STEP 2, REVIEW STEP 3, GENERATE MODEL STEP 4, QUICK PREVIEW OF PACKAGE GENERATION What will we cover Copyright 2014 Varigence, Inc. &

Copyright © 2013 Varigence, Inc. AdventureWorksLT Source Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Step 1, Analyze Mark potential satellite tables –Every table on which there are no foreign keys referencing to and has only one foreign key with all the referencing columns also primary key columns and no other columns are part of primary key: is a candidate to become a satellite. Mark as peg leg Links –Every table on which there are no foreign keys referencing to and has only one foreign key with all the referencing columns also primary key columns but primary key is wider than the foreign key: is a candidate to become a peg leg link. Mark as links –Every table on which there are no foreign keys referencing to and has more than one foreign key with all the referencing columns also primary key columns: is a candidate to become a link. Mark as hubs –Every table which does not fit on any of the categories above is going to be a hub. Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. AdventureWorksLT Post Analyze Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. STEP 1, ANALYZE Demonstratio n Copyright 2014 Varigence, Inc. &

Copyright © 2013 Varigence, Inc. Step 2, Review The analyst might not be happy with all the decisions of the analyzer and can overrule that output but not every change is possible. Below is a list of possible changes. Satellite -> Hub Every table marked as Satellite can very well be a Hub Peg leg Link -> Multi-active satellite Every table marked as Peg leg link can be a Multi-active satellite Peg leg Link -> Hub Every table marked as Peg leg link can be a Hub Link -> Hub Every table marked as Link can also be a Hub In our sample model we are not going to apply and change on the result of the analyzer. Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Step 3, Generate Create a hub and a satellite for each table marked as hub. Create links from relationships of tables marked as hubs Create links from tables marked as links Create satellites based on tables marked as satellites Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Our Results Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. STEP 3, GENERATE MODEL Demonstratio n Copyright 2014 Varigence, Inc. &

Copyright © 2013 Varigence, Inc. STEP 4, QUICK PREVIEW OF PACKAGE GENERATION Demonstratio n Copyright 2014 Varigence, Inc. &

Copyright © 2013 Varigence, Inc. Twitter LinkedIn Biml User Group – – – Varigence Mist – BimlScript – CodePlex – Biml Documentation – Biml Resources Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Dan Linstedt – – –Book - "Super Charge Your Data Warehouse"Book - "Super Charge Your Data Warehouse" Hans Hultgren – –Book - "Modeling The Agile Data Warehouse with Data Vault"Book - "Modeling The Agile Data Warehouse with Data Vault" explainedhttp:// explained Data Vault Resources Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Upcoming Events Copyright 2014 Varigence, Inc.

Copyright © 2013 Varigence, Inc. Thank You Copyright 2014 Varigence, Inc.