Swagatika Sarangi (Jazz), MDM Expert Master Data Theory & Reality Swagatika Sarangi (Jazz), MDM Expert
Swagatika Sarangi MDM Lead Tweetable: MDM/MDS/DW Consultant Tweetable: “Master Data is the “Reticular Activating System” of EDW!” - Jazz Swagatika Sarangi MDM Lead /sarangiswagatika @gatorjazz12 swagatika.sarangi
Why are we here? Simon Sinek, author of “Start with Why” startwithwhy.com To begin, let’s start with a very important question: Why? Why are we here? …a question asked by one of my favorite authors…
Why MDM? The purpose of the data warehouse is to provide a single version of the truth for business reporting. Ralph Kimball; founder, author of “The Data Warehouse Toolkit” kimballgroup.com Why do we need Master Data Management? One of the most noted thought leaders of modern Data Warehouse principles is Ralph Kimball In his books, papers and lectures, he says that the purpose of the data warehouse is to provide a single version of the truth for business reporting. How can we do this when data is constantly changing?
Real-time Use Cases TS: 1:22:77 Sec
Agenda: Data Types, SCD Types Master Data Management Master Data Services MDS Architecture MDS Data Flow Entity Based Staging Domain Based Attributes Hierarchy Types MDS Web UI Security Q & A
Data Types, SCD Types Data types are broadly categorized into Master, Transactional, Reference, Analytical. 3 types of SCD: SCD Type 1 (Overwrites, no column is added) SCD Type 2 (Adds a new row to the existing column, timestamps via LUD) SCD Type 3 (Adds 3 new columns for changes in 1 existing column, includes timestamp via LUD)
Master Data Management Master Data Management provides a single hub repository for multiple source systems, finds & stores “Golden Record” via: Match (Exact, Fuzzy) Merge Consolidation processes …and pushes data to Data Warehouse
Master Data Services Keeps single version of truth for the dimension members Provides two ways of interacting via Excel Sheet add-in and MDS web UI for business users. Provides Subscription views as a consumption for downstream system using SQL server as their in-house database. For ETL, MDS uses SSIS packages to cleanse, transfer, match and merge data from Pre-staging area to staging area, and finally pushing data to the UI.
Workflow/ Notification Entity Based Staging Tables MDS Architecture WEB-UI Excel Add-in Pre-Staging IIS Service Workflow/ Notification MDS Service DWH Subscription Views DQS Entity Based Staging Tables Excel External System MDS Database
MDS Data Flow Scenarios to Use: Data-in: SSIS Excel sheet SQL Insert statements Manually at the UI Data-out: Excel sheet (Via exporting it from MDS Web UI), MDS Web UI, Subscription views
Batch Processing via SSIS Entity Based Staging Batch Processing via SSIS 3 2 1 21 | 43 TS: 29:04:46
Domain Based Attributes DBAs creation/interaction with Staging Tables Via Excel add-in Via Web UI
MDS Terms MDS TERMS RELATIONAL DB EQUIVALENT TERMS Model Database Entity Table Attribute Column Member Record
Let’s get to Incremental SSIS package design/architecture and demo…. Create a truncate and Load Pre-staging SSIS packages to bring in data from source to Pre-staging area Create an Incremental load for bringing in data from prestaging area to MDS staging area. Run the stored procedure to push values from MDS entity based staging table to MDS Web UI Create an SCD Type 2 Subscription view at the MDS Web UI (Further discussed in Slide 13 & 14)
Import Types: Credits: https://technet.microsoft.com/en-us/library/ee633854(v=sql.110).aspx
DQS/PRESTAGING TABLES MDS DATA FLOW: FLAT FILES DQS/PRESTAGING TABLES MDS WEB UI STAGE/LEAF TABLES MDS SYSTEM TABLES SUBSCRIPTION VIEWS TS
Hierarchy Types User - Defined Hierarchy Derived Hierarchy Parent - Child Hierarchy Explicit Hierarchy No reason to use a specific version number
MDS Web UI FIX THIS --- makes your slide dated. Replace with 2017 if it has changed. At least, remove CTP reference.
Explorer Capabilities Enables business users to visualize members of an entity Explorer allows addition/deletion of members Users can view history of a member, and can access to change sets
System Administration Capabilities This section facilitates creation of Model, Entity, Attribute, Hierarchy, and Business Rules. In this section a developer can define a Domain Based Attribute
Integration Management Capabilities Developers can run Batch Jobs Developers can create Subscription Views
Version Management Capabilities This section allows to create and manager Admin groups, Users, and assign security roles to them.
User & Group Permissions Capabilities Allows user to create Active Directory Groups, and add users to it. This section helps in enabling security access to specific models.
Security
Summary Master Data Management holds Single Version of Truth Master Data Services is Microsoft’s MDM solution compatible with SQL server MDS Architecture is a WCF based fault-tolerant architecture MDS Data Flow provides multiple ways of data in and out among cross-functional systems Entity Based Staging allows leaf based, 2 way staging process Domain Based Attributes is Master Data’s foreign key implementation across entities Hierarchy Types leverages the best out of Derived Hierarchies MDS Web UI is a user friendly web UI specially built for data stewards
Learn more from Speaker Name @gatorjazz12 Swagatika.innova@gmail.com