INTELLIGENT DATA SOLUTIONS OM
INTELLIGENT DATA SOLUTIONS OM Data Modeling SQL Saturday BI Edition, Atlanta Talking Points for Delora J Bradish, Sr. Consultant January 9, 2016
INTELLIGENT DATA SOLUTIONS OM 3 Agenda BI Fundamental Review EDW Modeling Additional Modeling Considerations Migrating to MS BI
INTELLIGENT DATA SOLUTIONS OM BI Fundamentals
INTELLIGENT DATA SOLUTIONS OM 5 Purpose of BI
INTELLIGENT DATA SOLUTIONS OM 6 Purpose of BI = Reporting & Analytics
INTELLIGENT DATA SOLUTIONS OM 7 Reporting Components Business Intelligence Components 6Reporting Initiated by your development team and grown through self-service BI. 5SharePoint Services One time configuration by your SharePoint Administrator 4SharePoint, O-365 & SSAS Security Coordinated with Active Directory, roles and groups 3SQL Server Analysis Services (SSAS) Deployed Multidimensional Cubes or Tabular Models 2SQL Server Data Warehouse Star schema EDW (Enterprise Data Warehouse) 1Information Data Store (IDS) Consolidated and cleansed 3NF data warehouse Take time to build a strong foundation!
INTELLIGENT DATA SOLUTIONS OM 8 BI Blueprint Data model considerations Reporting Sources
INTELLIGENT DATA SOLUTIONS OM 9 IDS vs EDW Information Data StoreEnterprise Data Warehouse OLTPOLAP Production ReportingAnalytics 3NF or Snowflake2NF Star Schema Optimized for Data IntegrationOptimized for Date Delivery MDM & DQSNO Data Cleansing! Base DataBusiness Logic / Analytics Bill InmonRalph Kimball
INTELLIGENT DATA SOLUTIONS OM 10 Reporting vs Analytics ReportingAnalytics Production ReportingAnalysis of Business Processes A tableA subject area NormalizedDenormalized Parent – ChildCombined Parent with Child PRODUCT PRODUCT_SUBCATEGORY PRODUCT_CATEGORY (combined) DimProduct VISIT VISIT_LINE VISIT_LINE_DETAIL (combined) FactVisitLineDetail DimVisitLineDetail
INTELLIGENT DATA SOLUTIONS OM 11 Return on Investment Cube Design Scalability Optimized Query Performance Uncluttered GUI
INTELLIGENT DATA SOLUTIONS OM Modeling
INTELLIGENT DATA SOLUTIONS OM 13 Talking Points 1.Dimensions vs Facts 2.Slowly Changing Dimensions 3.Deleted Rows 4.Denormalization 5.Degenerate Dimensions 6.Many-to-Many 7.Predictive Analytics
INTELLIGENT DATA SOLUTIONS OM 14 Dimensions vs Facts * Business keys are often stored on disc in the fact table, but exposed in the cube as a dimension DimensionFact A Set of NounsA Set of Verbs StringsNumeric An EntityA Process AttributesMeasures Group / Slice & FilterAggregate Primary Keys & Business Keys*Foreign Keys Only Regular Junk Degenerate Slowly Changing Role Playing Transactional Accumulating Snapshot Periodic Snapshot
INTELLIGENT DATA SOLUTIONS OM 15 Slowly Changing Dimensions Attribute “at time of fact” Type 1 – no history Type 2 – multiple rows Type 3 – multiple columns
INTELLIGENT DATA SOLUTIONS OM 16 Deleted Rows “Is Deleted” Flag Deleted Schema
INTELLIGENT DATA SOLUTIONS OM 17 Denormalization Snowflake (Parent-child Related) tables Role Playing Dimensions Degenerate Dimensions
INTELLIGENT DATA SOLUTIONS OM 18
INTELLIGENT DATA SOLUTIONS OM 19 YES No YES
INTELLIGENT DATA SOLUTIONS OM 20 Denormalization Illustrated GR1Group 1 Name GR2Group 2 Name GR3Group 3 Name GR1CAT1Cat 1 Name GR1CAT2Cat 2 Name GR1CAT3Cat 3 Name GR2CAT4Cat 4 Name CAT1Pat1Pat 1 Name CAT1Pat2Pat 2 Name CAT2Pat3Pat 3 Name CAT3Pat4Pat 4 Name Fact1Pat1 Fact2Pat1 Fact3Pat2 Fact4Pat3 GR1Group 1 NameCAT1Cat 1 NamePat1Pat 1 Name GR1Group 1 NameCAT1Cat 1 NamePat2Pat 2 Name GR1Group 1 NameCAT2Cat 2 NamePat3Pat3Name GR1Group 1 NameCAT3Cat 3 NamePat4Pat 4 Name GR2Group 2 NameCAT4Cat 4 NameNULLNULL Patient.DIM PK FK PK
INTELLIGENT DATA SOLUTIONS OM 21 Denormalization Illustrated CAT1GRP1Cat 1 Name CAT1GRP2Cat 1 Name CAT1GRP3Cat 1 Name CAT2GRP1Cat 2 Name CAT2GRP3Cat 2 Name Pat1CAT1Pat 1 Name Pat1CAT2Pat 1 Name Pat2CAT1Pat 2 Name Pat2CAT2Pat 2 Name GR1Group 1 Name GR2Group 2 Name GR3Group 3 Name GR1Group 1 NameCAT1Cat 1 NamePat1Pat 1 Name GR2Group 2 Name CAT1Cat 1 NamePat1Pat 1 Name GR3Group 3 Name CAT1Cat 1 NamePat1Pat 1 Name GR1Group 1 NameCAT2Cat 2 NamePat1Pat 1 Name GR3Group 3 NameCAT2Cat 2 NamePat1Pat 1 Name Fact1Pat1 $10 PK FKAMT Patient.DIM
INTELLIGENT DATA SOLUTIONS OM 22 Degenerate Dimensions 1-1 with a Fact Natural Keys ‘Fact’ Cube Relationship
INTELLIGENT DATA SOLUTIONS OM 23 Degenerate Dimensions Illustrated
INTELLIGENT DATA SOLUTIONS OM 24 Many-to-Many Factless Fact aka Bridge Table Contains FKs Only Requires “Intermediate Measure Group”
INTELLIGENT DATA SOLUTIONS OM PA & Migration
INTELLIGENT DATA SOLUTIONS OM 26 Predictive Analytics in SSAS Consuming Unstructured Data Snapshot Facts Completely flat Excel-type Dataset
INTELLIGENT DATA SOLUTIONS OM 27 Additional Considerations Grain Agile Methodologies Indexing ABC (Audit, Balance & Control)
INTELLIGENT DATA SOLUTIONS OM 28 Migration Modeling for MS BI Tool Selection Migration or Replacement? User Expectations
INTELLIGENT DATA SOLUTIONS OM 29 “If you would hit the mark, you must aim a little above it; Every arrow that flies feels the attraction of earth.” ~ Henry W. Longfellow
INTELLIGENT DATA SOLUTIONS OM Supporting Material
INTELLIGENT DATA SOLUTIONS OM 31 Fundamentals of Cube Design Understand Your Data Use a star schema Design for SSAS Denormalize! Model many-to-many with a bridge Key using integer data types Remove NULL values Use role playing attributes (-1, etc.) Use role playing dimensions Create at least one hierarchy / dimension Push business logic back to the EDW, not the DSV or MDX
INTELLIGENT DATA SOLUTIONS OM 32 DIM Review Checklist – Required Non-Negotiable Dimensions are indicative of a complete subject area. The same PK that is defined in the DSV is used as the PK in the dimension There are fewer than 25 or 30 dimensions in a cube Dimensions are related to multiple measure groups No dimension is a copy of another Every dimension attribute name is unique between multiple dimensions. Dimension attribute names are not measure group specific Role playing dimensions have been used for multiple dates or geography keys found in a single measure group. (There is only one DATE.DIM and one GEOGRAPHY.DIM) Attribute relationships are showing no warnings Parent-child hierarchies have been avoided Each dimension passes a BIDS Helper Dimension Health Check
INTELLIGENT DATA SOLUTIONS OM 33 DIM Review Checklist – Good Idea 1.Every dimension has multiple attributes 2.Every dimension has a hierarchy 3.Degenerate dimensions are used in the cube and generally do not contain a hierarchy (sometimes you will find 'Order Number' + 'Order Line' hierarchy) 4.Attributes used in a hierarchy are not exposed as independent dimension attributes 5.All PK and FK are hidden in the dimension attribute properties, not in a perspective 6.Naming conventions have been implemented for a cleaner user experience. 7.Integer keys (attribute property) are in use whenever possible with the NameColumn pointing to the varchar() value
INTELLIGENT DATA SOLUTIONS OM Reporting Phases
INTELLIGENT DATA SOLUTIONS OM 35 Waterfall vs. Agile