Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com.

Slides:



Advertisements
Similar presentations
Summary and Q&A Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Advertisements

1 1 Summary and Q&A Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Introduction to Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Power BI Rafal Lukawiecki Strategic Consultant Project Botticelli Ltd
1 1 The Big Picture of Business Intelligence: Goals, Concepts, and the Platform Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
1 1 The Big Picture of Business Intelligence: Goals, Concepts, and the Platform Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
1 1 The Knowledge Worker’s Perspective: Self-Service of BI with Microsoft PowerPivot and Office 2010 Rafal Lukawiecki Strategic Consultant, Project Botticelli.
Implementing Business Analytics with MDX Chris Webb London September 29th.
Delivering BI Through Microsoft Office System 2007 Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
1 1 The IT Perspective: Data Warehousing, Management, and Analytical Structures Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Microsoft Business Intelligence Gustavo Santade Business Intelligence Project Manager Improving Business Insight Building a cube using Analysis Services.
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
1 1 The Knowledge Worker’s Perspective: Self-Service of BI Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
Microsoft business analytics Power and simplicity.
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
1. The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki.
Activity Running Time DurationIntro0 2 min Setup scenario 2 2 min SQL BI components & concepts 4 5 min Data input (Let’s go shopping) 9 7 min Whiteboard.
Do It Strategically with Microsoft Business Intelligence! Bojan Ciric Strategic Consultant.
1 1 The IT Perspective: Data Warehousing, Management, and Analytical Structures Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Welcome to BI Roadshow 2010 John PowerPivot Microsoft
BI Terminologies.
Do It Strategically with Microsoft Business Intelligence! Bojan Ciric Strategic Consultant.
Turning data into a business advantage Rafal Lukawiecki Strategic Consultant Project Botticelli
customer.
Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
Advanced (and attractive) analytics Rafal Lukawiecki Strategic Consultant, Project Botticelli
Welcome to BI Roadshow 2010 Germán Díaz Product Marketing Manager Microsoft Spain.
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.
Welcome José Grilo Server and Tools Lead Microsoft Portugal
Welcome My Name Microsoft Xxx Data Mining and Business Intelligence for Enterprises.
Microsoft Solutions for Business Intelligence.
Event Title Event Date. Module 02—Introduction to Dimensional Modeling Techniques Name Title Microsoft Corporation.
Improving Insight and Decision Making Using Microsoft Business Intelligence and SQL Server 2008 Rafal Lukawiecki Strategic Consultant, Project Botticelli.
John Tran Business Program Manager, The Suddath Companies
Presenter Date | Location
Data warehouse and OLAP
6/19/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
What’s New in SQL Server 2016 Master Data Services
Julie Strauss Senior Program Manager Microsoft
The Knowledge Worker’s Perspective: Self-Service of BI with Microsoft PowerPivot and Office 2010 Rafal Lukawiecki Strategic Consultant, Project Botticelli.
Data Warehouse.
Entity Based Staging SQL Server 2012 Tyler Graham
Business Intelligence for Project Server/Online
11/11/2018 5:18 AM © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
The IT Perspective: Data Warehousing, Management, and Analytical Structures Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Matt Masson Software Development Engineer Microsoft Corporation
TechEd /24/2018 6:19 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Hierarchies SQL Server 2012 Tyler Graham Senior Program Manager
How EMI Music Implemented Master Data Services with Adatis
Delivering BI Through Microsoft Office System 2007
Andi Comisioneru Principal Group Program Manager Microsoft Corporation
Delivering BI Through Microsoft Office 2007 and PerformancePoint Server 2007 Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Kasper de Jonge Microsoft Corporation
From DTS to SSIS, Redesign or Upgrade
Andi Comisioneru Principal Group Program Manager Microsoft Corporation
Delivering an End to End Business Intelligence Solution
Building Self-Service BI Applications Using PowerPivot
Tech·Ed North America /7/2019 2:30 PM
Implementing a Distributed Enterprise Architecture to Deliver BI
Presentation transcript:

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com

Objectives Explain the basics of: Data Warehousing ETL OLAP/Multidimensional Data Relate the theory to SQL Server 2008 SSAS and SSIS This seminar is based on a number of sources including a few dozen of Microsoft-owned presentations, used with permission. Thank you to Marin Bezic, Kathy Sabourin, Aydin Gencler, Bryan Bredehoeft, and Chris Dial for all the support. Thank you to Maciej Pilecki for assistance with demos. The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions © 2009 Project Botticelli Ltd & entire material © 2009 Microsoft Corp. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation.  Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.

1. Data Warehouse

Rich Connectivity Data Providers ODBC SQL Server SAP NetWeaver BI SQL Server Report Server Models SQL Server Integration Services Teradata XML OLE DB DB2 MySAP SQL Server Data Mining Models Oracle SQL Server Analysis Services Hyperion Essbase

Let’s Store the Intelligence: DW SQL Server Analysis Services server is a logical endpoint for data being aggregated with SSIS But do not store actual data in it Data physically rests in another database called a Data Warehouse You can manipulate it directly, or build in parallel with OLTP processing Modelling of data stored in DW and analysed using SSAS is at the heart of good Data Warehouse design

Microsoft BI Voyage Star Schema

Star Schema Benefits Transforms normalized data into a simpler model Microsoft BI Voyage Star Schema Benefits Transforms normalized data into a simpler model Delivers high-performance queries Delivers higher performing queries using Star Join Query Optimization Uses mature modeling techniques that are widely supported by many BI tools Requires low maintenance as the data warehouse design evolves

Snowflake Dimension Tables Microsoft BI Voyage Snowflake Dimension Tables Define hierarchies using multiple dimension tables Support fact tables with varying granularity Simplify consolidation of data from multiple sources Potential for slower query performance in relational reporting No difference in performance in Analysis Services database

Hierarchies Benefits Implementation Microsoft BI Voyage Hierarchies Benefits View of data at different levels of summarization Path to drill down or drill up Implementation Denormalized star schema dimension Normalized snowflake dimension Self-referencing relationship

Fact Table Fundamentals Microsoft BI Voyage Fact Table Fundamentals Collection of measurements associated with a specific business process Specific column types Foreign keys to dimensions Measures – numeric and additive Metadata and lineage Consistent granularity – the most atomic level by which the facts can be defined

Fact Table Examples Quarter Grain Day Grain Reseller sales data by: Microsoft BI Voyage Fact Table Examples Reseller sales data by: Product Order Date Reseller Employee Sales Territory Sales quota data by: Employee Time Quarter Grain Day Grain

Microsoft BI Voyage Date Dimension Table Most common dimension used in analysis (aka Time dimension) Used consistently with all facts for efficient and flexible analysis Useful common attributes – Year, Quarter, Month, Day Time series analysis support Navigation and summarization enabled with hierarchies, such as calendar or fiscal Single table design (typically not snowflake design) Tip: Format the key of the dimension as yyyymmdd (e.g. 20060925) to make it readily understandable

Parent-Child Hierarchy Microsoft BI Voyage Parent-Child Hierarchy A dimension that contains a parent attribute A parent attribute describes a self-referencing relationship, or a self-join, within a dimension table Common examples Organizational charts General Ledger structures Bill of Materials

Parent-Child Hierarchy Example Microsoft BI Voyage Parent-Child Hierarchy Example Brian Amy Stacia Stephen Shu Michael Peter José Syed

Slowly Changing Dimensions Microsoft BI Voyage Slowly Changing Dimensions Support primary role of data warehouse to describe the past accurately Maintain historical context as new or changed data is loaded into dimension tables Implement changes by Slowly Changing Dimension (SCD) type Type 1: Overwrite the existing dimension record Type 2: Insert a new ‘versioned’ dimension record Type 3: Track limited history with attributes

SCD Type 1 Existing record is updated History is not preserved Microsoft BI Voyage SCD Type 1 Existing record is updated History is not preserved

SCD Type 2 Existing record is ‘expired’ and new record inserted Microsoft BI Voyage SCD Type 2 Existing record is ‘expired’ and new record inserted History is preserved Most common form of SCD

SCD Type 3 Existing record is updated Limited history is preserved Microsoft BI Voyage SCD Type 3 Existing record is updated Limited history is preserved Implementation is rare SalesTerritoryKey update to 10

Let’s Get the Data We would like to populate facts and dimensions in our Data Warehouse from OLTP data...

2. Integration and ETL

Let’s do ETL with SSIS SQL Server Integration Services (SSIS) service Microsoft BI Voyage Let’s do ETL with SSIS SQL Server Integration Services (SSIS) service SSIS object model Two distinct runtime engines: Control flow Data flow 32-bit and 64-bit editions

The Package The basic unit of work, deployment, and execution Microsoft BI Voyage The Package The basic unit of work, deployment, and execution An organized collection of: Connection managers Control flow components Data flow components Variables Event handlers Configurations Can be designed graphically or built programmatically Saved in XML format to the file system or SQL Server

Control Flow Control flow is a process-oriented workflow engine Microsoft BI Voyage Control Flow Control flow is a process-oriented workflow engine A package contains a single control flow Control flow elements Containers Tasks Precedence constraints Variables

Data Flow The Data Flow Task Encapsulates the data flow engine Microsoft BI Voyage Data Flow The Data Flow Task Encapsulates the data flow engine Exists in the context of an overall control flow Performs traditional ETL in addition to other extended scenarios Is fast and scalable Data Flow Components Extract data from Sources Load data into Destinations Modify data with Transformations Service Paths Connect data flow components Create the pipeline

Data Flow Sources Sources extract data from Microsoft BI Voyage Data Flow Sources Sources extract data from Relational tables and views Files Analysis Services databases

Data Flow Destinations Microsoft BI Voyage Data Flow Destinations Destinations load data to Relational tables and views Files Analysis Services databases and objects DataReaders and Recordsets Enterprise Edition only

Populating Fact Tables Microsoft BI Voyage Populating Fact Tables Fact source Transform Lookup dimension key Repeat for each dimension key Lookup failed? Y Insert new dimension record N Insert new record

Populating Dimension Tables Microsoft BI Voyage Populating Dimension Tables Dimension source Transform Correlate records New record? Y N Type 1 change? Y Update changed column(s) N Type 2 change? Y Expire existing record Insert new record

Data Flow Transformations Microsoft BI Voyage Data Flow Transformations Aggregate, merge, distribute, or modify data Include error outputs in some cases Transformation Categories Row Rowset Split and Join Business Intelligence (BI) Script Other

Row Transformations Update column values or create new columns Microsoft BI Voyage Row Transformations Update column values or create new columns Transform each row in the pipeline input

Rowset Transformations Microsoft BI Voyage Rowset Transformations Create new rowsets that can include Aggregated values Sorted values Sample rowsets Pivoted or unpivoted rowsets This is a heavy-weight performer of SSIS Are also called asynchronous components

Split and Join Transformations Microsoft BI Voyage Split and Join Transformations Distribute rows to different outputs Create copies of the transformation inputs Join multiple inputs into one output Perform lookup operations

8/7/2018 12:46 PM Demo Using SQL Server Integration Services for Aggregating and Deriving Data © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

3. OLAP/Multidimensional Data

SQL Server 2008 Analysis Services Microsoft BI Voyage SQL Server 2008 Analysis Services OLAP component Aggregates and organizes data from business data sources Performs calculations difficult to perform using relational queries Supports advanced business intelligence, such as Key Performance Indicators Data mining component Discovers patterns in both relational and OLAP data Enhances the OLAP component with discovered results

Cube = Unified Dimensional Model Microsoft BI Voyage Cube = Unified Dimensional Model Multidimensional data Combination of measures and dimensions as one conceptual model Measures are sourced from fact tables Dimensions are sourced from dimension tables

Microsoft BI Voyage Dimensions Members from tables/views in a data source view (based on a Data Warehouse) Contain attributes matching dimension columns Organize attributes as hierarchies One All level and one leaf level User hierarchies are multi-level combinations of attributes Can be placed in display folders Used for slicing and dicing by attribute

Hierarchy Defined in Analysis Services 8/7/2018 12:46 PM Hierarchy Defined in Analysis Services Ordered collection of attributes into levels Navigation path through dimensional space Very important to get right! Customers by Geography Customers by Demographics Country Marital State Gender City Customer Customer

Measure Group Group of measures with same dimensionality Analogous to a fact table Cube can contain more than one measure group E.g. Sales, Inventory, Finance Defined by dimension relationships

Measure Group Measure Group Dimension Sales Inventory Finance Customers X Products Time Promotions Warehouse Department Account Scenario Dimension

Dimension Relationships Microsoft BI Voyage Dimension Relationships Define interaction between dimensions and measure groups Relationship types Regular Reference Fact (Degenerate) Many-to-many Data mining

Dimension Model Attributes Hierarchies 8/7/2018 12:46 PM Country State Marital City Gender Gender State Customer Customer Customer City Gender Marital Age Customer State Gender City Country Marital Customer Attributes Hierarchies

Microsoft BI Voyage Calculations Expressions evaluated at query time for values that cannot be stored in fact table Types of calculations Calculated members Named sets Scoped assignments Calculations are defined using MDX MDX = MultiDimensional EXpressions

8/7/2018 12:46 PM Demo 1. Using BIDS to Review Dimension Design 2. Cube Design and Functionality © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Summary As a platform for enterprise Business Intelligence you should consider three things: A Data Warehouse Process of Data Integration (incl. ETL) Multidimensional Analysis (OLAP) = SQL Server 2008 Engine, SSIS, and SSAS Now you can support decision making and performance management through: Reports, dashboards, Excel integration, data mining, and better business software

© 2009 Microsoft Corporation & Project Botticelli Ltd © 2009 Microsoft Corporation & Project Botticelli Ltd. All rights reserved. The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions © 2009 Project Botticelli Ltd & entire material © 2009 Microsoft Corp. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation.  Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.