Presentation is loading. Please wait.

Presentation is loading. Please wait.

Datawarehouse & Datamart OLAPs vs. OLTPs Dimensional Modeling Creating Physical Design Using SQL Mgt. Studio Module II: Designing Datamarts 1.

Similar presentations


Presentation on theme: "Datawarehouse & Datamart OLAPs vs. OLTPs Dimensional Modeling Creating Physical Design Using SQL Mgt. Studio Module II: Designing Datamarts 1."— Presentation transcript:

1 Datawarehouse & Datamart OLAPs vs. OLTPs Dimensional Modeling Creating Physical Design Using SQL Mgt. Studio Module II: Designing Datamarts 1

2 BI System Components Data Source Flat Files Transactions DB (OLTP) XML Files Excel Files Etc. Data Repository Datamart DataWarehourse OLAP System Multidimensional Database - Cubes Data Analysis Visualization Cube Browsing Reporting Dashboards Data Mining Module 4: Populate a DataMart Chapter 7 & 8 – Larson Book ETL Process SSI Services Module 2: Design a Datamart: Chapter 3 & 6 Larson Book Requirement Analysis Creating a Schema SS DB Engine Module 3: Business Analytics Chapter 4,9, 10 – Larson Book Build an OLAP/Cube SSA Services Module 1: Delivering BI Chapter 1, 2, 10,18– Larson Book Creating KPI Creating Reports Excel and Tableau

3 Outline Data Warehouse Concept OLAPs vs. OLTPs (fundamental differences that suggest the need for different design approaches) Dimensional Modeling Creating Physical Design Using SQL Mgt. Studio 3

4 Concept and Characteristics Datawarehouse & Datamart 4

5 Data Warehouse Data Warehouse is a “central” repository for all or significant parts of the data that an enterprise's various business systems collect. A warehouse is a collection of data that is subject-oriented, integrated, time-variant and non-volatile. Provides a consolidated view of enterprise data, optimized for reporting and analysis. A physical repository where relational data are specially organized to provide enterprise-wide, cleansed data in a standardized format Data Marts are smaller versions of warehouses 5

6 6 OLAP vs OLTP

7 OLAP vs. OLTP 7 Online Transaction Processing Systems (OLTP): Systems that (e.g., order processing) – Inserting, Updating, Deleting appropriate records in a database at the end of each transaction. Online Analytical Processing Systems (OLAP): Systems that summarize & analyze a collection of transaction data. process transactions summarize & analyze

8 Relationship between OLTP and OLAP? Structural/Design differences? Purpose /Function difference? Difference in the type of data or information stored Size Users Data stored Performance Metric? OLTP vs OLAP 8

9 Relationship between OLTP and OLAP? Relationship between OLTP and OLAP? OLTP a data source for OLAP Structural/Design differences? Structural/Design differences? ER Modeling vs. Dimensional Modeling ER-Design vs. Star or Snow-Flake Design ER-Design -well structured steps, have been used and tested for decades vs. Star and Snow-Flake Design widely used for only a decade and still unstructured and the “rules” are not well established Application oriented vs. Subject oriented OLTP vs OLAP 9

10 Purpose /Function difference? Purpose /Function difference? OLTP process transactions vs..OLAP conducts analysis (performance, gain insight) OLTP focus on transaction processing efficiencies vs. OLAP ease data retrieval that is cognitively less overloading (allows for “chunks” or “Cubes” of data to be viewed OLTP process repetitive transactions (insert, delete) and conduct simple manipulations (select, update) vs. OLAP involves examining (mostly read only) many data items, complex relationships and focuses on aggregates OLTP views detailed and flat transactions vs. OLAP multidimensional and aggregates OLTP vs OLAP 10

11 Difference in the type of data or information stored Difference in the type of data or information stored OLTP current and isolated vs. OLAP historic and consolidated OLTP stores data specific to a transaction vs. OLAP stores data specific to performance Size Size Users - OLTP has thousands of users vs. OLAP have hundreds or fewer users Data stored - OLTP stores 100s MB-GB vs. OLAP stores 100s GB- TB Performance Metric? Performance Metric? Transaction Throughput vs. OLAP Query Throughput Data Quality - “Dirty” data a major issue for OLAP OLTP vs OLAP 11

12 Modeling Technique used to design data warehouses and data marts 12 Dimensional Modeling

13 ER Modeling vs. Dimensional Modeling ER ModelingDimensional Modeling 13 Transaction Capture Reduce Data Redundancy – highly normalized tables Hard for End-user to understand and remember Not query friendly All the attributes for an entity including categorical as well as numeric, belong to the entity table. Well defined theory driven process Data Retrieval Intuitive and high query performance Categorical data in a 'dimension' entity and the 'fact' entity has mostly numeric attributes. The only categorical (non- fact) field in the fact table are the keys to dimension tables Process ill-defined…more of an art

14 Dimensional Modeling – Benefits 14 1. Produce database structures that are easy for end users to understand and write queries against. 2. Optimize query performance (as opposed to update performance). 3. Scalability - Dimensional models are scalable and “easily” accommodate unexpected new data.

15 Designing a Data Mart Identifying the information that the decision makers need - measures, dimensions, hierarchies, and attributes. (Group Deliverable I) Build the database structure for the data mart using either a star or snowflake schema.. (Group Deliverable II) 15

16 Requirement Analysis –Decision Makers' Needs (GD#1)  Business intelligence design must start with the decision makers  What foundational and feedback information do they need?  How do they need that information sliced and diced for proper analysis?  More specifically:  What facts, figures, statistics, and so forth do you need for effective decision making? (measures)  How should this information be sliced and diced for analysis? (dimensions)  What additional information can aid in decision making? (attributes) 16

17 Data Mart – Structure Data Mart’s Structure consists of the following two types of data objects Performance Measures (also referred as facts) Dimensions Hierarchies Attributes 17

18 Data Mart – Structure  Performance Measures :A Measure is a numeric quantity expressing some aspect of the organization's performance. The information represented by this quantity is used to support or evaluate the decision making and performance of the organization. A measure can also be called a fact. Example – Total Sales.  Information needed during the design process 1. Name of the measure 2. What fields should be used to supply the data (source) 3. Data type (money, integer, decimal) 4. Formula used to calculate the measure (if there is one)  Measures define what the decision makers want to see 18

19 Data Mart – Structure Dimensions (Slicers): A Dimension is a categorization used to spread out an aggregate measure to reveal its constituent parts. Examples: “total sales by sales person by year” Dimension - Key words: "by," "for each," or "for every“ Information needed during the design process Name of the dimension What fields should be used to supply the data (source) Data type of the dimension's key (the code that uniquely identifies each member of the dimension) Name of the parent dimension (if there is one) The dimensions and hierarchies define how the decision maker wants to view the data. 19

20 Data Mart – Structure  Hierarchy (Slicers; Drill Down): A Hierarchy is a structure made up of two or more levels of related dimensions. A dimension at an upper level of the hierarchy completely contains one or more dimensions from the next lower level of the hierarchy. Example: Time Dimension – Month, Quarter, Year.  Hierarchies are used to organize dimensions into various levels  Hierarchies – “roll up cities into sales regions" or "drill down from year into quarter ” 20

21 Data Mart – Structure Attributes: An Attribute is an additional piece of information pertaining to a dimension member that is not the unique identifier or the description of the member. Example: Regional Manager’s information, Customers’ gender and age. Provides more contextual information about a dimension Information needed during the design process Name of the attribute What fields should be used to supply the data (source) Data type Name of the dimension to which it applies Allows decision makers to filter data 21

22 Dimensional Design – The Schema Key Principle - A dimensional schema physically separates the measures that quantify a subject’s performance (e.g., student, business, team, process) from the descriptive elements (a.k.a. dimensions) that summarize and categorize the performance. Two types of schema A Star Schema A Snow Flake Schema 22

23 Data Mart’s – Data Objects – Various Measures and Dimensions – how to configure? 23 Measures Dimensions Hierarchies

24 The main idea underlying this design 24 Measure Group Dim 1 Dim 2 Dim 3 Dim 4 Dim 6 Dim 5

25 The Star Schema 25

26 The Snow Flake Schema 26

27 The Tables  Measures – All the measures are placed in a single table called the fact table in the schema  The dimensions are places in their own table  In the star schema, all the information for a hierarchy is stored in the same table. The information for the parent (or grandparent or great-grandparent, and so forth) dimension is added to the table containing the dimension at the lowest level of the hierarchy.  The snowflake schema works a bit differently. In the snowflake schema, each level in the dimensional hierarchy has its own table. The dimension tables are linked together with foreign key relationships to form the hierarchy. 27

28 A Four Step Dimensional Modeling Process - http://www.kimballgroup.com/ (Not in the book) http://www.kimballgroup.com/ 28 Step 1: Describe the Business Process that the Datamart Supports & Identify the Sources of Measurement Key concept - Measurement Events Step 2: Declare the Fact Table Grain Key Concept – Fact Table Data Views Step 3: Choosing the Dimensions Key Concept – Cardinalities & Hierarchies Step 4: Choosing the Facts Key Concept – Its relationships with the measurement events and the grain

29 Refer to the Class Handout and LBD#1 for this section 29 Dimension Modeling Details - Steps and Examples

30 Refer to LBD#2 for this Section 30 Converting Logical Design to Physical Design Using SQL Mgt. Studio

31 Summary 31 Overview of Data Warehouse concept – A data source for OLAPs OLTP vs OLAP – Compare and Contrast Dimensional Modeling Benefits Data Objects Data Structures Schemas – Logical and Physical


Download ppt "Datawarehouse & Datamart OLAPs vs. OLTPs Dimensional Modeling Creating Physical Design Using SQL Mgt. Studio Module II: Designing Datamarts 1."

Similar presentations


Ads by Google