Datawarehouse & Datamart OLAPs vs. OLTPs Dimensional Modeling Creating Physical Design Using SQL Mgt. Studio Module II: Designing Datamarts 1.

Slides:



Advertisements
Similar presentations
Management Information Systems, Sixth Edition
Advertisements

Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
Data Warehousing M R BRAHMAM.
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/20101Lipyeow.
Database Systems: Design, Implementation, and Management Tenth Edition
Data Warehouse IMS5024 – presented by Eder Tsang.
Data Sources Data Warehouse Analysis Results Data visualisation Analytical tools OLAP Data Mining Overview of Business Intelligence Data visualisation.
Chapter 13 The Data Warehouse
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
DATA WAREHOUSE (Muscat, Oman).
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
CS346: Advanced Databases
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Datawarehousing Concepts | 7.0 9/7/2015 Datawarehousing Concepts.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
AN OVERVIEW OF DATA WAREHOUSING
Cube Intro. Decision Making Effective decision making Goal: Choice that moves an organization closer to an agreed-on set of goals in a timely manner Goal:
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing.
Module 1: Introduction to Data Warehousing and OLAP
BI Terminologies.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Data resource management
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
DATA RESOURCE MANAGEMENT
Foundations of Business Intelligence: Databases and Information Management.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Pooja Sharma Shanti Ragathi Vaishnavi Kasala. BUSINESS BACKGROUND Lowe's started as a single hardware store in North Carolina in 1946 and since then has.
Business Intelligence Training Siemens Engineering Pakistan Zeeshan Shah December 07, 2009.
The Data Warehouse Chapter Operational Databases = transactional database  designed to process individual transaction quickly and efficiently.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
SQL Server Analysis Services Understanding Unified Dimension Model (UDM)
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Data Warehousing COMP3017 Advanced Databases Dr Nicholas Gibbins –
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
Data Integration - The ETL Process Module 4: BIC#4 – Data Integration Capability Populating Data Warehouse (Data Mart) 1.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Business Intelligence Overview
Serve as Director Funded by the Louisiana Department of Transportation and Development Developed LaCrash application to electronically capture crash.
Data warehouse.
Module III: Business Analytics
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Chapter 13 Business Intelligence and Data Warehouses
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
Data Warehouse and OLAP
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009
Introduction of Week 9 Return assignment 5-2
Chapter 13 The Data Warehouse
Chapter 3 Database Management
Data Warehouse and OLAP
Data Warehouse and OLAP Technology
Presentation transcript:

Datawarehouse & Datamart OLAPs vs. OLTPs Dimensional Modeling Creating Physical Design Using SQL Mgt. Studio Module II: Designing Datamarts 1

BI System Components Data Source Flat Files Transactions DB (OLTP) XML Files Excel Files Etc. Data Repository Datamart DataWarehourse OLAP System Multidimensional Database - Cubes Data Analysis Visualization Cube Browsing Reporting Dashboards Data Mining Module 4: Populate a DataMart Chapter 7 & 8 – Larson Book ETL Process SSI Services Module 2: Design a Datamart: Chapter 3 & 6 Larson Book Requirement Analysis Creating a Schema SS DB Engine Module 3: Business Analytics Chapter 4,9, 10 – Larson Book Build an OLAP/Cube SSA Services Module 1: Delivering BI Chapter 1, 2, 10,18– Larson Book Creating KPI Creating Reports Excel and Tableau

Outline Data Warehouse Concept OLAPs vs. OLTPs (fundamental differences that suggest the need for different design approaches) Dimensional Modeling Creating Physical Design Using SQL Mgt. Studio 3

Concept and Characteristics Datawarehouse & Datamart 4

Data Warehouse Data Warehouse is a “central” repository for all or significant parts of the data that an enterprise's various business systems collect. A warehouse is a collection of data that is subject-oriented, integrated, time-variant and non-volatile. Provides a consolidated view of enterprise data, optimized for reporting and analysis. A physical repository where relational data are specially organized to provide enterprise-wide, cleansed data in a standardized format Data Marts are smaller versions of warehouses 5

6 OLAP vs OLTP

OLAP vs. OLTP 7 Online Transaction Processing Systems (OLTP): Systems that (e.g., order processing) – Inserting, Updating, Deleting appropriate records in a database at the end of each transaction. Online Analytical Processing Systems (OLAP): Systems that summarize & analyze a collection of transaction data. process transactions summarize & analyze

Relationship between OLTP and OLAP? Structural/Design differences? Purpose /Function difference? Difference in the type of data or information stored Size Users Data stored Performance Metric? OLTP vs OLAP 8

Relationship between OLTP and OLAP? Relationship between OLTP and OLAP? OLTP a data source for OLAP Structural/Design differences? Structural/Design differences? ER Modeling vs. Dimensional Modeling ER-Design vs. Star or Snow-Flake Design ER-Design -well structured steps, have been used and tested for decades vs. Star and Snow-Flake Design widely used for only a decade and still unstructured and the “rules” are not well established Application oriented vs. Subject oriented OLTP vs OLAP 9

Purpose /Function difference? Purpose /Function difference? OLTP process transactions vs..OLAP conducts analysis (performance, gain insight) OLTP focus on transaction processing efficiencies vs. OLAP ease data retrieval that is cognitively less overloading (allows for “chunks” or “Cubes” of data to be viewed OLTP process repetitive transactions (insert, delete) and conduct simple manipulations (select, update) vs. OLAP involves examining (mostly read only) many data items, complex relationships and focuses on aggregates OLTP views detailed and flat transactions vs. OLAP multidimensional and aggregates OLTP vs OLAP 10

Difference in the type of data or information stored Difference in the type of data or information stored OLTP current and isolated vs. OLAP historic and consolidated OLTP stores data specific to a transaction vs. OLAP stores data specific to performance Size Size Users - OLTP has thousands of users vs. OLAP have hundreds or fewer users Data stored - OLTP stores 100s MB-GB vs. OLAP stores 100s GB- TB Performance Metric? Performance Metric? Transaction Throughput vs. OLAP Query Throughput Data Quality - “Dirty” data a major issue for OLAP OLTP vs OLAP 11

Modeling Technique used to design data warehouses and data marts 12 Dimensional Modeling

ER Modeling vs. Dimensional Modeling ER ModelingDimensional Modeling 13 Transaction Capture Reduce Data Redundancy – highly normalized tables Hard for End-user to understand and remember Not query friendly All the attributes for an entity including categorical as well as numeric, belong to the entity table. Well defined theory driven process Data Retrieval Intuitive and high query performance Categorical data in a 'dimension' entity and the 'fact' entity has mostly numeric attributes. The only categorical (non- fact) field in the fact table are the keys to dimension tables Process ill-defined…more of an art

Dimensional Modeling – Benefits Produce database structures that are easy for end users to understand and write queries against. 2. Optimize query performance (as opposed to update performance). 3. Scalability - Dimensional models are scalable and “easily” accommodate unexpected new data.

Designing a Data Mart Identifying the information that the decision makers need - measures, dimensions, hierarchies, and attributes. (Group Deliverable I) Build the database structure for the data mart using either a star or snowflake schema.. (Group Deliverable II) 15

Requirement Analysis –Decision Makers' Needs (GD#1)  Business intelligence design must start with the decision makers  What foundational and feedback information do they need?  How do they need that information sliced and diced for proper analysis?  More specifically:  What facts, figures, statistics, and so forth do you need for effective decision making? (measures)  How should this information be sliced and diced for analysis? (dimensions)  What additional information can aid in decision making? (attributes) 16

Data Mart – Structure Data Mart’s Structure consists of the following two types of data objects Performance Measures (also referred as facts) Dimensions Hierarchies Attributes 17

Data Mart – Structure  Performance Measures :A Measure is a numeric quantity expressing some aspect of the organization's performance. The information represented by this quantity is used to support or evaluate the decision making and performance of the organization. A measure can also be called a fact. Example – Total Sales.  Information needed during the design process 1. Name of the measure 2. What fields should be used to supply the data (source) 3. Data type (money, integer, decimal) 4. Formula used to calculate the measure (if there is one)  Measures define what the decision makers want to see 18

Data Mart – Structure Dimensions (Slicers): A Dimension is a categorization used to spread out an aggregate measure to reveal its constituent parts. Examples: “total sales by sales person by year” Dimension - Key words: "by," "for each," or "for every“ Information needed during the design process Name of the dimension What fields should be used to supply the data (source) Data type of the dimension's key (the code that uniquely identifies each member of the dimension) Name of the parent dimension (if there is one) The dimensions and hierarchies define how the decision maker wants to view the data. 19

Data Mart – Structure  Hierarchy (Slicers; Drill Down): A Hierarchy is a structure made up of two or more levels of related dimensions. A dimension at an upper level of the hierarchy completely contains one or more dimensions from the next lower level of the hierarchy. Example: Time Dimension – Month, Quarter, Year.  Hierarchies are used to organize dimensions into various levels  Hierarchies – “roll up cities into sales regions" or "drill down from year into quarter ” 20

Data Mart – Structure Attributes: An Attribute is an additional piece of information pertaining to a dimension member that is not the unique identifier or the description of the member. Example: Regional Manager’s information, Customers’ gender and age. Provides more contextual information about a dimension Information needed during the design process Name of the attribute What fields should be used to supply the data (source) Data type Name of the dimension to which it applies Allows decision makers to filter data 21

Dimensional Design – The Schema Key Principle - A dimensional schema physically separates the measures that quantify a subject’s performance (e.g., student, business, team, process) from the descriptive elements (a.k.a. dimensions) that summarize and categorize the performance. Two types of schema A Star Schema A Snow Flake Schema 22

Data Mart’s – Data Objects – Various Measures and Dimensions – how to configure? 23 Measures Dimensions Hierarchies

The main idea underlying this design 24 Measure Group Dim 1 Dim 2 Dim 3 Dim 4 Dim 6 Dim 5

The Star Schema 25

The Snow Flake Schema 26

The Tables  Measures – All the measures are placed in a single table called the fact table in the schema  The dimensions are places in their own table  In the star schema, all the information for a hierarchy is stored in the same table. The information for the parent (or grandparent or great-grandparent, and so forth) dimension is added to the table containing the dimension at the lowest level of the hierarchy.  The snowflake schema works a bit differently. In the snowflake schema, each level in the dimensional hierarchy has its own table. The dimension tables are linked together with foreign key relationships to form the hierarchy. 27

A Four Step Dimensional Modeling Process - (Not in the book) 28 Step 1: Describe the Business Process that the Datamart Supports & Identify the Sources of Measurement Key concept - Measurement Events Step 2: Declare the Fact Table Grain Key Concept – Fact Table Data Views Step 3: Choosing the Dimensions Key Concept – Cardinalities & Hierarchies Step 4: Choosing the Facts Key Concept – Its relationships with the measurement events and the grain

Refer to the Class Handout and LBD#1 for this section 29 Dimension Modeling Details - Steps and Examples

Refer to LBD#2 for this Section 30 Converting Logical Design to Physical Design Using SQL Mgt. Studio

Summary 31 Overview of Data Warehouse concept – A data source for OLAPs OLTP vs OLAP – Compare and Contrast Dimensional Modeling Benefits Data Objects Data Structures Schemas – Logical and Physical