Alternative Database topology: The star schema

Slides:



Advertisements
Similar presentations
Data Warehousing Denis Manley Enterprise Systems FT228/3.
Advertisements

10 Copyright © 2005, Oracle. All rights reserved. Dimensions.
Chapter 4 Tutorial.
Chapter 4 Tutorial.
© 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Joins and Sub-queries in SQL.
Dimensional Modeling.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Cognos 8 Training Session
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading.
DATAWAREHOUSE FOR BANKING by Amey Aras Deepesh Dhake Hatem Murad Nirav Hamlai.
Data Warehousing and Decision Support, part 2
MIS 451 Building Business Intelligence Systems
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
4.1 Opening Vignette: Data Warehousing and DSS at Group Health Cooperative 2-3 million data records are processed monthly How to use for decision support?
Data Warehousing - 2 ISYS 650. Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as.
1 9 Ch3, Hachim Haddouti Adv. DBS and Data Warehouse CSC5301 Ch3 Hachim Haddouti Hachim Haddouti.
Multidimensional Modeling MIS 497. What is multidimensional model? Logical view of the enterprise Logical view of the enterprise Shows main entities of.
Business Intelligence. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing ISYS 650. What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data.
Data Warehousing (Kimball, Ch.2-4) Dr. Vairam Arunachalam School of Accountancy, MU.
Data Model Examples USER SPECIFICATIONS.
Lecture 5 CS.456 DATABASE DESIGN.
Core of Business “Intelligence” technology
Core of Business “Intelligence” technology
Core of Business “Intelligence” technology Database warehouse, data mining and on-line analytical processing.
20.5 Data Cubes Instructor : Dr. T.Y. Lin Chandrika Satyavolu 222.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
Microsoft Access Big City Store Sales Database CUSTOMERS First Last Address City State Zip PRODUCTS Product Supplier Description Units Cost Price SALES.
InfoCubes and Aggregates
Customer Order Order Number Date Cust ID Last Name First Name State Amount Tax Rate Product 1 ID Product 1 Description Product 1 Quantity Product 2 ID.
BI Terminologies.
MIS2502: Data Analytics The Information Architecture of an Organization.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
DIMENSIONAL MODELING MIS2502 Data Analytics. So we know… Relational databases are good for storing transactional data But bad for analytical data What.
MIS2502: Data Analytics Dimensional Data Modeling
Basic Model: Retail Grocery Store
1 On-Line Analytic Processing Warehousing Data Cubes.
Competitive (Business) Intelligence Systems The Road to Denormalization (starring Charlie Sheen & other Random Celebrities)
Creating the Dimensional Model
Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.
Data Warehousing Multidimensional Analysis
1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.
Pooja Sharma Shanti Ragathi Vaishnavi Kasala. BUSINESS BACKGROUND Lowe's started as a single hardware store in North Carolina in 1946 and since then has.
DO NOT COPY --CONFIDENTIAL Homework 5 Partial Key Star Diagrams & Data Warehouse Design BCIS 4660 Dr. Nick Evangelopoulos Spring 2012.
Data Warehousing DSCI 4103 Dr. Mennecke Chapter 2.
Copyright © Archer Decision Sciences, Inc. Our Model Store DimensionProduct Dimension District Region Total Brand Manufacturer Total StoresProducts.
INCREMENTAL AGGREGATION After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation.
COMP 430 Intro. to Database Systems Denormalization & Dimensional Modeling.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
Defining Data Warehouse Structures Data Warehouse Data Access End User Data Access Data Sources Staging Area Data Marts Data Extract, Transform, and Load.
Jaclyn Hansberry MIS2502: Data Analytics The Things You Can Do With Data The Information Architecture of an Organization Jaclyn.
MIS2502: Data Analytics Dimensional Data Modeling
Star Schema.
MIS2502: Data Analytics Dimensional Data Modeling
Assignment 2 Due Thursday Feb 9, 2006
MIS2502: Data Analytics Dimensional Data Modeling
Inventory is used to illustrate:
Retail Sales is used to illustrate a first dimensional model
MIS2502: Data Analytics Dimensional Data Modeling
INFS 3220 Systems Analysis & Design
MIS2502: Data Analytics The Information Architecture of an Organization Acknowledgement: David Schuff.
Retail Sales is used to illustrate a first dimensional model
The Road to Denormalization
Retail Sales is used to illustrate a first dimensional model
Dimensional Model January 16, 2003
Aggregate improvement Lost, shrunken, and collapsed Ralph Kimball
Data Warehousing.
Presentation transcript:

Alternative Database topology: The star schema D.W. O.L.A.P Data mining

The Atomic Schema Customer Cust Purchases Product Ref Cust Averages Customer ID Status Date Cust Addr State Cust ZIP Code Customer Type Customer Status ... Customer ID Activity Date Product Code Product Name Sales Rep ID Qty Purchased Total Dollars Promotion Flag Cust Purchases Product Code ProdRef Eff. Date ProdRef End Date Product Name Unit Price Product Category Product Type Product Sub Type Product Ref Cust Averages Customer ID Cust Average Date Cust Avg. End Date Cust Avg. Rev. Cust Longevity Atomic level data structured to support a wide variety of informational requirements across the organization As a result, atomic data too normalized to be easily accessed or understood by most end users Data consistently needs to be aggregated into the same categories (dimensions) Multidimensional processing capabilities provide users with tremendous flexibility for most of their analysis requirements Store ID Store Name Store Location Distribution Channel Outlet Reference Sales Rep ID Sales Person Name Store ID Sales Rep Ref

The Star Schema Fact Table Dimension Table 1 Dimension Table 3 Dimension Key 1 Fact Table Dimension Key 3 Dimension Key 1 Dimension Key 2 Dimension Key 3 Dimension Key 4 Description 1 Aggregatn Lvl 1.1 Aggregatn Lvl 1.2 Aggregatn Lvl 1.n Description 3 Aggregatn Lvl 3.1 Aggregatn Lvl 3.2 Aggregatn Lvl 3.n Fact 1 Fact 2 Fact 3 Fact 4 . Fact n Dimension Table 2 Dimension Table 4 Dimension Key 2 Dimension Key 4 Description 2 Aggregatn Lvl 2.1 Aggregatn Lvl 2.2 Aggregatn Lvl 2.n Description 4 Aggregatn Lvl 4.1 Aggregatn Lvl 4.2 Aggregatn Lvl 4.n

Dimension Table Dimension Table 1 Dimension Key 1 Description 1 Aggregatn Lvl 1.1 Aggregatn Lvl 1.2 Aggregatn Lvl 1.n Describes the data that has been organized in the Fact Table Key should either be the most detailed aggregation level necessary (e.g. country vs. county), if possible, or... Surrogate keys may be necessary, but will decrease the natural value of the key Manageable number of aggregation levels

Fact Table Quantifies the data that has been described by the Dimension Tables Key made up of unique combination of values of dimension keys ALWAYS contains date or date dimension Fact values should be additive Aggregations of quantities or amounts from atomic level No percentages or ratios May be non-additive, time-variant data Dimension Key 1 Dimension Key 2 Dimension Key 3 Dimension Key 4 Fact 1 Fact 2 Fact 3 Fact 4 . Fact n Fact Table

For Example: Purchases 1 Customer Location Selling Responsibility Cust ZIP Code City State/Province Country Customer Location Selling Responsibility Sales Rep ID Sales Rep Name Store ID Store Name Store Location Sales Channel Purchases 1 Days of Activity Unit Price Total Quantity Total Dollars Returned Qty Returned Dollars Promotion Qty Sales Rep ID Product Code Cust ZIP Code Customer Type Week Ending Date Customer Type Cust Type Desc Product Product Code Product Name Prod. Category Product Type Prod Sub Type Week Ending Date Month Quarter Year Date Information

Star Schema Query Select E.Month, B.Customer_Type, C.Product_Type, D.Store_Location, sum(A.Total_Quantity) From Purchases_1 A, Customer_Type B, Product C, Selling_Responsibility D, Date_Information E Where B.Customer_Type = A.Customer_Type and C.Product_Code = A.Product_Code and D.Sales_Rep_ID = A.Sales_Rep_ID and E.Week_Ending_Date = A.Week_Ending_Date and E.Year = “1996” and C.Product_Category = “V” Group by E.Month, B.Customer_Type, C.Product_Type, D.Store_Location;

Answer: Distinct Time Period Fact Tables Weekly Date D1 D2 D3 D4 Monthly Date D1 D2 D3 D4 Create separate fact tables to account for different time periods Date still part of each fact table key Same dimension tables used by both fact tables Improves overall performance (loading and accessing) for each time period Will not increase amount of managed redundancy Different time periods (weekly, monthly, accounting period, billing cycle) required for different analysis purposes.