Retail Sales is used to illustrate a first dimensional model

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Cognos 8 Training Session
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
MIS 451 Building Business Intelligence Systems
Alternative Database topology: The star schema
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Dimensional Modeling Business Intelligence Solutions.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Dimensional Modeling – Part 2
March 2010ACS-4904 Ron McFadyen1 Aggregate management References: Lawrence Corr Aggregate improvement
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing (Kimball, Ch.2-4) Dr. Vairam Arunachalam School of Accountancy, MU.
DWH – Dimesional Modeling PDT Genči. 2 Outline Requirement gathering Fact and Dimension table Star schema Inside dimension table Inside fact table STAR.
Dimensional model. What do we know so far about … FACTS? “What is the process measuring?” Fact types:  Numeric Additive Semi-additive Non-additive (avg,
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
BI Terminologies.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Normalized model vs dimensional model
Basic Model: Retail Grocery Store
Designing a Data Warehousing System. Overview Business Analysis Process Data Warehousing System Modeling a Data Warehouse Choosing the Grain Establishing.
More Dimensional Modeling. Facts Types of Fact Design Transactional Periodic Snapshot –Predictable time period –Ex. Monthly, yearly, etc. Accumulating.
UNIT-II Principles of dimensional modeling
1 On-Line Analytic Processing Warehousing Data Cubes.
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
Creating the Dimensional Model
1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Data Warehousing DSCI 4103 Dr. Mennecke Chapter 2.
Dimensional Modeling Primer Chapter 1 Kimball & Ross.
Last Updated : 26th may 2003 Center of Excellence Data Warehousing Introductionto Data Modeling.
DATA WAREHOUSING – DIMENSIONAL MODELLING AND SCHEMAS With MIKE –AARONE ATUHE Handout 5.
Operation Data Analysis Hints and Guidelines
© The McGraw-Hill Companies, All Rights Reserved APPENDIX C DESIGNING DATABASES APPENDIX C DESIGNING DATABASES.
On-Line Analytic Processing
PRINCIPLES OF DIMENSIONAL MODELING
MIS2502: Data Analytics Dimensional Data Modeling
Star Schema.
Applying Data Warehouse Techniques
MIS2502: Data Analytics Dimensional Data Modeling
Overview and Fundamentals
Competing on Analytics II
Dimensional Model January 14, 2003
Inventory is used to illustrate:
Retail Sales is used to illustrate a first dimensional model
MIS2502: Data Analytics Dimensional Data Modeling
CMPE 226 Database Systems April 11 Class Meeting
Introduction to Customizing Reports in SAP
MIS2502: Data Analytics Dimensional Data Modeling
Assignment 2 Due Thursday Feb 9, 2006
Retail Sales is used to illustrate a first dimensional model
Data warehouse architecture CIF, DM Bus Matrix Star schema
Introduction to Customizing Reports in SAP
Dimensional Modeling.
MIS2502: Data Analytics Dimensional Data Modeling
Role Playing Dimensions (p )
Dimensional Model January 16, 2003
DWH – Dimesional Modeling
Applying Data Warehouse Techniques
Aggregate improvement Lost, shrunken, and collapsed Ralph Kimball
Examines blended and separate transaction schemas
Review of Major Points Star schema Slowly changing dimensions Keys
Transaction fact table (figure 7.2)
Many aggregates can be defined for one base star schema
Applying Data Warehouse Techniques
Review of Major Points Star schema Slowly changing dimensions Keys
Page 37 Figure 2.3, with attributes excluded
Presentation transcript:

Retail Sales is used to illustrate a first dimensional model Chapter 2 Retail Sales is used to illustrate a first dimensional model Design process Case study: POS example Star schema Facts Dimensions Creating the schema in SQL Server Factless facts Degenerate dimensions Extensibility Snowflaking Outriggers January 2004 91.4904 Ron McFadyen

The Dimensional Design Process 4 Step Dimensional Design Process Select the business process, examples: invoicing, orders, inventory, general ledger, … Declare the grain. Determine exactly what an individual fact table row represents. Examples: a line item on an order, a boarding pass to get on a flight, a student’s course registration, a monthly snapshot for a bank account. Choose the dimensions that apply to the facts. What describes each fact. Examples: customer dimension, student dimension, course dimension, day dimension. Identify the numeric facts that appear in the rows of the fact table. January 2004 91.4904 Ron McFadyen

The business process: POS retail sales Case Study Case Study The business process: POS retail sales Grain of the fact table: individual line items on a POS transaction The dimensions: date, product, store, promotion The facts: sales quantity, cost dollar amount, sales dollar amount, gross profit dollar amount (derivable) January 2004 91.4904 Ron McFadyen

A typical drawing seen in practice, in articles, … Case Study Schema Date Product Sales facts Store Promotion A typical drawing seen in practice, in articles, … January 2004 91.4904 Ron McFadyen

Case Study Schema in Peter Chen Notation 1 1 Product Date n n Sales facts n n n Store Promotion 1 1 1 Sales Transaction Note: Sales transaction does not appear in text. Later in chapter it is discussed as a degenerate dimension January 2004 91.4904 Ron McFadyen

Facts can be described as additive, non-additive, semi-additive. Case Study Fact Table Sales facts Sales quantity Sales dollar amount Cost dollar amount Gross profit dollar amount Additive Facts can be described as additive, non-additive, semi-additive. Additive: can be meaningfully summed across all dimensions Semi-additive: …………………….. across some dimensions Non-additive: can’t be … The text discusses some non-additive facts that might be included in such a fact table: gross margin, unit price January 2004 91.4904 Ron McFadyen

Case Study Fact Table The physical table: Sales facts Date key (FK) Product key (FK) Store key (FK) Promotion key (FK) POS Transaction Number (degenerate dimension) Sales quantity Sales dollar amount Cost dollar amount Gross profit dollar amount PK January 2004 91.4904 Ron McFadyen

Case Study Date Dimension Very descriptive Easy to set criteria for queries Easy to get headings for reports One row for each day (this is the grain of the Date dimension) PK is a surrogate key Used in every star schema Hierarchies are present Not normalized attribute hierarchy Calendar week  … Fiscal week  … Date key (PK) Date Full date description Day of week Day number in epoch Week number in epoch Month number in epoch Day number in calendar month … …. Last day in week indicator Holiday indicator Weekday indicator SQL date stamp … Calendar week Calendar month Calendar year Fiscal week Fiscal month Fiscal year January 2004 91.4904 Ron McFadyen

Case Study Product Dimension Very descriptive Easy to set criteria for queries Easy to get headings for reports One row for each product for sale, or ever sold, by the company PK is a surrogate key. We do not use the operational PK here. Over time it may not be unique: the business may re-use keys, companies merge … Not normalized An attribute hierarchy Brand  category  department Product key (PK) Product description SKU number Brand description Category description Department description Package type description Package size Fat content Diet type Weight Weight units of measure … January 2004 91.4904 Ron McFadyen

Case Study Drilling Down/Up Product Run a query to generate: Department, sales amount, sales quantity Now, add another attribute at a ‘lower’ level such as brand: Department, brand, sales amount, sales quantity What is meant by row-headers (in the text)? Product key (PK) Product description SKU number (natural key) Brand description Category description Department description Package type description Package size Fat content Diet type Weight Weight units of measure … January 2004 91.4904 Ron McFadyen

Case Study Store Dimension Very descriptive Easy to set criteria for queries Easy to get headings for reports One row for each store PK is a surrogate key. Not normalized An attribute hierarchies city  county  state  zip district  region How does the text handles the “First open date” attribute? Store key (PK) Store name Store number (natural key) Store street address Store city Store county Store state Store zip code … Total square footage First open date January 2004 91.4904 Ron McFadyen

Case Study Promotion Dimension Very descriptive Easy to set criteria for queries Easy to get headings for reports One row for each promotion PK is a surrogate key. Need a special row for “no promotion in effect” Why? Promotion key (PK) Promotion name Price reduction type Promotion media type Ad type … January 2004 91.4904 Ron McFadyen

A factless fact table has no measurement metrics. Factless Fact Tables A factless fact table has no measurement metrics. These types of fact tables record occurrences of events. In retail sales, one might ask “What products were on promotion, but did not sell?” The Sales facts table only records sales, and so it alone is not enough to answer this question. January 2004 91.4904 Ron McFadyen

How many rows are in Coverage? How many rows in Sales Facts? Factless Fact Tables Consider Date Product Coverage Store Promotion The Coverage table has one row for each promotion of a product at some store on a certain date. How many rows are in Coverage? How many rows in Sales Facts? January 2004 91.4904 Ron McFadyen

What is the SQL to determine: Factless Fact Tables Consider Date Product Coverage Store Promotion What is the SQL to determine: “What products were on promotion, but did not sell?” January 2004 91.4904 Ron McFadyen

Degenerate Dimensions A degenerate dimension is one where the only attribute of interest is the natural key. As a result, there is no physical dimension in the data warehouse. e.g. Transaction number in Retail Sales Transaction can be shown as a dimension in the logical design, but there is no Transaction table in the physical design. The fact table has a transaction number (instead of a surrogate key to a Transaction dimension) January 2004 91.4904 Ron McFadyen

Degenerate Dimensions Sales facts Date key (FK) Product key (FK) Store key (FK) Promotion key (FK) POS Transaction Number (DD) Sales quantity Sales dollar amount Cost dollar amount Gross profit dollar amount PK Degenerate Dimension January 2004 91.4904 Ron McFadyen

Degenerate Dimensions Very common in star schema designs Orders Invoices … In many systems where there are “line items”, there is some interesting operational key that can tie the facts back to the operational systems: order#, invoice#, … January 2004 91.4904 Ron McFadyen

Extensibility of Star Schema Designs In many cases we can add: New dimension tables New fact tables New aggregates New dimension attributes New measurement metrics without changing existing applications January 2004 91.4904 Ron McFadyen

Extensibility of Star Schema Designs In many cases we can add dimensions to an existing design and database. Consider Retail Sales and the new dimensions: Frequent Shopper, Clerk, Time of Day Is the Frequent Shopper concept valid? Is knowing who the clerk is reasonable? Do we know the time of day for a sale? Any way of describing a fact that is single-valued for all existing facts in the fact table, could become a dimension. What is required in the database environment to accomplish this? January 2004 91.4904 Ron McFadyen

Extensibility of Star Schema Designs What is required in the database environment to extend a star schema with a new dimension? Alter table … may be complex – at the least we are adding an attribute Create table … create a new dimension Load the new dimension … Populate the new foreign key in the altered fact table Create an ETL process for the new dimension Modify the ETL process for the fact table January 2004 91.4904 Ron McFadyen

If a dimension is normalized, we say it is a snowflaked design. Snowflaking If a dimension is normalized, we say it is a snowflaked design. Consider the Product dimension, and suppose we have the following functional dependencies: January 2004 91.4904 Ron McFadyen

The Product dimension is in _____________ normal form. Snowflaking Product key SKU number Product description Brand key Brand description Category key Category description Department key Department description The Product dimension is in _____________ normal form. January 2004 91.4904 Ron McFadyen

Now, the Product dimension is in _____________ normal form. Snowflaking Date Product Brand Category Department Sales facts Store Promotion Now, the Product dimension is in _____________ normal form. The general problem is that this complicates the user’s view of data, complicates the underlying SQL, defeats the usefulness of bit vectors, minimally decreases space requirements, and queries execute slower. January 2004 91.4904 Ron McFadyen

Outriggers Date Product Sales facts Date Store Promotion Date is called an outrigger table for Store. Note there is only one Date table Store was shown to have two dates: First open date and Last remodel date Instead of being attribute values from the Date domain, these can be foreign keys to the Date dimension. January 2004 91.4904 Ron McFadyen

Outriggers Date Product Sales facts Date Store Promotion Same table Date Product A fact will join to one row of Date and one row of Store, but these two rows of Date are usually different rows. Sales facts Date Store Promotion Outrigger Outriggers are an acceptable variation on normalized dimensions. They are justified because they add a great deal to the expressive capability of queries. January 2004 91.4904 Ron McFadyen