Dimensional Modeling Chapter 2. The Dimensional Data Model An alternative to the normalized data model Present information as simply as possible (easier.

Slides:



Advertisements
Similar presentations
Chapter 4 Tutorial.
Advertisements

Dimensional Modeling By Dr. Gabriel.
Dimensional Modeling.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Cognos 8 Training Session
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Entity, Attribute, and Relationship DATA ENTITY TYPE: a noun, i.e. roles, events, locations, people, tangible things about which we wish to maintain.
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
GB Video Dimensional Example. Customer #Cust No F Name L Name Ads1 Ads2 City State Zip Tel No CC No Expire Rental #Rental No Date Clerk No Pay Type CC.
Data Warehousing M R BRAHMAM.
DATA WAREHOUSE DATA MODELLING
Dimensional Modeling Business Intelligence Solutions.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Dimensional Modeling – Part 2
Data Warehousing Design Transparencies
MIS 451 Building Business Intelligence Systems Logical Design (3) – Design Multiple-fact Dimensional Model.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Principles of Dimensional Modeling
Data warehousing theory and modelling techniques Building Dimensional Models.
Business Intelligence
DWH – Dimesional Modeling PDT Genči. 2 Outline Requirement gathering Fact and Dimension table Star schema Inside dimension table Inside fact table STAR.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Dimensional model. What do we know so far about … FACTS? “What is the process measuring?” Fact types:  Numeric Additive Semi-additive Non-additive (avg,
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
Data Warehousing Concepts, by Dr. Khalil 1 Data Warehousing Design Dr. Awad Khalil Computer Science Department AUC.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Babysitter Example Dimensional Modeling. Babysitter Service The MISSA Service Club wants to run a babysitting service. Customers call to request a sitter.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
BI Terminologies.
Entity-Relationship (ER) Modelling ER modelling - Identify entities - Identify relationships - Construct ER diagram - Collect attributes for entities &
Database Application Design and Data Integrity AIMS 3710 R. Nakatsu.
Basic Model: Retail Grocery Store
Designing a Data Warehousing System. Overview Business Analysis Process Data Warehousing System Modeling a Data Warehouse Choosing the Grain Establishing.
Dimensional Modelling
More Dimensional Modeling. Facts Types of Fact Design Transactional Periodic Snapshot –Predictable time period –Ex. Monthly, yearly, etc. Accumulating.
ISQS 3358, Business Intelligence Supplemental Notes on the Term Project Zhangxi Lin Texas Tech University 1.
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
1 On-Line Analytic Processing Warehousing Data Cubes.
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
Creating the Dimensional Model
1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Data Warehousing DSCI 4103 Dr. Mennecke Chapter 2.
Last Updated : 26th may 2003 Center of Excellence Data Warehousing Introductionto Data Modeling.
1 Dimensional Modelling III. 2 Dimensional Models A denormalized relational model Made up of tables with attributes Relationships defined by keys and.
COMP 430 Intro. to Database Systems Denormalization & Dimensional Modeling.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
Building the Corporate Data Warehouse Pindaro Demertzoglou Data Resource Management.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Data Warehouse Toolkit
Dimensional Modelling III
Star Schema.
Applying Data Warehouse Techniques
Overview and Fundamentals
Dimensional Model January 14, 2003
Inventory is used to illustrate:
Retail Sales is used to illustrate a first dimensional model
CMPE 226 Database Systems April 11 Class Meeting
Retail Sales is used to illustrate a first dimensional model
Data warehouse architecture CIF, DM Bus Matrix Star schema
Dimensional Modeling.
Retail Sales is used to illustrate a first dimensional model
Dimensional Model January 16, 2003
Applying Data Warehouse Techniques
Applying Data Warehouse Techniques
Presentation transcript:

Dimensional Modeling Chapter 2

The Dimensional Data Model An alternative to the normalized data model Present information as simply as possible (easier to understand) Return queries as quickly as possible (efficient for queries) Track the underlying business processes (process focused)

The Dimensional Data Model Contains the same information as the normalized model Has far fewer tables Grouped in coherent business categories Pre-joins hierarchies and lookup tables resulting in fewer join paths and fewer intermediate tables Normalized fact table with denormalized dimension tables.

GB Video E-R Diagram Customer #Cust No F Name L Name Ads1 Ads2 City State Zip Tel No CC No Expire Rental #Rental No Date Clerk No Pay Type CC No Expire CC Approval Line #Line No Due Date Return Date OD charge Pay type Requestor of Owner of Video #Video No One-day fee Extra days Weekend Title #Title No Name Vendor No Cost Name for Holder of

Customer CustID Cust No F Name L Name Rental RentalID Rental No Clerk No Store Pay Type Line LineID OD Charge OneDayCharge ExtraDaysCharge WeekendCharge DaysReserved DaysOverdue CustID AddressID RentalId VideoID TitleID RentalDateID DueDateID ReturnDateID Video VideoID Video No Title TitleID TitleNo Name Cost Vendor Name Rental Date RentalDateID SQLDate Day Week Quarter Holiday Due Date DueDateID SQLDate Day Week Quarter Holiday Return Date ReturnDateID SQLDate Day Week Quarter Holiday Address AddressID Adddress1 Address2 City State Zip AreaCode Phone GB Video Data Mart

Fact Table Measurements associated with a specific business process Grain: level of detail of the table Process events produce fact records Facts (attributes) are usually Numeric Additive Derived facts included Foreign (surrogate) keys refer to dimension tables (entities) Classification values help define subsets

Dimension Tables Entities describing the objects of the process Conformed dimensions cross processes Attributes are descriptive Text Numeric Surrogate keys Less volatile than facts (1:m with the fact table) Null entries Date dimensions Produce “by” questions

Bus Architecture An architecture that permits aggregating data across multiple marts Conformed dimensions and attributes Drill Down vs. Drill Across Bus matrix

Keys and Surrogate Keys A surrogate key is a unique identifier for data warehouse records that replaces source primary keys (business/natural keys) Protect against changes in source systems Allow integration from multiple sources Enable rows that do not exist in source data Track changes over time (e.g. new customer instances when addresses change) Replace text keys with integers for efficiency

Slowly Changing Dimensions Attributes in a dimension that change more slowly than the fact granularity Type 1: Current only Type 2: All history Type 3: Most recent few (rare) Note: rapidly changing dimensions usually indicate the presence of a business process that should be tracked as a separate dimension or as a fact table

CustKeyBKCustIDCustNameCommDistGenderHomOwn? Jane Rider3FN DateCustKeyProdKeyItem CountAmount 1/7/ , /2/ /7/ /21/ Cust Key BKCust ID Cust Name Comm Dist GenderHom Own? EffEnd Jane Rider3FN1/7/20041/1/ Jane Rider31FN1/2/200612/31/9999 Fact Table Dimension with a slowly changing attribute

Date Dimensions One row for every day for which you expect to have data for the fact table (perhaps generated in a spreadsheet and imported) Usually use a meaningful integer surrogate key (such as yyyymmdd for Sep. 26, 2006). Note: this order sorts correctly. Include rows for missing or future dates to be added later.

Degenerate Dimensions Dimensions without attributes. (Such as a transaction number or order number.) Put the attribute value into the fact table even though it is not an additive fact.

Snowflaking (Outrigger Dimensions or Reference Dimensions) Connects entities to dimension tables rather than the fact table Complicates coding and requires additional processing for retrievals Makes type 2 slowly changing dimensions harder to maintain Useful for seldom used lookups

M:N Multivalued Dimensions Fact to Dimension Dimension to Dimension Try to avoid these. Solutions can be very misleading.

Multivalued Dimensions ORDERS (FACT) SalesRepKey ProductKey SalesRepGrpKey CustomerKey OrderQty SALESREP SalesRepKey Name Address SALESREP-ORDER-BRIDGE SalesRepKey SalesrepGroupKey Weight= (1/NumReps)

Hierarchies Group data within dimensions: SalesRep Region State County Neighborhood Problem structures Variable depth Frequently changing

Heterogeneous Products Several different kinds of entry with different attributes for each (The sub-class problem)

Aggregate Dimensions Dimensions that represent data at different levels of granularity Remove a dimension Roll up the hierarchy (provide a new shrunken dimension with new surr-key that represents rolled up data)

Junk Dimensions Miscellaneous attributes that don’t belong to another entity, usually representing processing levels Flags Categories Types

Fact Tables Transaction Track processes at discrete points in time when they occur Periodic snapshot Cumulative performance over specific time intervals Accumulating snapshot Constantly updated over time. May include multiple dates representing stages.

Aggregates Precalculated summary tables Improve performance Record data an coarser granularity