1 Data Warehousing Lecture-14 Process of Dimensional Modeling Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

Cognos 8 Training Session
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Lecture-19 ETL Detail: Data Cleansing
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-5 Types & Typical Applications of DWH Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Data Warehouse IMS5024 – presented by Eder Tsang.
1 9 Ch3, Hachim Haddouti Adv. DBS and Data Warehouse CSC5301 Ch3 Hachim Haddouti Hachim Haddouti.
Dimensional Modeling – Part 2
Multidimensional Modeling MIS 497. What is multidimensional model? Logical view of the enterprise Logical view of the enterprise Shows main entities of.
Data Warehousing Design Transparencies
Lecture-33 DWH Implementation: Goal Driven Approach (1)
Data Warehousing (Kimball, Ch.2-4) Dr. Vairam Arunachalam School of Accountancy, MU.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Lecture-1 Introduction and Background
DWH-Ahsan Abdullah 1 Data Warehousing Lab Lect-2 Lab Data Set Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Ahsan Abdullah 1 Data Warehousing Lecture-12 Relational OLAP (ROLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Ahsan Abdullah 1 Data Warehousing Lecture-17 Issues of ETL Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Cube Intro. Decision Making Effective decision making Goal: Choice that moves an organization closer to an agreed-on set of goals in a timely manner Goal:
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
1 Data Warehousing Lecture-13 Dimensional Modeling (DM) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
Data Warehousing Concepts, by Dr. Khalil 1 Data Warehousing Design Dr. Awad Khalil Computer Science Department AUC.
Ahsan Abdullah 1 Data Warehousing Lecture-7De-normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-4 Introduction and Background Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
Ahsan Abdullah 1 Data Warehousing Lecture-18 ETL Detail: Data Extraction & Transformation Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. &
Ahsan Abdullah 1 Data Warehousing Lecture-9 Issues of De-normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Data Warehousing 1 Lecture-28 Need for Speed: Join Techniques Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-2 Introduction and Background Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
Ahsan Abdullah 1 Data Warehousing Lecture-10 Online Analytical Processing (OLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
Data Warehousing Lecture-31 Supervised vs. Unsupervised Learning Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Ahsan Abdullah 1 Data Warehousing Lecture-16 Extract Transform Load (ETL) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
1 Data Warehousing Lecture-15 Issues of Dimensional Modeling Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Data Warehousing Lecture-30 What can Data Mining do? Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-29 Brief Intro. to Data Mining Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
UNIT-II Principles of dimensional modeling
1 On-Line Analytic Processing Warehousing Data Cubes.
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-22 DQM: Quantifying Data Quality Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
Data warehousing theory and modelling techniques Graduate course on dimensional modelling.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Ahsan Abdullah 1 Data Warehousing Lecture-6Normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Dr. Abdul Basit Siddiqui. The Process of Dimensional Modeling Four Step Method from ER to DM 1. Choose the Business Process 2. Choose the Grain 3. Choose.
Data Warehousing DSCI 4103 Dr. Mennecke Chapter 2.
Ahsan Abdullah 1 Data Warehousing Lecture-8 De-normalization Techniques Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-21 Introduction to Data Quality Management (DQM) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Lecture-3 Introduction and Background
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Lecture-32 DWH Lifecycle: Methodologies
On-Line Analytic Processing
Data warehouse and OLAP
OLAP Systems versus Statistical Databases
Star Schema.
Dimensional Model January 14, 2003
Inventory is used to illustrate:
Retail Sales is used to illustrate a first dimensional model
CMPE 226 Database Systems April 11 Class Meeting
Retail Sales is used to illustrate a first dimensional model
Lecture-38 Case Study: Agri-Data Warehouse
MIS2502: Data Analytics Dimensional Data Modeling
Lecture-35 DWH Implementation: Pitfalls, Mistakes, Keys
Retail Sales is used to illustrate a first dimensional model
Presentation transcript:

1 Data Warehousing Lecture-14 Process of Dimensional Modeling Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research National University of Computers & Emerging Sciences, Islamabad

2 Process of Dimensional Modeling

3 The Process of Dimensional Modeling Four Step Method from ER to DM 1.Choose the Business Process 2.Choose the Grain 3.Choose the Facts 4.Choose the Dimensions

4 Step-1: Choose the Business Process  A business process is a major operational process in an organization.  Typically supported by a legacy system (database) or an OLTP.  Examples: Orders, Invoices, Inventory etc.  Business Processes are often termed as Data Marts and that is why many people criticize DM as being data mart oriented.

5 Star-1 Star-2 Snow-flake Step-1: Separating the Process

6 Step-2: Choosing the Grain  Grain is the fundamental, atomic level of data to be represented.  Grain is also termed as the unit of analyses.  Example grain statements  Typical grains  Individual Transactions  Daily aggregates (snapshots)  Monthly aggregates  Relationship between grain and expressiveness.  Grain vs. hardware trade-off.

7 Step-2: Relationship b/w Grain Daily aggregates 6 x 4 = 24 values Four aggregates per week 4 x 4 = 16 values Two aggregates per week 2 x 4 = 8 values LOW GranularityHIGH Granularity

8 The case FOR data aggregation  Works well for repetitive queries.  Follows the known thought process.  Justifiable if used for max number of queries.  Provides a “big picture” or macroscopic view.  Application dependent, usually inflexible to business changes (remember lack of absoluteness of conventions).

9 The case AGAINST data aggregation  Aggregation is irreversible.  Can create monthly sales data from weekly sales data, but the reverse is not possible.  Aggregation limits the questions that can be answered.  What, when, why, where, what-else, what-next

10 The case AGAINST data aggregation  Aggregation can hide crucial facts.  The average of 100 & 100 is same as 150 & 50

11 Aggregation hides crucial facts Example Week-1Week-2Week-3Week-4Average Zone-1100 Zone Zone Zone Average100 Just looking at the averages i.e. aggregate

12 Aggregation hides crucial facts chart Z1: Sale is constant (need to work on it) Z2: Sale went up, then fell (need of concern) Z3: Sale is on the rise, why? Z4: Sale dropped sharply, need to look deeply. W2: Static sale

13 “We need monthly sales volume and Rs. by week, product and Zone” Facts Dimensions Step 3: Choose Facts statement

14  Choose the facts that will populate each fact table record.  Remember that best Facts are Numeric, Continuously Valued and Additive.  Example: Quantity Sold, Amount etc. Step 3: Choose Facts

15  Choose the dimensions that apply to each fact in the fact table.  Typical dimensions: time, product, geography etc.  Identify the descriptive attributes that explain each dimension.  Determine hierarchies within each dimension. Step 4: Choose Dimensions

16 Step-4: How to Identify a Dimension?  The single valued attributes during recording of a transaction are dimensions. Calendar_Date Time_of_Day Account _No ATM_Location Transaction_TypeTransaction_Rs Fact Table Dim Time_of_day: Time_of_day: Morning, Mid Morning, Lunch Break etc. Transaction_Type: Transaction_Type: Withdrawal, Deposit, Check balance etc.

17 Step-4: Can Dimensions be Multi-valued?  Are dimensions ALWYS single?  Not really  What are the problems? And how to handle them  Calendar_Date (of inspection)  Reg_No  Technician  Workshop  Maintenance_Operation  How many maintenance operations are possible?  Few  Maybe more for old cars.

18 Step-4: Dimensions & Grain  Several grains are possible as per business requirement.  For some aggregations certain descriptions do not remain atomic.  Example: Time_of_Day may change several times during daily aggregate, but not during a transaction  Choose the dimensions that are applicable within the selected grain.