MIS 451 Building Business Intelligence Systems Logical Design (1)

Slides:



Advertisements
Similar presentations
The Organisation As A System An information management framework The Performance Organiser Data Warehousing.
Advertisements

Chapter 4 Tutorial.
Dimensional Modeling.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
MIS 451 Building Business Intelligence Systems
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Data Warehousing M R BRAHMAM.
Dimensional Modeling Business Intelligence Solutions.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
MIS 451 Building Business Intelligence Systems Logical Design (5) – Aggregate.
Dimensional Modeling – Part 2
Data Warehousing Design Transparencies
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Dimensional Modeling I Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Dimensional Modeling II Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
MIS 451 Building Business Intelligence Systems Logical Design (3) – Design Multiple-fact Dimensional Model.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Dimensional Modeling VI Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing ISYS 650. What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Tanvi Madgavkar CSE 7330 FALL Ralph Kimball states that : A data warehouse is a copy of transaction data specifically structured for query and analysis.
Designing a Data Warehouse
Principles of Dimensional Modeling
1 Basic concepts of On-Line Analytical processing DT211 /4.
Data warehousing theory and modelling techniques Building Dimensional Models.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
DWH – Dimesional Modeling PDT Genči. 2 Outline Requirement gathering Fact and Dimension table Star schema Inside dimension table Inside fact table STAR.
Sayed Ahmed Logical Design of a Data Warehouse.  Free Training and Educational Services  Training and Education in Bangla: Training and Education in.
Bogdan Shishedjiev Data Analysis1 Data Analysis OLTP and OLAP Data Warehouse SQL for Data Analysis Data Mining.
Dimensional model. What do we know so far about … FACTS? “What is the process measuring?” Fact types:  Numeric Additive Semi-additive Non-additive (avg,
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
1 Data Warehousing Lecture-13 Dimensional Modeling (DM) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
Data Warehousing Concepts, by Dr. Khalil 1 Data Warehousing Design Dr. Awad Khalil Computer Science Department AUC.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
Data Warehouse. Design DataWarehouse Key Design Considerations it is important to consider the intended purpose of the data warehouse or business intelligence.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
BI Terminologies.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
The University of Akron Dept of Business Technology Computer Information Systems The Relational Model: Concepts 2440: 180 Database Concepts Instructor:
1 Data Warehousing Lecture-15 Issues of Dimensional Modeling Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
More Dimensional Modeling. Facts Types of Fact Design Transactional Periodic Snapshot –Predictable time period –Ex. Monthly, yearly, etc. Accumulating.
UNIT-II Principles of dimensional modeling
Building Dashboards SharePoint and Business Intelligence.
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
Data Warehousing Multidimensional Analysis
Pooja Sharma Shanti Ragathi Vaishnavi Kasala. BUSINESS BACKGROUND Lowe's started as a single hardware store in North Carolina in 1946 and since then has.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
DO NOT COPY --CONFIDENTIAL Homework 5 Partial Key Star Diagrams & Data Warehouse Design BCIS 4660 Dr. Nick Evangelopoulos Spring 2012.
Data Warehousing DSCI 4103 Dr. Mennecke Chapter 2.
MIS 451 Building Business Intelligence Systems Data Staging.
Last Updated : 26th may 2003 Center of Excellence Data Warehousing Introductionto Data Modeling.
SQL Server Analysis Services Understanding Unified Dimension Model (UDM)
Building the Corporate Data Warehouse Pindaro Demertzoglou Data Resource Management.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Data warehouse and OLAP
Star Schema.
Dimensional Model January 14, 2003
Retail Sales is used to illustrate a first dimensional model
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009
Retail Sales is used to illustrate a first dimensional model
Retail Sales is used to illustrate a first dimensional model
Dimensional Model January 16, 2003
DWH – Dimesional Modeling
Examines blended and separate transaction schemas
Presentation transcript:

MIS 451 Building Business Intelligence Systems Logical Design (1)

2 Project Planning Requirements Analysis Physical Design Logical Design Data Staging Data Analysis (OLAP)

3 Introduction to Dimensional Modeling Dimensional Modeling is a DW logical design technique that seeks to present data in a standard framework that is intuitive for data access and allows for high performance data access. Intuitive: easy to write SQL High performance: high performance SQL

4 ER Model Dimensional Model (Star Schema) For detailed information, please refer handout 1.

5 Introduction to Dimensional Modeling Analytical Report: 2-dimension January sales report by customer state and product category Query: list sales in Jan. by customer state and product category?

6 Introduction to Dimensional Modeling Query based on ER Model: Select State, PCName, SUM(Price*Quantity) From OrderLine OL, Customer C, Product_Category PC, Product P, Order O Where OL.OID = O.OID and OL.PID = P.PID and O.CID = C.CID and to_char(O.OrderDate,’MON’) = ’JAN’ and P.PCID = PC.PCID Group by State, PCName Join: 5 tables Query based on Dimensional Model: Select State, PCName, SUM(Sales) From Sales S, Customer C, Product P, Time T Where S.Time_ Key = T.Time_Key and S.Product_ Key = P.Product_Key and S.Customer_Key = C.Customer_Key and T.Month= ’JAN’ Group by State, PCName Join: 4 tables

7 Fact and Dimension Fact table Dimension table

8 Fact and Dimension There are two types of tables in dimensional modeling: Fact table: attributes in fact tables are measurements for analysis or contents in reports. Dimension table: attributes in dimension tables are constraints for the measurements or headers in reports. Dimensions Facts

9 Facts and Dimensions CriteriaFact AttributesDimension Attributes PurposeMeasurements for analysisConstraints for the measurements Reporting useReport contentRow or column report headers Data typeMost facts are numeric and additive. There are semi-additive or no-additive facts. Textual, descriptive SizeLarger number of recordsSmaller number of records

10 Facts and Dimensions How to identify facts and dimensions? Requirements Analysis: Analytical requirements: Marketing managers want to know sales performance for different product category in different states? Information requirements: quantity of product sold, sales amount, product category, and customer states ER Model

11

12 F1: Calculation F: refers to special considerations for fact table or special type of fact table

13 F1: Calculation Normalization in RDB 1NF 2NF 3NF Non-volatile property of data warehouse enables DW design to resist normalization and improve query performance.

14 D1: Slowly changing dimension D: refers to special considerations for dimension table or special type of dimension table

15 D1: Slowly changing dimension Values of attributes in dimension tables may evolve over time. For example, customers moved from one city to another city. CID CNameStateCity 101JonArizonaTucson 102TomArizonaTucson 103MarkArizonaPhoenix Tom moved from Tucson to Phoenix Phoenix

16 D1: Slowly changing dimension There are three ways to handle slowly changing dimension. Method 1: Overwrite old values with new values CID CNameStateCity 101JonArizonaTucson 102TomArizonaTucson 103MarkArizonaPhoenix CID CNameStateCity 101JonArizonaTucson 102TomArizonaPhoenix 103MarkArizonaPhoenix

17 D1: Slowly changing dimension Drawbacks of method 1: Historical information is totally lost. We will never know that customer 102 lived in Tucson before. Moreover, when listing sales by city, all the sales of customer 102 will be counted as part of Phoenix sales, although 102 was in Tucson before.

18 D1: Slowly changing dimension Method 2: Add a new attribute to record current value of the changing attribute. CID CNameStateCity 101JonArizonaTucson 102TomArizonaTucson 103MarkArizonaPhoenix CIDCNameStateOriginal CityCurrent City 101JonArizonaTucson 102TomArizonaTucsonPhoenix 103MarkArizonaPhoenix

19 D1: Slowly changing dimension Drawbacks of method 2: Only partial Historical information (original & current) is kept. Considering that customer 102 moved from Tucson to Flagstaff then to Phoenix, the customer information of customer 102 only includes Tucson and Phoenix.

20 D1: Slowly changing dimension Method 3: Add a record whenever a dimension attribute changes. CID CNameStateCity 101JonArizonaTucson 102TomArizonaTucson 103MarkArizonaPhoenix

21 D1: Slowly changing dimension Method 3 keep all the information. However, Is there any problem?

22 D1: Slowly changing dimension Method 4: warehouse key + method 3 Warehouse key is a sequence of non-negative integers served as primary keys of tables in data warehouse. CID CNameStateCity 101JonArizonaTucson 102TomArizonaTucson 103MarkArizonaPhoenix Warehouse key

23 D1: Slowly changing dimension Why warehouse key is needed in data warehouse? Solve slowly changing dimension problem Compared with natural keys (i.e., primary keys of tables in RDB, such as CID of customer table), warehouse keys have high join performance.

24 D1: Slowly changing dimension Warehouse key Primary keys in dimensional tables are warehouse keys. Primary key in fact table is a collection of warehouse keys of all/part of its associated dimensions.

25 D1: Slowly changing dimension Notation: Primary key

26 D2: Time Dimension D: refers to special considerations for dimension table or special type of dimension table

27 D2: Time Dimension Data warehouse needs an explicit time dimension table instead of just a time attribute (e.g, ORDERDATE). Besides the time attribute, time dimension table includes the following additional attributes: Day_of_week (1-7); Day_number_in_month (1-31); Day_number_in_year (1-365) Week_number (1-52); month (1-12), Quarter (1-4) Holiday_flag (y/n) Fiscal_quarter, Fiscal_year

28 D2: Time Dimension Time dimension can: Save computation effort and improve query performance Complex queries regarding calendar calculation are hidden from end users of data warehouse.

29 D3: Snowflake D: refers to special considerations for dimension table or special type of dimension table

30 D3: Snowflake Snowflake structure

31 D3: Snowflake Snowflake structure should be avoided in data warehouse design Tradeoff of avoiding snowflake Advantage: improve query performance Disadvantage: require more storage space