Data Warehousing DSCI 4103 Dr. Mennecke Chapter 2.

Slides:



Advertisements
Similar presentations
The Organisation As A System An information management framework The Performance Organiser Data Warehousing.
Advertisements

Chapter 4 Tutorial.
Dimensional Modeling.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Cognos 8 Training Session
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Dimensional Modeling Business Intelligence Solutions.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Data Warehouse IMS5024 – presented by Eder Tsang.
1 9 Ch3, Hachim Haddouti Adv. DBS and Data Warehouse CSC5301 Ch3 Hachim Haddouti Hachim Haddouti.
MIS 451 Building Business Intelligence Systems Logical Design (5) – Aggregate.
Dimensional Modeling – Part 2
Data Warehousing Design Transparencies
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Dimensional Modeling II Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing (Kimball, Ch.2-4) Dr. Vairam Arunachalam School of Accountancy, MU.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Tanvi Madgavkar CSE 7330 FALL Ralph Kimball states that : A data warehouse is a copy of transaction data specifically structured for query and analysis.
Designing a Data Warehouse
Principles of Dimensional Modeling
Summarizing Data with CUBE and ROLLUP. SQL ROLLUP and CUBE commands Quick, efficient way to summarize the data stored in your database Offer a valuable.
DWH – Dimesional Modeling PDT Genči. 2 Outline Requirement gathering Fact and Dimension table Star schema Inside dimension table Inside fact table STAR.
Bogdan Shishedjiev Data Analysis1 Data Analysis OLTP and OLAP Data Warehouse SQL for Data Analysis Data Mining.
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
Dimensional model. What do we know so far about … FACTS? “What is the process measuring?” Fact types:  Numeric Additive Semi-additive Non-additive (avg,
Dimensional Modeling Chapter 2. The Dimensional Data Model An alternative to the normalized data model Present information as simply as possible (easier.
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
1 Data Warehousing Lecture-13 Dimensional Modeling (DM) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
Data Warehousing Concepts, by Dr. Khalil 1 Data Warehousing Design Dr. Awad Khalil Computer Science Department AUC.
Data Warehouse Database Design Methods For Technical IT Audience Peter Nolan
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
Data Warehouse. Design DataWarehouse Key Design Considerations it is important to consider the intended purpose of the data warehouse or business intelligence.
BI Terminologies.
Normalized model vs dimensional model
Basic Model: Retail Grocery Store
Designing a Data Warehousing System. Overview Business Analysis Process Data Warehousing System Modeling a Data Warehouse Choosing the Grain Establishing.
More Dimensional Modeling. Facts Types of Fact Design Transactional Periodic Snapshot –Predictable time period –Ex. Monthly, yearly, etc. Accumulating.
UNIT-II Principles of dimensional modeling
Shilpa Seth.  Multidimensional Data Model Concepts Multidimensional Data Model Concepts  Data Cube Data Cube  Data warehouse Schemas Data warehouse.
Creating the Dimensional Model
1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.
Fact Table The fact table stores business events. The attributes explain the conditions of the entity at the time the business event happened.
MIS 451 Building Business Intelligence Systems Logical Design (1)
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Dr. Abdul Basit Siddiqui. The Process of Dimensional Modeling Four Step Method from ER to DM 1. Choose the Business Process 2. Choose the Grain 3. Choose.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Dimensional Modeling Primer Chapter 1 Kimball & Ross.
DATA WAREHOUSING – DIMENSIONAL MODELLING AND SCHEMAS With MIKE –AARONE ATUHE Handout 5.
Pindaro Demertzoglou Data Resource Management – MGMT 4170 Lally School of Management Rensselaer Polytechnic Institute.
Building the Corporate Data Warehouse Pindaro Demertzoglou Data Resource Management.
Data warehouse and OLAP
PRINCIPLES OF DIMENSIONAL MODELING
Star Schema.
Applying Data Warehouse Techniques
Overview and Fundamentals
Inventory is used to illustrate:
Retail Sales is used to illustrate a first dimensional model
Applying Data Warehouse Techniques
Retail Sales is used to illustrate a first dimensional model
Applying Data Warehouse Techniques
Dimensional Modeling.
Retail Sales is used to illustrate a first dimensional model
Dimensional Model January 16, 2003
DWH – Dimesional Modeling
Applying Data Warehouse Techniques
Applying Data Warehouse Techniques
Presentation transcript:

Data Warehousing DSCI 4103 Dr. Mennecke Chapter 2

Steps in Designing a DW n Choose a business process model (BPM) n Choose the grain of the business process n Choose the dimensions n Choose the measured facts that will populate each record of the fact table

Choose a Business Process Model n A BPM is a view of the organization that considers the operational processes for which the operational system captures data. n Business processes are typically things like orders, shipments, inventory, sales, etc.

Choose the Grain of the Business Process n The grain is the fundamental level of data to be represented in the fact table for the business process that is being modeled n Grains can be is detailed as individual transactions or as broad as periodic summaries –Detailed grains are preferable to highly summarized grains because they offer more flexibility

Choose the Dimensions n Dimensions often fall out of the fields included in the fact table. However, selecting the right dimensions and dimensional fields is critical to the success of a DW –If a dimension requires the addition of more records to the fact table, then it violates the grain of the fact table and is suspect

Choose the Facts n Usually measured facts are numeric additive fields like quantity sold –Non-additive facts are fields such as unit price, ratios, and percentages

Points to Consider n Normalizing a facts table is important: –A fact table is the largest part of the DW (literally hundreds of millions of records), therefore as few fields as possible should be included in this table n Normalizing dimension tables is a waste of time: –Dimension tables are minuscule when compared to the fact table. –When dimension tables are normalized, it reduces the users ability to browse the data

Date Dimensions n A date dimension is important because DW are time variant. This means that they capture data at a moment in time. This implies that date must usually be part of a DW query.

Date fields n For example… Date key Date Full date description Day of week Day of epoch Week number of epoch Month number of epoch Day number in calendar month Day number in calendar year Day number in fiscal month Day number in fiscal year Selling season Weekend indicator Holiday indicator Calendar quarter Etc…

Degenerate dimensions n Fact table fields that are included in the table even though they are not linked to a dimension table –Occurs when the grain of the fact table corresponds to an individual transaction For example, the POS transaction# is important to the transaction, but is described by the facts and figures represented elsewhere in the fact table (not in a dimension)

Drilling Up and Down n To drill down into a DW, more dimensions are added to the query –In other words, more fields are added to the query which results in more records being included in the dynaset n To drill up, dimensions are removed from the query –In other words, fewer fields are part of the query which results in fewer records being included in the dynaset

Normalizing Dimensions n Snowflaking is the process of normalizing denormalized dimensional tables. Product Dimension Brand Dimension Package Type Dimension Storage Type Dimension

Normalization should be resisted n Presentation is more difficult with snowflake dimensions n Snowflake dimensions create difficulties for query optimizers n Disk space savings are insignificant n Snowflake dimensions inhibit a user’s ability to browse and query dimensions n Snowflake dimensions reduce index efficiency (DWs are highly indexed)

Surrogate keys n Every join between dimensions and the fact table in the data warehouse should be based on meaningless integer surrogates –Avoid using natural codes or keys –Keys should not be smart n Over time, smart keys and natural codes may change and loose their meaning