Modeling Issues for Data Warehouses CMPT 455/826 - Week 7, Day 1 (based on Trujollo) Sept-Dec 2009 – w7d11.

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
C6 Databases.
Chapter 3 Data Modeling Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Data Design The futureERD - CardinalityCODINGRelationshipsDefinition.
Ch5: ER Diagrams - Part 1 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
1 SWE Introduction to Software Engineering Lecture 13 – System Modeling.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 3 The Basic (Flat) Relational Model.
Multidimensional Modeling MIS 497. What is multidimensional model? Logical view of the enterprise Logical view of the enterprise Shows main entities of.
Data Sources Data Warehouse Analysis Results Data visualisation Analytical tools OLAP Data Mining Overview of Business Intelligence Data visualisation.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
MIS 451 Building Business Intelligence Systems Logical Design (3) – Design Multiple-fact Dimensional Model.
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Warehouse and Data Cube Lecture Notes for Chapter 3 Introduction to Data Mining By.
Database Design Concepts Info1408
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
WJEC Applied ICT Databases – Attributes & Entities Entities A database contains one or more related tables. Each table holds all of the information.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
INTRODUCTION TO DATABASE USING MS ACCESS 2013 PART 2 NOVEMBER 4, 2014.
Data Modeling Using the Entity-Relationship Model
Understanding Data Analytics and Data Mining Introduction.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Database Design Sections 6 & 7 Second Normal Form (2NF), Unique Identifiers (UID), Third Normal Form (3NF), Arcs, Hierarchies and Recursive relationships.
THE RELATIONAL DATA MODEL CHAPTER 3 (6/E) CHAPTER 5 (5/E) 1.
Understanding Semantic Relationships By Veda C. Storey CMPT 455/826 - Week 5, Day 1 Sept-Dec 2009 – w5d11.
Artificial, Composite and Secondary UIDs
OnLine Analytical Processing (OLAP)
Chapter 3: Relational Model I Structure of Relational Databases Structure of Relational Databases Convert a ER Design to a Relational Database Convert.
CSC 240 (Blum)1 Introduction to Database. CSC 240 (Blum)2 Data versus Information When people distinguish between data and information, –Data is simply.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
Systems Analysis and Design in a Changing World, 6th Edition 1 Chapter 4 Domain Classes.
1 Relational Databases and SQL. Learning Objectives Understand techniques to model complex accounting phenomena in an E-R diagram Develop E-R diagrams.
BI Terminologies.
Some OLAP Issues CMPT 455/826 - Week 9, Day 2 Jan-Apr 2009 – w9d21.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Data resource management
UNIT_2 1 DATABASE MANAGEMENT SYSTEM[DBMS] [Unit: 2] Prepared By Lavlesh Pandit SPCE MCA, Visnagar.
Object-Oriented Modeling: Static Models. Object-Oriented Modeling Model the system as interacting objects Model the system as interacting objects Match.
CISB113 Fundamentals of Information Systems Data Management.
Home Work. Design Principles and Weak Entity Sets.
UNIT-II Principles of dimensional modeling
1 On-Line Analytic Processing Warehousing Data Cubes.
Object-Oriented Analysis and Design CHAPTERS 9, 31: DOMAIN MODELS 1.
CSE314 Database Systems Lecture 3 The Relational Data Model and Relational Database Constraints Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Data Modelling and Cleaning CMPT 455/826 - Week 8, Day 2 Sept-Dec 2009 – w8d21.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
The relational model A data model (in general) : Integrated collection of concepts for describing data (data requirements). Relational model was introduced.
IS 320 Notes for April 15, Learning Objectives Understand database concepts. Use normalization to efficiently store data in a database. Use.
1. Convert a conceptual business process level REA model into a logical relational model 2. Convert a logical relational model into a physical implementation.
Robust Estimation With Sampling and Approximate Pre-Aggregation Author: Christopher Jermaine Presented by: Bill Eberle.
Data modeling Process. Copyright © CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design.
Advanced Database Concepts
The Data Warehouse Chapter Operational Databases = transactional database  designed to process individual transaction quickly and efficiently.
Mapping ER to Relational Model Each strong entity set becomes a table. Each weak entity set also becomes a table by adding primary key of owner entity.
RELATIONAL TABLE NORMALIZATION. Key Concepts Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive.
Pindaro Demertzoglou Data Resource Management – MGMT 4170 Lally School of Management Rensselaer Polytechnic Institute.
Database Concepts and Applications in HRIS
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
1 Database Design Sections 6 & 7 First Normal Form (1NF), Second Normal Form (2NF), Unique Identifiers (UID), Third Normal Form (3NF), Arcs, Hierarchies.
Managing Data Resources File Organization and databases for business information systems.
COP Introduction to Database Structures
Logical Database Design and the Rational Model
On-Line Analytic Processing
Star Schema.
Slides based on those originally by : Parminder Jeet Kaur
Presentation transcript:

Modeling Issues for Data Warehouses CMPT 455/826 - Week 7, Day 1 (based on Trujollo) Sept-Dec 2009 – w7d11

This is a tough paper This is the toughest paper that we’ve dealt with so far It introduces –a number of concepts that are very important –in ways that are often difficult to follow –with a combination of standard and homemade terms So, for today –rather than concentrate on critique items –we need to concentrate on the concepts Sept-Dec 2009 – w7d12

Multidimensional modeling Ties together the concepts of: –a data warehouse –multidimensional database (MDB) –online analytical processing (OLAP) What are dimensions? What are –data warehouses –multidimensional database (MDB) –online analytical processing (OLAP) Sept-Dec 2009 – w7d13

Multidimensional modeling Structures information into –facts –dimensions a set of attributes called measures or fact attributes –can be atomic or derived –are contained in cells or points within the data cube We base this set of measures on a set of dimensions that derive from the granularity chosen for representing the facts. These dimensions thus present the context for analyzing the facts. dimension attributes –provide the specifics that characterize dimensions. Sept-Dec 2009 – w7d14

Multidimensional modeling facts –many-to-many relationships between all dimensions –many-to-one relationships between the fact and every particular dimension e.g. product sale is related to only one product that is sold in one store to one customer at one time –can represent many-to-many relationships between particular dimensions e.g. one sales slip can contain many products, and one product can be on many sales slips Sept-Dec 2009 – w7d15

Multidimensional modeling The additivity / summarizability concept –A measure (fact attribute) is additive along a dimension if we can use the SUM operator to aggregate attribute values along all hierarchies defined on that dimension The aggregation of some fact attributes –called roll-up in OLAP terminology –might not be semantically meaningful for all measures along all dimensions e.g. number of clients –estimated by counting the number of purchase receipts for a given product, customer, day, and store –is not additive along the product dimension. Because the same ticket can include other products, adding up the number of clients for two or more products would lead to inconsistent results. Sept-Dec 2009 – w7d16

Multidimensional modeling The strictness concept –an object at a hierarchy’s lower level –belongs to only one higher level object –e.g. a province can only relate to one country The completeness concept –all members belong to one higher-class object and –that object consists of those members only –e.g. only the recorded provinces can form a country. In a “complete” classification hierarchy between the country and province levels, all the recorded provinces form the country, and all the provinces that form the country have been recorded Sept-Dec 2009 – w7d17

Multidimensional modeling Categorization of dimensions –some attributes are normally valid for all elements within a dimension –while others are only valid for a subset of elements –e.g. the attributes alcohol percentage and volume would only be valid for drink products and would be null for food products. A proper multidimensional data model –should consider attributes only when necessary, –depending on the categorization of dimensions. Sept-Dec 2009 – w7d18

Multidimensional modeling Recommended modeling approach –Clearly separate the structure of a multidimensional model into facts dimensions –Fact classes are composite classes “in a shared-aggregation relationship of n dimension classes” e.g. they relate instances from all dimensions –A fact object instance is always related to object instances from all dimensions Sept-Dec 2009 – w7d19

Multidimensional modeling Given the basic of their modeling approach –they then go on to explain how they can annotate derived measures (with a “/”) table specific components of the table’s primary key / object ID (“OID”) attributes that function as descriptors (‘D”) constraints on additivity (between braces near the fact table) additivity and derivation rules (separate from the diagram) that a dimension is a directed acyclic graph (“DAG”) –they also use various other UML notations Is this perhaps a little much semantic loading? Sept-Dec 2009 – w7d110

Multidimensional modeling Regardless of how we model these various concepts –it is important that they be considered –in the design of data warehouses Sept-Dec 2009 – w7d111

Dimensional Modeling (based on Jones) Sept-Dec 2009 – w7d112

Characteristics for using Patterns The problem that the pattern addresses is identified, recognized, and defined from real world situations. A pattern provides an approach for formulating a solution to a real world problem. The approach must be defined with respect to the real world context from which the problem emanates. The approach is reusable because it has been successfully used to solve recurring real world problems. A pattern endures over time. Sept-Dec 2009 – w7d113

Dimensional Data Patterns involve a commonly known & recognized mental model –with the intent of increasing the practitioner's ability to understand, remember, and apply the DDPs facilitate the identification of commonly used entities –thereby providing a greater potential for improving design correctness with the initial model are common across many dimensional models –thus reusability is improved and design time may be decreased Sept-Dec 2009 – w7d114

Mental Models for DDPs Using a story as the basis for Domain DDPs –Who: the characters involved in the story –What: the important entities and the ideas for those entities –When: a particular time frame involved –Where: the location / setting of the story –Why: the motivation or the reasons behind the story Sept-Dec 2009 – w7d115

Domain DDPs A high-level set of domains can then be constructed: –temporal (when) –location (where) –stakeholder (who) –action (what is done or accomplished) –object (what) –qualifier (why) Sept-Dec 2009 – w7d116

Commonality of DDPs The basic domains can apply to any story Experience across stories will recognize commonalities Individual stories may contain unique components –however, many of these components will take on similar patterns –despite the components having different names Sept-Dec 2009 – w7d117