1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.

Slides:



Advertisements
Similar presentations
The Organisation As A System An information management framework The Performance Organiser Data Warehousing.
Advertisements

Author: Graeme C. Simsion and Graham C. Witt Chapter 11 Logical Database Design.
Chapter 4 Tutorial.
Dimensional Modeling.
Tips and Tricks for Dimensional Modeling
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Cognos 8 Training Session
IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading.
Alternative Database topology: The star schema
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Multidimensional Modeling MIS 497. What is multidimensional model? Logical view of the enterprise Logical view of the enterprise Shows main entities of.
MIS 451 Building Business Intelligence Systems Logical Design (3) – Design Multiple-fact Dimensional Model.
1 Lecture 10: More OLAP - Dimensional modeling
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing ISYS 650. What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data.
Lecture 5 CS.456 DATABASE DESIGN.
Class Agenda: 02/13/2014 Review Goals of assignments.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
ISQS 3358, Business Intelligence Creating Data Marts Zhangxi Lin Texas Tech University 1.
DWH – Dimesional Modeling PDT Genči. 2 Outline Requirement gathering Fact and Dimension table Star schema Inside dimension table Inside fact table STAR.
IMS 6217: Data Warehousing / Business Intelligence Part 3 1 Dr. Lawrence West, Management Dept., University of Central Florida Analysis.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Dimensional model. What do we know so far about … FACTS? “What is the process measuring?” Fact types:  Numeric Additive Semi-additive Non-additive (avg,
Dimensional Modeling Chapter 2. The Dimensional Data Model An alternative to the normalized data model Present information as simply as possible (easier.
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
CS 157B: Database Management Systems II March 20 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
1 Data Warehouses BUAD/American University Data Warehouses.
University of Nevada, Reno Organizational Data Design Architecture 1 Organizational Data Architecture (2/19 – 2/21)  Recap current status.  Discuss the.
 Agenda 2/20/13 o Review quiz, answer questions o Review database design exercises from 2/13 o Create relationships through “Lookup tables” o Discuss.
BI Terminologies.
Basic Model: Retail Grocery Store
Designing a Data Warehousing System. Overview Business Analysis Process Data Warehousing System Modeling a Data Warehouse Choosing the Grain Establishing.
More Dimensional Modeling. Facts Types of Fact Design Transactional Periodic Snapshot –Predictable time period –Ex. Monthly, yearly, etc. Accumulating.
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
IS201 Agenda: 09/19  Modify contents of the database.  Discuss queries: Turning data stored in a database into information for decision making.  Create.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Data Warehousing.
The Data Warehouse Chapter Operational Databases = transactional database  designed to process individual transaction quickly and efficiently.
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
Data modeling. Presentation by – Anupama Vudaru, Phani Kondapalli Content by – Prathibha Madineni, Subrahmanyam Kolluri October 2010.
University of Nevada, Reno Organizational Data Design Architecture 1 Agenda for Class: 02/06/2014  Recap current status. Explain structure of assignments.
 Review quiz. Answer questions.  Discuss queries: ◦ What is a query? Turning data stored in a database into information for decision making. ◦ You: Completed.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Building the Corporate Data Warehouse Pindaro Demertzoglou Data Resource Management.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
Building the Corporate Data Warehouse Pindaro Demertzoglou Lally School of Management Data Resource Management.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Data Warehouses Brief Overview Add ETL Copyright © 2011 Curt Hill.
Competing on Analytics II
Dimensional Model January 14, 2003
Retail Sales is used to illustrate a first dimensional model
CMPE 226 Database Systems April 11 Class Meeting
Relational Database Model
MIS2502: Data Analytics Dimensional Data Modeling
Retail Sales is used to illustrate a first dimensional model
MIS2502: Data Analytics Dimensional Data Modeling
Retail Sales is used to illustrate a first dimensional model
Dimensional Model January 16, 2003
DWH – Dimesional Modeling
Analysis Services Analysis Services vs. the Data Warehouse vs. OLTP DB
Presentation transcript:

1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design a data mart.  Did stakeholder analysis and BI data analysis last class.  Presented some of the analyses.  Try data mart designs for class exercise.

Database designs Transaction database.  May be many; may be incomplete for BI needs; usually only internal data. ERD. Reconciled database (data warehouse design)  Integrated transaction databases.  Usually third normal form; designed to last over time as the single version of truth. ERD.  Encompasses time in the design – slowly changing dimensions.  May encompass external data. Data Mart  Designed to support a set of timely, urgent decisions.  Will not last over time; will change as the BI needs change. 2

3 Data Mart Considerations Focused on a specific subject and fairly specific decisions. Data is usually stored permanently in the reconciled data model and then in the replicated data mart. Data marts are deleted when no longer useful for decision making.

4 Contents of a data mart Internal and external data.  Organizations don’t usually store external data permanently in the reconciled data model. They store/use as necessary for a given decision. May integrate into the data mart. Data set is limited.  Must decide what data is necessary to support decision making.  Think Excel pivot table format.  Must be usable by people who may have limited knowledge of data structures or SQL type of programming. Contains facts and dimensions.

Facts “Fact” means data related to a set of decisions.  A “fact” is measured by numeric values.  A “measure” is a numerical property of a fact.  A set of numeric measurements are stored in a “fact table”.  The measurements should be capable of being aggregated and manipulated mathematically.  A fact table contains measures and keys. That’s it. Examples of measurements stored in a fact table:  Sales related: Sales $ of a given transaction, quantity sold, unit price, unit weight.  Service call: Duration in time increment, satisfaction measure.  Manufacturing: qty items produced, qty items accepted, qty items rejected.  Human resources: qty people hired, qty people fired, qty people trained. 5

Dimensions A dimension is a property of a fact.  Think of it as the “by” property: Qty of units rejected by employee, by manufacturing process, by production line, by plant, by product, by product type, by week, by month, by year. Dimensions are filled with helpful data.  May contain long, descriptive data: Stored data should be long enough and descriptive enough to be understandable. Avoid storing coded data.  Should be complete – no null values. Should contain understandable messages rather than null values.  Should be accurate – no misspellings, obsolete data, nor incorrect/impossible values. 6

Sample “star” basic data mart (Crowsfoot ERD) 7 Fact Table Dimension Tables Dimension Table

8 Sample basic data mart using dimensional modeling notation

Issues in designing a fact table Granularity: The level of detail.  Think about when a row is created in the fact table. That will be the “grain” of the fact table.  For example, is a row created every time a sales transaction occurs in each and every store? Or every ten minutes in each and every store? Or every ten minutes for all stores together? The grain must be consistent for all measures and all rows.  One transaction per row.  Must be able to aggregate data in the rows (sum, count, max, min, avg). Must be able to perform consistent mathematical operations. Fact table keys.  Using surrogate in the example on the previous page.  Frequently concatenated key composed of all foreign keys. Can have a “factless” fact table (means measureless).  The fact serves as the intersection among the dimensions.  The measure is a count of the incidences of intersection. 9

Issues in designing dimension tables Hierarchy:  Dimensions usually have hierarchical relationships within the dimension.  There are frequently multiple 1:m relationships between data in a single dimension.  This is called “snowflake” (-ed or –ing) dimensions. Sometimes there are relationships between dimensions.  This is called “cross-dimensional” attributes.  Common cross-dimensional attributes are location (city, county, state, country) and date (day, week, month, quarter, year). 10

11 Sample snowflake design using crowsfoot ERD notation

12 Sample snowflake using dimensional modeling notation (overview)

13 Sample snowflake using dimensional modeling notation (more detail)

Incorporating external data Consider grain.  Must align data with appropriate grain.  Usually does not align with fact table because grain is not congruent. Consider dimensional relationship.  Is the external data a “by”? Do you want to look at the measure in the fact table “by” something that is external?  Is the external data an attribute that can be contained within an existing dimension?  Is the external data a separate table that can be related to an existing dimension? Maybe a cross dimensional relationship? 14

Incorporate External Data as a separate table 15

External data Most often stored as a separate table. Can be descriptive attributes within an existing table. Are usually related to time or to location (or both). External data as measures:  If they fit with the grain of a fact table, then they can be measures within the same fact table as internal data.  If they don’t fit with the grain, then consider creating a separate fact table with external measures. Relate that fact table with the appropriate dimensions. 16