Data Warehouse IMS5024 – presented by Eder Tsang.

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Cognos 8 Training Session
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Data Warehousing M R BRAHMAM.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Dimensional Modeling Business Intelligence Solutions.
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/20101Lipyeow.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Introduction to Data Warehousing
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Data Warehousing ISYS 650. What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data.
DATA WAREHOUSE (Muscat, Oman).
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Designing a Data Warehouse
An Overview of Data Warehousing and OLTP Technology Presenter: Parminder Jeet Kaur Discussion Lead: Kailang.
Defining Data Warehouse Concepts and Terminology.
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
Data Warehouse & Data Mining
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
Data Warehousing & Business Intelligence Introduction What do you think of when you hear the words Data Warehousing ? Prithwis Mukerjee, Ph.D.
Data Warehouse Concepts Transparencies
Data Warehouse Architecture. Inmon’s Corporate Information Factory The enterprise data warehouse is not intended to be queried directly by analytic applications,
AN OVERVIEW OF DATA WAREHOUSING
Program Pelatihan Tenaga Infromasi dan Informatika Sistem Informasi Kesehatan Ari Cahyono.
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
Data Warehousing.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
CISB594 – Business Intelligence
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
CISB594 – Business Intelligence Data Warehousing Part I.
Two-Tier DW Architecture. Three-Tier DW Architecture.
Advanced Database Concepts
Data Warehousing 4 Definition of Data Warehouse 4 Architecture of Data Warehouse 4 Different Data Warehousing Tools 4 Summary.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Acct 6910 Building Business Intelligence Systems An Introduction to Data Warehouse.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 5: Data Warehousing.
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
Data Warehouse/Data Mart It’s all about the data.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Advanced Applied IT for Business 2
Data warehouse.
Decision Support System by Simulation Model (Ajarn Chat Chuchuen)
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
Data Warehouse and OLAP
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009
Data Warehousing Data Model –Part 1
Introduction of Week 9 Return assignment 5-2
Data Warehousing Concepts
Data Warehouse and OLAP
Data Warehouse and OLAP Technology
Presentation transcript:

Data Warehouse IMS5024 – presented by Eder Tsang

Data Warehouse  A data warehouse is a system consisting of processes and databases used to provide the “data infrastructure” for EIS and DSS  “… a subject-oriented, integrated, timevariant, and non- volatile collection of data in support of management’s decisions” Inmon and Hackathorn (1994 )

Data warehouse - subject oriented  The data warehouse is organised by “data subjects” that are relevant to the organisation. – Customer, claim, shipment, product  This may be contrasted with the process orientation of many OLTP systems

Data warehouse - integrated  Data in the warehouse is structured based on a corporate- wide model, spanning the functional boundaries of legacy systems  This includes naming standards, units of measurement and periodicity

Data warehouse - time variant  Data is the data warehouse is characterised by the time- series nature of historical data  The data consists of a series of “snapshots” which are time-stamped and record values at a moment in time  This supports trend analysis of the data

Data warehouse - non volatile  The data warehouse is not continuously updated (inserts, eletes and changes) like data in an OLTP system  Data in a data warehouse is periodically up-loaded at a scheduled time intervals (say daily)

Motivations for data warehousing  Demands on OLTP data bases for query processing would be too great  Data warehousing is designed for efficient retrieval  Data in legacy systems is frequently inconsistent, of poor quality and stored in different formats  Reduce costs in providing data for decision making

Motivations for data warehousing  Support for focus on complete business processes (BPR)  Support for new initiatives – CRM, Balanced Scorecard  Industry sources quote ROI’s averaging 401% over 3 years  Remain competitive

Typical Data Warehouse Architecture

An Actual Data Warehouse

Data warehouse development  Requirements identification  Logical design, data modelling  Data extract, transform and load (ETL)  Warehouse architecture, technology and tools  Physical database design  Delivery systems  Operational policies

Designing a data warehouse – data design There are two main approaches to data modelling or data warehouse design – entity relationship modelling and normalisation – dimensional modelling

The design of databases using a traditional E-R approach  Entities and relationships  Normalisation 3NF

Entity relationship schema

Why do we normalise data?  Normalisation is a process for converting complex data structures into simple, stable data structures  Normalisation protects integrity of database by avoiding anomalies (update, delete, create) Normalised data models are: robust and stable have minimum redundancy

Dimensional Modeling (star schema)

Components of dimensional model: –Fact Tables : contain measurements of business eg. Sales, purchase order, shipment – Dimension Tables : store the descriptions of the dimensions of the business eg. Product, customer, vendor, store

Dimensional Modeling (star schema)  Each dimension table has a single primary key that corresponds exactly to one of the components of the multipart key in the fact table.  A fact table always expresses a many to many relationship (the key is composed of foreign keys  The most useful facts in a fact table are numeric and additive ( typically values are added up)

Snowflake schema  Snowflake schema –all the tables are normalised  Star schemas are preferable to snowflake – fewer joins for information retrieval

Dimensional Modelling vs E-R modelling  the purpose of dimensional modelling structure data for easy and efficient analysis  E-R modelling creates a single required to support organisation’s Whereas  DM creates individual models for business/decision interest eg. model for sales info model for Inventory info

Entity relationship schema (3NF)

Corresponding Star schema

Corresponding snowflake schema

Dimensional Modelling vs E-R modelling (Con’t)  OLTP and DW have different purpose: – operational vs informational  Normalisation protects integrity of database by avoiding anomalies (update, delete, create)  Data models for data warehouse do not have to be normalised – In contrast, data in DW does not change often – periodic additions of new data

DM vs. E-R modeling debate (Kimball’s view)  OLTP systems are volatile – high rates of update transactions  In normalised models the goal is to reduce data redundancy and prevent update anomalies  Data in a data warehouse does not need to be normalised because it is periodically refreshed not updated by user transactions