IST722 Data Warehousing An Introduction to Data Warehousing Michael A. Fudge, Jr.

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

Cognos 8 Training Session
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Introduction to Data Warehouse and Data Mining MIS 2502 Data Analytics
Data Warehouse IMS5024 – presented by Eder Tsang.
Chapter 15 Data Warehousing, OLAP, and Data Mining
13 Chapter 13 The Data Warehouse Hachim Haddouti.
Chapter 13 The Data Warehouse
DATA WAREHOUSE (Muscat, Oman).
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Data Warehouse Toolkit Introduction. Data Warehouse Bill Inmon's paradigm: Data warehouse is one part of the overall business intelligence system. An.
Components of the Data Warehouse Michael A. Fudge, Jr.
ETL Design and Development Michael A. Fudge, Jr.
ITEC 3220A Using and Designing Database Systems
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
Data Warehouse & Data Mining
1 Brett Hanes 30 March 2007 Data Warehousing & Business Intelligence 30 March 2007 Brett Hanes.
Introduction to the Orion Star Data
Data Warehouse Concepts Transparencies
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Data Warehouse Architecture. Inmon’s Corporate Information Factory The enterprise data warehouse is not intended to be queried directly by analytic applications,
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
1 Data Warehouses BUAD/American University Data Warehouses.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
13 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
MIS2502: Data Analytics The Information Architecture of an Organization.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Datawarehouse A sneak preview. 2 Data Warehouse Approach An old idea with a new interest: Cheap Computing Power Special Purpose Hardware New Data Structures.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Fox MIS Spring 2011 Data Warehouse Week 8 Introduction of Data Warehouse Multidimensional Analysis: OLAP.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
1 Database Systems, 8 th Edition 1 Chapter 13 Business Intelligence and Data Warehouses Objectives In this chapter, you will learn: –How business intelligence.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Oracle 8i Data Warehousing (chapter 1, 2) Data Warehousing Lab. 석사 1 학기 HyunSuk Jung.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 5: Data Warehousing.
Data Warehouse Data Mart Elahe Soroush. Agenda  Data Warehouse definition  Concepts  Logical transformation  Physical transformation  DW components.
Building the Corporate Data Warehouse Pindaro Demertzoglou Data Resource Management.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
Data Warehouse/Data Mart It’s all about the data.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Enterprise Resource Planning System & Data Warehousing Implementation.
Business Intelligence Overview
Jaclyn Hansberry MIS2502: Data Analytics The Things You Can Do With Data The Information Architecture of an Organization Jaclyn.
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Applying Data Warehouse Techniques
Competing on Analytics II
CMPE 226 Database Systems April 11 Class Meeting
Components of the Data Warehouse Michael A. Fudge, Jr.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
An Introduction to Data Warehousing
Introduction of Week 9 Return assignment 5-2
Data Warehousing Concepts
Applying Data Warehouse Techniques
Technical Architecture
Applying Data Warehouse Techniques
Data Warehouse and OLAP Technology
Presentation transcript:

IST722 Data Warehousing An Introduction to Data Warehousing Michael A. Fudge, Jr.

What is the most important asset of any organization?

DATA Why? Answer:

Without data: Do you know your customers? Understand their needs? Can you figure out what products to put on sale? Which ones to discontinue? Do you know your expenses? Your Profitability?

NOPE

This reminds me of a story…

The Informational Needs of an Organization…

Each level of an organization has different informational needs and requirements: Organizational Hierarchy Non-Management Operational Management Tactical Management Strategic Management Do you want fries with that? How many fries did I sell this week? Demand for fries in our China locations is up 200% Customers who purchase fries are also likely to buy milkshakes.

Data like this goes into a…. The Technology Behind It All…

Starts with the Transactional Database A.k.a. Operational Database Stored in a Relational Database or files. Highly Normalized (Data stored as efficiently as possible, lots of tables.) Optimized for processing speed and handling the “now”. Designed for capturing data, not for reporting on it. Designed to support the operational needs of the org.

Transactional Databases Are Complex  Adventure works fictitious bicycle manufacturer. 72 tables. Blackboard Learning Management System. 592 tables. SU’s Oracle PeopleSoft ERP Implementation 40,000+ tables.

Example: A Query of “iSchool Students” Students in the current term with gpa, demographics, major, minor, program of study, etc... Either enrolled in one of our programs or taking one of our courses.

Issues Reporting with Transactional Databases Difficult, Time-consuming & Error prone. Many joins, sub-selects, Due to vast number of tables. How do you know your query is correct? Resource-intensive The database is not optimized for this purpose. Multi table joins are RAM and CPU hogs Impossible transactional systems are flushed or archived frequently to maintain performance. You can’t query data you no longer have

Solution? The Data Warehouse Designed to support an organization’s informational needs. Data is re-structured conducive to reporting and analytic applications. Transactional databases are data sources for the Data Warehouse. Data grows over time; existing data in the warehouse very seldom changes.

Characteristics of the Data Warehouse Time Variant Flow of data through time Projected data Non-Volatile Data never removed Always growing Copy of source data Integrated Centralized Holds data retrieved from entire organization Subject-Oriented Optimized to give answers to diverse questions Used by all functional areas

ETL: For Populating the Data Warehouse Payroll Sales Purchasing

The Data Mart Single-subject subset of the data warehouse Provides Decision support to small group Address local or departmental needs

The Evolution of the DW Business Intelligence Improved Decision Making Data Warehouse

Business Intelligence Analytical and Decision-Support capabilities of the Data warehouse. The “Glitz and Glam” of Data Warehousing

Data Warehouse or Business Intelligence? Is the data warehouse a component of business intelligence?or Is business intelligence a component of the data warehouse?

But how does this work? Here’s a hyper-abridged example…

#1: We Have Northwind OLTP Database Insufficient reporting capabilities Can only report “In the now” Complex queries to get questions answered.

#2: Identify business process to model Business Process & Grain Orders – products sold to customers over time by sale. One row per product order (product on the order) Dimensions Products, Employees (Sales), Time (Order Date), Customer Facts Order Quantity, Order Amount This represents our Data Mart in the DW

#3: Create Northwind Orders Star Schema Build the data mart in the Data warehouse Fact Table + outer Dimensions No data (yet) Fields are based on what’s available in the source data

#4: Create Northwind Source to Target Map How does the OLTP align with OLAP? Helps us define the ETL process Fact Table: OrderFact TimeDim EmployeeDim CustomerDim ProductDim

#5: Populate targets with ETL Dimensions before Facts. Need a strategy to handle changes to data. Tooling exists to assist with the process. Products Source ProductsDim Data

#6: Visualize with a BI Tool You can easily query star schemas in SQL or better yet use a BI tool like Excel or Tableau

Demo: Visualizing Adventure Works Internet Orders with Excel

The Fathers of Data Warehousing W.H. InmonRalph Kimball The “Father” of…Data WarehousingBusiness Intelligence Million Dollar Idea:“Corporate Information Factory” “Kimball Lifecycle” “Data Warehouse” Definition Strict. Subject-oriented summarized data. Loose. Any query able data. Approach: How is the Data Warehouse built? As a whole, over time (Waterfall, Top-down) In parts, by business process (Iterative, Bottom-up)

Your Textbooks “What”Inmon “How To” Kimball We’ll use the Inmon definitions, and apply the Kimball Approach.

Inmon’s Corporate Information Factory A reference architecture for an “Information Ecosystem”

The Kimball Lifecycle

This Course is About: 1.Understand the CIF/DW/BI components 2.Requirements Gathering / Analysis 3.Dimensional Modeling and Design 4.Physical design 5.ETL – Moving data Around 6.Business Intelligence 7.Technical architecture, Data Governance, Master data Management

The Informational Needs of an Organization, In Summary… Organizational Hierarchy Non-Management Operational Management Tactical Management Strategic Management Operational Data in Transactional Databases Decision-Support Data in the Data Warehouse

Relational Philosophies, In Summary… OLTP Highly normalized One or more tables per business entity. Supports the Operational needs of the organization Lots of tables OLAP Denormlaized Just Star Schemas Dimension and Fact tables Supports the Analytical needs of the organization. Data mart in the data warehouse

In Summary… Data is an organizations most important asset. The transactional systems we use to collect and manage data are not suitable for analysis and reporting. The data warehouse is a subject-oriented, time-variant, non-volitile collection of operational data. The data mart supports the decision-support needs of a group or department within the organization. Business intelligence is the use of information to improve decision making. Inmon’s Corporate Information factory is a model for business intelligence. The Kimball Lifecycle is a methodology for creating data warehousing solutions.

IST722 Data Warehousing An Introduction to Data Warehousing Michael A. Fudge, Jr.