Data Warehousing - 2 ISYS 650. Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as.

Slides:



Advertisements
Similar presentations
IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
Advertisements

BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Data Warehousing and Decision Support, part 2
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Jennifer Widom On-Line Analytical Processing (OLAP) Introduction.
Decision Support and Data Warehouse. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
Decision Support Systems. Decision Support Trends The emerging class of applications focuses on –Personalized decision support –Modeling –Information.
Online Analytical Processing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional.
Business Intelligence. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing ISYS 650. What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data.
DATA WAREHOUSE (Muscat, Oman).
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Business Intelligence. Topics Chart Online Analytical Process, OLAP – Excel’s Pivot table – Data visualization with dashboard Data warehousing Data Mining.
DATA WAREHOUSING IN SQL SERVER 2005/2008 BUSINESS INTELLIGENCE.
Business Intelligence - 1 BUS 782. Topics Scenario Management Chart Online Analytical Process, OLAP – Excel’s Pivot table/Pivot chart Import/Export Data.
Introduction to the Orion Star Data
Presented By: Muhammad Rizvi Raghuram Vempali Surekha Vemuri.
Online Analytical Processing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing.
MIS2502: Data Analytics The Information Architecture of an Organization.
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Data Warehousing. Databases support: Transaction Processing Systems –operational level decision –recording of transactions Decision Support Systems –tactical.
New Developments in Business Intelligence ( Decision Support Systems) BUS 782.
Business Intelligence BUS 782. Topics Import/Export Data Chart Online Analytical Process, OLAP – Excel’s Pivot table/Pivot chart Scenario Management Data.
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
1 On-Line Analytic Processing Warehousing Data Cubes.
Decision supports Systems Components
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
ADVANCED TOPICS IN RELATIONAL DATABASES Spring 2011 Instructor: Hassan Khosravi.
Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.
Business Intelligence. Topics Chart Online Analytical Process, OLAP – Excel’s Pivot table – Data visualization with dashboard Scenario Management Data.
Data Warehousing Multidimensional Analysis
OLAP On Line Analytic Processing. OLTP On Line Transaction Processing –support for ‘real-time’ processing of orders, bookings, sales –typically access.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Data Warehousing.
The Data Warehouse Chapter Operational Databases = transactional database  designed to process individual transaction quickly and efficiently.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 6 The Data Warehouse Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
I Copyright © 2007, Oracle. All rights reserved. Introduction.
4 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Using Data Model Editor to Create Data Models Based on a SQL Query Data Set.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
Pindaro Demertzoglou Data Resource Management – MGMT 4170 Lally School of Management Rensselaer Polytechnic Institute.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
Decision Support System ISYS 363. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Data warehouse and OLAP
Data Warehouse.
Competing on Analytics II
On-Line Analytical Processing (OLAP)
CMPE 226 Database Systems April 11 Class Meeting
Data Warehouse and OLAP
University of Houston-Clear Lake Kaiser Permanente San Jose
MIS2502: Data Analytics Dimensional Data Modeling
MIS2502: Data Analytics Dimensional Data Modeling
Introduction of Week 9 Return assignment 5-2
DATA CUBES E0 261 Jayant Haritsa Computer Science and Automation
Data Warehouse and OLAP
Online Analytical Processing
Data Warehousing.
Presentation transcript:

Data Warehousing - 2 ISYS 650

Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as customers, employees, locations, products, time periods, etc. Fact table – contain detailed business data with links to dimension tables.

Star schema example Fact table provides statistics for sales broken down by product, period and store dimensions Dimension tables contain descriptions about the subjects of the business Note: What is the key of the fact table?

Star schema with sample data

On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques OLAP Operations – Cube slicing–come up with 2-D view of data – Drill-down–going from summary to more detailed views – Roll-up – the opposite direction of drill-down – Reaggregation – rearrange the order of dimensions

Slicing a data cube

Example of drill-down Summary report Drill-down with color added Starting with summary data, users can obtain details for particular cells

Excel’s Pivot Table Insert/Pivot Table or Pivot Chart – Drill down, rollup and reaggregation – Pivot: change the dimensional orientation of a report or an ad hoc query-page display – Filter Pivot Chart – Filter – Drilldown, rollup, reaggregation

Data Warehouse Lifecycle Requirement gathering – Determine the reports that DW is supposed to support. Identify data sources and data modeling – based on user requirements Extract data and populate the staging area with the data extracted from transactional sources. Build and populate a dimensional database. Build Extraction Transformation and Loading (ETL) routines to populate the dimensional database regularly. Build reports and analytical views Maintain the warehouse by adding/changing supported features and reports

Example: Transaction Database Customer Order Product Has 1 M M M CID Cname City OIDODate PID Pname Price Rating SalesPerson Qty

Analyze Sales Data Detailed Business Data Total sales: – by product: Qty*Price of each detail line Sum (Qty*Price) Detailed business data: qty*price Total quantity sold: – By product: Sum(Qty) Detailed business data: Qty

Dimensions for Data Analysis: Factors relevant to the business data Analyze sales by Product Analyze sales related to Customer: – Location: Sales by City – Customer type: Sales by Rating Analyze sales related to Time: – Quarterly, monthly, yearly Sales Analyze sales related to Employee: – Sales by SalesPerson

Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as customers, employees, locations, products, time periods, etc. Fact table – contain detailed business data with links to dimension tables.

Star Schema FactTable LocationCode PeriodCode Rating PID Qty Amount Location Dimension LocationCode State City CustomerRating Dimension Rating Description Product Dimension PID Pname Category Period Dimension PeriodCode Year Quarter Can group by State, City

Define Location Dimension Location: – In the transaction database: City – In the data warehouse we define Location to be State, City San Francisco -> California, San Francisco Los Angeles -> California, Los Angeles – Define Location Code: California, San Francisco -> L1 California, Los Angeles -> L2

Define Period Dimension Period: – In the transaction database: Odate – In the data warehouse we define Period to be: Year, Quarter Odate: 11/2/2003 -> 2003, 4 Odate: 2/28/2003 -> 2003, 1 – Define Period Code: 2003, 4 -> , 1 -> 20031

The ETL Process Capture/Extract Transform – Scrub(data cleansing),derive – Example: City -> LocationCode, State, City OrderDate -> PeriodCode, Year, Quarter Load and Index

From SalesDB to MyDataWarehouse Extract data from SalesDB: – Create query to get the fact data FactData – Download to MyDataWareHouse Transform: – Transform City to Location – Transform Odate to Period Query FactDataScrubing Load data to FactTable

Performing Analysis Analyze sales: – by Location – By Location and Customer Type – By Location and Period – By Period and Product Pivot Table: – Drill down, roll up, reaggregation

HR Database Historical data: – Job_History A record in this table keep track the starting date and ending date of an employee working on a job at a department.

We may study: Average days an employee stays in assigned jobs. Average days employees stay in a specific job_id. Any difference among departments in how long employees stay in job. Will the starting year affect how long employees stay in job? Basic measurement: – DaysOnJob: End_Date – Start_Date

Star Schema FactTable Empliyee_ID SartedYear Job_ID Department_ID City DayOnJob City Dimension City Country_Name Employee Dimension Empliyee_ID FullName Department Dimension Department_ID Department_Name StartYear Dimension StartedYear City Dimension City Country_Name

Define Dimensions Employee dimension: – Employee_ID, FullName, FullName = First_name || ‘ ‘ || Last_Name Job dimension: – Job_ID, Job_Title City dimension: – City, Country_Name Join Locations and Countries Department dimension: – Department_ID, Department_Name StartYear dimension – StartedYear extract(year from start_date)

Create DWHR Using Access Each dimension is defined as a view in HR database. Communication between Access and Oracle is using ODBC. In Access, we can import Oracle’s view to create a table.

Create View to Retrieve Fact Data FactData view is a join of Job_History, Departments and Locations.

Transform Fact Data select employee_id, extract(year from start_date) as StartedYear, Job_id,department_id,city, End_date-Start_date as DaysOnJob from factdata ;

Reference us/library/aa902672(SQL.80).aspx#sql_dwdesi gn_tool