CS 157B: Database Management Systems II March 20 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.

Slides:



Advertisements
Similar presentations
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Advertisements

BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Data Warehousing M R BRAHMAM.
Chapter 13 Business Intelligence and Data Warehouses
Business Intelligence. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Warehouse and Data Cube Lecture Notes for Chapter 3 Introduction to Data Mining By.
Lab3 CPIT 440 Data Mining and Warehouse.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Chapter 13 The Data Warehouse
Microsoft SQL Server 2012 Analysis Services (SSAS) Reporting Services (SSRS)
Ch3 Data Warehouse part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
DATA WAREHOUSE (Muscat, Oman).
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
CS346: Advanced Databases
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
OLAP OPERATIONS. OLAP ONLINE ANALYTICAL PROCESSING OLAP provides a user-friendly environment for Interactive data analysis. In the multidimensional model,
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
CS 157B: Database Management Systems II March 18 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
Data Warehouse & Data Mining
CS 157B: Database Management Systems II May 8 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak
Multi-Dimensional Databases & Online Analytical Processing This presentation uses some materials from: “ An Introduction to Multidimensional Database Technology,
CS 157B: Database Management Systems II March 13 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
Cube Intro. Decision Making Effective decision making Goal: Choice that moves an organization closer to an agreed-on set of goals in a timely manner Goal:
1 Data Warehouses BUAD/American University Data Warehouses.
Data Warehousing.
 Business Intelligence Anthony DeCerbo Meaghan Duffy Steve Smith Warren Scoville.
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
CS 157B: Database Management Systems II April 3 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
Fox MIS Spring 2011 Data Warehouse Week 8 Introduction of Data Warehouse Multidimensional Analysis: OLAP.
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
1 On-Line Analytic Processing Warehousing Data Cubes.
Decision supports Systems Components
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
CMPE 226 Database Systems November 18 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
A POWER OF OLAP TECHNOLOGY National Technical University of Ukraine “Kiev Polytechnic Institute” Heat and energy design faculty Department of automation.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Advanced Database Concepts
The Data Warehouse Chapter Operational Databases = transactional database  designed to process individual transaction quickly and efficiently.
CS 157B: Database Management Systems II April 10 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
CS 151: Object-Oriented Design October 29 Class Meeting Department of Computer Science San Jose State University Fall 2013 Instructor: Ron Mak
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
CS 235: User Interface Design April 28 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
CMPE Database Systems Workshop June 12 Class Meeting
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Chapter 13 Business Intelligence and Data Warehouses
Data Warehouses Brief Overview Add ETL Copyright © 2011 Curt Hill.
Chapter 13 The Data Warehouse
Data storage is growing Future Prediction through historical data
Data Warehouse.
Competing on Analytics II
CMPE 226 Database Systems April 11 Class Meeting
Data Warehouse and OLAP
MIS2502: Data Analytics Dimensional Data Modeling
MIS2502: Data Analytics Dimensional Data Modeling
Introduction of Week 9 Return assignment 5-2
Data Warehousing Concepts
Analytics, BI & Data Integration
Data Warehouse and OLAP
Data Warehousing.
Presentation transcript:

CS 157B: Database Management Systems II March 20 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 2 Unofficial Field Trip  Computer History Museum in Mt. View  Experience a fully restored IBM 1401 mainframe computer from the early 1960s in operation. General info: My summer seminar: Restoration: thelen.org/1401Project/1401RestorationPage.htmlhttp://ed- thelen.org/1401Project/1401RestorationPage.html Private demos at 11:45 and at 2:00.  See a life-size working model of Charles Babbage’s Difference Engine in operation, a hand-cranked mechanical computer designed in the early 1800s. Public demo at 1:00. Saturday, March 23. Meet in the museum lobby at 11:15 AM.

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 3 Extra Credit!  There will be extra credit if you participate in the unofficial field trip to the Computer History Museum. Up to 10 points added to your midterm score. To be decided:  a quiz (via Desire2Learn)  or an essay

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 4 Extract, Transform, and Load (ETL)

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 5 Extract, Transform, and Load (ETL)  You want only high quality data in your data warehouse.  What is high quality data? correct unambiguous consistent complete  The transform phase of ETL produces high quality data. Cleaning the data. Conforming data from multiple sources. _

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 6 Extract, Transform, and Load (ETL)  In the real world, data is often dirty. Therefore, the ETL process must clean the source data when the data is being copied into the data warehouse.  Cleaning operations Remove or correct corrupted data. Remove or correct invalid or inconsistent data.  unexpected null values  missing data  values out of range  misspellings  referential integrity violations  business rule violations _

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 7 Extract, Transform, and Load (ETL)  Data from multiple sources may need to be conformed to be usable together in the data warehouse. Type conversion  Example: Convert a user ID in a data source from a string to a long integer to match with the user ID in other data sources. Format conversion  Example: Dates and times, names Align field and attribute names  Examples: customer_name vs. name_of_client store vs. retail_outlet _

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 8 ETL: Semantic Mappings  Unit conversions Example: feet vs. yards, miles vs. kilometers  Structural mappings Example: federal  state  city  district vs. kingdom  region  parish  Temporal mappings Example: One data source has a measure taken once an hour, another data source has the same measure taken daily.  Spatial mappings Example: street addresses vs. GIS coordinates (latitude + longitude) vs. political boundaries (cities, districts, counties, etc.)

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 9 ETL: Semantic Mappings  Spatio-temporal mappings Locations in space-time  And even more complex mappings May require the use of ontologies.  shared vocabularies  knowledge structures  models of reality  etc. _

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 10 Dimensional Modeling  Fact tables Contain values that are measures, usually numeric.  Example: the number of sales  Dimension tables Contain the context for the measures.  Examples: time, location, product Dimensions are usually grouped and hierarchical  Example: western locations, eastern locations  Example: yearly, quarterly, monthly, weekly, daily, hourly Often denormalized for query performance.  Many queries, few updates. _

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 11 Dimensional Modeling  Design criteria What are the facts? What are we measuring?  Example: number of sales What is the grain, or granularity of the facts?  Determined by the dimensions.  All measurements in a fact table must be at the same grain.  Example: sales figures collected at the point of sale What are the dimensions? What context do we need to provide for the measures in the fact table?  Examples: stores, dates, products

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 12 Dimensional Modeling  Implementation Star schema Measures: number of units sold Dimensions: date, store, product

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 13 Online Analytical Processing (OLAP)  A common type of business analysis. Also used to analyze scientific data.  Visualize data in a multidimensional manner. Analytical processes that involve manipulating data along different dimensions. The OLAP cube.  “What happened recently, and why?” _

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 14 Online Analytical Processing (OLAP)  OLAP operations slice and dice drill up, drill down drill across, drill through pivot _

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 15 Online Analytical Processing (OLAP)  Slice View or manipulate the data along a subset of the dimensions. Consider only data from the first quarter.

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 16 Online Analytical Processing (OLAP)  Dice View or manipulate the data within subsets of the ranges of the dimensions. Consider only data from Q1 and Q2 from only Toronto and Vancouver for only computers and home entertainment.

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 17 Online Analytical Processing (OLAP)  Drill down View or manipulate a dimension at a lower level of detail. Drill down on the time dimension from quarters to months.

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 18 Online Analytical Processing (OLAP)  Drill up “Roll up” (aggregate) data to a higher level in along a dimension. Sum up the cities by country.

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 19 Online Analytical Processing (OLAP)  Drill across Integrate data from more than one fact table.  Drill through Access the database tables that underlie the OLAP cube. _

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 20 Online Analytical Processing (OLAP)  Pivot Rotate the axes (dimensions) to present a different view.

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 21 OLAP Summary

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 22 DW Summary Plus: dashboards and scorecards

Department of Computer Science Spring 2013: March 20 CS 157B: Database Management Systems II © R. Mak 23 Cognos  Business intelligence (BI) tool from IBM. Queries and reports Dashboards and scorecards OLAP Data mining  predictive analysis  Cognos Business Intelligence 10 is available in the IBM Academic Cloud along with a sample data warehouse. I will create student accounts. Online tutorials Cognos demo