Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.

Slides:



Advertisements
Similar presentations
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
Advertisements

OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Data Warehouse Design Enrico Franconi CS 636. CS 3362 Implementing a Warehouse  Monitoring: Sending data from sources  Integrating: Loading, cleansing,...
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
Introduction to Data Warehousing CPS Notes 6.
Data Warehousing M R BRAHMAM.
Jennifer Widom On-Line Analytical Processing (OLAP) Introduction.
Advanced Querying OLAP Data Warehousing. Database Applications Transaction processing –Online setting –Supports day-to-day operation of business Decision.
Data Warehousing Overview
Lecture 1: Data Warehousing Based on the slides by Jeffrey D. Ullman and Hector Garcia-Molina at Stanford University 1.
1 1 Data Warehousing Decision-Support Systems  Data Analysis  OLAP  Extended aggregation features in SQL –Windowing and ranking  Implementation Techniques.
Data Warehousing and OLAP
Data Warehouse Models and OLAP Operations
Dr. M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2010 COMP207: Data Mining Data Warehousing COMP207: Data Mining.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
1 Lecture 10: More OLAP - Dimensional modeling
Data Warehousing Overview CS245 Notes 11 Hector Garcia-Molina Stanford University CS Notes11.
Lab3 CPIT 440 Data Mining and Warehouse.
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
CS346: Advanced Databases
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
8/20/ Data Warehousing and OLAP. 2 Data Warehousing & OLAP Defined in many different ways, but not rigorously. Defined in many different ways, but.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
IMS 6217: Data Warehousing / Business Intelligence Part 3 1 Dr. Lawrence West, Management Dept., University of Central Florida Analysis.
1 Cube Computation and Indexes for Data Warehouses CPS Notes 7.
Data Models for Warehouse Session-12/13 Data Management for Decision Support.
Data Warehousing Xintao Wu. Can You Easily Answer These Questions? What are Personnel Services costs across all departments for all funding sources? What.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing.
Roadmap 1.What is the data warehouse, data mart 2.Multi-dimensional data modeling 3.Data warehouse design – schemas, indices 4.The Data Cube operator –
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Data Warehousing. Databases support: Transaction Processing Systems –operational level decision –recording of transactions Decision Support Systems –tactical.
Data Warehousing and OLAP. Warehousing ► Growing industry: $8 billion in 1998 ► Range from desktop to huge:  Walmart: 900-CPU, 2,700 disk, 23TB Teradata.
Winter 2006Winter 2002 Keller, Ullman, CushingJudy Cushing 19–1 Warehousing The most common form of information integration: copy sources into a single.
1 On-Line Analytic Processing Warehousing Data Cubes.
Data Warehousing Overview CS245 Notes 11 Hector Garcia-Molina Stanford University CS Notes11.
Data Warehousing Multidimensional Analysis
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
CSE 5331/7331 F'071 CSE 5331/7331 Fall 2007 Dimensional Modeling Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University.
Data Warehousing.
Advanced Database Concepts
The Data Warehouse Chapter Operational Databases = transactional database  designed to process individual transaction quickly and efficiently.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
An Overview of Data Warehousing and OLAP Technology
Data Warehousing COMP3017 Advanced Databases Dr Nicholas Gibbins –
1 Advanced Database Systems: DBS CB, 2 nd Edition Data Warehouse, OLAP, Data Mining Ch , Ch. 22.
CSE6011 Implementing a Warehouse  Monitoring: Sending data from sources  Integrating: Loading, cleansing,...  Processing: Query processing, indexing,...
11/20/ :11 AMData Mining 1 Data Mining – CSE 9033 Chapter – 1; Data Warehousing Dr. Goutam Sarker, B.E., M.E., Ph.D.(Engineering), Fellow: IE(I),
Advanced Database Systems: DBS CB, 2nd Edition
Data Warehousing Overview CS245 Notes 12
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
On-Line Analytic Processing
Three tier Architecture of Data Warehousing
Data Warehouse.
Data Warehouse Design Enrico Franconi CS 636.
On-Line Analytical Processing (OLAP)
Data Warehouse and OLAP
Overview of Data Warehousing and OLAP
Data Warehousing Overview CS245 Notes 11
Data Warehousing and OLAP
Data Warehousing: Data Models and OLAP operations
Data Warehouse and OLAP
Data Warehousing.
Presentation transcript:

Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions

Hector Garcia Molina: Data Warehousing and OLAP 2 Warehouse Models & Operators Data Models u relations u stars & snowflakes u Cubes Operators u slice & dice u roll-up, drill down u pivoting u other

Hector Garcia Molina: Data Warehousing and OLAP 3 Star

Hector Garcia Molina: Data Warehousing and OLAP 4 Star Schema sale orderId date custId prodId storeId qty amt

Hector Garcia Molina: Data Warehousing and OLAP 5 Terms l Fact table l Dimension tables l Measures

Hector Garcia Molina: Data Warehousing and OLAP 6 Dimension Hierarchies store sType cityregion  snowflake schema  constellations

Hector Garcia Molina: Data Warehousing and OLAP 7 Cube Fact table view: Multi-dimensional cube: dimensions = 2

Hector Garcia Molina: Data Warehousing and OLAP 8 3-D Cube day 2 day 1 dimensions = 3 Multi-dimensional cube:Fact table view:

Hector Garcia Molina: Data Warehousing and OLAP 9 ROLAP vs. MOLAP l ROLAP: Relational On-Line Analytical Processing l MOLAP: Multi-Dimensional On-Line Analytical Processing

Hector Garcia Molina: Data Warehousing and OLAP 10 Aggregates Add up amounts for day 1 In SQL: SELECT sum(amt) FROM SALE WHERE date = 1 81

Hector Garcia Molina: Data Warehousing and OLAP 11 Aggregates Add up amounts by day In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date

Hector Garcia Molina: Data Warehousing and OLAP 12 Another Example Add up amounts by day, product In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId drill-down rollup

Hector Garcia Molina: Data Warehousing and OLAP 13 Aggregates l Operators: sum, count, max, min, median, ave l “Having” clause l Using dimension hierarchy u average by region (within store) u maximum by month (within date)

Hector Garcia Molina: Data Warehousing and OLAP 14 Cube Aggregation day 2 day drill-down rollup Example: computing sums

Hector Garcia Molina: Data Warehousing and OLAP 15 Cube Operators day 2 day sale(c1,*,*) sale(*,*,*) sale(c2,p2,*)

Hector Garcia Molina: Data Warehousing and OLAP 16 Extended Cube day 2 day 1 * sale(*,p2,*)

Hector Garcia Molina: Data Warehousing and OLAP 17 Aggregation Using Hierarchies day 2 day 1 customer region country (customer c1 in Region A; customers c2, c3 in Region B)

Hector Garcia Molina: Data Warehousing and OLAP 18 Pivoting day 2 day 1 Multi-dimensional cube: Fact table view:

Hector Garcia Molina: Data Warehousing and OLAP 19 Integration l Data Cleaning l Data Loading l Derived Data Client Warehouse Source Query & Analysis Integration Metadata

Hector Garcia Molina: Data Warehousing and OLAP 20 Data Cleaning Migration (e.g., yen  dollars) l Scrubbing: use domain-specific knowledge (e.g., social security numbers) l Fusion (e.g., mail list, customer merging) l Auditing: discover rules & relationships (like data mining) billing DB service DB customer1(Joe) customer2(Joe) merged_customer(Joe)

Hector Garcia Molina: Data Warehousing and OLAP 21 Loading Data l Incremental vs. refresh l Off-line vs. on-line l Frequency of loading u At night, 1x a week/month, continuously l Parallel/Partitioned load

Hector Garcia Molina: Data Warehousing and OLAP 22 Derived Data l Derived Warehouse Data u indexes u aggregates u materialized views (next slide) l When to update derived data? l Incremental vs. refresh

Hector Garcia Molina: Data Warehousing and OLAP 23 Materialized Views l Define new warehouse relations using SQL expressions does not exist at any source

Hector Garcia Molina: Data Warehousing and OLAP 24 Processing l ROLAP servers vs. MOLAP servers l Index Structures l What to Materialize? l Algorithms Client Warehouse Source Query & Analysis Integration Metadata

Hector Garcia Molina: Data Warehousing and OLAP 25 ROLAP Server l Relational OLAP Server relational DBMS ROLAP server tools utilities Special indices, tuning; Schema is “denormalized”

Hector Garcia Molina: Data Warehousing and OLAP 26 MOLAP Server l Multi-Dimensional OLAP Server multi- dimensional server M.D. tools utilities could also sit on relational DBMS Product City Date milk soda eggs soap A B Sales

Hector Garcia Molina: Data Warehousing and OLAP 27 Join “Combine” SALE, PRODUCT relations In SQL: SELECT * FROM SALE, PRODUCT

Hector Garcia Molina: Data Warehousing and OLAP 28 Join Indexes join index

Hector Garcia Molina: Data Warehousing and OLAP 29 What to Materialize? l Store in warehouse results useful for common queries l Example: day 2 day total sales materialize

Hector Garcia Molina: Data Warehousing and OLAP 30 Cube Aggregates Lattice city, product, date city, productcity, dateproduct, date cityproductdate all day 2 day use greedy algorithm to decide what to materialize

Hector Garcia Molina: Data Warehousing and OLAP 31 Dimension Hierarchies all state city

Hector Garcia Molina: Data Warehousing and OLAP 32 Dimension Hierarchies city, product city, product, date city, date product, date city product date all state, product, date state, date state, product state not all arcs shown...

Hector Garcia Molina: Data Warehousing and OLAP 33 Interesting Hierarchy all years quarters months days weeks conceptual dimension table