Data Cube Computation Model dependencies among the aggregates: most detailed “view” can be computed from view (product,store,quarter) by summing-up all.

Slides:



Advertisements
Similar presentations
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
Advertisements

Materialization and Cubing Algorithms. Cube Materialization Each cell of the data cube is a view consisting of an aggregation of interest. The values.
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Incremental Maintenance for Non-Distributive Aggregate Functions work done at IBM Almaden Research Center Themis Palpanas (U of Toronto) Richard Sidle.
Implementation & Computation of DW and Data Cube.
Database – Part 3 Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Mr. Sakthi Angappamudali.
ICS 421 Spring 2010 Data Warehousing 3 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/1/20101Lipyeow.
Multidimensional Database in Context of DB2 OLAP Server Khang Pham Class: CSCI397-16C Instructor: Professor Renner.
An Introduction to Dimensional Data Warehouse Design Presented by Joseph J. Sarna Jr. JJS Systems, LLC.
Database – Part 2b Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Sakthi Angappamudali at Standard Insurance; BI.
Data Warehousing & OLAP. 2 What is Data Warehouse? “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
IST722 Data Warehousing Technical Architecture Michael A. Fudge, Jr. * Figures taken from Kimball Ch. 4.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Chapter 13 The Data Warehouse
Data Warehouse View Maintenance Presented By: Katrina Salamon For CS561.
An Array-Based Algorithm for Simultaneous Multidimensional Aggregates
CS346: Advanced Databases
Components of the Data Warehouse Michael A. Fudge, Jr.
1 Data Warehousing – CG124 Dr. Akhtar Ali School of Computing, Engineering and Information Sciences Computing.unn.ac.uk/staff/CGMA2/CG124.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
An Introduction to Data Warehousing Yannis Kotidis AT&T Labs-Research.
Data Warehousing.
TNQ How To Use Office 2000 Features In Data Warehousing Michael L. Flakus Senior Consultant BEST Consulting.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25, Part B.
1 Cube Computation and Indexes for Data Warehouses CPS Notes 7.
OnLine Analytical Processing (OLAP)
Efficient Methods for Data Cube Computation and Data Generalization
Copyright © 2002, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.
Part Two: - The use of views. 1. Topics What is a View? Why Views are useful in Data Warehousing? Understand Materialised Views Understand View Maintenance.
1 Data Warehouses BUAD/American University Data Warehouses.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Data Warehouse & OLAP Kuliah 1 Introduction Slide banyak mengambil dari acuan- acuan yang dipakai.
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Data Warehousing.
BI Terminologies.
Ahsan Abdullah 1 Data Warehousing Lecture-10 Online Analytical Processing (OLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
Designing Aggregations. Performance Fundamentals - Aggregations Pre-calculated summaries of data Intersections of levels from each dimension Tradeoff.
Dr. N. MamoulisAdvanced Database Technologies1 Topic 6: Data Warehousing & OLAP Defined in many different ways, but not rigorously. A decision support.
Frank Dehnewww.dehne.net Parallel Data Cube Data Mining OLAP (On-line analytical processing) cube / group-by operator in SQL.
UNIT-II Principles of dimensional modeling
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Data Warehousing & OLAP Yannis Kotidis AT&T Labs-Research.
OLAP & Data Warehousing. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
What is OLAP?.
Data Warehousing.
12 1 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel 12.4 Online Analytical Processing OLAP creates an advanced data.
Cubing Heuristics (JIT lecture) Heuristics used during data cube computation.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
Or How I Learned to Love the Cube…. Alexander P. Nykolaiszyn BLOG:
CSE6011 Implementing a Warehouse  Monitoring: Sending data from sources  Integrating: Loading, cleansing,...  Processing: Query processing, indexing,...
Dense-Region Based Compact Data Cube
Overview of Data Warehousing (DW) and OLAP
Advanced Applied IT for Business 2
Pertemuan <<13>> Data Warehousing dan Decision Support
Chapter 13 The Data Warehouse
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehouse.
Data Warehouse and OLAP
C.U.SHAH COLLEGE OF ENG. & TECH.
Data Warehousing and Decision Support
CUBE MATERIALIZATION E0 261 Jayant Haritsa
Technical Architecture
Data Warehouse and OLAP
Presentation transcript:

Data Cube Computation Model dependencies among the aggregates: most detailed “view” can be computed from view (product,store,quarter) by summing-up all quarterly sales product,store,quarter productstorequarter none store,quarterproduct,quarterproduct, store

Computation Directives Hash/sort based methods (Agrawal et. al. VLDB’96) 1.Smallest-parent 2.Cache-results 3.Amortize-scans 4.Share-sorts 5.Share-partitions product,store,quarter productstorequarter none store,quarterproduct,quarterproduct, store

Alternative Array-based Approach Model data as a sparse multidimensional array – partition array into chunks (a small sub-cube which fits in memory). – fast addressing based on (chunk_id, offset) Compute aggregates in “multi-way” by visiting cube cells in the order which minimizes the # of times to visit each cell, and reduces memory access and storage cost. B What is the best traversing order to do multi-way aggregation?

Roadmap What is the data warehouse, data mart Multi-dimensional data modeling Data warehouse design – the star schema, bitmap indexes The Data Cube operator – semantics and computation Aggregate View Selection

Views and Decision Support OLAP queries are typically aggregate queries. – Pre-computation is essential for interactive response times. – The CUBE is in fact a collection of aggregate queries, and pre-computation is especially important: lots of work on what is best to pre-compute given a limited amount of space to store pre-computed results. Warehouses can be thought of as a collection of asynchronously replicated tables and periodically maintained views. – Has renewed interest in view maintenance!

Materialized Views A view whose tuples are stored in the database is said to be materialized. – Provides fast access, like a (very high-level) cache. – Need to maintain the view as the underlying tables change. – Ideally, we want incremental view maintenance algorithms. Close relationship to data warehousing, OLAP, (asynchronously) maintaining distributed databases, checking integrity constraints, and evaluating rules and triggers.

Issues in View Materialization Algorithm to maintain a materialized view? What views should we materialize, and what indexes should we build on the pre- computed results? Given a query and a set of materialized views (possibly with some indexes), can we use the materialized views to answer the query?

View Selection Problem Use some notion of benefit per view Limit: disk space product,store,quarter productstorequarter none store,quarterproduct,quarterproduct, store Hanarayan et al SIGMOD’96: Pick views greedily until space is filled

Reality check:too many views! 2 n views for n dimensions (no- hierarchies) Storage/update- time explosion More pre- computation doesn’t mean better performance!!!!

Sources of Information - Books 1.The Data Warehouse Toolkit – Ralph Kimball ISBN The Data Warehouse Lifecycle Toolkit – Ralph Kimball, Laura Reeves, Warren Thornthwaite & Margy Ross ISBN Data Warehouse Design Solutions – Christopher Adamson & Michael Venerable ISBN X

Sources of Information – Web Sites Technology Guides for Data Warehousing - Ralph Kimball Associates Articles - Data Warehousing - Data Warehousing Knowledge Center - The Data Warehousing Information Center - Documenting data replication and data transformation sites on the Net - DM Review Business Intelligence & Data Warehousing Enabling E-Business - The Data Warehousing Institute - Intelligent Enterprise Magazine - DataWarehousing Forum -