Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals Presenter : Parminder Jeet Kaur Discussion Lead : Kailang.

Slides:



Advertisements
Similar presentations
Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Advertisements

OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
5.1Database System Concepts - 6 th Edition Chapter 5: Advanced SQL Advanced Aggregation Features OLAP.
Set operators (UNION, UNION ALL, MINUS, INTERSECT) [SQL]
Chapter 11 Group Functions
Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases Presented by Darren Gates for ICS 280.
Jennifer Widom On-Line Analytical Processing (OLAP) Introduction.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Pivoting and SQL:1999.
Decision Support and Data Warehouse. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
OLAP. Overview Traditional database systems are tuned to many, small, simple queries. Some new applications use fewer, more time-consuming, analytic queries.
1 i247: Information Visualization and Presentation Marti Hearst Multidimensional Graphing.
Data Cube and OLAP Server
Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases Chris Stolte and Pat Hanrahan Computer Science Department.
Implementing Business Analytics with MDX Chris Webb London September 29th.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Horizontal data sets: Number of attributes is of the same order to several orders of magnitude higher than the number of records. Example: genetic data.
©Silberschatz, Korth and Sudarshan22.1Database System Concepts 4 th Edition 1 SQL:1999 Advanced Querying Decision-Support Systems Data Warehousing Data.
©Silberschatz, Korth and Sudarshan22.1Database System Concepts 4 th Edition 1 Extended Aggregation SQL-92 aggregation quite limited  Many useful aggregates.
Chap8: Trends in DBMS 8.1 Database support for Field Entities 8.2 Content-based retrieval 8.3 Introduction to spatial data warehouses 8.4 Summary.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
1 Lecture 10: More OLAP - Dimensional modeling
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Warehouse and Data Cube Lecture Notes for Chapter 3 Introduction to Data Mining By.
Lab3 CPIT 440 Data Mining and Warehouse.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Microsoft SQL Server 2012 Analysis Services (SSAS) Reporting Services (SSRS)
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
OLAP OPERATIONS. OLAP ONLINE ANALYTICAL PROCESSING OLAP provides a user-friendly environment for Interactive data analysis. In the multidimensional model,
Advanced Databases 5841 DATA CUBE. Index of Content 1. The “ALL” value and ALL() function 2. The New Features added in CUBE 3. Computing the CUBE and.
Enhancements to the GROUP BY Clause Fresher Learning Program January, 2012.
Database Programming Sections 5– GROUP BY, HAVING clauses, Rollup & Cube Operations, Grouping Set, Set Operations 11/2/10.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
On-Line Analytic Processing Chetan Meshram Class Id:221.
1 CUBE: A Relational Aggregate Operator Generalizing Group By By Ata İsmet Özçelik.
Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
20.5 Data Cubes Instructor : Dr. T.Y. Lin Chandrika Satyavolu 222.
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross- Tab and Sub-Totals Gray et Al. Presented By: Priya Rajan.
Computing & Information Sciences Kansas State University Monday, 26 Nov 2007CIS 560: Database System Concepts Lecture 37 of 42 Monday, 26 November 2007.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Prof. Bayer, DWH, Ch.4, SS Chapter 4: Dimensions, Hierarchies, Operations, Modeling.
Data Warehousing.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Extended Aggregation in SQL:1999 The cube operation computes.
Some OLAP Issues CMPT 455/826 - Week 9, Day 2 Jan-Apr 2009 – w9d21.
ISO/IEC JTC1 SC32 1SQL/OLAP Sang-Won Lee Let’s e-Wha! URL: Jul. 12th,
SHIFALI CHOUBEY GISE LAB IITB Decision Support System For Farmers.
Modeling Issues for Data Warehouses CMPT 455/826 - Week 7, Day 1 (based on Trujollo) Sept-Dec 2009 – w7d11.
1 On-Line Analytic Processing Warehousing Data Cubes.
Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006.
Data Warehousing.
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals 데이터베이스 연구실 김호숙
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
Data Analysis Decision Support Systems Data Analysis and OLAP Data Warehousing.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation OLAP Queries and SQL:1999.
Data Analysis and OLAP Dr. Ms. Pratibha S. Yalagi Topic Title
Lecturer : Dr. Pavle Mogin
On-Line Analytic Processing
Chapter 5: Advanced SQL Database System concepts,6th Ed.
SQL/OLAP Sang-Won Lee Let’s e-Wha! URL: Jul. 12th, 2001 SQL/OLAP
Based on notes by Jim Gray
DATA CUBE Advanced Databases 584.
On-Line Analytical Processing (OLAP)
Data warehouse Design Using Oracle
DATA CUBES E0 261 Jayant Haritsa Computer Science and Automation
Slides based on those originally by : Parminder Jeet Kaur
Presentation transcript:

Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals Presenter : Parminder Jeet Kaur Discussion Lead : Kailang Paper II

Presentation Outline 1 Introduction Data Analysis: What? Why is it useful? How this is done? Visualization and Dimensionality Reduction 2 Standard SQL features & GROUP-BY Operator Relevant features of SQL Problems with Group-By operator 3 CUBE and ROLL-UP Operators General idea of Cube operator Why are they different?

1 Introduction Data Analysis: What? Why is it useful? How this is done? Dimensionality Reduction 2 Standard SQL features & GROUP-BY Operator Relevant features of SQL Problems with Group-By operator 3 CUBE and ROLL-UP Operators General idea of Cube operator Why are they different?

Data Analysis ›Applications look for unusual patterns in data ›Four steps: –Formulating a complex query –Extracting the aggregated data from DB into a relation –Visualizing results in N-dimensional space –Analyzing the results to find unusual or interesting patterns by roll-up and drill-down on data ›Facilitate decision making

Dimensionality Reduction ›Summarizes data along dimensions that are left out ›Ex. Analysis car sales –Focus on role of model, year and color of the cars –Ignore differences between sales along dimensions of date of sale or car dealership –As a result, extensive constructs are used, such as cross- tabulation, subtotals, roll-up and drill-down

1 Introduction Data Analysis: What? Why is it useful? How this is done? Visualization and Dimensionality Reduction 2 Standard SQL features & GROUP-BY Operator Relevant features of SQL Problems with Group-By operator 3 CUBE and ROLL-UP Operators General idea of Cube operator Why are they different?

Relevant features of SQL ›Relational system models N-dimensional data as a relation with N-attribute domains ›Example: 4D Earth temperature data

continued… Relevant features of SQL ›For summarization, standard SQL supports group-by op ›GROUP-BY operator –Partitions the relation into disjoint tuple set –Then, aggregates over each sets

Problems with GROUP-BY operator 1.Histograms - Weather table, group time into days, weeks or months and group locations into areas (US, Canada) by mapping longitude and latitude to country’s name -Standard SQL computes histograms indirectly from a table expression

Problems with GROUP-BY operator 2. Roll-up Totals and Sub-Totals for drill-downs – One dimensional Aggregation SELECT Models, SUM(Sales) FROM Sales GROUP BY Models –Two dimensional Aggregation SELECT Models, Years, Color, SUM(Sales) FROM Sales WHERE Models = “Chevy” GROUP BY Year, Color ModelsSales Chevy290 Ford220 ModelsYearColorSales Chevy1994Black50 Chevy1995Black85 Chevy1994White40 Chevy1995White115

Problems with GROUP-BY operator 2. Roll-up Totals and Sub-Totals for drill-downs Three dimensional aggregation (by Model by Year by Color) Not relational (can’t form a key)

Problems with GROUP-BY operator 2. Roll-up Totals and Sub-Totals for drill-downs Alternate representation 3D aggregation (Pivot table) –Creates columns based on subsets of column values rather than subsets of column name (as recommended by Chris Date approach) –Results in larger set –If one pivots on two columns containing N and M values, the resulting pivot table has N x M values, that’s, so many columns and such obtuse column names!

ALL value approach ›Rather than extending result table by adding new cols ›Prevents the exponential growth of columns ›“ALL” added to fill in the super-aggregation items

ALL value approach - SQL statement to build 3D roll- up - Aggregating over N dimensions requires N unions

Problems with GROUP-BY operator 3. Cross Tabulations –Expressing cross-tab queries with conventional SQL is daunting –6D cross-tab requires 64-way union of 64 different GROUP BY operators! –Resulting representation is too complex for optimization

1 Introduction Data Analysis: What? Why is it useful? How this is done? Visualization and Dimensionality Reduction 2 Standard SQL features & GROUP-BY Operator Relevant features of SQL Problems with Group-By operator 3 CUBE and ROLL-UP Operators General idea of Cube operator Why are they different?

CUBE operator ›N dimensional generalization of simple aggregate function ›N-1 lower-dimensional aggregates points, lines, planes, cubes ›Data cube operator builds a table containing all these aggregated values ›Unifies several common and popular concepts: such as aggregates, group by, histograms, roll-ups and drill-downs and, cross tabs

CUBE operator SELECT Model, Year, Color, SUM (Sales) AS sales FROM Sales WHERE Model in ['Ford', 'Chevy'] AND year BETWEEN 1994 AND 1995 GROUP BY CUBE Model, Year, Color ›A relational operator –GROUP BY and ROLL UP are degenerate forms of the operator. ›Aggregates over all attributes in GROUP BY clause as in standard GROUP BY ›It UNIONs in each super-aggregate of global cube—substituting ALL for the aggregation columns ›If there are N attributes in the, there will be 2 N -1 super- aggregate value

Discussion Questions #TODO