5.1Database System Concepts - 6 th Edition Chapter 5: Advanced SQL Advanced Aggregation Features OLAP.

Slides:



Advertisements
Similar presentations
Chapter 18: Data Analysis and Mining Kat Powell. Chapter 18: Data Analysis and Mining ➔ Decision Support Systems ➔ Data Analysis and OLAP ➔ Data Warehousing.
Advertisements

Data Warehouses These slides are a modified version of the slides of the book “Database System Concepts” (Chapter 18), 5th Ed., McGraw-Hill, by Silberschatz,
Chapter 11 Group Functions
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Pivoting and SQL:1999.
©Silberschatz, Korth and Sudarshan5.1Database System Concepts - 6 th Edition Authorization Forms of authorization on parts of the database: Read - allows.
Chapter 5: Advanced SQL.
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 18: Data Analysis and Mining.
Chapter 18: Data Analysis and Mining Chapter 18: Data Analysis and Mining Decision Support Systems Data Analysis and OLAP Data Warehousing Data.
1 1 Data Warehousing Decision-Support Systems  Data Analysis  OLAP  Extended aggregation features in SQL –Windowing and ranking  Implementation Techniques.
Data Cube and OLAP Server
Chapter 18: Data Analysis and Mining
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Horizontal data sets: Number of attributes is of the same order to several orders of magnitude higher than the number of records. Example: genetic data.
©Silberschatz, Korth and Sudarshan22.1Database System Concepts 4 th Edition 1 SQL:1999 Advanced Querying Decision-Support Systems Data Warehousing Data.
©Silberschatz, Korth and Sudarshan22.1Database System Concepts 4 th Edition 1 Extended Aggregation SQL-92 aggregation quite limited  Many useful aggregates.
Chap8: Trends in DBMS 8.1 Database support for Field Entities 8.2 Content-based retrieval 8.3 Introduction to spatial data warehouses 8.4 Summary.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
Copyright: Silberschatz, Korth and Sudarshan 1 OLAP Functions Order-Dependent Aggregates and Windows in SQL: SQL: same as SQL:1999.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25, Part A.
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Warehouse and Data Cube Lecture Notes for Chapter 3 Introduction to Data Mining By.
Lab3 CPIT 440 Data Mining and Warehouse.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Chapter 4 Tutorial.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
OLAP OPERATIONS. OLAP ONLINE ANALYTICAL PROCESSING OLAP provides a user-friendly environment for Interactive data analysis. In the multidimensional model,
1 Basic concepts of On-Line Analytical processing DT211 /4.
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals Presenter : Parminder Jeet Kaur Discussion Lead : Kailang.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
1. Data Analysis and Mining 2 Decision Support Systems Data Analysis and OLAP Data Warehousing Data Mining.
Computing & Information Sciences Kansas State University Tuesday, 01 May 2007CIS 560: Database System Concepts Lecture 41 of 42 Tuesday, 01 May 2007 William.
Chapter 22: Advanced Querying and Information Retrieval
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 18: Data Analysis and Mining.
ADVANCE T-SQL: WINDOW FUNCTIONS Rahman Wehelie 7/16/2013 ITC 226.
Computing & Information Sciences Kansas State University Monday, 26 Nov 2007CIS 560: Database System Concepts Lecture 37 of 42 Monday, 26 November 2007.
Roadmap 1.What is the data warehouse, data mart 2.Multi-dimensional data modeling 3.Data warehouse design – schemas, indices 4.The Data Cube operator –
Database Systems Microsoft Access Practical #3 Queries Nos 215.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Extended Aggregation in SQL:1999 The cube operation computes.
Computing & Information Sciences Kansas State University Wednesday, 12 Nov 2008CIS 560: Database System Concepts Lecture 31 of 42 Wednesday, 12 November.
Computing & Information Sciences Kansas State University Wednesday, 29 Nov 2006CIS 560: Database System Concepts Lecture 39 of 42 Wednesday, 29 November.
1 On-Line Analytic Processing Warehousing Data Cubes.
Decision support, data mining & data warehousing.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Module B: Advanced SQL.
Data warehousing, data analysis and OLAP Sunita Sarawagi
Computing & Information Sciences Kansas State University Thursday, 03 May 2007CIS 560: Database System Concepts Lecture 42 of 42 Thursday, 03 May 2007.
Decision support, data mining & data warehousing.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Decision support, data mining & data warehousing.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
©Silberschatz, Korth and Sudarshan5.1Database System Concepts - 6 th Edition Recursive Queries.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
Chapter 20 Data Warehousing and Mining 1 st Semester, 2016 Sanghyun Park.
Data Analysis Decision Support Systems Data Analysis and OLAP Data Warehousing.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation OLAP Queries and SQL:1999.
Data Analysis and OLAP Dr. Ms. Pratibha S. Yalagi Topic Title
Lecturer : Dr. Pavle Mogin
Chapter 5: Advanced SQL.
On-Line Analytic Processing
Chapter 5: Advanced SQL Database System concepts,6th Ed.
SQL/OLAP Sang-Won Lee Let’s e-Wha! URL: Jul. 12th, 2001 SQL/OLAP
Chapter 5: Advanced SQL Database System concepts,6th Ed.
Chapter 5: Advanced SQL.
DATA CUBE Advanced Databases 584.
Data warehouse Design Using Oracle
Chapter 18: Data Analysis and Mining
Data Warehousing Decision-Support Systems
OLAP Functions Order-Dependent Aggregates and Windows in SQL: SQL: same as SQL:1999.
Slides based on those originally by : Parminder Jeet Kaur
Presentation transcript:

5.1Database System Concepts - 6 th Edition Chapter 5: Advanced SQL Advanced Aggregation Features OLAP

5.2Database System Concepts - 6 th Edition Ranking Ranking is done in conjunction with an order by specification. Suppose we are given a relation student_grades(ID, GPA) giving the grade-point average of each student Find the rank of each student. select ID, rank() over (order by GPA desc) as s_rank from student_grades An extra order by clause is needed to get them in sorted order select ID, rank() over (order by GPA desc) as s_rank from student_grades order by s_rank Ranking may leave gaps: e.g. if 2 students have the same top GPA, both have rank 1, and the next rank is 3 dense_rank does not leave gaps, so next dense rank would be 2

5.3Database System Concepts - 6 th Edition Ranking Ranking can be done using basic SQL aggregation, but resultant query is very inefficient select ID, (1 + (select count(*) from student_grades B where B.GPA > A.GPA)) as s_rank from student_grades A order by s_rank;

5.4Database System Concepts - 6 th Edition Ranking (Cont.) Ranking can be done within partition of the data. Given a relation dept_grades (ID, dept_name, GPA) “Find the rank of students within each department.” select ID, dept_name, rank () over (partition by dept_name order by GPA desc) as dept_rank from dept_grades order by dept_name, dept_rank; Multiple rank clauses can occur in a single select clause. Ranking is done after applying group by clause/aggregation Can be used to find top-n results More general than the limit n clause supported by many databases, since it allows top-n within each partition

5.5Database System Concepts - 6 th Edition Windowing Used to smooth out random variations. E.g., moving average: “Given sales values for each date, calculate for each date the average of the sales on that day, the previous day, and the next day” Window specification in SQL: Given relation sales(date, value) select date, avg(value) over (order by date between rows 1 preceding and 1 following) from sales Examples of other window specifications: between rows unbounded preceding and current rows unbounded preceding ( 從自己前面一筆到最前面 ) range between 10 preceding and current row  All rows with values between current row value –10 to current value range interval 10 day preceding  Not including current row

5.6Database System Concepts - 6 th Edition Data Analysis and OLAP Online Analytical Processing (OLAP) Interactive analysis of data, allowing data to be summarized and viewed in different ways in an online fashion (with negligible delay) Data that can be modeled as dimension attributes and measure attributes are called multidimensional data. For the relation sales(item_name, color, clothes_size, quantify) Measure attributes  measure some value  can be aggregated upon  e.g., the attribute quantity of the sales relation Dimension attributes  define the dimensions on which measure attributes (or aggregates thereof) are viewed  e.g., the attributes item_name, color, and clothes_size of the sales relation

5.7Database System Concepts - 6 th Edition Example sales relation

5.8Database System Concepts - 6 th Edition Cross Tabulation of sales by item_name and color The table above is an example of a cross-tabulation (cross-tab), also referred to as a pivot-table. Values for one of the dimension attributes form the row headers Values for another dimension attribute form the column headers Other dimension attributes are listed on top Values in individual cells are (aggregates of) the values of the dimension attributes that specify the cell.

5.9Database System Concepts - 6 th Edition Data Cube A data cube is a multidimensional generalization of a cross-tab Can have n dimensions; we show 3 below Cross-tabs can be used as views on a data cube

5.10Database System Concepts - 6 th Edition Hierarchies on Dimensions Hierarchy on dimension attributes: lets dimensions to be viewed at different levels of detail  E.g., the dimension DateTime can be used to aggregate by hour of day, date, day of week, month, quarter or year

5.11Database System Concepts - 6 th Edition Cross Tabulation With Hierarchy Cross-tabs can be easily extended to deal with hierarchies Can drill down or roll up on a hierarchy

5.12Database System Concepts - 6 th Edition Relational Representation of Cross-tabs Cross-tabs can be represented as relations We use the value all to represent aggregates. The SQL standard actually uses null values in place of all despite confusion with regular null values.

5.13Database System Concepts - 6 th Edition Online Analytical Processing Operations Pivoting: changing the dimensions used in a cross-tab Slicing: creating a cross-tab for fixed values only Sometimes called dicing, particularly when values for multiple dimensions are fixed. Rollup: moving from finer-granularity data to a coarser granularity Drill down: The opposite operation - that of moving from coarser- granularity data to finer-granularity data

5.14Database System Concepts - 6 th Edition Example of “pivot” select * from sales pivot ( sum(quantity) for color in (‘dark’, ‘pastel’, ‘white’) ) order by item_name; The for clause within the pivot clause specifies what values from the attribute color should appear as attribute names in the pivot result. The values for the newly created attributes are specified to come from the attribute quantity, and the aggregate function specifies how the values should be combined.

5.15Database System Concepts - 6 th Edition Extended Aggregation to Support OLAP The cube operation computes union of group by’s on every subset of the specified attributes Example relation for this section sales(item_name, color, clothes_size, quantity) E.g. consider the query select item_name, color, size, sum(number) from sales group by cube(item_name, color, size) This computes the union of eight different groupings of the sales relation: { (item_name, color, size), (item_name, color), (item_name, size), (color, size), (item_name), (color), (size), ( ) } where ( ) denotes an empty group by list. For each grouping, the result contains the null value for attributes not present in the grouping.

5.16Database System Concepts - 6 th Edition Extended Aggregation (Cont.) The rollup construct generates union on every prefix of specified list of attributes E.g., select item_name, color, size, sum(number) from sales group by rollup(item_name, color, size) Generates union of four groupings: { (item_name, color, size), (item_name, color), (item_name), ( ) } Rollup can be used to generate aggregates at multiple levels of a hierarchy. E.g., suppose table itemcategory(item_name, category) gives the category of each item. Then select category, item_name, sum(number) from sales, itemcategory where sales.item_name = itemcategory.item_name group by rollup(category, item_name) would give a hierarchical summary by item_name and by category.

5.17Database System Concepts - 6 th Edition Extended Aggregation (Cont.) Multiple rollups and cubes can be used in a single group by clause Each generates set of group by lists, cross product of sets gives overall set of group by lists E.g., select item_name, color, size, sum(number) from sales group by rollup(item_name), rollup(color, size) generates the groupings {item_name, ()} X {(color, size), (color), ()} = { (item_name, color, size), (item_name, color), (item_name), (color, size), (color), ( ) }