Algorithm for the Aggregate Function SUM

Slides:



Advertisements
Similar presentations
Lesson 3 Working with Formulas.
Advertisements

16.4 Estimating the Cost of Operations Project GuidePrepared By Dr. T. Y. LinVinayan Verenkar Computer Science Dept San Jose State University.
1 Advanced SQL Queries. 2 Example Tables Used Reserves sidbidday /10/04 11/12/04 Sailors sidsnameratingage Dustin Lubber Rusty.
15.8 Algorithms using more than two passes Presented By: Seungbeom Ma (ID 125) Professor: Dr. T. Y. Lin Computer Science Department San Jose State University.
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
CS411 Database Systems Kazuhiro Minami 06: SQL. Join Expressions.
Chapter 11 Group Functions
The University of Akron Dept of Business Technology Computer Information Systems The Relational Model: Query-By-Example (QBE) 2440: 180 Database Concepts.
ONE PASS ALGORITHM PRESENTED BY: PRADHYUMAN RAOL ID : 114 Instructor: Dr T.Y. LIN.
ONE PASS ALGORITHM PRESENTED BY: PRADHYUMAN RAOL ID : 114 Instructor: Dr T.Y. LIN.
Introduction to Oracle9i: SQL1 SQL Group Functions.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Word Processing Chapter 5 Review Slides. All template files have this file extension and are stored in the Templates folder.dotx.
Concepts of Database Management, Fifth Edition
Chapter 6 Group Functions. Chapter Objectives  Differentiate between single-row and multiple-row functions  Use the SUM and AVG functions for numeric.
CPSC 404, Laks V.S. Lakshmanan1 Evaluation of Relational Operations: Other Operations Chapter 14 Ramakrishnan & Gehrke (Sections ; )
Using Special Operators (LIKE and IN)
Querying a Database - A question or an inquiry (dictionary.com) - WHAT ARE WE ASKING QUESTIONS ABOUT? THE DATA - BY ASKING QUESTIONS OF THE DATA WE OBTAIN?
1 SQL-3 Tarek El-Shishtawy Professor Ass. Of Computer Engineering.
Instructor Neelima Gupta Table of Contents Review of Lower Bounding Techniques Decision Trees Linear Sorting Selection Problems.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Aggregate Function Computation and Iceberg Querying in Vertical Databases Yue (Jenny) Cui Advisor: Dr. William Perrizo Master Thesis Oral Defense Department.
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 2
1 Chapter 3 Single Table Queries. 2 Simple Queries Query - a question represented in a way that the DBMS can understand Basic format SELECT-FROM Optional.
SQL and Query Execution for Aggregation. Example Instances Reserves Sailors Boats.
SQL LANGUAGE TUTORIAL Prof: Dr. Shu-Ching Chen TA: Hsin-Yu Ha.
Copyright © Cengage Learning. All rights reserved. 2 Descriptive Analysis and Presentation of Single-Variable Data.
Aggregate Function Computation and Iceberg Querying in Vertical Databases Yue (Jenny) Cui Advisor: Dr. William Perrizo Master Thesis Oral Defense Department.
More SQL: Complex Queries,
FIGURES FOR CHAPTER 1 GETTING STARTED
Scoring the Technical Evaluation Maximum possible score
Lab 13 Databases and SQL.
Queries.
An Iterative FFT We rewrite the loop to calculate nkyk[1] once
Database Management System
Efficient Ranking of Keyword Queries Using P-trees
Yue (Jenny) Cui and William Perrizo North Dakota State University
SQL: Advanced Options, Updates and Views Lecturer: Dr Pavle Mogin
Chapter 2: Intro to Relational Model
Yue (Jenny) Cui and William Perrizo North Dakota State University
Data Mining Concept Description
Classification by Decision Tree Induction
מדינת ישראל הוועדה לאנרגיה אטומית
The Relational Algebra and Relational Calculus
Session 3 Welcome: To session 3-the sixth learning sequence
Instructor: Mohamed Eltabakh
Query Execution Two-pass Algorithms based on Hashing
SQL – Entire Select.
Built in Functions Massaging the data.
Chapter 4 Summary Query.
1.6) Storing Integer:.
Microsoft Excel – Part I
SQL: Structured Query Language
Exploring Microsoft® Office 2016 Series Editor Mary Anne Poatsy
Relational Algebra.
One-Pass Algorithms for Database Operations (15.2)
Lesson 4: Introduction to Functions
ArcView_module_5 May 13, 10:40 AM
Query Functions.
Chapter 2: Intro to Relational Model
Computer Science 2 More Trees.
Algorithm of Aggregate Function SUM
Aggregate Functions.
Aggregate functions Objective- understand and able to use aggregate functions in select statement Aggregate functions are used to implement calculation.
Measures of location: Mean
Algorithms CSCI 235, Spring 2019 Lecture 19 Order Statistics
LINQ to SQL Part 3.
Relational Algebra Chapter 4 - part I.
Introduction to SQL Server and the Structure Query Language
Group Operations Part IV.
Presentation transcript:

Algorithm for the Aggregate Function SUM SUM function: Sum function can total a field of numerical values. Algorithm 4.1 Evaluating sum () with P-tree. total = 0.00; For i = 0 to n {total = total + 2i * RootCount (Pi);} Return total P4,3 P4,2 P4,1 P4,0 10 5 6 7 11 9 3 1 1 1 1 For example, if we want to know the total number of products which were sold out in relation S, the procedure is showed on left {3} {3} {5} {5} 23 * + 22 * + 21 * + 20 * = 51

Algorithm of Aggregates AVERAGE, MAX Average function: Average function will show the average value in a field. It can be calculated from function COUNT and SUM. Average () = Sum ()/Count (). Algorithm 4.2 Evaluating max () with P-tree. max = 0.00; c = 0; /*Pc is set all 1s*/ For i=n to 0 { c=Count (Pc AND Pi); If (c >= 1) Pc = Pc AND Pi; max = max + 2i } Return max; Steps IF Pos Bits P4,3 P4,2 P4,1 P4,0 1. Pc = P4,3 RootCount (Pc) = 3 >= 1 10 5 6 7 11 9 3 1 1 1 1 {1} 2. RootCount (Pc AND P4,2) = 0 < 1 Pc = Pc AND P’4,2 {0} 3. RootCount (Pc AND P4,1 ) = 2 >= 1 Pc = Pc AND P4,1 {1} 4. RootCount (Pc AND P4,0 ) = 1 >= 1 {1} 23 * + 22 * + 21 * + 20 * = {1} {0} {1} {1} 11

Algorithm of Aggregate Function MIN Algorithm 4.3. Evaluating Min () with P-tree. min=0.00; c = 0; /*Pc is set all 1s*/ For i=n to 0 { c=RootCount (Pc AND NOT (Pi)); If (c >= 1) Pc=Pc AND NOT (Pi); Else min = min + 2i; } Return min; Steps IF Pos Bits P4,3 P4,2 P4,1 P4,0 1. Pc = P’4,3 RootCount (Pc) = 4 > = 1 10 5 6 7 11 9 3 1 1 1 1 {0} 2. RootCount (Pc AND P’4,2) = 1 >= 1 Pc = Pc AND P’4,2 {0} 3. RootCount (Pc AND P’4,1 ) = 0 < 1 Pc = Pc AND P4,1 {1} 4. RootCount (Pc AND P’4,0 ) = 0 < 1 {1} 23 * + 22 * + 21 * + 20 * = {0} {0} {1} {1} 3

Algorithms of Aggregate Function MEDIAN, RANK, TOP-K Rank (K) function returns the value that is the kth largest value in a field. Algorithm 4.4. Evaluating Median () with P-tree median=0.00; pos=N/2; for rank pos=K; c = 0; /*Pc is set all 1s for single attribute*/ For i=n to 0 { c=RootCount (Pc&Pi); If (c >= pos) median=median+2i; Pc=PcΠ Else pos=pos-c; Pc=Pc &NOT(Pi);} Return median; Steps IF Pos Bits P4,3 P4,2 P4,1 P4,0 1. Pc = P4,3 RootCount (Pc) = 3 < 4 Pc = P’4,3 pos = 4 – 3 = 1 10 5 6 7 11 9 3 1 1 1 1 {0} 2. RootCount (Pc AND P4,2) = 3 >= 1 Pc = Pc AND P4,2 {1} 3. RootCount (Pc AND P4,1 ) = 2 >= 1 Pc = Pc AND P4,1 {1} 4. RootCount (Pc AND P4,0 ) = 1 >= 1 {1} 23 * + 22 * + 21 * + 20 * = {0} {1} {1} {1} 7 Top-k: (largest k values): Find rank k value Vk. Find all tuples  Vk using EINRING.

Iceberg Query Operation Using P-trees PMN 1 PNY PCH SELECT Loc, Type, Sum (# Product) FROM Relation S GROUPBY Loc, Type HAVING Sum (# Product) >= 15 Step one: Build value P-trees for the 4 values, {Loc| New York, Minneapolis, Chicago}, of attribute Loc. Calculation of value P-tree PNY. Because binary value of New York is 00001, we will get (1). PNY = P’1,4 AND P’1,3 AND P’1,2 AND P’1,1 AND P1,0 (1) LOC 0 0 0 0 1 P1,4 P1,3 P1,2 P1.1 P1.0 P’1,4 P’1,3 P’1,2 P’1.1 P1.0 PNY 1

After getting all the value P-trees for each location, we calculate the total number of products sold in each place. We still use the value, New York, as our example. Sum(# product | New York) = 23 * RootCount (P4,3 AND PNY) + 22 * RootCount (P4,2 AND PNY) + 21 * RootCount (P4,1 AND PNY) + 20 * RootCount (P4,0 AND PNY) = 8 * 1 + 4 * 2 + 2 * 3 + 1 * 1 = 23 (2) Loc Values Sum (# Product) Threshold New York 23 Y Minneapolis 18 Chicago 9 N Table shows total number of products sold out in each location. Because threshold is 15, eliminate Chicago. 1 PNotebook PDesktop PPrinter PFAX Step 2: Similarly, build P-trees for every value of attribute Type. Attribute Type has 4 values {Type | Notebook, desktop, Printer, Fax}. Figure shows P-tree of 4 values of attrib Type. Type Values Sum (# Product) Threshold Notebook 28 Y Desktop 14 N FAX 3 Printer 6 Similarly we get the summary table for each value of attribute Type. According to the threshold, T equals 15, only value P-tree of notebook will be used in the future.

Calculate total notebooks sold in New York by Sum(# Product | New York) = 23 * RootCount (P4,3 AND PNY AND Notebook) + 22 * RootCount (P4,2 AND PNY AND Notebook) + 21 * RootCount (P4,1 AND PNY AND Notebook) + 20 * RootCount (P4,0 AND PNY AND Notebook) = 8 * 1 + 4 * 1 + 2 * 2 + 1* 1 = 17 (3) Iceberg Step three: We only generate candidate Loc and Type pairs for local store and Product type, which can pass T. By Performing And op on PNY with PNotebook, obtain P-tree PNY AND Notebook 1 PNY PNotebook PNY AND Notebook AND = By performing And operations on PMN with P Notebook, obtain value P-tree PMN AND Notebook 1 PMN PNotebook PMN AND Notebook AND =

We calculate the total number of notebook sold out in Minneapolis by formula 4. Sum (# product | Minneapolis) = 23 * RootCount (P4,3 AND PMN AND Notbook) + 22 * RootCount (P4,2 AND PMN AND Notbook) + 21 * RootCount (P4,1 AND PMN AND Notbook) + 20 * RootCount (P4,0 AND PMN AND Notbook) = 8 * 1 + 4 * 0 + 2 * 1 + 1 * 1 = 11 (4) Finally, we obtain the summary table 5. According to the threshold T=15, we can see that only group pair “New York And Notebook” pass our threshold T. From value P-tree PNY AND Notebook, we can see that tuple 1 and 4 are in the results of our iceberg query example. PNY AND Notebook Type Values Sum (# Product) Threshold New York And Notebook 17 Y Minneapolis And Notebook 11 N 1