Database Principles SQL 2. Database Principles Aggregate Queries SQL has 5 built-in “column functions” called aggregate functions. –min(): Returns the.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Advertisements

Characteristic Functions. Want: YearCodeQ1AmtQ2AmtQ3AmtQ4Amt 2001e (from fin_data table in Sybase Sample Database) Have: Yearquartercodeamount.
CMPT 354 Views and Indexes Spring 2012 Instructor: Hassan Khosravi.
SQL Subqueries Objectives of the Lecture : To consider the general nature of subqueries. To consider simple versus correlated subqueries. To consider the.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Chapter 6 Set Functions.
Chapter 11 Group Functions
1 Database Systems Relations as Bags Grouping and Aggregation Database Modification.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Database Principles ER to RDM Mapping. Database Principles Mapping from ER to Relational Data Model the next phase Exercise: Give me some suggestions.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.
Database Principles SQL 1. Database Principles Connecting to db2: Using your userID and password connect to our server. Connect to the Library database.
Database Principles Relational Algebra. Database Principles What is Relational Algebra? It is a language in which we can ask questions (query) of a database.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 6: Set Functions.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #3.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. SQL - part 2 - Database Management Systems I Alex Coman, Winter 2006.
Database Systems More SQL Database Design -- More SQL1.
Database Principles Hard SQL Queries Explained. Database Principles Query 1, Problem 1: Find the books (author, title) which have not been borrowed by.
A Guide to SQL, Seventh Edition. Objectives Retrieve data from a database using SQL commands Use compound conditions Use computed columns Use the SQL.
Microsoft Access 2010 Chapter 7 Using SQL.
Venn Diagrams Database Principles.
©Silberschatz, Korth and Sudarshan4.1Database System Concepts Chapter 4: SQL Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries.
Relational DBs and SQL Designing Your Web Database (Ch. 8) → Creating and Working with a MySQL Database (Ch. 9, 10) 1.
Chapter 3 Single-Table Queries
SQL Unit 5 Aggregation, GROUP BY, and HAVING Kirk Scott 1.
Banner and the SQL Select Statement: Part Four (Multiple Connected Select Statements) Mark Holliday Department of Mathematics and Computer Science Western.
1 ICS 184: Introduction to Data Management Lecture Note 10 SQL as a Query Language (Cont.)
1 Single Table Queries. 2 Objectives  SELECT, WHERE  AND / OR / NOT conditions  Computed columns  LIKE, IN, BETWEEN operators  ORDER BY, GROUP BY,
Instructor: Jinze Liu Fall Basic Components (2) Relational Database Web-Interface Done before mid-term Must-Have Components (2) Security: access.
Using Special Operators (LIKE and IN)
Database Systems Microsoft Access Practical #3 Queries Nos 215.
Views Lesson 7.
NULLs & Outer Joins Objectives of the Lecture : To consider the use of NULLs in SQL. To consider Outer Join Operations, and their implementation in SQL.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
DATA RETRIEVAL WITH SQL Goal: To issue a database query using the SELECT command.
Concepts of Database Management Seventh Edition Chapter 3 The Relational Model 2: SQL.
©Silberschatz, Korth and Sudarshan3.1Database System Concepts Extended Relational-Algebra-Operations Generalized Projection Aggregate Functions Outer Join.
Access Queries Agenda 6/16/14 Review Access Project Part 1, answer questions Discuss queries: Turning data stored in a database into information for decision.
SCUHolliday - coen 1787–1 Schedule Today: u Subqueries, Grouping and Aggregation. u Read Sections Next u Modifications, Schemas, Views. u Read.
A Guide to SQL, Eighth Edition Chapter Four Single-Table Queries.
More SQL (and Relational Algebra). More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update.
Database Programming Sections 6 –Subqueries, Single Row Subqueries, Multiple-row Subqueries, Correlated Subqueries.
1 Chapter 3 Single Table Queries. 2 Simple Queries Query - a question represented in a way that the DBMS can understand Basic format SELECT-FROM Optional.
SQL: Structured Query Language Instructor: Mohamed Eltabakh 1 Part II.
Day 5 - More Complexity With Queries Explanation of JOIN & Examples Explanation of JOIN & Examples Explanation & Examples of Aggregation Explanation &
BTM 382 Database Management Chapter 8 Advanced SQL Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia.
1 Relational Algebra and SQL. 2 Relational Query Languages Languages for describing queries on a relational database Relational AlgebraRelational Algebra.
Retrieving Information Pertemuan 3 Matakuliah: T0413/Current Popular IT II Tahun: 2007.
Concepts of Database Management, Fifth Edition Chapter 3: The Relational Model 2: SQL.
1 Introduction to Database Systems, CS420 SQL JOIN, Aggregate, Grouping, HAVING and DML Clauses.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Slides are reused by the approval of Jeffrey Ullman’s
Outerjoins, Grouping/Aggregation Insert/Delete/Update
Databases : More about SQL
CPSC-310 Database Systems
Schedule Today: Next After that Subqueries, Grouping and Aggregation.
SQL Structured Query Language 11/9/2018 Introduction to Databases.
CS 405G: Introduction to Database Systems
Chapter 4 Summary Query.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
SQL: Structured Query Language
SQL: Structured Query Language
M1G Introduction to Database Development
Query Functions.
Access: Queries III Participation Project
SQL: Structured Query Language
Section 4 - Sorting/Functions
Projecting output in MySql
Joins and other advanced Queries
More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation
Presentation transcript:

Database Principles SQL 2

Database Principles Aggregate Queries SQL has 5 built-in “column functions” called aggregate functions. –min(): Returns the minimum value in a column –max(): Returns the maximum value in a column –sum(): Returns the sum of the values in a numeric column –count(): Returns the number of values in a column –avg(): Returns the average of the values in a numeric column

Database Principles Simple Syntax for Aggregate Functions: select, from where group by repeated

Database Principles Example Select min(p_price) AS MIN, max(p_price) AS MAX, sum(p_price) AS SUM, count(p_price) AS COUNT, count(distinct p_price) AS COUNT_DISTINCT, CAST(avg(p_price) AS NUMERIC(5,2)) AS AVERAGE, CAST(avg(distinct p_price) AS NUMERIC(5,2)) AS AVERAGE_DISTINCT from copy MIN MAX SUM COUNT COUNT_DISTINCT AVERAGE AVERAGE_DISTINCT record(s) selected.

Database Principles The Gory Details: Using aggregate functions on table columns (or expressions) is complicated by having these functions operate on subgroups of the values in some other column or columns. This is similar to something you might do with Excel.

Database Principles Another example: Perform the previous query but aggregate over the individual books in the Copy table and not the entire table. Select ISBN, min(p_price) AS MIN, max(p_price) AS MAX, sum(p_price) AS SUM, count(p_price) AS COUNT, count(distinct p_price) AS COUNT_DISTINCT, CAST(avg(p_price) AS NUMERIC(5,2)) AS AVERAGE, CAST(avg(distinct p_price) AS NUMERIC(5,2)) AS AVERAGE_DISTINCT from copy group by ISBN

Database Principles The Answer: ISBN MIN MAX SUM COUNT COUNT_DISTINCT AVERAGE AVERAGE_DISTINCT Select ISBN, min(p_price) AS MIN, max(p_price) AS MAX, sum(p_price) AS SUM, count(p_price) AS COUNT, count(distinct p_price) AS COUNT_DISTINCT, CAST(avg(p_price) AS NUMERIC(5,2)) AS AVERAGE, CAST(avg(distinct p_price) AS NUMERIC(5,2)) AS AVERAGE_DISTINCT from copy group by ISBN

Database Principles How Group-By Works: Step 1: Ignore the group_by clause, use the where_clause to build a work table invisible to the programmer. The work table will contain all the columns necessary to calculate the final result table. Step 2: Use the columns in the group_by clause to divide the work table into groups where the values of the group_by columns are the same. Step 3: Calculate the aggregate functions of the select_list one group at a time. Step 4: Produce one row of output per group.

Database Principles Step 1: Create the work table with all data needed to produce the final output: ISBN P_PRICE Remember the final output contains the ISBN column and various aggregate functions applied to p_price.

Database Principles Step 2: Break the work table into groups using the columns of the group_by clause (ISBN). Each group contains a single set of values for the group_by columns. ISBN P_PRICE single value in each group NOTE: This requires sorting the rows of the work table on the columns of the group_by clause.

Database Principles Step 3: Calculate the aggregate functions of the select_list one group at a time. ISBN P_PRICE MIN MAX SUM COUNT COUNT_DISTINCT AVERAGE AVERAGE_DISTINCT

Database Principles Step 4: Produce one row of output per group. ISBN MIN MAX SUM COUNT COUNT_DISTINCT AVERAGE AVERAGE_DISTINCT

Database Principles Observations: It is vitally important that there not be any variation in the non-aggregate values in each group since only one row of output per group is permitted and there can be no ambiguity about what goes in that row. For this reason db2 insists that the non-aggregate columns of the select_list match the columns in the group_by clause. Queries like the following are not permitted because of the potential that author and title might not be constant for a given ISBN (even though we know they are). Select k.ISBN, k.author, k.title, sum(p_price) AS SUM from copy c, book k where c.isbn = k.isbn group by k.ISBN

Database Principles Explanation of Single-Row Rule: Suppose we started to build the work table for the previous query and that there were two copies with the same ISBN but different author or title. For the group ‘5-55’ what would the single output row look like? Select k.ISBN, k.author, k.title, sum(p_price) AS SUM from copy c, book k where c.isbn = k.isbn group by k.ISBN ISBN Author Title p_price X1 T X1 T X1 T OR 5-55 X1 T Since db2 can’t decide it doesn’t let this happen.

Database Principles Solution: Rewrite the query. The groups in the work table would become two groups and no ambiguity about the output Select k.ISBN, k.author, k.title, sum(p_price) AS SUM from copy c, book k where c.isbn = k.isbn group by k.ISBN, k.author, k.title ISBN Author Title p_price X1 T X1 T

Database Principles Example: Find how many books have been borrowed by each cardholder. Problem: The result only has six cardholders and the original Cardholder table has seven cardholders. Analysis: One cardholder (Albert from Rosendale) has not borrowed any book and does not appear in the query result. Why? select b_name, b_addr, count(*) from cardholder ch, borrows b where ch.borrowerid = b.borrowerid group by b_name, b_addr B_NAME B_ADDR diana Tilson 1 jo-ann New Paltz 2 john Kingston 2 john New Paltz 2 mike Modena 3 susan Wallkill 1

Database Principles Example (cont): According to our description of how group by works, the first thing created is a work table. Let’s look at that table. The reason Albert of Rosendale is missing from the work table is because the join_term fails to be true for that cardholder. Since Albert never makes it to the work table he can never make it to the final answer table. B_NAME B_ADDR L_DATE john New Paltz 12/10/1992 john New Paltz 12/01/1992 jo-ann New Paltz 12/14/1992 jo-ann New Paltz 11/30/1992 mike Modena 12/08/1992 mike Modena 12/04/1992 john Kingston 12/09/1992 diana Tilson 12/12/1992 susan Wallkill 12/01/1992 john Kingston 11/28/1992 ch.borrowerid = b.borrowerid

Database Principles Solution: SQL has a special join called the outer join that helps resolve this problem. The left outer join acts like a normal join when the join_term is true. When the join_term is never true for a row in the table to the left of the left outer join syntax, the left outer join is true once. This changes the work table select b_name, b_addr, count(l_date) from cardholder ch left outer join borrows b on ch.borrowerid = b.borrowerid group by b_name, b_addr

Database Principles Solution (2): The reason for the null value is that since there is no join between the row containing Albert’s information and the Borrows table, there is no corresponding l_date value so the work table has to put null value in place of a date. On top of that, the count() function counts a single null value as 0. select b_name, b_addr, count(l_date) from cardholder ch left outer join borrows b on ch.borrowerid = b.borrowerid group by b_name, b_addr B_NAME B_ADDR L_DATE john New Paltz 12/10/1992 john New Paltz 12/01/1992 albert Rosendale null jo-ann New Paltz 12/14/1992 jo-ann New Paltz 11/30/1992 mike Modena 12/08/1992 mike Modena 12/04/1992 john Kingston 12/09/1992 diana Tilson 12/12/1992 susan Wallkill 12/01/1992 john Kingston 11/28/1992

Database Principles Solution (3): The final table then becomes: B_NAME B_ADDR albert Rosendale 0 diana Tilson 1 jo-ann New Paltz 2 john Kingston 2 john New Paltz 2 mike Modena 3 susan Wallkill 1

Database Principles Alternative (Incorrect) Solution: You must be careful that the column being used in the aggregate function must come from the right-hand table. The following query fails to produce the correct result. It is clear that when tries to count ch.borrowerid it is not counting null so actually comes up with a number – 1. select b_name, b_addr, count(ch.borrowerid) from cardholder ch left outer join borrows b on ch.borrowerid = b.borrowerid group by b_name, b_addr B_NAME B_ADDR BORROWERID diana Tilson 9823 jo-ann New Paltz 1325 Albert Rosendale 1345 john Kingston 7635 john New Paltz 1234 mike Modena 2653 susan Wallkill 5342 work table count(ch.borrowerid)

Database Principles Alternative (Incorrect) Solution (cont): Trying to count from the Cardholder table and not the Borrows table yields the following incorrect solution: select b_name, b_addr, count(ch.borrowerid) from cardholder ch left outer join borrows b on ch.borrowerid = b.borrowerid group by b_name, b_addr B_NAME B_ADDR albert Rosendale 1 diana Tilson 1 jo-ann New Paltz 2 john Kingston 2 john New Paltz 2 mike Modena 3 susan Wallkill 1 turns out to be 1 instead of the correct 0.

Database Principles Left Outer Join vs Right Outer Join: The following are equivalent: select b_name, b_addr, count(l_date) from cardholder ch left outer join borrows b on ch.borrowerid = b.borrowerid group by b_name, b_addr select b_name, b_addr, count(l_date) from borrows b right outer join cardholder ch on b.borrowerid = ch.borrowerid group by b_name, b_addr

Database Principles Warning: You are not allowed to use an aggregate function in a where_clause except inside a subquery. The error in this query is that where_clause conditions are evaluated one row at a time and count(*) is always applied to a set of rows as a unit. Find the cardholders with two books borrowed Select b_name, b_addr from cardholder ch, borrows b where ch.borrowerid = b.borrowerid AND count(*) = 2 # this causes a syntax error Find the cardholders with two books borrowed Select b_name, b_addr from cardholder ch where 2 = (select count(*) from borrows b where b.borrowerid = ch.borrowerid) co-related subquery

Database Principles Complete Group By Syntax: The having_clause is intended to do for groups what the where_clause does for rows. In other words, the having_clause is intended to include some groups and not others. select, from where group by having

Database Principles How the Complete Group-By Works: Step 1: Ignore the group_by clause, use the where_clause to build a work table invisible to the programmer. The work table will contain all the columns necessary to calculate the final result table. Step 2: Use the columns in the group_by clause to divide the work table into groups where the values of the group_by columns are the same. Step 3: Apply the having_clause condition to each group in turn, throwing away groups where it is false. Step 4: Calculate the aggregate functions of the select_list one group at a time. Step 5: Produce one row of output per group.

Database Principles Example: For each cardholder, find the total value of all books on loan to that cardholder provided the total values exceeds $ NOTE: We don’t need to use left outer join here because we are only interested in cardholders with one or more book loans. select b_name, b_addr, sum(p_price) from cardholder ch, borrows b, copy c where ch.borrowerid = b.borrowerid AND b.accession_no = c.accession_no group by b_name, b_addr having sum(p_price) >= B_NAME B_ADDR john Kingston mike Modena 95.00

Database Principles Example 2: For each cardholder, find the total value of all books on loan to that cardholder provided the total values is less than $ NOTES: –coalesce(A,B) –If A is null then value is B select b_name, b_addr, coalesce(sum(p_price),0) from cardholder ch left outer join borrows b on ch.borrowerid = b.borrowerid, copy c where b.accession_no = c.accession_no group by b_name, b_addr having coalesce(sum(p_price),0.0) < 40.00; B_NAME B_ADDR albert Rosendale 0.00 diana Tilson jo-ann New Paltz john New Paltz susan Wallkill 37.00

Database Principles Revisit Left Outer Join: Yes, know how to do them but avoid them if you can. Consider To fully join Cardholder to borrows or Copy we need a left outer join. To join Book to Copy we do not need a left outer join. borrows

Database Principles Dummy Rows in Copy and Book Perform the following inserts into Book and Copy Think of these as “dummy” rows and needs be done only once. Minimum participation number of COPY is_copy_of BOOK stays as 1 insert into Book (ISBN) values ('0-00'); insert into Copy(acc_no,ISBN) values ('0','0-00');

Database Principles Insert a New Cardholder Every time you add a new Cardholder, add a corresponding dummy row in Borrows. What we have done is make it appear as though Donna has borrowed the “dummy” copy of the “dummy” book. Now Cardholder Copy minimum participation number is 1. insert into Cardholder (borrowerid,b_name,b_addr,b_status) values(9999,'Donna','Accord','junior'); -- also add insert into Borrows (borrowerid, accession_no) values(9999,'0');

Database Principles Automatic Input Databases provide a mechanism called a trigger to do automatic things like the insert into Borrows. Insert a row into Cardholder and the trigger “fires” and causes an insert to take place in Borrows as well. So even Cardholders who have borrowed nothing have borrowed the dummy book. create trigger i_cardholder after insert on Cardholder referencing new as n for each row begin atomic insert into borrows (borrwerid,accession_no) values(n.borrowerid,'0');

Database Principles Automatic Input (cont) We also need triggers on Borrows because we need a cardholder to either have borrowed the dummy book or a real book but not both. create trigger i_borrows after insert on borrows referencing new as n for each row begin atomic delete into borrows where borrower_id = n.borrower_id and accession_no = '0';

Database Principles Automatic Input (cont) And when we delete a book loan. create trigger d_borrows after delete on borrows referencing old as o for each row BEGIN ATOMIC declare v_accession_cnt int set v_accession_cnt = (select count(*) from borrows where borrower_id = o.borrower_id); IF (v_accession_cnt = 0) THEN insert into borrows(borrower_id,accession_no) values (o.borrower_id,'0'); END IF;

Database Principles No More Left Outer Join: Find the number of books borrowed by each cardholder. NOTES: – qnec(a,b) is a user-defined function that returns 1 if a!= b and 0 if a = b. – sum(0|1) == count(*) where row has 1 select ch.borrower_id, b_name, b_addr, sum(qnec(b.accession_no,'0')) from cardholder ch, borrows b where ch.borrowerid = b.borrowerid group by ch.borrowerid, b_name, b_addr;

Database Principles Non-Aggregate Example Suppose we want a list of all books a cardholder has borrowed and the cardholder names. Place a – where the cardholder has borrowed no books compared to select b_name, title from cardholder ch, borrows b, copy c, book k where ch.borrowerid = b.borrowerid and b.accession_no = c.accession_no and c.isbn = k.isbn; select b_name, title from cardholder ch left outer join borrows b on ch.borrowerid = b.borrowerid, copy c, book k Where b.accession_no = c.accession_no and c.isbn = k.isbn;