Database Systems Subqueries, Aggregation

Slides:



Advertisements
Similar presentations
SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: SQL92, SQL2, SQL3. Vendors support.
Advertisements

Relational Algebra (end) SQL April 19 th, Complex Queries Product ( pid, name, price, category, maker-cid) Purchase (buyer-ssn, seller-ssn, store,
1 Lecture 12: SQL Friday, October 26, Outline Simple Queries in SQL (5.1) Queries with more than one relation (5.2) Subqueries (5.3) Duplicates.
1 Lecture 03: Advanced SQL. 2 Outline Unions, intersections, differences Subqueries, Aggregations, NULLs Modifying databases, Indexes, Views Reading:
1 Lecture 03: SQL Friday, January 7, Administrivia Have you logged in IISQLSRV yet ? HAVE YOU CHANGED YOUR PASSWORD ? Homework 1 is now posted.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #12 M.P. Johnson Stern School of Business, NYU Spring, 2005.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. SQL - part 2 - Database Management Systems I Alex Coman, Winter 2006.
Correlated Queries SELECT title FROM Movie AS Old WHERE year < ANY (SELECT year FROM Movie WHERE title = Old.title); Movie (title, year, director, length)
1 Lecture 3: More SQL Friday, January 9, Agenda Homework #1 on the web site today. Sign up for the mailing list! Next Friday: –In class ‘activity’
Union, Intersection, Difference (SELECT name FROM Person WHERE City=“Seattle”) UNION (SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The.
CSE544: SQL Monday 3/27 and Wednesday 3/29, 2006.
Exercises Product ( pname, price, category, maker) Purchase (buyer, seller, store, product) Company (cname, stock price, country) Person( per-name, phone.
1 Lecture 4: More SQL Monday, January 13th, 2003.
SQL April 22 th, Agenda Union, intersections Sub-queries Modifying the database Views Modifying views Reusing views.
1 SQL cont.. 2 Outline Unions, intersections, differences (6.2.5, 6.4.2) Subqueries (6.3) Aggregations (6.4.3 – 6.4.6) Hint for reading the textbook:
IM433-Industrial Data Systems Management Lecture 5: SQL.
More SQL: Complex Queries, Triggers, Views, and Schema Modification UMM AL QURA UNIVERSITY College of Computer Dr. Ali Al Najjar 1.
1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.
SQL SQL Review. SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: ANSI SQL, SQL92.
SQL. SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: ANSI SQL, SQL92 (a.k.a.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
1 SQL Additional Notes. 2  1 Group and Aggregation*  2 Execution Order*  3 Join*  4 Find the maximum  5 Line Format SQL Additional Notes *partially.
1/18/00CSE 711 data mining1 What is SQL? Query language for structural databases (esp. RDB) Structured Query Language Originated from Sequel 2 by Chamberlin.
1 Introduction to Database Systems CSE 444 Lecture 02: SQL September 28, 2007.
1 Lecture 02: SQL Friday, September 30, Administrivia Homework 1 is out. Due: Wed., Oct. 12 Did you login on IISQLSRV ? Did you change your password.
Background Lots of Demos(That’s it.)
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 4: Intermediate.
1 Lecture 03: SQL Monday, January 9, Project t/Default.aspxhttp://iisqlsrv.cs.washington.edu/444/Projec.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 4: Intermediate.
A Glance at the Window Functions. Window Functions Introduced in SQL 2005 Enhanced in SQL 2012 So-called because they operate on a defined portion of.
SQL. SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: ANSI SQL, SQL92 (a.k.a.
CS3220 Web and Internet Programming More SQL
SQL.
SQL – Part 2.
CS580 Advanced Database Topics
Cours 7: Advanced SQL.
Database Systems SQL cont. Relational algebra
Lecture 04: SQL Monday, January 10, 2005.
Chapter # 6 The Relational Algebra and Calculus
6/22/2018.
Chapter 3 Introduction to SQL(3)
Using Window Ranking, Offset, and Aggregate Functions
Cse 344 April 4th – Subqueries.
Monday 3/27 and Wednesday 3/29, 2006
Class 2 Relational Data Languages
Lecture 2 (cont’d) & Lecture 3: Advanced SQL – Part I
Cse 344 January 12th –joins.
SQL-2 Week 9-11.
Introduction to Database Systems CSE 444 Lecture 03: SQL
January 19th – Subqueries 2 and relational algebra
January 17th – Subqueries
Lecture 4: Advanced SQL – Part II
Introduction to SQL Wenhao Zhang October 5, 2018.
Introduction to Database Systems CSE 444 Lecture 03: SQL
More SQL: Complex Queries, Triggers, Views, and Schema Modification
CSE544 SQL Wednesday, March 31, 2004.
SQL Introduction Standard language for querying and manipulating data
Lecture 12: SQL Friday, October 20, 2000.
Lectures 3: Introduction to SQL Part II
Introduction to Database Systems CSE 444 Lecture 02: SQL
Lectures 5: Introduction to SQL 4
Lecture 4: SQL Thursday, January 11, 2001.
Lectures 6: Introduction to SQL 5
Lecture 3 Monday, April 8, 2002.
Lecture 4: SQL Wednesday, April 10, 2002.
Lecture 03: SQL Friday, October 3, 2003.
Lecture 04: SQL Monday, October 6, 2003.
Lecture 14: SQL Wednesday, October 31, 2001.
Presentation transcript:

Database Systems Subqueries, Aggregation Gergely Lukács Pázmány Péter Catholic University Faculty of Information Technology Budapest, Hungary lukacs@itk.ppke.hu

Overview Subqueries VIEWS AGGREGATION WHERE clause FROM clause VIEWS AGGREGATION GROUP BY, HAVING Grouping and aggregating + Joining: Chasm trap, Fan trap SQL for analysis, Window function

Subqueries WHERE CLAUSE

Subqueries Returning Relations Company(name, city) Product(pname, maker) Purchase(id, product, buyer) Return cities where one can find companies that manufacture products bought by Joe Blow SELECT Company.city FROM Company WHERE Company.name IN (SELECT Product.maker FROM Purchase INNER JOIN Product ON Product.pname=Purchase.product WHERE Purchase.buyer = ‘Joe Blow‘);

Subqueries Returning Relations Is it equivalent to this ? SELECT company.city   FROM company        INNER JOIN  product                ON company.name = product.maker        INNER JOIN  purchase                ON product.pname = purchase.product                   AND purchase.buyer = 'Joe Blow';  Beware of duplicates !

Removing Duplicates Now they are equivalent SELECT Company.city FROM Company WHERE Company.name IN (SELECT Product.maker FROM Purchase, Product WHERE Product.pname=Purchase.product AND Purchase .buyer = ‘Joe Blow‘); SELECT  DISTINCT company.city   FROM company        INNER JOIN  product                ON company.name = product.maker        INNER JOIN  purchase                ON product.pname = purchase.product                   AND purchase.buyer = 'Joe Blow';  Now they are equivalent

Subqueries Returning Relations You can also use: s > ALL R s > ANY R EXISTS R Product ( pname, price, category, maker) Find products that are more expensive than all those produced By “Gizmo-Works” SELECT name FROM Product WHERE price > ALL (SELECT price FROM Purchase WHERE maker=‘Gizmo-Works’)

Correlated Queries Movie (title, year, director, length) Find movies whose title appears more than once. SELECT DISTINCT title FROM Movie AS x WHERE year <> ANY (SELECT year FROM Movie WHERE title = x.title); Notes: (1) scope of variables (2) this can still be expressed as single SFW

IN NOT IN EXISTS NOT EXISTS ALL

Subqueries FROM CLAUSE

Subquery Subquery SUBQUERY SQL> SELECT … FROM … WHERE …

Subquery in the FROM clause SELECT DISTINCT company.city FROM company,        product,        (SELECT *          FROM purchase          WHERE purchase.buyer = 'Joe Blow') Purchase_filtered WHERE company.name = product.maker    AND product.pname = Purchase_filtered.product  Very useful in more complex queries; s. Aggregation later Also called: „inline view”

view

Views In some cases, we want to have the results of queries as tables without having to think again about the query In some cases, it is not desirable for all users to see the entire logical model (that is, all the actual relations stored in the database.) Consider a person who needs to know an instructors name and department, but not the salary. This person should see a relation described, in SQL, by select ID, name, dept_name from instructor A view provides a mechanism for these issues. View : “virtual relation”, defined by a query

Example View A view of instructors without their salary CREATE VIEW faculty AS SELECT ID, name, dept_name FROM instructor Find all instructors in the Biology department SELECT name FROM faculty WHERE dept_name = ’Biology’

View Definition create view v as < query expression > A view is defined using the create view statement which has the form create view v as < query expression > where <query expression> is any legal SQL expression. The view name is represented by v. Once a view is defined, the view name can be used to refer to the virtual relation that the view generates. View definition is not the same as creating a new relation by evaluating the query expression Rather, a view definition causes the saving of an expression; the expression is substituted into queries using the view.

aggregation

Aggregation SELECT avg(price) FROM Product WHERE maker=“Toyota” SELECT count(*) FROM Product WHERE year > 1995 SQL supports several aggregation operations: sum, count, min, max, avg Except count, all aggregations apply to a single attribute

Aggregation: Count COUNT applies to duplicates, unless otherwise stated: same as Count(*) (except for NULL values) SELECT Count(category) FROM Product WHERE year > 1995 We probably want: SELECT Count(DISTINCT category) FROM Product WHERE year > 1995

More Examples Purchase(product, date, price, quantity) SELECT Sum(price * quantity) FROM Purchase What do they mean ? SELECT Sum(price * quantity) FROM Purchase WHERE product = ’bagel’

Simple Aggregations Purchase Product Date Price Quantity Bagel 10/21 1 20 Banana 10/3 0.5 10 10/10 10/25 1.50 SELECT Sum(price * quantity) FROM Purchase WHERE product = ’bagel’ 50 (= 20+30)

Grouping and Aggregation Purchase(product, date, price, quantity) Find total sales per product. SELECT product, Sum(price*quantity) AS TotalSales FROM Purchase GROUP BY product Let’s see what this means…

SELECT – FROM – GROUP BY Product TotalSales Bagel 50 Banana 15 Product Date Price Quantity Bagel 10/21 1 20 10/25 1.50 Banana 10/3 0.5 10 10/10 SELECT product, Sum(price*quantity) AS TotalSales FROM Purchase GROUP BY product

WHERE Find total sales per product. Consider only sales after 10/1/2005 SELECT product, Sum(price*quantity) AS TotalSales FROM Purchase WHERE date > ‘10/1/2005’ GROUP BY product

HAVING Clause Same query, except that we consider only products the total quantity of which is more than 30. SELECT product, Sum(price * quantity) FROM Purchase GROUP BY product HAVING Sum(quantity) > 30 HAVING clause contains conditions on aggregates. Filters groups

General form of Grouping and Aggregation SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2 ORDER BY O Evaluation steps: Evaluate FROM-WHERE, apply condition C1 Group by the attributes a1,…,ak Apply condition C2 to each group (may have aggregates) Compute aggregates in S Sort the result according O return the result

General form of Grouping and Aggregation SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2 ORDER BY O S = may contain attributes a1,…,ak and/or any aggregates but NO OTHER ATTRIBUTES C1 = is any condition on the attributes in R1,…,Rn C2 = is any condition on aggregate expressions O = may contain attributes a1,…,ak and/or any aggregates but NO OTHER ATTRIBUTES

Aggregating with Join Id User_id Name 1 2 Bella Tiger 3 Molly Id Name Cat Person Id Name 1 Peter 2 Anna Id User_id Name 1 Max 2 Jack 3 Duke Dog

Aggregating with OUTER JOIN Chasm trap SELECT p.id,        p.name,        Count(c.id) AS cat_count FROM   jn_person p        LEFT OUTER JOIN jn_cat c                     ON p.id = c.person_id GROUP  BY p.id,           p.name;  SELECT p.id,        p.name,        Count(c.id) AS cat_count FROM   jn_person p        INNER JOIN jn_cat c                ON p.id = c.person_id GROUP  BY p.id,           p.name;  Chasm trap

Aggregating over three tables Fan trap SELECT p.id,        p.name,        COUNT(c.id) AS cat_count,        COUNT(d.id) AS dog_count FROM   jn_person p        LEFT OUTER JOIN jn_cat c                     ON p.id = c.person_id        LEFT OUTER JOIN jn_dog d                     ON p.id = d.person_id GROUP  BY p.id,           p.name; 

Aggregating over three tables SELECT pc.id,        pc.name,        pc.cat_count,        Count(d.id) dog_count FROM   (SELECT p.id,                p.name,                Count(c.id) AS cat_count         FROM   jn_person p                LEFT OUTER JOIN jn_cat c                             ON p.id = c.person_id         GROUP  BY p.id,                   p.name) pc        LEFT OUTER JOIN jn_dog d                     ON pc.id = d.person_id GROUP  BY pc.id,           pc.name,           pc.cat_count;  ; 

SQL for analysis and reporting, window functions

SELECT fname,        lname,        salary,        Rank() OVER (ORDER BY salary) FROM   employee; 

SELECT fname,         lname,         salary,         Rank()  OVER (ORDER BY salary),         Dense_rank() OVER (ORDER BY salary),         Round(100 * Percent_rank()OVER(ORDER BY salary))  FROM  employee;  Row_number(), rank(), dense_rank(), percent_rank(), cume_dist(), ntile()

SELECT fname,         lname,         salary,         Avg(salary)           OVER (             ORDER BY salary),         SUM (salary)           OVER (             ORDER BY salary)  FROM   employee;  

SELECT fname,         lname,         salary,         Lag(salary) OVER (ORDER BY salary),         salary - Lag(salary) OVER (ORDER BY salary)  FROM   employee;  Lag(), Lead(), First_value(), Last_value()

SELECT fname,         lname,         dno,         salary,         Rank() OVER (PARTITION BY dno  ORDER BY salary )  FROM   employee;

Window Functions Overview ROWS UNBOUNDED PRECEDING AND CURRENT ROW (default) ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING RANGE BETWEEN 2 PRECEDING AND 2 FOLLOWING