Structured Query Language

Slides:



Advertisements
Similar presentations
1 Advanced SQL Queries. 2 Example Tables Used Reserves sidbidday /10/04 11/12/04 Sailors sidsnameratingage Dustin Lubber Rusty.
Advertisements

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Programming, Triggers Chapter 5 Modified by Donghui Zhang.
Introduction to Database Systems 1 SQL: The Query Language Relation Model : Topic 4.
1 SQL: Structured Query Language (‘Sequel’) Chapter 5.
CS 166: Database Management Systems
SQL: Queries, Constraints, Triggers
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Constraints, Triggers Chapter 5.
Database Management Systems 1 Raghu Ramakrishnan SQL: Queries, Programming, Triggers Chpt 5.
SQL: The Query Language Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein and etc for some slides.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Programming, Triggers Chapter 5.
CS 405G: Introduction to Database Systems
CMPT 258 Database Systems SQL: Queries, Constraints, Triggers (Chapter 5) Part II home.manhattan.edu/~tina.tian.
1 SQL: Structured Query Language (‘Sequel’) Chapter 5.
1 Reminder We have covered: –Creating tables –Converting ER diagrams to table definitions Today we’ll talk about: –Altering tables –Inserting and deleting.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 90 Database Systems I SQL Queries.
1 SQL (Simple Query Language). 2 Query Components A query can contain the following clauses –select –from –where –group by –having –order by Only select.
FALL 2004CENG 351 File Structures and Data Management1 SQL: Structured Query Language Chapter 5.
1 Views and Null values. 2 What does this return? SELECT B.bid, COUNT(*) FROM Boats B, Reserves R WHERE R.bid=B.bid and B.color=‘red’ GROUP BY B.bid For.
Rutgers University SQL: Queries, Constraints, Triggers 198:541 Rutgers University.
1 Views. 2 Views A view is a "virtual table" defined using a query You can use a view as if it were a table, even though it doesn't contain data The view.
1 Table Alteration. 2 Altering Tables Table definition can be altered after its creation –Adding columns –Changing columns’ definition –Dropping columns.
1 Views. 2 What are views good for?(1) Simplifying complex queries: we saw one example. Here is another example that allows the user to "pretend" that.
1 Rewriting Minus Queries Using Not In SELECT S.sname FROM Sailors S, Boats B, Reserves R WHERE S.sid = R.sid and R.bid = B.bid and B.color = ‘red’ MINUS.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Constraints, Triggers Chapter 5.
1 The Oracle Database System Querying the Data Database Course The Hebrew University of Jerusalem.
1 Views. 2 What are views good for? (1) Simplifying complex queries: We saw one example. Here is another that allows the user to "pretend" that there.
1 SQL: Structured Query Language Chapter 5. 2 SQL and Relational Calculus relationalcalculusAlthough relational algebra is useful in the analysis of query.
1 Rewriting Intersect Queries Using In SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid = R.sid and R.bid = B.bid and B.color = ‘red’ INTERSECT.
1 Advanced SQL. 2 Consider the following relations: –pupil (pupil_name, address, class, birthyear) –subject (subject_name, class, teacher) –grades (pupil_name,
CSC343 – Introduction to Databases - A. Vaisman1 SQL: Queries, Programming, Triggers.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
CSC 411/511: DBMS Design Dr. Nan WangCSC411_L6_SQL(1) 1 SQL: Queries, Constraints, Triggers Chapter 5 – Part 1.
SQL Part I: Standard Queries. COMP-421: Database Systems - SQL Queries I 2 Example Instances sid sname rating age 22 debby debby lilly.
SQL Examples CS3754 Class Note 11 CS3754 Class Note 11, John Shieh,
Unit 5/COMP3300/ SQL: Queries, Programming, Triggers Chapter 5.
SQL: Queries, Programming, Triggers. Example Instances We will use these instances of the Sailors and Reserves relations in our examples. If the key for.
ICS 321 Fall 2009 SQL: Queries, Constraints, Triggers Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 9/8/20091Lipyeow.
1 Database Systems ( 資料庫系統 ) October 24, 2005 Lecture #5.
1 SQL: Queries, Constraints, Triggers Chapter 5. 2 Example Instances R1 S1 S2  We will use these instances of the Sailors and Reserves relations in our.
Introduction to SQL ; Christoph F. Eick & R. Ramakrishnan and J. Gehrke 1 Using SQL as a Query Language COSC 6340.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Constraints, Triggers Chapter 5.
CMPT 258 Database Systems SQL Queries (Chapter 5).
1 SQL: Queries, Constraints, Triggers Chapter 5. 2 Overview: Features of SQL  Data definition language: used to create, destroy, and modify tables and.
1 SQL: The Query Language (Part II). 2 Expressions and Strings v Illustrates use of arithmetic expressions and string pattern matching: Find triples (of.
1 SQL: Structured Query Language (‘Sequel’) Chapter 5.
SQL: The Query Language Part 1 R &G - Chapter 5 The important thing is not to stop questioning. Albert Einstein.
1 SQL: The Query Language. 2 Example Instances R1 S1 S2 v We will use these instances of the Sailors and Reserves relations in our examples. v If the.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Basic SQL Queries.
1 CS122A: Introduction to Data Management Lecture 9 SQL II: Nested Queries, Aggregation, Grouping Instructor: Chen Li.
1 SQL Structured Query Language. Query Language SQL is a query language Used to examine data in the database SQL queries do not change the contents of.
1 SQL Structured Query Language. Query Language SQL is a query language Used to examine data in the database SQL queries do not change the contents of.
Basic SQL Queries Go over example queries, like 10 > ALL.
COP Introduction to Database Structures
© פרופ' יהושע שגיב, האוניברסיטה העברית
01/31/11 SQL Examples Edited by John Shieh CS3754 Classnote #10.
SQL The Query Language R & G - Chapter 5
Database Systems October 14, 2009 Lecture #5.
DB Review.
Database Systems (資料庫系統)
Basic SQL Lecture 6 Fall
Database Applications (15-415) SQL-Part II Lecture 9, February 04, 2018 Mohammad Hammoud.
CS 405G: Introduction to Database Systems
SQL: Queries, Constraints, Triggers
Relational Algebra Chapter 4 - part I.
SQL: The Query Language Part 1
קורס קבצים ובסיסי נתונים
CS4222 Principles of Database System
SQL: Structured Query Language
SQL: Queries, Programming, Triggers
SQL: Queries, Programming, Triggers
Presentation transcript:

Structured Query Language SQL Structured Query Language

Query Language SQL is a query language Used to examine data in the database SQL queries do not change the contents of the database (no side-effects!) The result of an SQL query is printed to the screen, not stored in the database

Basic SQL query structure SELECT Attributes FROM relations WHERE condition For example: SELECT sid,sname FROM students WHERE sid=1122

Query Components A query can contain the following clauses select from where group by having order by Only select and from are obligatory Order of clauses is always as above

Very Basic SQL Query SELECT [Distinct] Attributes FROM relation Attributes: The attributes or values which will appear in the query result (For example: id, name). DISTINCT: Optional keyword to delete duplicates Relation: Relation to perform the query on. Example: Select studentID, studentName From students

Select studentID, studentName From students Result: StudentDept. StudentName StudentAge 1123 Math Moshe 25 2245 Computers Mickey 26 55611 Menahem 29 Select studentID, studentName From students Result: StudentID StudentName 1123 Moshe 2245 Mickey 55611 Menahem

Basic SQL Query SELECT [Distinct] Attributes FROM relation WHERE condition condition: A Boolean condition (For example: Eid>21, or Ename=‘Yuval’ ). Only tuples which return ‘true’ for this condition will appear in the result

Select studentID, studentName From students Where StudentDept=‘Math’ StudentAge 1123 Math Moshe 25 2245 Computers Mickey 26 55611 Menahem 29 Select studentID, studentName From students Where StudentDept=‘Math’ Result: StudentID StudentName 1123 Moshe 55611 Menahem

SQL and relational algebra SELECT Distinct A1,…,An FROM R1,…,Rm WHERE C; A1,…,An (C(R1 x…x Rm))

Basic SQL Query Important! The evaluation order, conceptually, is: SELECT [Distinct] attributes FROM relations WHERE condition; Important! The evaluation order, conceptually, is: Compute the cross product of the tables in relations. Delete all rows that do not satisfy condition. Delete all columns that do not appear in attributes. If Distinct is specified eliminate duplicate rows.

Example Tables Used Boats bid bname color 101 103 Nancy Gloria red green Sailors sid sname rating age 22 31 58 Dustin Lubber Rusty 7 8 10 45.0 55.5 35.0 Reserves sid bid day 22 58 101 103 10/10/96 11/12/96

All sailors who have reserved a boat What does this compute? Select sname from sailors, reserves Where sailors.sid=reserves.sid All sailors who have reserved a boat Sailors sid sname rating age 22 31 58 Dustin Lubber Rusty 7 8 10 45.0 55.5 35.0 Reserves sid bid day 22 58 101 103 10/10/96 11/12/96 Computation on upcoming slides…

Stage 1: Sailors x Reserves sid sname rating age bid day 22 Dustin 7 45.0 101 10/10/96 58 103 11/12/96 31 Lubber 8 55.5 Rusty 10 35.0

Stage 2: “where sailors.sid=reserves.sid” sname rating age bid day 22 Dustin 7 45.0 101 10/10/96 58 103 11/12/96 31 Lubber 8 55.5 Rusty 10 35.0

Stage 2: “where sailors.sid=reserves.sid” sname rating age bid day 22 Dustin 7 45.0 101 10/10/96 58 Rusty 10 35.0 103 11/12/96

Stage 3: “select sname” Sailors Reserves sid sname rating age bid day 22 Dustin 7 45.0 101 10/10/96 58 Rusty 10 35.0 103 11/12/96

Stage 3: “select sname” sname Dustin Rusty Final answer

Example Query Q: What does this compute? SELECT sname, age FROM Sailors WHERE rating>7; Q: What does this compute? Name and age of sailors with rating > 7

Example Query Q: What does this compute? SELECT DISTINCT sname FROM Sailors, Reserves WHERE Sailors.sid = Reserves.sid and bid = 103; Q: What does this compute? Names of sailors who reserved boat number 103

WHERE Sailors.sid = Reserves.sid and bid = 103; Sailors Reserves sid sname rating age bid day 22 Dustin 7 45.0 101 10/10/96 58 103 11/12/96 31 Lubber 8 55.5 Rusty 10 35.0

Select sname Sailors Reserves sid sname rating age bid day 22 Dustin 7 45.0 101 10/10/96 58 103 11/12/96 31 Lubber 8 55.5 Rusty 10 35.0

A Few SELECT Options Select all columns: Rename selected columns: FROM Sailors; Rename selected columns: SELECT sname AS Sailors_Name Applying functions (e.g., Mathematical manipulations) SELECT (age-5)*2

The WHERE Clause Numerical and string comparison: !=,<>,=, <, >, >=, <=, between(val1 AND val2) Logical components: AND, OR Null verification: IS NULL, IS NOT NULL Checking against a list with IN, NOT IN.

Examples SELECT sname FROM Sailors WHERE age>=40 AND rating IS NOT NULL ; SELECT sid, sname FROM sailors WHERE sid IN (1223, 2334, 3344) or sname between(‘George’ and ‘Paul’);

The LIKE Operator A pattern matching operator (regular expression) Basic format: colname LIKE pattern Example: _ is a single character % is 0 or more characters SELECT sid FROM Sailors WHERE sname LIKE ‘R_%y’;

Relation naming Naming relations is good style SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid = R.sid and R.bid = 103; Naming relations is good style It is necessary if the same relation appears twice in the FROM clause

Explanation in the next slides Example Query SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid = R.sid and R.bid != 103; Q: Does this return the names of sailors who did not reserve boat 103? A: No! it returns the names of sailors who reserved a boat other than boat 103 Explanation in the next slides

SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid = R.sid and R.bid != 103; Sailors sid sname rating age 22 31 Dustin Lubber 7 8 45.0 55.5 Reserves sid bid day 22 31 101 103 104 10/10/07 11/12/07 12/2/07

Sailors Reserves sid sname rating age bid day 22 Dustin 7 45.0 101 10/10/07 103 11/12/07 31 104 12/2/07 Lubber 8 55.5

Sailors Reserves sid sname rating age bid day 22 Dustin 7 45.0 101 10/10/07 103 11/12/07 31 104 12/2/07 Lubber 8 55.5

But Dustin did order boat 103! Sailors Reserves sid sname rating age bid day 22 Dustin 7 45.0 101 10/10/07 31 Lubber 8 55.5 104 12/2/07 sname Dustin Lubber But Dustin did order boat 103!

When there is a sailor who reserved more than a single boat SQL query SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid = R.sid; When would adding DISTINCT give a different result? When there is a sailor who reserved more than a single boat

Are any of these the same? SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid = R.sid; SELECT DISTINCT R.sid FROM Sailors S, Reserves R WHERE S.sid = R.sid; SELECT R.sid FROM Reserves R No. The first will return a Sailor sid as many times as he reserved a boat. The second will return sailors who reserved boats, without repetition. The third will return sids in Reserves. The first and the third will be the same if every sid in reserves appears in sailors (i.e., if there is a foreign key constraint) Sailors sid sname rating age Reserves sid bid day

How would you find sailors who have reserved more than one boat? Example Query How would you find sailors who have reserved more than one boat? SELECT S.sname FROM Sailors S, Reserves R1, Reserves R2 WHERE S.sid = R1.sid and R1.sid=R2.sid and R1.bid!=R2.bid;

SQL query Q: What does this return? SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid and R.bid = B.bid and B.color = 'red' Q: What does this return? Names of sailors who reserved red boats

SQL query Q: How would you find the colors of boats reserved by Bob? SELECT distinct B.color FROM Sailors S, Reserves R, Boats B WHERE S.sname = ‘Bob’ and S.sid = R.sid and R.bid = B.bid

Order Of the Result The ORDER BY clause can be used to sort results by one or more columns The default sorting, when ORDER BY is used, is in ascending order Can specify ASC or DESC SELECT sname, rating, age FROM Sailors S WHERE age > 50 ORDER BY rating ASC, age DESC

What does this return? What would happen if we replaced or by and ? SELECT DISTINCT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid and R.bid = B.bid and (B.color = 'red' or B.color='green') What would happen if we replaced or by and ? Returns sailors who reserved red or green boats. Replacing or with and would result in sailors who reserved a boat that is both red and green. Since no boat is both red and green (each boat has a single color), the result will be empty. We would get no results! Then how can we find sailors who have reserved both a green and a red boat?

Sailors who’ve reserved red and green boats SELECT S.sname FROM Sailors S, Reserves R1, Reserves R2 Boats B1, Boats B2 WHERE S.sid = R1.sid and R1.bid = B1.bid and B1.color = ‘red’ and S.sid = R2.sid and R2.bid = B2.bid and B2.color = ‘green’;

Other Relational Algebra Operators So far, we have seen selection, projection and Cartesian product How do we do operators UNION and MINUS?

Three SET Operators [Query] UNION [Query] [Query] EXCEPT [Query] [Query] INTERSECT [QUERY] Note: The operators remove duplicates by default!

Sailors who’ve reserved red or green boat SELECT S.sname FROM Sailors S, Boats B, Reserves R WHERE S.sid = R.sid and R.bid = B.bid and B.color = ‘red’ UNION WHERE S.sid = R.sid and R.bid = B.bid and B.color = ‘green’; Would INTERSECT give us sailors who reserved both red and green boats? Be careful when intersecting between queries that do not return a key! Almost, but not quite because sname is not unique…

Multiset (Bag) Operators Union without removing duplicates: UNION ALL SELECT DISTINCT sname FROM Sailors S UNION ALL

Nested Queries

Nested Queries A query is nested if one of its clauses contains a query Queries can be nested in the following clauses: Select From Where Having

Nested Queries (cont) A sub-query of a nested query is correlated if it refers to relations appearing in the outer portion of the query We start by discussing subqueries in the WHERE clause Common operators used to correlate are: IN, ANY/ALL, EXISTS

Remember! The WHERE clause is evaluated for each tuple in the Cartesian Product formed by the FROM clause, and a Boolean answer is returned The subquery is used to define the Boolean answer!

Nested queries in WHERE Subqueries with multiple results: SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid = 103); We would return names of sailors who did not reserve boat 103 What would happen if we wrote NOT IN?

In/Not In Format The format of these subqueries is always: Attribute or value In / Not in Subquery that returns a single column Returns true if the attribute value is in / not in the result of the subquery

Names of sailors who did not reserve a red boat What does this produce? SELECT S.sname FROM Sailors S WHERE S.sid NOT IN (SELECT R.sid FROM Reserves R WHERE R.bid IN (SELECT B.bid FROM Boats B WHERE B.color='red')) Names of sailors who did not reserve a red boat

Any/All Format The format of these subqueries is always: Attribute or value Arithmetic comparison operator Any / All Subquery that returns a single column Returns true if the attribute value satisfies the arithmetic operator with respect to any/all of the query results

Set-Comparison Queries SELECT * FROM Sailors S1 WHERE S1.age > ANY (SELECT S2.age FROM Sailors S2); We can also use op ALL (op is >, <, =, >=, <=, or <>).

Exists/Not Exists Format The format of these subqueries is always: Exists / Not Exists Subquery that returns any number of columns Returns true if the subquery returns a non-empty (resp. empty) result

Correlated Nested Queries SELECT S.sid FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid = 103 and S.sid = R.sid); S not in subquery, refers to outer loop Sid of sailors who reserved boat 103 Q: What if we wrote NOT EXISTS? A: We would get sid of sailors who did not reserve boat 103

Exists and Not Exists Differs from In and Not In Exists: For every tuple in the outer loop, the inner loop is tested. If the inner loop produces a result, the outer tuple is added to the result.

How would you find the names of sailors who have reserved a red boat but not a green boat? SELECT SS.sname from sailors SS where SS.sid in ( SELECT R1.sid FROM Reserves R1, Boats B1 WHERE R1.bid=B1.bid and B1.color=‘red’ EXCEPT SELECT R2.sid FROM Reserves R2, Boats B2 WHERE R2.bid=B2.bid and B2.color=‘green’ );

Rewrite using not in SELECT SS.sname from sailors SS where SS.sid in ( SELECT R1.sid FROM Reserves R1, Boats B1 WHERE R1.bid=B1.bid and B1.color=‘red’ and R1.sid not in ( SELECT R2.sid FROM Reserves R2, Boats B2 WHERE R2.bid=B2.bid and B2.color=‘green’ ));

How would you find the sailors who have reserved all boats?

Remember: Algebraic Operator of Division Consider: A(X,Y) and B(Y). Then AB = In general, we require that the set of fields in B be contained in those of A.

Reminder: Suppliers from A who supply All Parts from B (1) sno pno S1 S2 S3 S4 P1 P2 P3 P4 sno P2 pno B1  = S1, S2, S3, S4 A

Reminder: Suppliers from A who supply All Parts from B (2) sno pno S1 S2 S3 S4 P1 P2 P3 P4 sno P2 P4 pno B2  = S1, S4 A

Reminder: Suppliers from A who supply All Parts from B (3) sno pno S1 S2 S3 S4 P1 P2 P3 P4 sno P1 P2 P4 pno B3  = S1 A

Sailors who Reserved all Boats So what is the result of this? Reserves(sid,bid) Boats(bid) Sailors who Reserved all Boats Sailor S whose "set of boats reserved" contains the "set of all boats"

What is the strategy for finding sailors who have reserved all boats? The sailors for which there does not exist a boat which they have not reserved

Sailors who reserved all boats (Division 1) Sailors for which there does not exist a boat that they did not reserve SELECT sid FROM Sailors S WHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE B.bid NOT IN (SELECT R.bid FROM Reserves R WHERE R.sid = S.sid));

Sailors who reserved all boats (Division 2) Sailors for which there does not exist a boat that they did not reserve SELECT S.sid FROM Sailors S WHERE NOT EXISTS( SELECT B.bid FROM Boats B SELECT R.bid FROM Reserves R WHERE R.bid=B.bid and R.sid=S.sid))

Sailors who reserved all boats (Division 3) Sailors for which there does not exist a boat for which there is no reservation in Reserves SELECT S.sid FROM Sailors S WHERE NOT EXISTS((SELECT B.bid FROM Boats B) EXCEPT (SELECT R.bid FROM Reserves R WHERE R.sid = S.sid));

Aggregation

Aggregate Operators The aggregate operators available in SQL are: COUNT(*) COUNT([DISTINCT] A) SUM([DISTINCT] A) AVG([DISTINCT] A) MAX(A) MIN(A)

Some Examples  SELECT COUNT(*) FROM Sailors S SELECT COUNT(sid) SELECT AVG(S.age) FROM Sailors S WHERE S.rating=10 SELECT COUNT(distinct color) FROM Boats

Find Average Age for each Rating So far, aggregation has been applied to all tuples that passed the WHERE clause test. How can we apply aggregation to groups of tuples?

Find Average Age for each Rating SELECT AVG(age) FROM Sailors GROUP BY rating;

Basic SQL Query SELECT [Distinct] attributes FROM relation-list WHERE condition GROUP BY grouping-attributes HAVING group-condition; attributes: must appear in grouping-attributes or aggregation operators group-condition: Constrains groups. Can only constrain attributes appearing in grouping-attributes or in aggregation operators

Evaluation- important! SELECT [Distinct] attributes FROM relation-list WHERE condition GROUP BY grouping-attributes HAVING group-condition; Compute cross product of relation-list Tuples failing condition are thrown away Tuples are partitioned into groups by values of grouping-attributes The group-condition is applied to eliminate groups One answer in generated for each group!

40 55.5 34 SELECT AVG(age) FROM Sailors GROUP BY rating; Sailors sid Sname rating age 22 31 58 63 78 84 Dustin Lubber Rusty Fluffy Morley Popeye 7 8 10 45.0 55.5 35.0 44.0 31.0 33.0 Sailors sid sname rating age 22 63 78 31 58 84 Dustin Fluffy Morley Lubber Rusty Popeye 7 8 10 45.0 44.0 31.0 55.5 35.0 33.0 40 55.5 34

SELECT AVG(age) FROM Sailors Where age<50 GROUP BY rating Having count(*)>2; Step 1 Sid Sname rating age 22 31 58 63 78 84 Dustin Lubber Rusty Fluffy Morley Popeye 7 8 10 45.0 55.5 35.0 44.0 31.0 33.0 Sid Sname rating age 22 58 63 78 84 Dustin Rusty Fluffy Morley Popeye 7 10 45.0 35.0 44.0 31.0 33.0

SELECT AVG(age) FROM Sailors Where age<50 GROUP BY rating Having count(*)>2; Step 2 Sid sname rating age 22 63 78 58 84 Dustin Fluffy Morley Rusty Popeye 7 10 45.0 44.0 31.0 35.0 33.0

SELECT AVG(age) FROM Sailors Where age<50 GROUP BY rating Having count(*)>2; Step 3 Sid sname rating age 22 63 78 58 84 Dustin Fluffy Morley Rusty Popeye 7 10 45.0 44.0 31.0 35.0 33.0 Sid sname rating age 22 63 78 Dustin Fluffy Morley 7 45.0 44.0 31.0

40 SELECT AVG(age) FROM Sailors Where age<50 GROUP BY rating Having count(*)>2; Step 4 Sid sname rating age 22 63 78 Dustin Fluffy Morley 7 45.0 44.0 31.0 Final Answer: 40

Find name and age of oldest Sailor..? Wrong! Why? SELECT S.sname, MAX(S.age) FROM Sailors S SELECT S.sname, MAX(S.age) FROM Sailors S GROUP BY S.sname Wrong! Why? First one is syntactically incorrect. Sname cannot appear in the select clause (outside of an aggregation operator) since it does not appear in the Group By Second one will return the maximum age for each sname (since tuples are grouped by sname)

Find name and age of oldest Sailor SELECT S.sname, S.age FROM Sailors S WHERE S.age = (SELECT MAX(S2.age) FROM Sailors S2) Right!! How else can this be done? HINT: Use ALL SELECT S.sname, S.age FROM Sailors S WHERE S.age >= ALL (SELECT S2.age FROM Sailors S2)

What does this return? SELECT B.bid, COUNT(*) FROM Boats B, Reserves R WHERE R.bid=B.bid and B.color=‘red’ GROUP BY B.bid This would not be legal, since color does not appear in the group by clause What would happen if we put the condition about the color in the HAVING clause?

What would happen if we put the condition about the color in the HAVING clause? SELECT B.bid, COUNT(*) FROM Boats B, Reserves R WHERE R.bid=B.bid GROUP BY B.bid, B.color HAVING B.color=‘red’ This is legal, and is equivalent to the earlier version, since grouping by bid or by bid and color, creates the same groups. (bid determines color)

Names of Boats that were not Reserved on more than 5 days What does this return? SELECT bname FROM Boats B, Reserves R WHERE R.bid=B.bid GROUP BY bid, bname HAVING count(DISTINCT day) <= 5 Names of Boats that were not Reserved on more than 5 days Can we move the condition in the HAVING to the WHERE? No! Aggregate functions are not allowed in WHERE

The Color for which there are the most boats..? SELECT color FROM Boats B GROUP BY color HAVING max(count(bid)) What is wrong with this? How would you fix it? Technically, we cannot apply an aggregate function to the result of an aggregate function. This also would not make sense, since max returns a number, and HAVING is supposed to return a Boolean value Fixed version in next slide

The Color for which there are the most boats SELECT color FROM Boats B GROUP BY color HAVING count(bid) >= ALL (SELECT count(bid) FROM Boats GROUP BY Color)

Aggregation Instead of Exists Aggregation can take the place of exists. What does this return? SELECT color FROM Boats B1 WHERE NOT EXISTS( SELECT * FROM Boats B2 WHERE B1.bid <> B2.bid AND B1.color=B2.color) The color of boats which have a unique color (no other boats with the same color)

Aggregation Instead of Exists SELECT color FROM Boats B1 GROUP BY color HAVING count(bid) = 1 Somewhat simpler…

Subqueries in the FROM and in the SELECT clauses

A Complex Query We would like to create a table containing 3 columns: Sailor id Sailor age Age of the oldest Sailor (same value in all rows) How can this be done?

What We Want: Sailors sid sname rating age 22 31 58 Dustin Lubber Rusty 7 8 10 45.0 55.5 35.0 Result sid age Max-age 22 31 58 45.0 55.5 35.0

Attempt 1 SELECT S.sid, S.age, MAX(S.age) FROM Sailors S; Syntactically incorrect. Sid and age do not appear in the group by Why is this wrong?

Attempt 2 SELECT S.sid, S.age, MAX(S.age) FROM Sailors S GROUP BY S.id, S.age; Returns the same values in the last 2 columns Why is this wrong?

Solution 1: Subquery in FROM SELECT S.sid, S.age, M.mx FROM Sailors S,(SELECT MAX(S2.age) as mx FROM Sailors S2) M; We can put a query in the FROM clause instead of a table The query in the FROM clause must be renamed with a range variable (M in this case).

Solution 2: Subquery in SELECT SELECT S.sid, S.age, (SELECT MAX(S2.age) FROM Sailors S2) FROM Sailors S; A query in the SELECT clause must return at most one value for each row returned by the outer query.

Another Example of a Sub-query in SELECT SELECT S.sid, S.age, (SELECT MAX(S2.age) FROM Sailors S2 WHERE S2.age<S.age) FROM Sailors S; For each sailor, his age, and the age of the oldest sailor younger than him What does this query return? Note the use of S (defined in the outer query) within the inner query.

Result: Sailors sid sname rating age 22 31 58 Dustin Lubber Rusty 7 8 10 45.0 55.5 35.0 Result sid age (Select…) 22 31 58 45.0 55.5 35.0 null

Another Example of a Sub-query in FROM?? SELECT S.sid, S.age, M.mx FROM Sailors S, (SELECT MAX(S2.age) as mx FROM Sailors S2 WHERE S2.age<S.age); Cannot correlate within the FROM clause. Query evaluation is not iterating over tuples at this point, so it does not make sense Why is this wrong?

Translating RA to SQL

RA is strictly less expressive than SQL Every query in relational algebra can be equivalently written in SQL There are SQL queries that cannot be expressed using relational algebra Examples? We now present a procedure for translating RA to SQL Note: This is not the most efficient translation

Assumptions To make the presentation simpler, assume we are translating a relational algebra expression E into SQL where: E does not use the same relation twice No two relations have any attributes with the same names Easy to overcome these assumptions using RA renaming and SQL aliasing.

Translation By Induction on Structure of E Induction on the number of relational algebra operators appearing in E. Base case: 0 operators. E is simply a relation R SELECT DISTINCT * FROM R

Translation By Induction on Structure of E Induction Step: Assume that for all E with less than k operators, there is an SQL expression equivalent to E. We show for k Must consider several cases, depending on the “last operator” using in E: , , , , -

Sub-query in the FROM Clause! Last Operator is  E = C(E1), where C is a Boolean condition Let S be an SQL expression equivalent to E1 (there is one by the induction hypothesis) E is equivalent to SELECT DISTINCT * FROM S WHERE C Sub-query in the FROM Clause!

Sub-query in the FROM Clause! Last Operator is  E = A1,..,Ak(E1), where A1,…,Ak are attributes Let S be an SQL expression equivalent to E1 (there is one by the induction hypothesis) E is equivalent to SELECT DISTINCT A1,…,Ak FROM S Sub-query in the FROM Clause!

Last Operator is  E = E1  E2 Let S1,S2 be SQL expressions equivalent to E1 and E2 E is equivalent to S1 UNION S2

Sub-queries in the FROM Clause! Last Operator is  E = E1  E2 Let S1,S2 be SQL expressions equivalent to E1 and E2 E is equivalent to SELECT * FROM S1, S2 Sub-queries in the FROM Clause!

Last Operator is - E = E1 - E2 Let S1,S2 be SQL expressions equivalent to E1 and E2 E is equivalent to S1 EXCEPT S2

Example Translate sname,color(rating<10(Sailors  Boats)) Select distinct sname, color From (select * from (select * from (select * from Sailors) T1, (select * from Boats) T2) T3 where rating < 20) T4

Null Values

What does NULL Mean? There are different interpretations to a value of NULL Value Unknown: I know that there is a value that belongs here, but I don’t know what it is. Example: Birthday attribute Value Inapplicable: There is no value that makes sense here Example: Spouse attribute for unmarried person

What does NULL Mean? (cont) Value Withheld: We are not entitled to know this value Example: phone number attribute

Null Values in Expressions Two important rules: When we operate on NULL and any other value, (including another NULL), using an arithmetic operator like * or +, the result is always NULL When we compare NULL and any other value (including another NULL) , using a comparison operator like = or >, the result is UNKNOWN. The correct way to determine if an attribute x has value NULL is using x IS NULL or x IS NOT NULL, which will return true or false

3 Valued Logic: True, False, Unknown B A and B A or B True False Unknown A Not A True False Unknown Only tuples for which the WHERE clause has value TRUE are used to create tuples in the result

What does this return? SELECT * FROM Sailors WHERE sname = sname Returns all rows where sname is not null

What does this return? SELECT * FROM Sailors WHERE rating > 5 or rating <= 5 Returns all rows where rating is not null

What do these return? SELECT sname, rating * 0 FROM Sailors SELECT sname, rating - rating FROM Sailors 1) Returns name, 0 when rating is not null, and name, null when rating is null 2) Same

Nulls in Aggregation Functions count(*): counts all rows (even rows that are all null values) count(A): counts non-null A-s. returns 0 if all As are null sum(A), avg(A), min(A), max(A) ignore null values of A if A only contains null value, the result is null

Distinct and Group By Rows are considered identical, for group by and distinct, if they have all the same non-null values and both have null values in the same columns Distinct removes duplicates of such rows Such rows form a single group when using GROUP BY

Example R B C 1 null 2 3 4 5 SELECT count(*), count(c), min(c), sum(c) FROM (SELECT c FROM R WHERE c IS NULL or c <> NULL GROUP BY c) Returns 1, 0, null, null

Join Operators in the FROM Clause “Syntactic Sugar”

Shorthand for Conditional Join SELECT S1.sname, S2.sname FROM Sailors S1, Sailors S2 WHERE S1.sid != S2.sid and S1.sname = ‘Rusty’ SELECT S1.sname, S2.sname FROM Sailors S1 JOIN Sailors S2 on S1.sid != S2.sid WHERE S1.sname = ‘Rusty’

Shorthand for Equi-Join SELECT S.sname, FROM Sailors S, Reserves R WHERE S.sid = R.sid and S.age > 20 SELECT S.sname, FROM Sailors S JOIN Reserves R USING (sid) WHERE S.age > 20

Shorthand for Natural Join SELECT S.sname, FROM Sailors S, Reserves R WHERE S.sid = R.sid and S.age > 20 SELECT S.sname, FROM Sailors S NATURAL JOIN Reserves R WHERE S.age > 20 Requires equality on all common fields

Outer Join

Left Outer Join The left outer join of R and S contains: all the tuples in the join of R and S all the tuples in R that did not join with tuples from S, padded with null values SELECT Sailors.sid, Reserves.bid FROM Sailors NATURAL LEFT OUTER JOIN Reserves

Result Sailors sid sname rating age 22 31 58 Dustin Lubber Rusty 7 8 10 45.0 55.5 35.0 Reserves sid bid day 22 58 101 103 10/10/96 11/12/96 Result sid bid 31 22 58 null 101 103

Right Outer Join, Full Outer Join The right outer join of R and S contains: all the tuples in the join of R and S all the tuples in S that did not join with tuples from R, padded with null values The full outer join of R and S contains: all the tuples in the left outer join of R and S all the tuples in the right outer join of R and S

Express the Left Outer Join in SQL, without the Left Outer Join Operator Suppose we have R(A,B) and S(B,C). Can you write a query that returns the left outer join of R and S, that does not use the left outer join operator? Select * From R, S Where R.b = S.b UNION Select R.a, R.b, null From R Where R.b not in (select b from S)

ALL and ANY: Special Cases

Query 1: What does this return? SELECT * FROM Sailors WHERE age > ANY (SELECT age WHERE sname=‘Joe’) Returns sailors that are older than any sailor named joe. (If there is no Joe, no sailors will be returned)

Query 2: What does this return? SELECT * FROM Sailors WHERE age > ALL (SELECT age WHERE sname=‘Joe’) Returns sailors that are older than all sailors named joe. (If there is no Joe, all sailors will be returned)

Query Containment We say that a query Q1 contains query Q2, if for all databases D, the result of applying Q1 to D contains the result of applying Q2 to D. Does Query 2 contain Query 1? Does Query 1 contain Query 2? (the answer to both questions is no – but why?) No, a sailor might be older than one joe, but younger than another No, if there are no joes, then query 1 returns no answers

Query Q3: What does this return? SELECT * FROM Sailors WHERE age = ANY (SELECT age WHERE sname=‘Joe’) Sailors who are the same age as any joe

Query Q4: What does this return? SELECT * FROM Sailors WHERE age IN (SELECT age WHERE sname=‘Joe’) Equivalent to Q3

Query Q5: What does this return? SELECT * FROM Sailors WHERE age <> ANY (SELECT age WHERE sname=‘Joe’) Sailors whose age differs with that of any joe

Query Q6: What does this return? SELECT * FROM Sailors WHERE age NOT IN (SELECT age WHERE sname=‘Joe’) Sailros whose age differs from all joes Not equivalent to Q5 – why?

Views

What is a View? A view is a virtual table A view is defined by a query The result of the query is the contents of the virtual table always update with respect to the database does not exist, is computed every time referenced Changing a table (insert/update/delete) automatically changes the view

Defining a View CREATE VIEW <view-name> AS <view-def>; Where view-def is an SQL query Example: CREATE VIEW GreatSailors AS SELECT sid, sname FROM Sailors WHERE rating>=9

Defining a View Another example: CREATE VIEW SailorsDates AS SELECT sid, date FROM Sailors S, Reservations R WHERE S.sid = R.sid

Querying a View Once you have defined a view, you can use it in a query (in the same way that you use a relation) SELECT sid FROM GreatSailors WHERE sname = ‘Joe’

Querying a View You can use a view and a regular relation together in a query SELECT bname FROM GreatSailors G, Reservations R, Boats B WHERE G.sid = R.sid and R.bid = B.bid

Understanding Queries using Views When writing a query with a view it is as if the expression defining the view is a sub-query is the FROM clause SELECT bname FROM GreatSailors G, Reservations R, Boats B WHERE G.sid = R.sid and R.bid = B.bid SELECT bname FROM (SELECT sid FROM Sailors WHERE rating >=9) G, Reservations R, Boats B WHERE G.sid = R.sid and R.bid = B.bid

What are views good for? (1) Simplifying complex queries: Here is an example allows the user to "pretend" that there is a single table in the database CREATE VIEW SRB as SELECT S.sid, sname, rating, age, R.bid, day, bname, color FROM Sailors S, Boats B, Reserves R WHERE S.sid = R.sid and R.bid = B.bid

What are views good for? (1) Now: Find snames of Sailors who reserved red boats on 1/11/09 using SRB SELECT sname FROM SRB WHERE color = ‘red’ and day = ‘1/11/09’

What are views good for? (2) Security issues – preventing unauthorized access. Example: hiding the rating value CREATE VIEW SailorInfo SELECT sname, sid, age FROM Sailors grant SELECT on SailorInfo to shimon;

Modifying Views Sometimes it is possible to insert into, delete from, or update, a view !!! Actually, the user request is translated into a modification of the base tables (the tables used in the view definition) Modifications are possible only when the view is updatable

Updatable Views There are complex rules determining when a view is updatable Basically, updates are possible when the view is defined by selecting (SELECT, not SELECT DISTINCT) from a single relation R such that: The WHERE clause does not involve R in a subquery The FROM clause contains only the one relation R The list in the SELECT clause must include enough attributes such that for every tuple inserted through the view, we can fill the other attributes with NULL or a default value

Interestingly, we won’t see this tuple when we query GreatSailors Inserting Example CREATE VIEW GreatSailors AS SELECT sid, sname FROM Sailors WHERE rating>=9 Interestingly, we won’t see this tuple when we query GreatSailors INSERT INTO GreatSailors VALUES(113, ‘Sam’) INSERT INTO Sailors(sid, sname) VALUES(113, ‘Sam’)

IMPORTANT NOTE There is no relation GreatSailors Insertion actually affects the table over which GreatSailors is defined, i.e., Sailors Similarly, deletion and updates will affect the underlying tables…

Deleting Example CREATE VIEW GreatSailors AS SELECT sid, sname FROM Sailors WHERE rating>=9 We add the where condition from the view definition to make sure that only tuples appearing in the view are deleted DELETE FROM GreatSailors WHERE sname = ‘John’ DELETE FROM Sailors WHERE sname = ‘John’ and rating>=9

Updating Example CREATE VIEW GreatSailors AS SELECT sid, sname FROM Sailors WHERE rating>=9 Update GreatSailors SET sname = ‘Abraham’ WHERE sname = ‘John’ Update Sailors SET sname = ‘Abraham’ WHERE sname = ‘John’and rating>=9

Postgres Support Postgres does not support updatable views Can achieve the same effect using triggers…

Recursion

Flights Flight(airline, from, to) Airline Frm To El Al Tel Aviv New York Continental Los Angeles Air Canada Toronto Montreal

Can you find? How can you find all places that you can get to by a direct flight from Tel Aviv? SELECT to FROM Flights WHERE frm = ‘Tel Aviv’

Can you find? How can you find all places that you can get to by a flight with one stop-over from Tel Aviv? SELECT F2.to FROM Flights F1, Flights F2 WHERE F1.frm = ‘Tel Aviv’ and F1.to = F2.frm

Can you find? How can you find all places that you can get to by a flight with zero or one stop-over from Tel Aviv? SELECT to FROM Flights WHERE frm = ‘Tel Aviv’ UNION SELECT F2.to FROM Flights F1, Flights F2 WHERE F1.frm = ‘Tel Aviv’ and F1.to = F2.frm

Why can’t you find? How can you find all places that you can get to by a flight any number of stop-overs from Tel Aviv? Problem: How many times should Flights appear in the FROM clause? Not possible with features seen so far

Recursion in SQL The SQL-99 standard allows us to define temporary relations which can be recursive WITH R AS <definition of R> <query involving R> Or more generally: WITH [RECURSIVE] R1 AS <definition of R1> , …, [RECURSIVE] Rn AS <definition of Rn> <query involving R1, .., Rn>

Example WITH RECURSIVE Reaches(frm,to) AS (SELECT frm, to FROM Flights) UNION (SELECT R1.frm, R2.to FROM Reaches R1, Reaches R2 WHERE R1.to = R2.frm) SELECT to FROM Reaches WHERE frm=‘Tel Aviv’

Fix-Point Semantics The value of Reaches is derived by repeatedly evaluating its definition until no changes are made Before starting evaluation, Reaches is empty Then, its definition is repeatedly evaluated, and the result defines Reaches This continues until no more changes appear

Flights Reaches: Step 1 WITH RECURSIVE Reaches(frm,to) AS (SELECT frm, to FROM Flights) UNION (SELECT R1.frm, R2.to FROM Reaches R1, Reaches R2 WHERE R1.to = R2.frm) SELECT to FROM Reaches WHERE frm=‘Tel Aviv’ Flights Reaches: Step 1 Airline Frm To El Al Tel Aviv New York Continental Los Angeles Air Canada Toronto Montreal Frm To Tel Aviv New York Los Angeles Toronto Montreal

Flights Reaches: Step 1 Reaches: Step 2 Airline Frm To El Al Tel Aviv New York Continental Los Angeles Air Canada Toronto Montreal Frm To Tel Aviv New York Los Angeles Toronto Montreal Frm To Tel Aviv New York Los Angeles Toronto Montreal Reaches: Step 2

Flights Reaches: Step 1 Reaches: Step 3 Airline Frm To El Al Tel Aviv New York Continental Los Angeles Air Canada Toronto Montreal Frm To Tel Aviv New York Los Angeles Toronto Montreal Frm To Tel Aviv New York Los Angeles Toronto Montreal Reaches: Step 3

Query will return values in Red Flights Reaches: Step 1 Airline Frm To El Al Tel Aviv New York Continental Los Angeles Air Canada Toronto Montreal Frm To Tel Aviv New York Los Angeles Toronto Montreal Frm To Tel Aviv New York Los Angeles Toronto Montreal Reaches: Final Value Query will return values in Red

Mutually Recursive Relations We can define several recursive queries, which can use one another in their definitions. A dependency graph has a node for each relation defined, and an edge from one node to another if the first uses the second in its definition In particular, in the previous example, there would be an edge from Reaches to itself R and S are mutually recursive, if there is a cycle in the graph involving nodes R and S

P and Q are mutually recursive Example WITH RECURSIVE P(x) AS (SELECT * FROM R) EXCEPT (SELECT * FROM Q) RECURSIVE Q(x) AS (SELECT * FROM P) SELECT * FROM P R P Q P and Q are mutually recursive

Problematic Recursion Complicated recursions are allowed However, sometimes the result may not be well defined These cases are not allowed by SQL Before defining exactly what is not allowed, we consider an example

Is there a Fix Point? Recall that the result of a defined relation is derived by simply evaluating it again and again until it no longer changes. However, what happens if this process never terminates?

What is in P and Q if R has the single tuple (0)? Example WITH RECURSIVE P(x) AS (SELECT * FROM R) EXCEPT (SELECT * FROM Q) RECURSIVE Q(x) AS (SELECT * FROM P) SELECT * FROM P What is in P and Q if R has the single tuple (0)? At the end of the first round of recursive evaluation, P and Q both have the tuple (0). At the end of the second round, they are both empty, at the end of the third round they both contain the tuple (0), …

Monotonicity Requirement R can be defined using a mutually recursive relation S only if R is monotone in S, i.e., Adding an arbitrary tuple to S might add tuples to R, or might leave R unchanged, but can never cause a tuple to be deleted from R

Monotonicity Requirement In the previous example, R uses the mutually recursive relation Q, but R is not monotonic in Q (adding tuples to Q can cause tuples to be removed from R) Therefore, this type of recursion is not allowed in SQL