CMSC424: Database Design Instructor: Amol Deshpande
Today Mapping relational algebra to SQL Complex SQL Queries See “movies database sql queries” on class webpage for the actual queries SQL Formal Semantics Advanced SQL Features
SQL Query Examples Movie(title, year, length, inColor, studioName, producerC#) StarsIn(movieTitle, movieYear, starName) MovieStar(name, address, gender, birthdate) MovieExec(name, address, cert#, netWorth) Studio(name, address, presC#) Queries: Producer with maximum average length of movies Find producer of Star Wars. All producers of movies in which harrison ford stars
SQL Query Examples Movie(title, year, length, inColor, studioName, producerC#) StarsIn(movieTitle, movieYear, starName) MovieStar(name, address, gender, birthdate) MovieExec(name, address, cert#, netWorth) Studio(name, address, presC#) Queries: Find movie titles that appear more than once Find number of people 3 hops away from Kevin Bacon
More SQL Select * into temptable from X1, … Having WHERE is to FROM what HAVING is to GROUPBY
Duplicates By definition, relations are sets So No duplicates allowed Problem: Not practical to remove duplicates after every operation Why ? So… SQL by default does not remove duplicates SQL follows bag semantics, not set semantics Implicitly we keep count of number of copies of each tuple
Formal Semantics of SQL RA can only express SELECT DISTINCT queries cnameccity Johnson Smith Johnson Smith Brighton Perry Brighton R.H. To express SQL, must extend RA to a bag algebra Bags (aka: multisets) like sets, but can have duplicates e.g: {5, 3, 3} e.g: homes = Next: will define RA * : a bag version of RA
Formal Semantics of SQL: RA* * p (r): preserves copies in r cnameccity Johnson Brighton cname Johnson Smith Johnson Smith e.g: * city = Brighton (homes) = * A1, …, An (r): no duplicate elimination e.g: * cname (homes) =
= Formal Semantics of SQL: RA* r * s : additive union AB ααββααααββαα AB ααβααβ AB βααβαα AB 1α AB 3α 4. r -* s: bag difference e.g: r -* s = s -* r = ** r s
** Formal Semantics of SQL: RA* r * s: cartesian product ABC ααααββααααββ AB ααβααβ C +-+-
Formal Semantics of SQL Query: SELECTa 1, ….., a n FROMr 1, ….., r m WHEREp Semantics: * A1, …, An ( * p (r 1 * … * r m ) )(1) Query: SELECT DISTINCT a 1, ….., a n FROMr 1, ….., r m WHEREp Semantics: What is the only operator to change in (1)? A1, …, An ( * p (r 1 * … * r m ) )(2)
Set/Bag Operations Revisited Set Operations UNION ≡ U INTERSECT ≡ ∩ EXCEPT ≡ - Bag Operations UNION ALL ≡ U* INTERSECT ALL ≡ ∩* EXCEPT ALL ≡ -* Duplicate Counting: Given m copies of t in r, n copies of t in s, how many copies of t in: r UNION ALL s? r INTERSECT ALL s? A: m + n A: min (m, n) r EXCEPT ALL s? A: max (0, m-n)
SQL: Summary ClauseEval Order Semantics (RA/RA*) SELECT [(DISTINCT)] FROM WHERE INTO GROUP BY HAVING ORDER BY AS UNION ALL UNION (similarly intersection, except) (or *) * * Extended relational operator g * Can’t express: requires ordered sets, bags U* U
Next Database updates
Modification of the Database – Deletion Delete all account records at the Perryridge branch delete from account where branch-name = ‘Perryridge’ Delete all accounts at every branch located in Needham city. delete from account where branch-name in (select branch-name from branch where branch-city = ‘Needham’) delete from depositor where account-number in (select account-number from branch, account where branch-city = ‘Needham’ and branch.branch-name = account.branch-name)
Example Query Delete the record of all accounts with balances below the average at the bank. delete from account where balance < (select avg (balance) from account) Problem: as we delete tuples from deposit, the average balance changes Solution used in SQL: 1. First, compute avg balance and find all tuples to delete 2. Next, delete all tuples found above (without recomputing avg or retesting the tuples)
Modification of the Database – Insertion Add a new tuple to account insert into account values (‘A-9732’, ‘Perryridge’,1200) or equivalently insert into account (branch-name, balance, account- number) values (‘Perryridge’, 1200, ‘A-9732’) Add a new tuple to account with balance set to null insert into account values (‘A-777’,‘Perryridge’, null)
Modification of the Database – Updates Increase all accounts with balances over $10,000 by 6%, all other accounts receive 5%. Write two update statements: update account set balance = balance 1.06 where balance > update account set balance = balance 1.05 where balance The order is important Can be done better using the case statement