Presentation is loading. Please wait.

Presentation is loading. Please wait.

M.P. Johnson, DBMS, Stern/NYU, Spring 20051 C20.0046: Database Management Systems Lecture #13 M.P. Johnson Stern School of Business, NYU Spring, 2005.

Similar presentations


Presentation on theme: "M.P. Johnson, DBMS, Stern/NYU, Spring 20051 C20.0046: Database Management Systems Lecture #13 M.P. Johnson Stern School of Business, NYU Spring, 2005."— Presentation transcript:

1 M.P. Johnson, DBMS, Stern/NYU, Spring 20051 C20.0046: Database Management Systems Lecture #13 M.P. Johnson Stern School of Business, NYU Spring, 2005

2 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 2 Agenda Finish SQL queries Updates and creating tables with SQL Indices, views, programs talking to SQL 3/10 (next week Thurs, before spring break)  Midterm  Hw2 due Today: returning proj2 at end  Mean: 26  Stdev: 4

3 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 3 Grouping & Aggregation ops In SQL:  aggregation operators in SELECT,  Grouping in GROUP BY clause Recall aggregation operators:  sum, avg, min, max, count strings, numbers, dates  Each applies to scalars  Count also applies to row: count(*)  Can DISTINCT inside aggregation op: count(DISTINCT x) Grouping: group rows that agree on single value  Each group becomes one row in result

4 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 4 Straight aggregation example Purchase(product, date, price, quantity) Q: Find total sales for the entire database: Q: Find total sales of bagels: SELECT SUM(price * quantity) FROM Purchase SELECT SUM(price * quantity) FROM Purchase SELECT SUM(price * quantity) FROM Purchase WHERE product = 'bagel' SELECT SUM(price * quantity) FROM Purchase WHERE product = 'bagel'

5 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 5 Straight grouping Group rows together by field values Produces one row for each group  I.e., by each (combin. of) grouped val(s)  Don’t select non-grouped fields Reduces to DISTINCT selections: SELECT product FROM Purchase GROUP BY product SELECT product FROM Purchase GROUP BY product SELECT DISTINCT product FROM Purchase SELECT DISTINCT product FROM Purchase

6 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 6 Illustrated G&A example Sometimes want to group and compute aggregations by group  Aggregation op applied to rows in group,  not to all rows in table Purchase(product, date, price, quantity) Find total sales for products that sold for > 0.50: SELECT product, SUM(price*quantity) total FROM Purchase WHERE price >.50 GROUP BY product SELECT product, SUM(price*quantity) total FROM Purchase WHERE price >.50 GROUP BY product

7 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 7 Illustrated G&A example Purchase

8 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 8 First compute the FROM-WHERE then GROUP BY product: Illustrated G&A example

9 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 9 Finally, aggregate and select: Illustrated G&A example SELECT product, SUM(price*quantity) total FROM Purchase WHERW price >.50 GROUP BY product SELECT product, SUM(price*quantity) total FROM Purchase WHERW price >.50 GROUP BY product

10 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 10 Illustrated G&A example GROUP BY may be reduced to (maybe more complicated) subquery: SELECT product, SUM(price*quantity) total FROM Purchase WHERE price >.50 GROUP BY product SELECT product, SUM(price*quantity) total FROM Purchase WHERE price >.50 GROUP BY product SELECT DISTINCT x.product, (SELECT SUM(y.price*y.quantity) FROM Purchase y WHERE x.product = y.product AND y.price >.50) total FROM Purchase x WHERE x.price >.50 SELECT DISTINCT x.product, (SELECT SUM(y.price*y.quantity) FROM Purchase y WHERE x.product = y.product AND y.price >.50) total FROM Purchase x WHERE x.price >.50

11 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 11 For every product, what is the total sales and max quantity sold? Multiple aggregations SELECT product, SUM(price * quantity) SumSales, MAX(quantity) MaxQuantity FROM Purchase WHERE price >.50 GROUP BY product SELECT product, SUM(price * quantity) SumSales, MAX(quantity) MaxQuantity FROM Purchase WHERE price >.50 GROUP BY product

12 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 12 Another grouping/aggregation e.g. Movie(title, year, length, studioName) Q: How many total minutes of film have been produced by each studio? Strategy: Divide movies into groups per studio, then add lengths per group

13 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 13 Another grouping/aggregation e.g. TitleYearLengthStudio Star Wars1977120Fox Jedi1980105Fox Aviator2004800Miramax Pulp Fiction1995110Miramax Lost in Translation 200395Universal SELECT studio, sum(length) totalLength FROM Movies GROUP BY studio SELECT studio, sum(length) totalLength FROM Movies GROUP BY studio

14 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 14 Another grouping/aggregation e.g. TitleYearLengthStudio Star Wars1977120Fox Jedi1980105Fox Aviator2004800Miramax Pulp Fiction1995110Miramax Lost in Translation 200395Universal SELECT studio, sum(length) totalLength FROM Movies GROUP BY studio SELECT studio, sum(length) totalLength FROM Movies GROUP BY studio

15 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 15 Another grouping/aggregation e.g. TitleYearLengthStudio Star Wars1977120Fox Jedi1980105Fox Aviator2004800Miramax Pulp Fiction1995110Miramax Lost in Translation 200395Universal StudioLength Fox225 Miramax910 Universal95 SELECT studio, sum(length) totalLength FROM Movies GROUP BY studio SELECT studio, sum(length) totalLength FROM Movies GROUP BY studio

16 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 16 Grouping/aggregation example StarsIn(SName,Title,Year) Q: Find the year of each star’s first movie Q: Find the span of each star’s career  Look up first and last movies SELECT sname, min(year) firstyear FROM StarsIn GROUP BY sname SELECT sname, min(year) firstyear FROM StarsIn GROUP BY sname

17 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 17 Acc(name,bal,type) Q: Who has the largest balance of each type? Can we do this with grouping/aggregation?

18 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 18 G & A for constructed relations Movie(title,year,producerSsn,length) MovieExec(name,ssn,netWorth) Can do the same thing for larger, non-atomic relations Q: How many mins. of film did each producer make?  What happens to non-producer movie-execs? SELECT name, sum(length) total FROM Movie, MovieExec WHERE producerSsn = ssn GROUP BY name SELECT name, sum(length) total FROM Movie, MovieExec WHERE producerSsn = ssn GROUP BY name

19 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 19 HAVING clauses Sometimes we want to limit which tuples may be grouped Q: How many mins. of film did each rich producer (i.e., netWorth > 10000000) make? Q: Is HAVING necessary here? A: No, could just add rich req. to WHERE SELECT name, sum(length) total FROM Movie, MovieExec WHERE producerSsn = ssn GROUP BY name HAVING netWorth > 10000000 SELECT name, sum(length) total FROM Movie, MovieExec WHERE producerSsn = ssn GROUP BY name HAVING netWorth > 10000000

20 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 20 HAVING clauses Sometimes we want to limit which tuples may be grouped, based on properties of the group Q: How many mins. of film did each old producer (i.e., who started before 1930) make? Q: Is HAVING necessary here? SELECT name, sum(length) total FROM Movie, MovieExec WHERE producerSsn = ssn GROUP BY name HAVING min(year) < 1930 SELECT name, sum(length) total FROM Movie, MovieExec WHERE producerSsn = ssn GROUP BY name HAVING min(year) < 1930

21 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 21 General form of G&A S = may contain attributes As and/or any aggregates but no other attributes C1 = condition on the attributes in R 1,…,R n C2 = condition on aggregations or attributes from As Why? NB: “Any attribute of relations in the FROM clause may be aggregated in the HAVING clause, but only those attributes that are in the GROUP BY list may appear unaggregated in the HAVING clause (the same rule as for the SELECT clause)” (Ullman, p283). Why? SELECTS FROMR1,…,Rn WHERE C1 GROUP BYAs HAVINGC2 SELECTS FROMR1,…,Rn WHERE C1 GROUP BYAs HAVINGC2

22 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 22 Evaluation of G&A Evaluation steps: Compute the FROM-WHERE part as usual to obtain a table with all attributes in R 1,…,R n Group by the attributes a 1,…,a k Compute the aggregates in C 2 and keep only groups satisfying C 2 Compute aggregates in S and return the result SELECTS FROMR 1,…,R n WHERE C 1 GROUP BYa 1,…,a k HAVINGC 2 SELECTS FROMR 1,…,R n WHERE C 1 GROUP BYa 1,…,a k HAVINGC 2

23 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 23 Web page examples Find all authors who wrote at least 10 documents  Authors(login, name)  Webpages(url, title, login) Attempt 1: with nested queries Bad! SELECT DISTINCT name FROM Authors WHERE COUNT(SELECT url FROM Webpages WHERE Authors.login=Webpages.login) > 10 SELECT DISTINCT name FROM Authors WHERE COUNT(SELECT url FROM Webpages WHERE Authors.login=Webpages.login) > 10

24 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 24 Web page examples Find all authors who wrote at least 10 documents: Attempt 2: Simplify with GROUP BY Good! No need for DISTINCT: get for free from GROUP BY SELECT name FROM Authors, Webpages WHERE Authors.login= Webpages.login GROUP BY name HAVING count(Webpages.url) > 10 SELECT name FROM Authors, Webpages WHERE Authors.login= Webpages.login GROUP BY name HAVING count(Webpages.url) > 10

25 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 25 Web page examples Find all authors who have a vocabulary over 10000 words:  Authors(login, name)  Webpages(url, title, login)  Mentions(url, word) SELECT name FROM Authors, Webpages, Mentions WHERE Authors.login=Wrote.login AND Webpages.url=Mentions.url GROUP BY name HAVING count(distinct word) > 10000 SELECT name FROM Authors, Webpages, Mentions WHERE Authors.login=Wrote.login AND Webpages.url=Mentions.url GROUP BY name HAVING count(distinct word) > 10000

26 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 26 Summary: SQL queries Only SELECT, FROM required Can’t have HAVING without GROUP BY Can have GROUP BY without HAVING Any clauses used must appear in this order: SELECTL FROMRs WHEREs GROUP BYL2 HAVINGs2 ORDER BYL3 SELECTL FROMRs WHEREs GROUP BYL2 HAVINGs2 ORDER BYL3

27 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 27 New topic: Nulls in SQL If we don’t have a value, can put a NULL Null can mean several things:  Value does not exists  Value exists but is unknown  Value not applicable The schema specifies whether null is allowed for each attribute  NOT NULL if not allowed  By default, null is allowed

28 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 28 Null Values x = NULL  4*(3-x)/7 = NULL x = NULL  x + 3 – x = NULL x = NULL  3 + (x-x) = NULL x = NULL  x = ‘Joe’ is UNKNOWN In general: no row using null fields appear in the selection test will pass the test  With one exception Pace Boole, SQL has three boolean values:  FALSE=0  TRUE=1  UNKNOWN=0.5

29 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 29 Null values in boolean expressions C1 AND C2= min(C1, C2) C1 OR C2= max(C1, C2) NOT C1= 1 – C1 height > 6 = UNKNOWN  UNKNOWN OR weight > 190 = UNKOWN  (age < 25) AND UNKNOWN = UNKNOWN E.g. age=20 height=NULL weight=180 SELECT * FROM Person WHERE (age < 25) AND (height > 6 OR weight > 190) SELECT * FROM Person WHERE (age < 25) AND (height > 6 OR weight > 190)

30 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 30 Comparing null and non-nulls Unexpected behavior: Some Persons are not included! The “trichotomy law” does not hold! SELECT * FROM Person WHERE age = 25 SELECT * FROM Person WHERE age = 25

31 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 31 Testing for null values Can test for NULL explicitly:  x IS NULL  x IS NOT NULL But:  x = NULL is always null Now it includes all Persons SELECT * FROM Person WHERE age = 25 OR age IS NULL SELECT * FROM Person WHERE age = 25 OR age IS NULL

32 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 32 Null/logic review TRUE AND UNKNOWN = ? TRUE OR UNKNOWN = ? UNKNOWN OR UNKNOWN = ? X = NULL = ?

33 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 33 Outerjoin Like L R except that dangling tuples are included, padded with nulls Left outerjoin: dangling tuples from left are included  Nulls appear “on the right” Right outerjoin: dangling tuples from right are included  Nulls appear “on the left”

34 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 34 Joins operations Variations:  Cross join (Cartesian product)  Join … On  Natural join  Outer join Apply to relations appearing in selections

35 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 35 Cross join - example NameAddressGenderBirthdate Hanks123 Palm RdM01/01/60 Taylor456 Maple AvF02/02/40 Lucas789 Oak StM03/03/55 NameAddressNetworth Spielberg246 Palm Rd10M Taylor456 Maple Av20M Lucas789 Oak St30M MovieStar MovieExec

36 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 36 Cross join – example MovieS tar.nam e MovieStar.add ress MovieSta r. Gender MovieStar.Birthdate MovieEx ec. Name MovieExec.Ad dress MovieEx ec. Networth Hanks123 Palm RdM01/01/60Spielberg246 Palm Rd10M Hanks123 Palm RdM01/01/60Taylor456 Maple Av20M Hanks123 Palm RdM01/01/60Lucas789 Oak St30M Taylor456 Maple AvF02/02/40Spielberg246 Palm Rd10M Taylor456 Maple AvF02/02/40Taylor456 Maple Av20M Taylor456 Maple AvF02/02/40Lucas789 Oak St30M Lucas789 Oak StM03/03/55Spielberg246 Palm Rd10M Lucas789 Oak StM03/03/55Taylor456 Maple Av20M Lucas789 Oak StM03/03/55Lucas789 Oak St30M Row 1 2 3 4 5 6 7 8 9 SELECT * FROM MovieStar CROSS JOIN MovieExec SELECT * FROM MovieStar CROSS JOIN MovieExec

37 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 37 Join … On: example MovieSt ar.name MovieStar.addr ess MovieS tar. Gender MovieStar. Birthdate MovieExec. Name MovieExec.Add ress MovieEx ec. Networth Hanks123 Palm RdM01/01/60Spielberg246 Palm Rd10M Hanks123 Palm RdM01/01/60Taylor456 Maple Av20M Hanks123 Palm RdM01/01/60Lucas789 Oak St30M Taylor456 Maple AvF02/02/40Spielberg246 Palm Rd10M Taylor456 Maple AvF02/02/40Taylor456 Maple Av20M Taylor456 Maple AvF02/02/40Lucas789 Oak St30M Lucas789 Oak StM03/03/55Spielberg246 Palm Rd10M Lucas789 Oak StM03/03/55Taylor456 Maple Av20M Lucas789 Oak StM03/03/55Lucas789 Oak St30M Row 1 2 3 4 5 6 7 8 9 SELECT * FROM MovieStar JOIN MovieExec ON MovieStar.Name <> MovieExec.Name SELECT * FROM MovieStar JOIN MovieExec ON MovieStar.Name <> MovieExec.Name

38 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 38 Natural Joins MovieStar(name, address, gender, birthdate) MovieExec(name, address, networth) Natural Join: …MovieStar Natural Join MovieExec Results in: list of individuals who are movie-stars as well as executives: (Name, address, gender, birthdate, networth)

39 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 39 Example - Natural join NameAddressGenderBirthdate Hanks123 Palm RdM01/01/60 Taylor456 Maple AvF02/02/40 Lucas789 Oak StM03/03/55 NameAddressNetworth Spielberg246 Palm Rd10M Taylor456 Maple Av20M Lucas789 Oak St30M MovieStar MovieExec NameAddressGenderBirthdateNetworth Taylor456 Maple AvF02/02/4020M Lucas789 Oak StM03/03/5530M SELECT * FROM MovieStar NATURAL JOIN MovieExec

40 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 40 Outer Join - Example NameAddressGenderBirthdateNetworth Hanks123 Palm RdM01/01/60NULL Spielberg246 Palm RdNULL 10M Taylor456 Maple AvF02/02/4020M Lucas789 Oak StM03/03/5530M NameAddressGenderBirthdateNetworth Hanks123 Palm RdM01/01/60NULL Spielberg246 Palm RdNULL 10M Taylor456 Maple AvF02/02/4020M Lucas789 Oak StM03/03/5530M SELECT * FROM MovieStar LEFT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name SELECT * FROM MovieStar RIGHT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name

41 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 41 Outer Join - Example NameAddressGenderBirthdate Hanks123 Palm RdM01/01/60 Taylor456 Maple AvF02/02/40 Lucas789 Oak StM03/03/55 NameAddressNetworth Spielberg246 Palm Rd10M Taylor456 Maple Av20M Lucas789 Oak St30M NameAddressGenderBirthdateNetworth Hanks123 Palm RdM01/01/60NULL Spielberg246 Palm RdNULL 10M Taylor456 Maple AvF02/02/4020M Lucas789 Oak StM03/03/5530M MovieStarMovieExec SELECT * FROM MovieStar FULL OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name

42 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 42 New-style join syntax Old-style syntax simply lists tables separated by commas: New-style makes the join explicit: SELECT * FROM A,B WHERE …; SELECT * FROM A,B WHERE …; SELECT * FROM A JOIN B ON … WHERE …; SELECT * FROM A JOIN B ON … WHERE …;

43 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 43 New-style join syntax Functionally equivalent to old-style, but perhaps more elegant Introduced in Oracle 8i, MySQL 3.x/4.x Older versions / other DBMSs may only support old-style syntax

44 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 44 New-style join types cross joins (simplest):  …FROM A CROSS JOIN B Inner joins (regular joins):  …FROM A [INNER] JOIN B ON … Natural join:  …FROM A NATURAL JOIN B;  Joins on common fields and merges Outer joins

45 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 45 New-style outer joins Outer joins may be left, right, or middle  FROM A LEFT [OUTER] JOIN B;  FROM A RIGHT [OUTER] JOIN B;  FROM A FULL [OUTER] JOIN B; “OUTER” is optional If “OUTER” is included, then “FULL” is the default Q: How to remember left v. right? A: It indicates the side whose rows are always included

46 M.P. Johnson, DBMS, Stern/NYU, Spring 2005 46 Old-style outer joins in Oracle Outer joins can also be done with the old-style syntax, but with the (+) …WHERE A.att=B.att(+) corresponds to: …FROM A LEFT JOIN B; The (+) is applied to all B attributes referred to in the WHERE clause Q: How to remember which side gets the (+)? A: The side that gets null rows “added”


Download ppt "M.P. Johnson, DBMS, Stern/NYU, Spring 20051 C20.0046: Database Management Systems Lecture #13 M.P. Johnson Stern School of Business, NYU Spring, 2005."

Similar presentations


Ads by Google