Presentation is loading. Please wait.

Presentation is loading. Please wait.

M.P. Johnson, DBMS, Stern/NYU, Sp20041 C20.0046: Database Management Systems Lecture #9 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.

Similar presentations


Presentation on theme: "M.P. Johnson, DBMS, Stern/NYU, Sp20041 C20.0046: Database Management Systems Lecture #9 Matthew P. Johnson Stern School of Business, NYU Spring, 2004."— Presentation transcript:

1 M.P. Johnson, DBMS, Stern/NYU, Sp20041 C20.0046: Database Management Systems Lecture #9 Matthew P. Johnson Stern School of Business, NYU Spring, 2004

2 M.P. Johnson, DBMS, Stern/NYU, Sp2004 2 Agenda Last time: 4NF, RA This time: 1. More RA 2. Bags Project Part 2 due next time

3 M.P. Johnson, DBMS, Stern/NYU, Sp2004 3 Normalization Review Q: What’s required for BCNF? Q: What’s the loophole for 3NF? Q: How do we fix a non-BCNF, non-3NF, non-4NF relations? Q: When are are FDs also MVDs? Q: When are MVDs also FDs?

4 M.P. Johnson, DBMS, Stern/NYU, Sp2004 4 Relational Algebra Review Five basic operators:  Union:   Intersection:  Difference: -  Selection:   Projection:   Cartesian Product:  Extended operators:  Joins (natural, equijoin, theta join, semijoin)  Renaming:   Extended projection

5 M.P. Johnson, DBMS, Stern/NYU, Sp2004 5 Renaming Changes the schema, not the instance Notation:  B1,…,Bn (R)  is spelled “rho”, pronounced “row” Example:  Employee(ssn,name)    social, name) (Employee)  Or just:   (Employee)

6 M.P. Johnson, DBMS, Stern/NYU, Sp2004 6 Complex RA Expressions Q: How long was Star Wars (1977)? Strategy: find the row with Star Wars; then project the length field TitleYearLengthinColorStudioPrdcr# Star Wars1977124TrueFox12345 M.Ducks1991104TrueDisney67890 W.World199295TrueParamount99999

7 M.P. Johnson, DBMS, Stern/NYU, Sp2004 7 Combining operations Q: Which Fox movies are at least 100 minutes long? TitleYearLengthFilmtypeStudio Star wars1977124ColorFox Mighty ducks1991104ColorDisney Wayne’s world199285ColorParamount

8 M.P. Johnson, DBMS, Stern/NYU, Sp2004 8 Complex RA Expressions Reps(ssn, name, etc.) Clients(ssn, name, rssn) Q: Who are George’s clients?  Clients.name (  Reps.name=George (  Reps.ssn=rssn ( Reps x Clients))) Or:  Clients.name (  Reps.name=George and Reps.ssn=rssn (Reps x Clients)) Or:  Clients.name (  Reps.name=George (Reps x Clients)   Reps.ssn=rssn (Reps x Clients))

9 M.P. Johnson, DBMS, Stern/NYU, Sp2004 9 Complex RA Expressions People(ssn, name, street, city, state)  assume for clarity that cities are unique Q: Who lives on George’s street? A: First, find George:   name=“George” (People) Get George’s street/city:   street,city (  name=“George” (People)) Cross with People:  People x  street,city (  name=“George” (People))

10 M.P. Johnson, DBMS, Stern/NYU, Sp2004 10 Complex RA Expressions How to specify street = street? Rename   p2(s2,c2) (People) x  street,city (  name=“George” (People)) Now can select:  street=s2 AND city=c2 (  p2(s2,c2) (People) x  street,city (  name=“George” (People))) Then project names… Only way? No. Join!  People   street,city (  name=“George” (People)) Q: Would the following work?   street,city (  name=“George” (People  People))

11 M.P. Johnson, DBMS, Stern/NYU, Sp2004 11 Complex RA Expressions Scenario: 1. Purchase(pid, seller-ssn, buyer-ssn, etc.) 2. Person(ssn, name, etc.) 3. Product(pid, name, etc.) Q: Who (give names) bought gizmos from Dick? Where to start? Purchase uses pid, ssn, so must get them…

12 M.P. Johnson, DBMS, Stern/NYU, Sp2004 12 Complex RA Expressions Person Purchase Person Product  name=“Dick”  name=“Gizmo”  pid  ssn seller-ssn=ssnpid=pidbuyer-ssn=Person.ssn  name

13 M.P. Johnson, DBMS, Stern/NYU, Sp2004 13 Complex RA Expressions Acc(name,ssn,balance) Q: Who has the largest balance? First, get two copies of rel to play with: Acc x  a2 (Acc) Now, consider this:  a2.bal < Acc.bal (Acc x  a2 (Acc)) Q: What does it give us? Now, subtract the names:  name (Acc) -  a2.name (  a2.bal < Acc.bal (Acc x  a2 (Acc)))

14 M.P. Johnson, DBMS, Stern/NYU, Sp2004 14 Confession Relations aren’t really sets! They’re bags!

15 M.P. Johnson, DBMS, Stern/NYU, Sp2004 15 Bag Theory (5.3) Bags: like sets but elements may repeat  “multisets” Set ops change somewhat when applied to bags  intuition: pretend identical elements are distinct {a,b,b,c}  {a,b,b,b,e,f,f} = {a,a,b,b,b,b,b,c,e,f,f} {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b} {a,b,b,b,c,c}  {b,c,c,c,d} = {b,c,c} Reading assignment: 5.3 – 5.4

16 M.P. Johnson, DBMS, Stern/NYU, Sp2004 16 Bag theory  C (R): preserve the number of occurrences  A (R), product, join: no duplicate elimination  |R1xR2| = |R1|*|R2| Can convert to sets when necessary Why not sets?  Too expensive: Union Projection updates…  consider: average(  bal (Acc))

17 M.P. Johnson, DBMS, Stern/NYU, Sp2004 17 Some surprises in bag theory Be careful about your set theory laws – not all hold in bag theory (R  S) – T = (R – T)  (S – T)  always true in set theory  But true in bag theory?  suppose t is in R, S and T

18 M.P. Johnson, DBMS, Stern/NYU, Sp2004 18 Finally: RA has limitations Cannot compute “transitive closure” Find all direct and indirect relatives of Fred Cannot express in RA! RA is not Turing-Complete Name1Name2Relationship FredMaryFather MaryJoeCousin MaryBillSpouse NancyLouSister

19 M.P. Johnson, DBMS, Stern/NYU, Sp2004 19 Extended Operators (5.4) Duplicate-eliminator  Lower-case delta  Convert to set Aggregation operators  Compute functions of tuples  Sum, average, etc. Grouping-and-aggregation op  lwr-case gamma  Partition tuples into groups, then compute function Sorting  Lower-case tau Extended projection  Project onto new, computed columns Outerjoin  Include dangling duples by nulling

20 M.P. Johnson, DBMS, Stern/NYU, Sp2004 20 Duplicate elimination AB 12 34 12 12 RAB 12 34  (R)

21 M.P. Johnson, DBMS, Stern/NYU, Sp2004 21 Aggregation operators Numerical: SUM, AVG, MIN, MAX Char: MIN, MAX  In lexocographic/alphabetic order Any attribute: COUNT  Number of values SUM(B) = 10 AVG(A) = 1.5 MIN(A) = 1 MAX(A) = 4 COUNT(A) = 4 AB 12 34 12 12 R

22 M.P. Johnson, DBMS, Stern/NYU, Sp2004 22 Grouping Motivation: Movie(title, year, length, studioName) Q: How many minutes of film have been produced by each studio? Strategy: Divide movies into groups per studio, then add lengths Our expression:   Studio,SUM(length)  total (Movies) The subscript: list of attributes and aggregations  Movies is grouped by these attributes  Result includes both

23 M.P. Johnson, DBMS, Stern/NYU, Sp2004 23 Grouping example  Studio,SUM(length)  total (Movies) TitleYearLengthStudio Star Wars1977120Fox Jedi1980105Fox M.Ducks1991110Disney Lion King1995110Disney W.World199295Paramount

24 M.P. Johnson, DBMS, Stern/NYU, Sp2004 24 Grouping example  Studio,SUM(length)  total (Movies) TitleYearLengthStudio Star Wars1977120Fox Jedi1980105Fox M.Ducks1991110Disney Lion King1995110Disney W.World199295Paramount StudioLength Fox225 Disney220 Paramount95

25 M.P. Johnson, DBMS, Stern/NYU, Sp2004 25 Extended projection Form:  a  b, a+b  c, a||b  d (R) a  b rename attribute a as b a+b  c create att c as sum of a and b a||b  d create att d as concatenation of a, b Example:  firstname||“ ”||lastname  fullname (R)  Replace firstname and lastname fields with a fullname field

26 M.P. Johnson, DBMS, Stern/NYU, Sp2004 26 Grouping/extended projection example StarsIn(SName,Title,Year) Q: Find the year of each star’s first movie  SName,MIN(year)  firstYear (StarsIn) How about Q: Find the span of each star’s career A idea: get max and min and subtract A:  SName,MIN(year)  firstY,MIN(year)  lastY (StarsIn) Now project onto diff:  SName,lastY-firstY+1  span (  SName,MIN(year)  firstY,MIN(year)  lastY (StarsIn))


Download ppt "M.P. Johnson, DBMS, Stern/NYU, Sp20041 C20.0046: Database Management Systems Lecture #9 Matthew P. Johnson Stern School of Business, NYU Spring, 2004."

Similar presentations


Ads by Google