Download presentation
Presentation is loading. Please wait.
1
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C20.0046: Database Management Systems Lecture #9 Matthew P. Johnson Stern School of Business, NYU Spring, 2004
2
M.P. Johnson, DBMS, Stern/NYU, Sp2004 2 Agenda Last time: 4NF, RA This time: 1. More RA 2. Bags Project Part 2 due next time
3
M.P. Johnson, DBMS, Stern/NYU, Sp2004 3 Normalization Review Q: What’s required for BCNF? Q: What’s the loophole for 3NF? Q: How do we fix a non-BCNF, non-3NF, non-4NF relations? Q: When are are FDs also MVDs? Q: When are MVDs also FDs?
4
M.P. Johnson, DBMS, Stern/NYU, Sp2004 4 Relational Algebra Review Five basic operators: Union: Intersection: Difference: - Selection: Projection: Cartesian Product: Extended operators: Joins (natural, equijoin, theta join, semijoin) Renaming: Extended projection
5
M.P. Johnson, DBMS, Stern/NYU, Sp2004 5 Renaming Changes the schema, not the instance Notation: B1,…,Bn (R) is spelled “rho”, pronounced “row” Example: Employee(ssn,name) social, name) (Employee) Or just: (Employee)
6
M.P. Johnson, DBMS, Stern/NYU, Sp2004 6 Complex RA Expressions Q: How long was Star Wars (1977)? Strategy: find the row with Star Wars; then project the length field TitleYearLengthinColorStudioPrdcr# Star Wars1977124TrueFox12345 M.Ducks1991104TrueDisney67890 W.World199295TrueParamount99999
7
M.P. Johnson, DBMS, Stern/NYU, Sp2004 7 Combining operations Q: Which Fox movies are at least 100 minutes long? TitleYearLengthFilmtypeStudio Star wars1977124ColorFox Mighty ducks1991104ColorDisney Wayne’s world199285ColorParamount
8
M.P. Johnson, DBMS, Stern/NYU, Sp2004 8 Complex RA Expressions Reps(ssn, name, etc.) Clients(ssn, name, rssn) Q: Who are George’s clients? Clients.name ( Reps.name=George ( Reps.ssn=rssn ( Reps x Clients))) Or: Clients.name ( Reps.name=George and Reps.ssn=rssn (Reps x Clients)) Or: Clients.name ( Reps.name=George (Reps x Clients) Reps.ssn=rssn (Reps x Clients))
9
M.P. Johnson, DBMS, Stern/NYU, Sp2004 9 Complex RA Expressions People(ssn, name, street, city, state) assume for clarity that cities are unique Q: Who lives on George’s street? A: First, find George: name=“George” (People) Get George’s street/city: street,city ( name=“George” (People)) Cross with People: People x street,city ( name=“George” (People))
10
M.P. Johnson, DBMS, Stern/NYU, Sp2004 10 Complex RA Expressions How to specify street = street? Rename p2(s2,c2) (People) x street,city ( name=“George” (People)) Now can select: street=s2 AND city=c2 ( p2(s2,c2) (People) x street,city ( name=“George” (People))) Then project names… Only way? No. Join! People street,city ( name=“George” (People)) Q: Would the following work? street,city ( name=“George” (People People))
11
M.P. Johnson, DBMS, Stern/NYU, Sp2004 11 Complex RA Expressions Scenario: 1. Purchase(pid, seller-ssn, buyer-ssn, etc.) 2. Person(ssn, name, etc.) 3. Product(pid, name, etc.) Q: Who (give names) bought gizmos from Dick? Where to start? Purchase uses pid, ssn, so must get them…
12
M.P. Johnson, DBMS, Stern/NYU, Sp2004 12 Complex RA Expressions Person Purchase Person Product name=“Dick” name=“Gizmo” pid ssn seller-ssn=ssnpid=pidbuyer-ssn=Person.ssn name
13
M.P. Johnson, DBMS, Stern/NYU, Sp2004 13 Complex RA Expressions Acc(name,ssn,balance) Q: Who has the largest balance? First, get two copies of rel to play with: Acc x a2 (Acc) Now, consider this: a2.bal < Acc.bal (Acc x a2 (Acc)) Q: What does it give us? Now, subtract the names: name (Acc) - a2.name ( a2.bal < Acc.bal (Acc x a2 (Acc)))
14
M.P. Johnson, DBMS, Stern/NYU, Sp2004 14 Confession Relations aren’t really sets! They’re bags!
15
M.P. Johnson, DBMS, Stern/NYU, Sp2004 15 Bag Theory (5.3) Bags: like sets but elements may repeat “multisets” Set ops change somewhat when applied to bags intuition: pretend identical elements are distinct {a,b,b,c} {a,b,b,b,e,f,f} = {a,a,b,b,b,b,b,c,e,f,f} {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b} {a,b,b,b,c,c} {b,c,c,c,d} = {b,c,c} Reading assignment: 5.3 – 5.4
16
M.P. Johnson, DBMS, Stern/NYU, Sp2004 16 Bag theory C (R): preserve the number of occurrences A (R), product, join: no duplicate elimination |R1xR2| = |R1|*|R2| Can convert to sets when necessary Why not sets? Too expensive: Union Projection updates… consider: average( bal (Acc))
17
M.P. Johnson, DBMS, Stern/NYU, Sp2004 17 Some surprises in bag theory Be careful about your set theory laws – not all hold in bag theory (R S) – T = (R – T) (S – T) always true in set theory But true in bag theory? suppose t is in R, S and T
18
M.P. Johnson, DBMS, Stern/NYU, Sp2004 18 Finally: RA has limitations Cannot compute “transitive closure” Find all direct and indirect relatives of Fred Cannot express in RA! RA is not Turing-Complete Name1Name2Relationship FredMaryFather MaryJoeCousin MaryBillSpouse NancyLouSister
19
M.P. Johnson, DBMS, Stern/NYU, Sp2004 19 Extended Operators (5.4) Duplicate-eliminator Lower-case delta Convert to set Aggregation operators Compute functions of tuples Sum, average, etc. Grouping-and-aggregation op lwr-case gamma Partition tuples into groups, then compute function Sorting Lower-case tau Extended projection Project onto new, computed columns Outerjoin Include dangling duples by nulling
20
M.P. Johnson, DBMS, Stern/NYU, Sp2004 20 Duplicate elimination AB 12 34 12 12 RAB 12 34 (R)
21
M.P. Johnson, DBMS, Stern/NYU, Sp2004 21 Aggregation operators Numerical: SUM, AVG, MIN, MAX Char: MIN, MAX In lexocographic/alphabetic order Any attribute: COUNT Number of values SUM(B) = 10 AVG(A) = 1.5 MIN(A) = 1 MAX(A) = 4 COUNT(A) = 4 AB 12 34 12 12 R
22
M.P. Johnson, DBMS, Stern/NYU, Sp2004 22 Grouping Motivation: Movie(title, year, length, studioName) Q: How many minutes of film have been produced by each studio? Strategy: Divide movies into groups per studio, then add lengths Our expression: Studio,SUM(length) total (Movies) The subscript: list of attributes and aggregations Movies is grouped by these attributes Result includes both
23
M.P. Johnson, DBMS, Stern/NYU, Sp2004 23 Grouping example Studio,SUM(length) total (Movies) TitleYearLengthStudio Star Wars1977120Fox Jedi1980105Fox M.Ducks1991110Disney Lion King1995110Disney W.World199295Paramount
24
M.P. Johnson, DBMS, Stern/NYU, Sp2004 24 Grouping example Studio,SUM(length) total (Movies) TitleYearLengthStudio Star Wars1977120Fox Jedi1980105Fox M.Ducks1991110Disney Lion King1995110Disney W.World199295Paramount StudioLength Fox225 Disney220 Paramount95
25
M.P. Johnson, DBMS, Stern/NYU, Sp2004 25 Extended projection Form: a b, a+b c, a||b d (R) a b rename attribute a as b a+b c create att c as sum of a and b a||b d create att d as concatenation of a, b Example: firstname||“ ”||lastname fullname (R) Replace firstname and lastname fields with a fullname field
26
M.P. Johnson, DBMS, Stern/NYU, Sp2004 26 Grouping/extended projection example StarsIn(SName,Title,Year) Q: Find the year of each star’s first movie SName,MIN(year) firstYear (StarsIn) How about Q: Find the span of each star’s career A idea: get max and min and subtract A: SName,MIN(year) firstY,MIN(year) lastY (StarsIn) Now project onto diff: SName,lastY-firstY+1 span ( SName,MIN(year) firstY,MIN(year) lastY (StarsIn))
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.