Functional dependencies CMSC 461 Michael Wilson
Designing tables Now we have all the tools to build our databases How should we actually go about doing this? First thing’s first
Functional dependencies Going back to relational algebra and the relational model Identifying functional dependencies in relations help us to identify keys in a relation
What is a functional dependency (FD)? Take a relation R A set of attributes X = A 1,A 2,…A n in R A set of attributes Y = B 1,B 2,…B m in R X functionally determines Y if there’s only one X value per Y value Every unique tuple in π A1,A2,…An maps to exactly one unique tuple in π B1,B2,…,Bm Can also say that if a tuple has the values (a 1,a 2,…a n ), it will also have the values (b 1,b 2,…b m ) Relation R then satisfies the functional dependency X→Y
What is a functional dependency (FD)? Tuples that have equal values for attributes in X will have equal values for attributes in Y Let’s look at some examples
Functional dependencies Tuples with the same values in (A 1,A 2,…A n ) will always be paired with the same values in (B 1,B 2,…B m )
Functional dependency examples titleyearlengthgenrestudioStar Star Wars SciFiFoxCarrie Fisher Star Wars SciFiFoxMark Hamill Star Wars SciFiFoxHarrison Ford Gone With the Wind DramaMGMVivien Leigh Wayne’s World ComedyParamountDana Carvey Wayne’s World ComedyParamountMike Myers
Functional dependency examples title year → length genre studio Does this work? Why does it work? Yes, this functional dependency holds Assumption that a movie with the same title will not come out within the same year Every unique movie will have the same length, genre, and studio name
More examples studio → title studio genre → title studio genre → title year title → year title year → studio title year → genre title year → star
Functional dependency reasoning Certain rules hold for functional dependencies that help us to reason about them You’ve seen these rules in math classes and logic courses
Transitivity Functional dependencies A → B B → C It can be shown that, because of these two dependencies, A → C
Splitting/combining A functional dependency A 1,A 2,…A n → B 1,B 2,…B m This can be split into A 1,A 2,…A n → B 1 A 1,A 2,…A n → B 2 A 1,A 2,…A n → B m Similarly, we can combine the aforementioned dependencies back into the original dependency
Trivial dependencies Trivial dependencies are functional dependencies where the right side is a subset of the left Taking A 1,A 2,…A n → B 1,B 2,…B m If B 1,B 2,…B m is some subset of A 1,A 2,…A n These dependencies always hold Therefore, they are called trivial
Trivial dependencies Intermediate situation: A 1,A 2,…A n → B 1,B 2,…B m Only some of the elements of B 1,B 2,…B m are in A 1,A 2,…A n Can simply remove the elements of B 1,B 2,…B m that are in A 1,A 2,…A n to get a nontrivial dependency
Closures Closures are a way of determining possible functional dependencies given a starting set of attributes and a few functional dependencies You can use these to verify proposed functional dependencies Notation for the closure of attributes A and B: {A, B} +
Closures In other words: Say I have a relation R with attributes A, B, C, D, E, F, AB → C BC → AD D → E I want to know if AB → D will hold In order to do this, I need to calculate {A, B} +
Calculating closures In order to calculate a closure, you need a set of attributes to calculate a closure on and a set of functional dependencies to test for If we have no existing functional dependencies, we can’t calculate closures In calculating a closure, we’re essentially trying to see what attributes we can “jump” to given a starting set and some other dependencies If we have no dependencies, there’s nowhere to go
Calculating closures Input: a set of attributes {A 1,A 2,…A n }, a set S of functional dependencies For this example Attributes A, B S AB → C BC → AD D → E CF → B
Calculating closures: Step 1 Step 1: Split dependencies in S We want to reduce all of the dependencies in S such that they are as simple as possible Single attributes on the right side of the dependency Why? Makes things way easier, as we’ll see
Calculating closures: step 1 AB → C BC → AD D → E CF → B
Calculating closures: step 1 AB → C BC → A BC → D D → E CF → B
Calculating closures: step 2 Step 2: Let X be your input attributes X = {A, B} X will eventually be your closure
Calculating closures: step 3 Step 3: Go through the functional dependencies in S. If the left side of a dependency can is satisfied by some subset of the current set of X, and the right side is not currently in X, add the right side to X Repeat until there’s nothing left to add
Calculating closures: step 3 IIteration 1 XX = {A, B} PPossible subsets: A, B, AB SS AAB → C BBC → A BBC → D DD → E CCF → B
IIteration 2 XX = {A, B, C} PPossible subsets: A, B, C, AB, AC, BC, ABC SS AAB → C BBC → A BBC → D DD → E CCF → B
IIteration 3 XX = {A, B, C, D} PPossible subsets AA, B, C, D, AB, AC, AD, BC, BD, CD, ABC, ABD, ACD, BCD, ABCD SS DD → E CCF → B
Iteration 4 X = {A, B, C, D, E} S CF → B This doesn’t match any subset of X Done!
Calculating closures: step 4 Step 4: Enjoy your closure {A, B} + = {A, B, C, D, E} F was not a part of the closure because at no point during the algorithm was there anything that “lead” to F How could we modify S to include F?
Using closures to determine superkeys Calculating a closure of a set of attributes If the closure includes all attributes in the relation, then it is a superkey of the relation Not necessarily a candidate key