Download presentation
Presentation is loading. Please wait.
1
Handout 4 Functional Dependencies
CIS 550 Handout 4 Functional Dependencies CIS550 Handout 4
2
Why we need relational design theory
We don’t need it to design databases ER diagrams and related tools are much more understandable and effective. The theory is useful as a check on our designs to understand certain things that ER diagrams cannot do to help us understand the consequences of redundancy (which we may use for efficiency) CIS550 Handout 4
3
Not all designs are equally good
Why is this design bad? And why is this one preferable? Data(Id#, Name, Address, C#, Description, Grade) Student(Id#, Name, Address) Course(C#, Description) Enrolled(Id#, C#, Grade) CIS550 Handout 4
4
An example of “bad” design
Name Jones Smith Brown Address Phila NYC Boston C# Phil7 Math8 Eng12 Description Plato Topology Chaucer Grade A B C Id# 124 456 789 Information is redundantly given. E.g. Name and Address Some information, e.g., course information depends on the existence of some student. CIS550 Handout 4
5
Functional Dependencies
Recall that a key is a set of attribute names. If two tuples agree on a key they agree everywhere -- they are the same. In our “bad” design, if two tuples agree on Id#, they agree on Address, even though they are not the same. We can say “Id# determines Address” -- written Id# Address This is a functional dependency CIS550 Handout 4
6
Here are some functional dependencies that we expect to hold in our student-course database
Id# Name, Address C# Description Id#,C# Grade Note that any relation (good or bad design) should be constrained by these dependencies A functional dependency X Y is simply a pair of sets. Notice the “sloppy” notation A,B C,D or AB CD rather than {A, B} {C,D } CIS550 Handout 4
7
The Meaning of fd’s Defn. Given a relation scheme R (a set of attributes) and subsets X,Y of R, an instance r of R satisfies X Y if, for any two tuples t1, t2 in R, t1[X ]=t2[X ] implies t1[Y ] = t2[Y ] N.B. We cannot look at a relation to determine which fd’s hold (we can tell if an it doesn’t satisfy an fd. CIS550 Handout 4
8
Basic Intuition in Relational Design
A database scheme is “good” if all fd’s are of the form K R where K is a key for R Example: Our “bad” design is bad because, for example Id# Address is not a key for the relation scheme in which these attributes occur. However, it isn’t as simple as as this. A A is a functional dependency for any attribute A. Are all attributes keys?? CIS550 Handout 4
9
Armstrong’s Axioms Some fd’s occur as consequences of others These can be deduced by Armstrong’s axioms: Reflexivity. If Y X then X Y (These are called trivial dependencies). Example: Name, Address -> Address Augmentation. If X Y then XW YW Example: From C# Description we deduce C#,Id# Description, Id# Transitivity. If X Y and Y Z then X Z Example: From Id#,C# C# and C# Description, we deduce Id#,C# Description CIS550 Handout 4
10
Consequences of Armstrong’s Axioms
Union. If X Y and X Z then X YZ Pseudotransitivity. If X Y and WY Z then XW Z Decomposition. If X Y and Z Y then X Z Prove these from Armstrong’s Axioms. CIS550 Handout 4
11
{X Y | X Y can be deduced from F by Armstrong’s Axioms}
Closure of a set of fd’s Defn. Let F be a set of fd’s. The closure of F, F + is the set of fd’s {X Y | X Y can be deduced from F by Armstrong’s Axioms} Which of the following are in the the closure of our Student-Course fd’s? Address Address C# Description C# Description, Name C#, Id# Description, Name CIS550 Handout 4
12
Equivalence of fd sets Defn. Two sets of fd’s, F and G, are equivalent if F + = G + Example: {AB C, A B } and {A C, A B } are equivalent. F + contains a huge number of fd’s (exponential in the size of the scheme). One naturally looks for small equivalent fd sets CIS550 Handout 4
13
Minimal Cover Defn. A fd set F is minimal if
1. Every fd in F is of the form where A is a (single) attribute, 2. For no X A F is F \ {X A } equivalent to F. 3. For no X A in F and Z X is F \{X A } {Z A } equivalent to F. Example (from previous slide) {A C, A B } is a minimal cover for {AB C, A B } CIS550 Handout 4
14
More on closures Fact. If F is a set of fd’s and X Y F + then there exists an attribute A s.t. X A F +. Proof. Assume otherwise Let Y = {A1,..., An}. Then X A1, ..., X An are in F + . Therefore X A1 ... An is in F +, i.e., X Y is in F + Notation: F (X ) for {Y | X Y F +} CIS550 Handout 4
15
Why Armstrong’s Axioms?
Why are Armstrong’s axioms (or an equivalent rule set) appropriate for fd’s? They are consistent and complete “Consistent” means that any relation that satisfies the fd’s in F will satisfy the fd’s in F + “Complete” means that if an fd X Y cannot be derived by Armstrong’s axioms from F. Then there’s a relational instance satisfying F but not X Y. In other words, Armstrongs axioms derive all the fd’s that should hold. CIS550 Handout 4
16
Proof of consistency This comes directly from the definition. Consider augmentation, for example. This says that if XY then XW YW. If a relation instance satisfies X Y then for any tuples t1, t2 r. If t1[X]=t2[X] then t1[Y] = t2[Y]. If, in addition, t1[W]=t2[W] then t1[YW]=t2[YW] (remember that we are using “sloppy” notation -- YW for YW) CIS550 Handout 4
17
Proof of Completeness To prove completeness we suppose X Y F + and construct a relation instance that satisfies F + but not X Y. By our previous result, we know there is an attribute A X such that X A F +. Our relation has 2 tuples. They agree on F (X ) but disagree everywhere else. x1 x xn a1,1 v1 v vm w1,1 w2,1... x1 x xn a1,2 v1 v vm w1,2 w2,2... X A F(X) \ X rest of R CIS550 Handout 4
18
Proof of Completeness cont’d
It is immediate that this relation fails to satisfy XA and hence X Y. We also have to check that it does satisfy any fd in F + . The tuples agree on only F (X ) . Thus the only fd’s that might be violated are of the form X’ Y’ where X’ F (X ). But if X’ Y’ F + and X’ F (X ) then Y’ F (X ) (reflexivity and augmentation). Therefore X’ Y’ is satisfied. CIS550 Handout 4
19
Data(Id#, Name, Address, C#, Description, Grade)
Decomposition Consider our attribute set We could decompose it into But this decomposition loses information about the relationship between students and courses. Why? Data(Id#, Name, Address, C#, Description, Grade) R1 (Id#, Name, Address,) R2(C#, Description, Grade) CIS550 Handout 4
20
Lossless Join Decomposition
R1, … Rk is a lossless join of R with respect to a fd set F if for every instance r of R that satisfies F, R1 r R1 r= r Consider What happens if we decompose on (Id#, Name,Address) and (C#,Description, Grade)? Name Jones Brown Address Phila Boston C# Phil7 Math8 Description Plato Topology Grade A C Id# 124 789 CIS550 Handout 4
21
Testing for lossless join
Fact. R1, R2 is a lossless join decomposition of R with respect to F iff at least one of the following dependencies is in F (R1 R2) R1 \ R2 Example: WRT the fd set Id# Name, Address C# Description Id#,C# Grade Is (Student,Name,Address) and (Student, C#, Description, Grade) a lossless decomposition? CIS550 Handout 4
22
Dependency preservation
Suppose we update a relation in a database. Can we easily check whether a fd XY is violated. We can if X Y is contained within set of attributes The projection of an fd set F onto a set of attributes Z, FZ is {XY | XYF + and X Y Z } A decomposition R1, …, Rk is dependency preserving if F + = (FR1...FRk)+ This means that the decomposition hasn’t “lost” any essential fd’s CIS550 Handout 4
23
{Sname, Sadd, City, Zip, Item, Price}
An example A relation scheme {Sname, Sadd, City, Zip, Item, Price} A fd set Sname Sadd, City Sadd,City Zip Sname,Item Price Consider the decomposition {Sname,Sadd, City,Zip} and{Sname,Item,Price} Is it lossless? Is it dependency preserving? What if we replaced the first fd by Sname, Sadd City ? CIS550 Handout 4
24
Another example The scheme: {Student, Teacher, Subject}
The fd set: Teacher Subject Student, Subject Teacher The decomposition: {Student, Teacher} and {Teacher, Subject} Is it lossless? Is it dependency preserving? CIS550 Handout 4
25
Fd’s and keys Earlier we stated that the idea in relational database design (from fd’s) is to obtain a design such that for each nontrivial dependency XY , X is a super-key for some relation scheme in R The last example shows that this cannot always be achieved in a way that preserves dependencies. This leads to two notions of normal forms CIS550 Handout 4
26
Normal forms Boyce-Codd Normal Form (BCNF). For every relation scheme R and for every X A that holds over R, either A X (it is trivial) ,or or X is a superkey for R Third Normal Form (3NF) For every relation scheme R and for every X A that holds over R, either A X (it is trivial), or X is a superkey for R, or A is a member of some key for R. CIS550 Handout 4
27
Normal Forms contd. BCNF is clearly desirable, but the teacher/student/subject example shows that it is not always obtainable. BCNF is stronger than 3NF There are algorithms to obtain A BCNF lossless join decomposition A 3NF lossless join, dependency preserving decomposition The 3NF algorithm uses a minimal cover. CIS550 Handout 4
28
BCNF Decomposition Algorithm
RES:= {R} //R = set of all attributes while there is a scheme S in RES that is not in BCNF do begin let A B be a nontrivial functional dependency that holds on S such that A S is not in F+ and A and B are disjoint RES:= (RES-{S}) {S-B} {AB} end CIS550 Handout 4
29
3NF Decomposition Algorithm
let F be a minimal cover. RES = {} for each A B in F do if none of the schemes in RES contains AB then RES:= RES {AB} if none of the schemes in RES contains a candidate key for R then RES:= RES {any candidate key for R} CIS550 Handout 4
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.