Download presentation
Presentation is loading. Please wait.
Published byJoella Fields Modified over 9 years ago
1
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization
2
2 Functional Dependencies A functional dependency (FD) takes the form of X Y, where X and Y are subsets of attributes in a relation What does X Y mean? Values of attributes X determines the values of attributes Y; Values of attributes Y depends on the values of attributes X; Suppose t 1 and t 2 are two tuples in the relation. If t 1 and t 2 have the same values for attribute set X, then their values for attribute set Y must be identical to each other in these two tuples
3
3 Functional Dependencies EMP_PRJ(Ssn, Pnumber, Hours, Ename, Pname, Plocation) {Ssn} {Ename} is a FD Ename depends on Ssn {Pnumber} {Pname, Plocation} is a FD Pname and Plocation depends on Pnumber Two rows with the same Pnumber must have the same values of Pname and Plocation {Plocation} {Pnumber} is not a FD {Ename, Plocation} {Pnumber} is not a FD
4
4 Functional Dependencies l Graphical Representation of FDs: l FD1: {SSN, Pnumber} {Hours} l FD2: {SSN} {Ename} l FD3: {PNumber} {PName, PLocation}
5
5 Functional Dependencies l A relation may contain many functional dependencies –How to derive all of them? l Given a set of functional dependencies of a relation R: = {AC B, A C, D A} –Does entail AD BC (i.e., is AD BC also a FD of R)?
6
6 Inference Rules (Example) Given AC B, A C, D A } Does entail AD BC? 1. D A (given in ) 2. AD A (augmenting (1) with A) 3. A C (given in ) 4. A AC (augmenting (3) with A) 5. AC B (given in ) 6. AC BC (augmenting (5) with C) 7. A BC (transitive between (4) and (6)) 8. AD BC (transitive between (2) and (7))
7
7 Normal Forms and Normalization l Functional dependencies can help us analyze whether a relational schema is “good” or “bad” l In relational model, we don’t say that a schema is good/bad. We say it is in 1NF, 2NF, 3NF, etc –Properties The higher the NF, the stricter the conditions placed on the schema A higher NF relation is also in lower NF but not vice-versa –A 3NF relation is in 2NF and 1NF (but not in 4NF, 5NF) l Normalization: –The process of decomposing "bad" (lower normal form) relations by breaking up their attributes into smaller relations
8
8 First Normal Form l A schema is in 1NF if it permits only atomic (indivisible) attribute values l 1NF disallows –composite attributes –multivalued attributes l The relational model itself prohibits relations that contain composite and multivalued attributes –Therefore, all the schemas in relational model are at least in 1NF
9
9 Example Relation is not in 1NF because it has a multivalued attribute (Dlocations)
10
10 Normalization into 1NF l 3 strategies for normalization: –Place the “offending” attributes in a separate relation DEPARTMENT(Dname, Dnumber, Dmgr_ssn) DEPTLOCATIONS(Dnumber, Dlocation) –Change Dlocations into Dlocation and modify the primary key DEPARTMENT(Dname, Dnumber, Dmgr_ssn, Dlocation) –If the maximum number of locations per department is 3: DEPARTMENT(Dname, Dnumber, Dmgr_ssn, Dloc1, Dloc2, Dloc3)
11
1 Is 1NF Sufficient? l Key of the relation is the combination of (Dnumber, Dlocation) l Relation is in 1NF, but there are redundancies: –Two rows with the same Dnumber must have the same Dname and Dmgr_ssn (even though their Dlocations are different)
12
12 2NF (Motivating Example) l Functional dependencies –{Dnumber, Dlocation} {Dname, Dmgr_ssn} (from primary key) –{Dnumber} {Dname, Dmgr_ssn} l Consequence: two tuples with same Dnumber but different Dlocation will have same Dname and Dmgr_ssn, which leads to redundancy! l If {Dnumber} {Dname, Dmgr_ssn} is not a FD, then there won’t be a redundancy problem
13
13 2NF (Motivating Example) l This example suggests that if X Y is a FD, where X is the key, you can’t have X’ Y also a FD of the same table (where X’ is a subset of X), otherwise, there’ll be redundancies in the table –We say that X Y must be a full FD {Dnumber, Dlocation} {Dname, Dmgr_ssn} (from primary key) {Dnumber} {Dname, Dmgr_ssn}
14
14 Full versus Partial Dependencies l X Y is a full FD if removal of any attribute from X means the FD does not hold any more l X Y is a partial FD if there is a FD X’ Y where X’ is a subset of X l Example: –{Dnumber, Dlocation} {Dname, Dmgr_ssn} is a partial FD because {Dnumber} {Dname, Dmgr_ssn} is also a FD of the schema
15
15 Prime versus NonPrime Attributes l Prime attribute: –an attribute that is a member of the candidate key K –Example (from previous slide): Dnumber, Dlocation l Nonprime attribute: –an attribute that is not a member of any candidate key. –Example (from previous slide): Dname, Dmgr_ssn
16
16 2NF Definition l A relation schema R is in second normal form (2NF) if every non- prime attribute A in R is fully functionally dependent on the key of R l Since {Dnumber, Dlocation} is the key –{Dnumber, Dlocation} {Dname, Dmgr_ssn} is FD of the schema –But {Dnumber} {Dname, Dmgr_ssn} is also a FD of the schema The non-prime attributes are not fully functionally dependent on the key So schema is not in 2NF
17
17 Example l FDs: –{SSN, Pnumber} {Hours, Ename, Pname, Plocation}, –{SSN} {Ename}, –{Pnumber} {Pname, Plocation}
18
18 Example –{SSN, PNUMBER} HOURS is a full FD since neither SSN HOURS nor PNUMBER HOURS hold –But {SSN, PNUMBER} ENAME is a partial dependency since SSN ENAME also holds
19
19 2NF –Is {SSN, PNUMBER} {Hours} a full FD? Yes –Is {SSN, PNUMBER} {Ename} a full FD? No –Is {SSN, PNUMBER} {Pname} a full FD? No –Is {SSN, PNUMBER} {Plocation} a full FD? No l Conclusion: The EMP_PROJ relation is not in 2NF l 2NF normalization: take the “offending” FDs and create separate relations
20
20 Normalizing into 2NF {SSN, Pnumber} {Hours}, {SSN} {Ename}, {Pnumber} {Pname, Plocation}
21
21 Is 2NF sufficient? l Key is SSN l FDs: –{SSN} {Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn} –{Dnumber} {Dname, Dmgr_ssn} l Is the table in 2NF? –Yes because every non-prime attribute is fully FD on the key
22
2 Is 2NF sufficient? l Are there still redundancies in the relation? Yes –Two tuples with the same Dnumber have the same Dname and Dmgr_ssn l What is the “offending” FD that causes redundancy?
23
23 Is 2NF sufficient? l Functional dependencies: –{SSN} {Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn} –{Dnumber} {Dname, Dmgr_ssn} l Since Dnumber is not a key, you can have two rows with the same Dnumber. Hence their Dname and Dmgr_ssn must be the same => redundancy!
24
24 3NF l A relation schema R is in third normal form (3NF) if –It is in 2NF and –There is no non-prime attribute in R that is transitively dependent on the primary key In X Y and Y Z are FDs, with X as the primary key, we consider Z to be transitively dependent on X only if Y is not a candidate key. If Y is a candidate key, then we do not consider this as a transitive dependency problem
25
25 Example of 3NF l FDs: –SSN Ename, Bdate, Address, Dnumber –SSN Dnumber –Dnumber Dname, Dmgr_ssn l Dname is transitively dependent on the primary key SSN because SSN Dnumber and Dnumber Dname are FDs of the relation –Therefore the relation is not in 3NF
26
26 Third Normal Form l Another way to check whether a relation is in 3NF (without checking for partial and transitive dependencies): –A relation schema R is in 3NF if whenever a nontrivial FD X A holds, either X is a superkey of R or A is a prime attribute of R
27
27 3NF l FDs: –SSN Ename, Bdate, Address –SSN Dnumber –Dnumber Dname, Dmgr_ssn But Dnumber is not superkey and Dname,Dmgr_ssn are not prime attributes l Therefore the relation is not in 3NF Transitive dependency
28
28 Normalizing into 3NF Take the “offending” FDs and create separate relations
29
29 Is 3NF enough to remove redundancy? l FDs: –{Student, Course} Instructor –Instructor Course l Relation is in 3NF (but there is still redundancy) Assume every instructor teaches only 1 course Key is (Student, Course) No transitive dependency because Course is not a prime attribute
30
30 BCNF (Boyce-Codd Normal Form) l A relation schema R is in BCNF if whenever an FD X A holds in R, then X must be a superkey of R l FDs: –{Student, Course} Instructor –Instructor Course l Relation is not in BCNF because Instructor is not a superkey
31
31 Achieving BCNF by Decomposition l STUD_COURSE –Key is {Student,Course} l COURSE_INSTRUCT –Key is {Instructor} –FD: Instructor Course l Loses the FD: {Student, Course} Instructor –But no redundancy STUD_COURSECOURSE_INSTRUCT
32
32 Decomposition 1 l Problem: decomposition does not result in lossless join (i.e., does not have nonadditive join property) –i.e., spurious tuples may be generated
33
3 Decomposition 2 l Dependency preserving? No –loses the FD: {Student, Course} Instructor l Lossless join? Yes
34
34 Decomposition 3 l Dependency preserving? No –loses the FD: {Student, Course} Instructor l Lossless join? No
35
35 Summary l 1 st normal form –no composite/multivalued attributes in relations l 2 nd, 3 rd, and Boyce-Code normal forms –Eliminate redundancies based on FDs l More normal forms (see textbook) –4 th : deal with multivalued dependencies –5 th : deal with join dependencies
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.