Download presentation
Presentation is loading. Please wait.
Published byTalon Straight Modified over 9 years ago
1
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala N ATIONAL I NSTITUTE OF T ECHNOLOGY A GARTALA Aug-Dec,2010 Normalization 2 CSE-503 :: D ATABASE M ANAGEMENT S YSTEM 1
2
Outline Functional dependencies Normalization 2
3
Normalization – use Functional Dependencies ProductNumberDateTimeEmployeeIDPhoneNumberPayRateWithholding CR76 CR56 CR74 CR56 05/13/02 07/04/02 13:30 12:00 10:30 SG5 SG37 SG5 3651101 3651102 12 18 1.80 2.40 ProductNumber,Date -> Time, EmployeeID, PhoneNumber EmployeeID, Date, Time -> ProductNumber PhoneNumber, Date, Time -> EmployeeID, ProductNumber EmployeeID, Date -> PhoneNumber Date -> PayRate PayRate -> Withholding Product Support Coverage 3
4
Illustrate FDs Product Coverage Support ProductNumber,Date -> Time, EmployeeID, PhoneNumber EmployeeID, Date, Time -> ProductNumber PhoneNumber, Date, Time -> EmployeeID, ProductNumber EmployeeID, Date -> PhoneNumber Date -> PayRate PayRate -> Withholding WithholdingPayRatePhoneNumberEmployeeIDTimeDateProductNumber 4
5
Product Support Coverage Pay Coverage Wages Taxes Product Coverage Phone Assignment 1NF 2NF 3NF BCNF Decompose 5
6
Functional Dependencies A functional dependency is a constraint between two sets of attributes in a relational database. If X and Y are two sets of attributes in the same relation T, then X Y means that X functionally determines Y so that the values of the attributes in X uniquely determine the values of the attributes in Y for any two tuples t 1 and t 2 in T, t 1 [X] = t 2 [X] implies that t 1 [Y] = t 2 [Y] if two tuples in T agree in their X column(s), then their Y column(s) should also be the same. 6
7
FD and Keys Key constraint is a special kind of functional dependency K is a superkey for relation schema R if and only if K → R K is a candidate key for R if and only if K → R, and for no α ⊂ K, α → R Key is on LHS, all attributes are on RHS ROLL ROLL, Name, Address For a key, no two rows share the same values, thus by default, when ever a tuple agrees on LHS it agrees on the RHS. 7
8
8
9
9
10
WELCOME 2 ND DAY OF NORMALIZATION 10
11
Armstrong’s Axioms of FDs 1. Reflexivity: If X Y then X Y (trivial FD) Name, Address Name 2. Augmentation: If X Y then X Z YZ If Town Zip then Town, Name Zip, Name 3. Transitivity: If X Y and Y Z then X Z But keep in mind 2 other rules that are useful: Union: If X → Y and X → Z, then X → YZ Decomposition: If X → YZ, then X → Y and X → Z 11
12
12
13
Soundness Axioms are sound: If an expression f: X Y can be derived from a set of FDs F using the axioms, then f is a FD. We say F entails f. Completeness Axioms are complete: If F entails f, then f can be derived from F using the axioms As a result, to determine if F entails f, use the axioms in all possible ways to generate F + (the set of possible FD’s is finite so this can be done) and see if f is in F + 13
14
Functional Dependency Closure (F+) Set F of Functional Dependencies (given) Relation: EmpProj: SSN, Pnumber, Hours, Ename, Pname, Plocation FDs F: {SSN → Ename} Pnumber → {Pname, Plocation} {SSN, Pnumber} → Hours} Closures: {SSN}+ = {SSN, Ename} {Pnumber}+ = {Pnumber, Pname, Plocation} F+ {SSN, Pnumber}+ = {SSN, Pnumber, Ename, Pname, Plocation, Hours} 14
15
Generating F + F AB C AB BCD A D AB BD AB BCDE AB CDE D E BCD BCDE Thus, AB BD, AB BCD, AB BCDE, and AB CDE are all elements of F + union aug trans aug decomp 15
16
16
17
Attribute Closure Calculating attribute closure is a more efficient way of checking entailment The attribute closure of a set of attributes, X, with respect to a set of functional dependencies, F, (denoted X + F ) is the set of all attributes, A, such that X A X + F1 is not necessarily the same as X + F2 Checking entailment: Given a set of FDs, F, then X Y if and only if X + F Y (by union & decomposition rule) 17
18
Computation of Attribute Closure Example AB C (a) A D (b) D E (c) AC B (d) Problem: Compute the attribute closure of AB with respect to the set of FDs : Initially closure = {AB} Using (a) closure = {ABC} Using (b) closure = {ABCD} Using (c) closure = {ABCDE} Solution: 18
19
Computation of Attribute Closure X + F closure := X; --since X X + F repeat old := closure; if there is an FD Z V in F such that Z closure then closure := closure V until old = closure -- If T closure then X T is entailed by F 19
20
Example - Computing Attribute Closure F: AB C A D D E AC B X X F + A {A, D, E} AB {A, B, C, D, E} (Hence AB is a key) B {B} D {D, E} Is AB E a FD? Yes Is D C a FD? No Result: X F + allows us to determine FDs entailed by F of the form X Y 20
21
21
22
N ORMALIZATION 22
23
The goal is to remove redundancy based on dependencies 23
24
Normal Forms Each normal form is a set of conditions on a schema that guarantees certain properties (relating to redundancy and update anomalies) The two commonly used normal forms are third normal form (3NF) and Boyce-Codd normal form (BCNF) 24
25
Levels of Normalization 1 NF 2 NF 3 NF BCNF 25
26
Normal Forms Considerations: Relational design by analysis Normal forms are based on functional dependencies (FDs) Intuitive, perhaps, but identifying a strictly controlled procedure allows a programmatic process Should consider 2 additional properties Lossless join (nonadditive join property) required Dependency preservation property use when possible 26
27
27
28
28
29
29
30
30
31
Lossless Joins and Dependency Preservation If relation R and FDs F hold over R, then decomposing R into R1 and R2 is lossless if the closure of F contains either: FD R1 ∩ R2 -> R1 or FD R1 ∩ R2 -> R2 If the closure of the attributes in R1, independent of those attributes in R2, unioned with the closure of attributes of R2, independent of those attributes in R1, are equivalent to the closure F, then dependency is preserved 31
32
32
33
First Normal Form (1NF) A relational schema R is in first normal form if the domains of all attributes of R are atomic (Atomicity) Domain is atomic if its elements are considered to be indivisible units Non-atomic values complicate storage and encourage redundant (repeated) storage of data Requirements: 1NF disallows multivalued attributes, or composite attributes, or their combinations, by requiring only single atomic (indivisible) values in the domain of an attribute 33
34
Business Rules Example Staffing hours (S) are on a per project activity (activities within projects) basis - AN Managers (PM) and their departments (D) are assigned to projects (PN) A department is assigned to a project managers A project manager is assigned to projects Project no (PN) Activity no (AN) Project Manager (PM) dept (D) Hour (S) AD311120 A000.8 AD31113020A001.5 AD31114020A001 AD31115020A001.25 AD31116020A000.75 AD31117020A000.35 MA21002010D110.5 MA21003010D111 OP10003010D110.25 IF10003020A001 IF10005020A000.5 IF10006020A000.5 1NF 34
35
Prime Attribute: An attribute of relation schema R is called a prime attribute of R if it is a member of some candidate key of R Non Prime Attribute: An attribute of relation schema R is called a non prime attribute of R if it is not a member of any candidate key. 35
36
Second Normal Form (2NF) 2NF based on the concept of full functional dependency. A functional dependency is full functional dependency if removal of any attribute A from X means that the dependency does not hold anymore; i.e. for any attribute A X, (X-{A}) does not functionally determine Y. If FD X→Y, removal of A eliminates the FD Partial functional dependency : For any attribute A X, (X-{A})→Y If A can be removed and FD remains, X→Y is a partial functional dependency (a violation of 2NF) 36
37
Partial functional dependency (a violation of 2NF) {SSN,PNUMBER HOURS is a full dependency. {SSN,PNUMBER} ENAME is partial because ENO ENAME holds 37
38
Second Normal Form (2NF) Definition of 2NF: A relational schema R is in 2NF if every nonprime attribute A in R is fully functionally dependent on the primary key of R. 38
39
2NF Project no (PN) Activity no (AN) Project Manager (PM) dept (D) Hour (S) AD311120 A000.8 AD31113020A001.5 AD31114020A001 AD31115020A001.25 AD31116020A000.75 AD31117020A000.35 MA21002010D110.5 MA21003010D111 OP10003010D110.25 IF10003020A001 IF10005020A000.5 IF10006020A000.5 Project no (PN) Activity no (AN) Hour (S) AD3111200.8 AD3111301.5 AD3111401 AD3111501.25 AD311160.75 AD311170.35 MA210020.5 MA2100301 OP1000300.25 IF1000301 IF100050.5 IF100060.5 Project no (PN) Project Manager (PM) dept (D) AD311120A00 MA210010D11 OP100010D11 IF100020A00 Staffing is on a per project activity ( and activities within projects) basis Managers and their departments are assigned to projects {PN,AN→h} PN →{PM,D} 1NF 39
40
WELCOME 3 rd DAY OF NORMALIZATION 40
41
Third Normal Form (3NF) Definition of 3NF: A relational schema R is in 3NF if it satisfies 2NF and no nonprime attribute of R is transitively dependent on the primary key. 3NF is based on the concept of transitive dependency of nonprime attributes on another nonprime attribute. {X→Y,Y→Z} ⊨ X→Z Transitive dependencies - is a 3NF violations LHS of FD should be superkey, or RHS is a prime attribute. 41
42
3NF Project no (PN) Project Manager (PM) dept (D) AD311120A00 MA210010D11 OP100010D11 IF100020A00 A department is assigned to a project manager A project manager is assigned to projects Project no (PN) Project Manager (PM) AD311120 MA210010 OP100010 IF100020 Project Manager (PM) dept (D) 20A00 10D11 PN→PM →D 2NF 42
43
Boyce-Codd Normal Form (BCNF) BCNF is a simpler form of 3NF that is more restrictive. Every relationship in BCNF is also in 3NF; however 3NF is not necessarily in BCNF. Definition of BCNF: A relation schema R is in BCNF if whenever a nontrivial functional dependencies X → A holds in R, then X is a superkey of R. LHS of a FD should be superkey Note: Each attribute is identified by nothing but the key Sometimes too restrictive, may not be dependency-preserving with regard to closure 43 3 NF BCNF
44
KEY A1 … 44
45
New example: LOT 45 1NF 2NF
46
New example: LOT 46 3NF
47
47 FD5 Let there exist a new FD5; FD5 violates BCNF in LOTS1A because AREA is not a superkey of LOTS1A FD5 satisfy 3NF in LOTS1A because COUNTRY_NAME is a prime attribute, but this condition does not exist in definition of BCNF So decompose LOTS1A in to LOTS1AX and LOTS1AY
48
3NF to BCNF 48
49
Other dependences and normal forms (not commonly used) Multivalued dependences (4NF) If X and Y are subsets of attributes of relation schema R: MVD X ↠ Y independent of the values in other attributes R is in 4NF if for every MVD X ↠ Y that holds over R, one of the following is true: Y X or XY = R (trivial MVD), or X is a superkey 49
50
MVD An employee can be assigned to any project and, within those projects, to any activities, but the assignments are consistent for that employee A project or activity can have any number of employees assigned to it ENPA 130Query Services, User Education Debug, Supp 30Query ServicesDebug, Test, Code ENP 130Query Services 130User Education 30Query Services ENA 130Debug 130Supp 30Debug 30Test 30Code EN→P EN→A 50
51
Join Dependences (5NF) A further generalization of MVDs All MVDs are JD, but not all JDs are MVDs For every JD ⋈ {R 1,…R n }, one of the following is true: R i = R for some i, or The JD is implied by the set of those FDs over R in which the left side is a key for R If a relation schema is in 3NF and each of its keys consists of a single attribute, it is also in 5NF More - normal forms (rare) 51
52
5NF ENPA 130Query ServicesDebug 130User EducationSupp 140Query ServicesSupp 130Query ServicesSupp ENP 130Query Services 130User Education 140Query Services ENA 130Debug 130Supp 140Supp PA Query Services Debug User Education Supp Query Services Supp {EN,P,A} JD {EN,P} {EN,P,A} JD {P,A} {EN,P,A} JD {EN,A} If an employee works for a project, the employee will be assigned to activities within that project 52
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.