Download presentation
Presentation is loading. Please wait.
1
Normalization
2
Normalization Database normalization is the process of removing redundant data from the database to improve storage efficiency, data integrity, and scalability. In the relational model, methods exist for quantifying how efficient a database is. These classifications are called normal forms (or NF), and there are algorithms for converting a given database between them. Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.
3
Why Normalization? (Drawbacks of Redundant Information)
Wastage of Storage Causes problems with update anomalies Insertion anomalies Deletion anomalies Modification anomalies
4
Benefits of Normalization
Less storage space Quicker updates Less data inconsistency Clearer data relationships Easier to add data Flexible Structure
5
Example
6
New relat ion
7
EXAMPLE OF AN UPDATE ANOMALY
Consider the relation: EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours) Changing the name of project number P1 from “Billing” to “Customer Accounting” may cause this update to be made for all employees working on project P1. EXAMPLE OF AN INSERT ANOMALY Cannot insert a project unless an employee is assigned to it. Conversely cannot insert an employee unless an he/she is assigned to a project. EXAMPLE OF AN DELETE ANOMALY When a project is deleted, it will result in deleting all the employees who work on that project. Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project.
8
Guidelines Each tuple in a relation should represent one entity or relationship instance. Design a schema that does not suffer from the insertion, deletion and update anomalies. If there are any anomalies present, then note them so that applications can be made to take them into account. Relations should be designed such that their tuples will have as few NULL values as possible Attributes that are NULL frequently could be placed in separate relations
9
Types of Normalization
First Normal Form Second Normal Form Third Normal Form Boyce Codd Normal Form Fourth Normal Form (Multi Valued Dependencies) Fifth Normal Form (Join Dependencies)
10
Functional Dependencies
An attribute Y is said to have a functional dependency on a set of attributes X (written X →Y) if and only if each X value is associated with precisely one Y value. Functional Dependencies-Example
11
Functional Dependencies-Example
{Ssn,Pnumber} -> {Hours} {Ssn} -> {Ename } {Pnumber} -> {Pname, Plocation}
12
Types of Functional Dependencies
Trivial functional dependency A trivial functional dependency is a functional dependency of an attribute on a superset of itself. {Ssn,Pnumber} -> {Hours} Trivial {Ssn} -> {Ename } Non trivial Full functional dependency An attribute is fully functionally dependent on a set of attributes X if it is functionally dependent on X, and not functionally dependent on any proper subset of X. {Ssn,Pnumber} -> {Hours} Transitive dependency A transitive dependency is an indirect functional dependency, one in which X→Z only by virtue of X→Y and Y→Z. Multivalued dependency A multivaluesd dependency is a constraint according to which the presence of certain rows in a table implies the presence of certain other rows. Join dependency A table T is subject to a join dependency if T can always be recreated by joining multiple tables each having a subset of the attributes of T.
13
Inference Rules for FDs
(Reflexive) If Y subset-of X, then X -> Y (Augmentation) If X -> Y, then XZ -> YZ (Transitive) If X -> Y and Y -> Z, then X -> Z Decomposition: If X -> YZ, then X -> Y and X -> Z Union: If X -> Y and X -> Z, then X -> YZ Pseudo transitivity: If X -> Y and WY -> Z, then WX -> Z
14
First Normal Form (1NF) A relation is said to be in First Normal Form (1NF) if and only if each attribute of the relation is atomic. It does not allows the composite and multi valued attributes. First Normal Form (1NF) sets the very basic rules for an organized database: Eliminate duplicative columns from the same table. Create separate tables for each group of related data and identify each row with a unique column (the primary key).
15
First Normal Form (1NF)-Example
Apply First Normal Form ->
16
First Normal Form (1NF)-Example
Decompose the relation into {Ename, Ssn, Bdate, Address, Dnumber} {Dnumber, Dname, Dmgr_Ssn}
17
First Normal Form (1NF)-Result
18
Second Normal Form (2NF)
A relation schema R is in second normal form (2NF) if a relation in 1NF and every non key attribute A in R is fully functionally dependent on the primary key
19
Second Normal Form (2NF)-Example
20
FD in 2NF
21
FD in 2NF
22
Results
23
Third Normal Form A relation schema R is in third normal form (3NF) if a table is in second normal form (2NF) and there are no transitive dependencies. (OR) Meet all the requirements of the 1NF Meet all the requirements of the 2NF Remove columns that are not dependent upon the primary key.
24
Third Normal Form-Example
FD ?
25
FD in 3NF
26
FD in 3NF
27
Results
28
Comparison of 1NF, 2NF & 3NF 1NF Remove repeating groups
2NF Remove partial key dependencies 3NF Remove transitive dependencies
29
BCNF (Boyce Codd Normal Form)
A relation schema R is in BCNF if for every nontrivial FD X-> Y in R, X is a candidate key. (OR) A relation is in BCNF, if and only if, every determinant is a candidate key. Each normal form is strictly stronger than the previous one Every 2NF relation is in 1NF Every 3NF relation is in 2NF Every BCNF relation is in 3NF There exist relations that are in 3NF but not in BCNF The goal is to have each relation in BCNF (or 3NF)
30
3NF VS BCNF 3NF BCNF A relation schema R is in 3NF if for every nontrivial FD X-> Y in R, X is not a candidate key A relation schema R is in Boyce- Codd Normal Form (BCNF) if for every nontrivial FD X-> Y in R, X is a candidate key 3NF has some redundancy BCNF removes all redundancies caused by FD’s Performance is Lesser than BCNF Better Performance than 3NF
31
Example- 1NF,2NF,3NF & BCNF FD ?
32
Example- 1NF,2NF,3NF & BCNF FD ?
Patno -> PatName Patno,appNo -> Time,doctor
33
Apply Normalization Apply 1NF Eliminating Repeating Groups Apply 2NF
2NF Eliminate partial key dependencies R1(Patno,appNo,time,doctor) R2(Patno,PatName) Apply 3NF 3NF Eliminate transitive dependencies None: so just as 2NF Apply BCNF Every determinant is a candidate key It is in BCNF Form.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.