Presentation is loading. Please wait.

Presentation is loading. Please wait.

Normalization.

Similar presentations


Presentation on theme: "Normalization."— Presentation transcript:

1 Normalization

2 Normalization Database normalization is the process of removing redundant data from the database to improve storage efficiency, data integrity, and scalability. In the relational model, methods exist for quantifying how efficient a database is. These classifications are called normal forms (or NF), and there are algorithms for converting a given database between them. Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.

3 Why Normalization? (Drawbacks of Redundant Information)
Wastage of Storage Causes problems with update anomalies Insertion anomalies Deletion anomalies Modification anomalies

4 Benefits of Normalization
Less storage space Quicker updates Less data inconsistency Clearer data relationships Easier to add data Flexible Structure

5 Example

6 New relat ion

7 EXAMPLE OF AN UPDATE ANOMALY
Consider the relation: EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours) Changing the name of project number P1 from “Billing” to “Customer Accounting” may cause this update to be made for all employees working on project P1. EXAMPLE OF AN INSERT ANOMALY Cannot insert a project unless an employee is assigned to it. Conversely cannot insert an employee unless an he/she is assigned to a project. EXAMPLE OF AN DELETE ANOMALY When a project is deleted, it will result in deleting all the employees who work on that project. Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project.

8 Guidelines Each tuple in a relation should represent one entity or relationship instance. Design a schema that does not suffer from the insertion, deletion and update anomalies. If there are any anomalies present, then note them so that applications can be made to take them into account. Relations should be designed such that their tuples will have as few NULL values as possible Attributes that are NULL frequently could be placed in separate relations

9 Types of Normalization
First Normal Form Second Normal Form Third Normal Form Boyce Codd Normal Form Fourth Normal Form (Multi Valued Dependencies) Fifth Normal Form (Join Dependencies)

10 Functional Dependencies
An attribute Y is said to have a functional dependency on a set of attributes X (written X →Y) if and only if each X value is associated with precisely one Y value. Functional Dependencies-Example

11 Functional Dependencies-Example
{Ssn,Pnumber} -> {Hours} {Ssn} -> {Ename } {Pnumber} -> {Pname, Plocation}

12 Types of Functional Dependencies
Trivial functional dependency A trivial functional dependency is a functional dependency of an attribute on a superset of itself. {Ssn,Pnumber} -> {Hours} Trivial {Ssn} -> {Ename } Non trivial Full functional dependency An attribute is fully functionally dependent on a set of attributes X if it is functionally dependent on X, and not functionally dependent on any proper subset of X. {Ssn,Pnumber} -> {Hours} Transitive dependency A transitive dependency is an indirect functional dependency, one in which X→Z only by virtue of X→Y and Y→Z. Multivalued dependency A multivaluesd dependency is a constraint according to which the presence of certain rows in a table implies the presence of certain other rows. Join dependency A table T is subject to a join dependency if T can always be recreated by joining multiple tables each having a subset of the attributes of T.

13 Inference Rules for FDs
(Reflexive) If Y subset-of X, then X -> Y (Augmentation) If X -> Y, then XZ -> YZ (Transitive) If X -> Y and Y -> Z, then X -> Z Decomposition: If X -> YZ, then X -> Y and X -> Z Union: If X -> Y and X -> Z, then X -> YZ Pseudo transitivity: If X -> Y and WY -> Z, then WX -> Z

14 First Normal Form (1NF) A relation is said to be in First Normal Form (1NF) if and only if each attribute of the relation is atomic. It does not allows the composite and multi valued attributes. First Normal Form (1NF) sets the very basic rules for an organized database: Eliminate duplicative columns from the same table. Create separate tables for each group of related data and identify each row with a unique column (the primary key).

15 First Normal Form (1NF)-Example
Apply First Normal Form ->

16 First Normal Form (1NF)-Example
Decompose the relation into {Ename, Ssn, Bdate, Address, Dnumber} {Dnumber, Dname, Dmgr_Ssn}

17 First Normal Form (1NF)-Result

18 Second Normal Form (2NF)
A relation schema R is in second normal form (2NF) if a relation in 1NF and every non key attribute A in R is fully functionally dependent on the primary key

19 Second Normal Form (2NF)-Example

20 FD in 2NF

21 FD in 2NF

22 Results

23 Third Normal Form A relation schema R is in third normal form (3NF) if a table is in second normal form (2NF) and there are no transitive dependencies. (OR) Meet all the requirements of the 1NF Meet all the requirements of the 2NF Remove columns that are not dependent upon the primary key.

24 Third Normal Form-Example
FD ?

25 FD in 3NF

26 FD in 3NF

27 Results

28 Comparison of 1NF, 2NF & 3NF 1NF Remove repeating groups
2NF Remove partial key dependencies 3NF Remove transitive dependencies

29 BCNF (Boyce Codd Normal Form)
A relation schema R is in BCNF if for every nontrivial FD X-> Y in R, X is a candidate key. (OR) A relation is in BCNF, if and only if, every determinant is a candidate key. Each normal form is strictly stronger than the previous one Every 2NF relation is in 1NF Every 3NF relation is in 2NF Every BCNF relation is in 3NF There exist relations that are in 3NF but not in BCNF The goal is to have each relation in BCNF (or 3NF)

30 3NF VS BCNF 3NF BCNF A relation schema R is in 3NF if for every nontrivial FD X-> Y in R, X is not a candidate key A relation schema R is in Boyce- Codd Normal Form (BCNF) if for every nontrivial FD X-> Y in R, X is a candidate key 3NF has some redundancy BCNF removes all redundancies caused by FD’s Performance is Lesser than BCNF Better Performance than 3NF

31 Example- 1NF,2NF,3NF & BCNF FD ?

32 Example- 1NF,2NF,3NF & BCNF FD ?
Patno -> PatName Patno,appNo -> Time,doctor

33 Apply Normalization Apply 1NF Eliminating Repeating Groups Apply 2NF
2NF Eliminate partial key dependencies R1(Patno,appNo,time,doctor) R2(Patno,PatName)  Apply 3NF 3NF Eliminate transitive dependencies None: so just as 2NF Apply BCNF Every determinant is a candidate key It is in BCNF Form.


Download ppt "Normalization."

Similar presentations


Ads by Google