Download presentation
Presentation is loading. Please wait.
1
A brief summary of database normalization
2
Database normalization
It was first proposed by Dr. Edgar F. Codd in 1970, as an integral part of a relational model. Normalization involves arranging attributes in relations based on dependencies between attributes, ensuring that the dependencies are properly enforced by database integrity constraints. Normalization is accomplished through applying some formal rules either by a process of synthesis or decomposition. Synthesis creates a normalized database design based on a known set of dependencies. Decomposition takes an existing (insufficiently normalized) database design and improves it based on the known set of dependencies.
3
Database Normalization, or simply normalization, is the process of organizing the columns (attributes) and tables (relations) of a relational database to reduce data redundancy and improve data integrity. It reduces and eliminates redundant data. It assures data integrity It avoids data anomalies Normalization is also the process of simplifying the design of a database so that it achieves the optimum structure.
4
The normal forms are progressive, so to achieve Second Normal Form, the tables must already be in First Normal Form. 2NF is better than 1NF; 3NF is better than 2NF Informally, a relational database relation is often described as "normalized" if it meets Third Normal Form. Most 3NF relations are free of insertion, update, and deletion anomalies.
5
Given a relation (a table) design, how to normalize it?
6
Step 1: Identify Functional dependency (FD)
Identify FD (the dependence relationship between attributes).
7
Dependency and partial dependency
What is dependency? If you look at two attributes (in a table), there are two kinds of relationship. Independent from each other, for example age and state in student table. One depends on the other, or in order words, one decide the other. For example, ssn and age. Age depends on ssn. If you know one’s ssn, you know his/her age. This shows the dependency of age on ssn, which usually is key. Partial dependency (in case when the primary key consists of multiple fields.) Fields within the table are dependent only on part of the primary key
8
Transitive dependency
Field is dependent on another field within the table that is not the primary key field.
9
Step 2: Identify keys If the relation has a primary key designated, the primary key should automatically be a key for all other attributes (by definition) Additionally, attributes deciding other attributes are keys to those subset of attributes. If an attribute (a subset of attributes) decides all other attributes in the same relation, then the deciding attribute (or the subset of attributes) is a (candidate) key of the relation.
10
Step 3: verify against normal forms
11
First normal form (1NF) Applies to every relation
Primary key field identified No multi-valued attributes, no composite attributes, i.e. each attribute is atomic, one value for each attribute. Applies to every relation
12
Second normal form (2NF) Normalization (continued)
In 1NF No partial dependencies
13
Third normal form (3NF) Normalization (continued)
In 2NF No transitive dependencies, i.e. the non – primary key attributes should be mutually independent Table is in 3NF when it is in 2NF and there are no transitive dependencies Transitive dependency Field is dependent on another field within the table that is not the primary key field
14
Normalization by Decomposition
If a table meets all three NF, usually the table is in the good shape. Recall that informally, a relational database relation is often described as "normalized" if it meets Third Normal Form. Most 3NF relations are free of insertion, update, and deletion anomalies. Otherwise, the table usually needs to be further decomposed to achieve normal forms.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.