Brian Thoms
Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics. But what does that mean? 2
ERD Macro view of an organization’s requirements and operations Can be transformed to DB design using iterative process DB Normalization Micro view of entities (and their attributes) 3
Reduce unnecessary redundancy since too much redundancy across tables will generate anomalies But the goal is not to eliminate all redundancy (some redundancy is desirable) Trade-off: Normalization does not mean better Data becomes spread across a larger number of tables (i.e. more normalization more joins) Types: 1NF, 2NF, 3NF, BCNF, 4NF (and, theoretically, higher forms exist 5NF / 6NF) 4
Why do we normalize? Data can suffer from logical inconsistencies from operations involving data updates, insertions, and deletions (aka data anomalies). 5 Update Anomaly The same information can be expressed on multiple records; therefore updates to the table may result in logical inconsistencies. (e.g.) employee 519 changes his/her address requiring changes across multiple tables/records?
Insertion Anomaly Circumstances in which certain facts cannot be recorded at all. (e.g.) A new faculty member arrives but has not been assigned any courses to teach therefore blanks/null data must be entered? Deletion Anomaly Circumstances when deleting of data representing certain facts requires the deletion of data representing completely different facts. (e.g.) Deleting a course taught for a faculty member could deletes the entire record for that faculty member. 6
First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) Fourth Normal form (4NF) (Theoretical) Fifth Normal Form (5NF) (Theoretical) Sixth Normal Form (6NF) 7
1NF Table faithfully represents a relation and has no "repeating groups“ (i.e. columns with-in/across that store the same data). 8
2NF A table is 1NF No non-prime attribute in the table is functionally dependent on a part (proper subset) of a candidate key (i.e. no partial dependencies exist). 9 +
3NF A table is 2NF Every non-prime attribute is non-transitively dependent on every key of the table DoB is dependent on Winner?
(Informally) A relational database table is described as "normalized" if it is in 3NF. Most 3NF tables are free of insertion, update, and deletion anomalies. 11
BCNF Every non-trivial functional dependency is dependent on a candidate key. BCNF is violated if the table contains more than 1 candidate key. Only in rare cases does a 3NF table not meet the requirements of BCNF 12 (Saver should always be on Court 1)
4NF Table is in 3NF. No multivalued dependencies exist. Multivalued dependencies occur when a determinant determines a particular set of values (e.g., employee degree (BS, MS, Ph.D.)). Solution: make them into separate tables. 13
malization malization form form al_form al_form Codd_normal_form Codd_normal_form l_form l_form 14
15