Gergely Lukács Pázmány Péter Catholic University

Slides:



Advertisements
Similar presentations
Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina.
Advertisements

Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
The Relational Model System Development Life Cycle Normalisation
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 Database Design Theory Which tables to have in a database Normalization.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
1 Multi-valued Dependencies. 2 Multivalued Dependencies There are database schemas in BCNF that do not seem to be sufficiently normalized. Consider a.
Introduction to Schema Refinement
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Computing & Information Sciences Kansas State University Monday, 13 Oct 2008CIS 560: Database System Concepts Lecture 18 of 42 Monday, 13 October 2008.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design Pitfalls in.
Relational Database Design by Relational Database Design by Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Computing & Information Sciences Kansas State University Tuesday, 27 Feb 2007CIS 560: Database System Concepts Lecture 18 of 42 Tuesday, 27 February 2007.
Computing & Information Sciences Kansas State University Wednesday, 04 Oct 2006CIS 560: Database System Concepts Lecture 17 of 42 Wednesday, 04 October.
Functional Dependencies and Normalization Jose M. Peña
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
11/06/97J-1 Principles of Relational Design Chapter 12.
Computing & Information Sciences Kansas State University Friday, 03 Oct 2007CIS 560: Database System Concepts Lecture 16 of 42 Wednesday, 03 October 2007.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
10/3/2017.
Normalization Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 19.
Functional Dependency and Normalization
Advanced Normalization
Functional Dependencies and Normalization for Relational Databases
Database Management Systems (CS 564)
Module 5: Overview of Database Design -- Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for RDBs
Database Design Dr. M.E. Fayad, Professor
Relational Database Design by Dr. S. Sridhar, Ph. D
Relational Database Design
Chapter 8: Relational Database Design
3.1 Functional Dependencies
Handout 4 Functional Dependencies
Advanced Normalization
Chapter 7: Relational Database Design
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Schema Refinement and Normalization
Database Management systems Subject Code: 10CS54 Prepared By:
Module 5: Overview of Normalization
Chapter 7: Relational Database Design
Schema Refinement What and why
Normalization Murali Mani.
Functional Dependencies and Normalization
Normalization.
Normal Form: 4 & 5.
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Decomposition and Higher Forms of Normalization
CS 405G: Introduction to Database Systems
Chapter 7: Relational Database Design
Instructor: Mohamed Eltabakh
Schema Refinement and Normal Forms
Relational Database Design
Chapter Outline 1 Informal Design Guidelines for Relational Databases
Chapter 7: Relational Database Design
Database Design Dr. M.E. Fayad, Professor
Chapter 7a: Overview of Database Design -- Normalization
Functional Dependencies and Normalization
Presentation transcript:

Database Systems Relational Database Design Functional Dependencies, Normalisation Theory Gergely Lukács Pázmány Péter Catholic University Faculty of Information Technology Budapest, Hungary lukacs@itk.ppke.hu

Overview Redundancy Goals of normalisation Keys Insert, Update, Delete anomalies Goals of normalisation Keys Super key, candidate key, primary key Functional Denepdencies 1NF, 2NF, 3NF, BCNF Multivalued dependencies, 4NF

Example: Repetition of information Data for building and budget are repeated for each dept_name Problems: Redundancy: Complicates updating, introducing possibility of inconsistency Wastes space Null values Cannot store information about a department if no instructor exists Can use null values, but they are difficult to handle.

Update, Insertion, Deletion anomalies (Uncontrolled redundancy) leads to: Update anomaly: If one instance is updated, then all other occurances have to be updated to the same new value (otherwise: inconsistency) Insertion anomaly: it is not possible to store information unless other information is stored also Deletion anomaly: it is not possible to delete some pieces of information without losing some other pieces of information

Features of a good design Free the database of modification anomalies Make the database schema more informative to users Minimize redesign when extending the database schema Avoid bias towards any particular pattern of querying

The role of normalisation Theoretical approach: starting with one relation, containing all attributes (universal relation) Some points in the design Getting information on the miniworld E/R model Relational model Normalisation Redesign, reverse-engineering

Decomposition A decomposition of R=(A1, A2, ...An) is a set of relations R1,…Rk R1(A11, A12, ...A1i), R2(A21, A22, ...A2j),.... Rk (Ak1, Ak2, ...Akn) such that the following 2 properties hold: 1. U Anm = { A1, A2, ...An } 2. an instance of Rk is rk=  (Ak1, Ak2, ...Akn)(r) where r is an instance of R

Decomposition All attributes of an original schema (R) must appear in the decomposition (R1, R2): R = R1 ∪ R2

Example: Lossy Decomposition

Definition of lossless decomposition Theorem If relations R1,…,Rk form a decomposition of R, then r  r1 r2.... rk (  = natural join) Definition: If relations R1,…Rk form a decomposition of R, then it is said to be a lossless decomposition, if r = r1 r2.... rk.

Functional Dependencies Functional dependency is a constraint: the value for a certain set of attributes determines uniquely the value for another set of attributes. generalization of the notion of a key.

Functional Dependencies Let R be a relation schema   R and   R The functional dependency    holds on R if and only if for any legal relations r(R), whenever any two tuples t1 and t2 of r agree on the attributes , they also agree on the attributes . That is, t1[] = t2 []  t1[ ] = t2 [ ] Example (with one attribute): Consider r(A,B) with the following instance of r. On this instance, A  B does NOT hold, but B  A does hold. Related to the semantics of the relationships, not to particular data in the tables! A dependency X⟶A is full if the dependency fails for every proper subset X' of X; the dependency is partial if not, ie if there is a proper subset X' of X such that X'⟶A. 4 1 5 7

Armstrong's axioms Additional rules derived from axioms F1: reflexivity if Y  X then X ® Y F2: augmentation if X ® Y then XZ ® YZ F3: transitivity if X ® Y and Y ® Z then X ® Z A B C Additional rules derived from axioms Union if A  B and A  C, then A  BC Decomposition if A BC, then A  B and A  C

Super/Candidate/Primary Keys K is a superkey for relation schema R if and only if K → R K is a candidate key for R if and only if K → R, and for no α ⊂ K, α → R Primary key: one selected candidate key Prime attribute: an attribute that belongs to some candidate key

First normal form First normal form (1NF) is now considered part of the formal definition of the relational model A relational schema R is in first normal form if the domains of all attributes of R are atomic (indivisible) and that the value of any attribute in a tuple must be a single value from the domain Example of non-atomic domains: Composite attributes Non-atomic values complicate storage and encourage redundant (repeated) storage of data Example: Set of accounts stored with each customer, and set of owners stored with each account NOTE: Objectrelational databases (used e.g., for geographic or xml databases) have moved away from this restriction

Second normal form (2NF) No non-prime attribute in the table is functionally dependent on a proper subset of any candidate key If K represents the set of attributes making up a candidate key every nonprime attribute A (that is an attribute not a member of any key) is functionally dependent on K (i.e. K⟶A), but that this fails for any proper subset of K (no proper subset of K functionally determines A).

Third normal form (3NF) 2NF and there is no dependency X⟶A for nonprime attribute A and for an attribute set X that does not contain a key (i.e. X is not a superkey). In other words, if X⟶A holds for some nonprime A, then X must be a superkey. For comparison, 2NF says that if X⟶A for nonprime A, then X cannot be a proper subset of any key, but X can still overlap with a key or be disjoint from a key.

Boyce-Codd Normal Form (BCNF) BCNF requires that whenever there is a nontrivial functional dependency X⟶A, then X is a superkey. It differs from 3NF in that 3NF requires either that X be a superkey or that A be prime (a member of some key). BCNF bans all nontrivial nonsuperkey dependencies X⟶A; 3NF makes an exception if A is prime.

“I swear to construct my tables so that all nonkey columns are dependent on the key, the whole key and nothing but the key, so help me Codd.”

(BCNF and Dependency Preservation) Constraints, including functional dependencies, are costly to check in practice unless they pertain to only one relation If it is sufficient to test only those dependencies on each individual relation of a decomposition in order to ensure that all functional dependencies hold, then that decomposition is dependency preserving. It is not always possible to achieve both BCNF and dependency preservation

(BCNF cont.) All databases enforce primary-key constraints. One could use a CHECK statement to enforce the lost FD2 statement, but this is often a lost cause. CHECK (not exists (select ay.county, ax.lot_num, ax.property_ID, ax2.property_ID from LOTS1AX ax, LOTS1AX ax2, LOTS1AY ay where ax.area = ay.area and ax2.area = ay.area // join condition and ax.lot_num = ax2.lot_num and ax.property_ID <> ax2.property_ID)) We might be better off ignoring FD5 here, and just allowing for the possibility that area does not determine county, or determines it only "by accident". Generally, it is good practice to normalize to 3NF, but it is sometimes not possible to achieve BCNF

Multivalued Dependencies There are database schemas in BCNF that do not seem to be sufficiently normalized Consider a database classes(course, teacher, book) such that (c,t,b)  classes means that t is qualified to teach c, and b is a required textbook for c The database is supposed to list for each course the set of teachers any one of which can be the course’s instructor, and the set of books, all of which are required for the course (no matter who teaches it).

Multivalued Dependencies (Cont.) There are no non-trivial functional dependencies and therefore the relation is in BCNF Insertion anomalies – i.e., if Sara is a new teacher that can teach database, two tuples need to be inserted (database, Sara, DB Concepts) (database, Sara, Ullman) course teacher book database operating systems Avi Hank Sudarshan Jim DB Concepts Ullman OS Concepts Shaw classes

Multivalued Dependencies (Cont.) Therefore, it is better to decompose classes into: course teacher database operating systems Avi Hank Sudarshan Jim teaches course book database operating systems DB Concepts Ullman OS Concepts Shaw text

Multivalued Dependencies (MVDs) Multivalued dependency: constraint between two sets of attributes in a relation in fact: 3 sets of attributes Multivalued dependency requires that certain tuples be present in a relation Let R be a relation schema and let   R and   R. The multivalued dependency    holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r such that t1[] = t2 [], there exist tuples t3 and t4 in r such that: t1[] = t2 [] = t3 [] = t4 [] t3[] = t1 [] t3[R – ] = t2[R – ] t4 [] = t2[] t4[R – ] = t1[R – ]

MVD (Cont.) Tabular representation of   

Example (Cont.) In our example: course  teacher course  book The above formal definition is supposed to formalize the notion that given a particular value of Y (course) it has associated with it a set of values of Z (teacher) and a set of values of W (book), and these two sets are in some sense independent of each other. Note: If Y  Z then Y  Z

4NF 4NF: avoiding multivalued dependencies

Youtube (good, somewhat too detailed) Functional Dependencies Part 1 https://www.youtube.com/watch?v=jwNv1-b0tJs Part 2 https://www.youtube.com/watch?v=MI1IdFAuNcM BCNF Part 1 https://www.youtube.com/watch?v=aQAzqTJ-8o8 Part 2 https://www.youtube.com/watch?v=SIBY4lHBXT0 Multivalued Dependencies Par t 1 https://www.youtube.com/watch?v=RFiyKvFguVs Part 2 https://www.youtube.com/watch?v=e67f26RSsu4