Dependency preservation, 3NF revisited and BCNF

Slides:



Advertisements
Similar presentations
The Relational Model and Normalization (2) IS 240 – Database Management Lecture # Prof. M. E. Kabay, PhD, CISSP Norwich University
Advertisements

Copyright: ©2005 by Elsevier Inc. All rights reserved. 1 Author: Graeme C. Simsion and Graham C. Witt Chapter 3 The Entity-Relationship Approach.
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
Fourth normal form: 4NF 1. 2 Normal forms desirable forms for relations in DB design eliminate redundancies avoid update anomalies enforce integrity constraints.
5NF and other normal forms
Introduction to SQL 1 Lecture 5. Introduction to SQL 2 Note in different implementations the syntax might slightly differ different features might be.
1 Term 2, 2004, Lecture 3, NormalisationMarian Ursu, Department of Computing, Goldsmiths College Normalisation 5.
Query optimisation.
Functional dependencies 1. 2 Outline motivation: update anomalies cause: not expressed constraints on data (FDs) functional dependencies (FDs) definitions.
Relational data integrity
Normal forms - 1NF, 2NF and 3NF
1 Term 2, 2004, Lecture 2, Normalisation - IntroductionMarian Ursu, Department of Computing, Goldsmiths College Normalisation Introduction.
1 Term 2, 2007, Lectures 2/3, NormalisationD. Tidhar (based on M. Ursu) Department of Computing, Goldsmiths College Normalisation 5.
CS 319: Theory of Databases
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 5 Normalization of Database Tables
5 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Chapter 5 Normalization of Database Tables
FEN Introduction to the database field: Quality checking table design: Design Guidelines Normalisation Seminar: Introduction to relational.
Schema Refinement: Normal Forms
Functional Dependencies and Normalization for Relational Databases
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
Week 1.
We will resume in: 25 Minutes.
Shantanu Narang.  Background  Why and What of Normalization  Quick Overview of Lower Normal Forms  Higher Order Normal Forms.
Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina.
Schema Refinement and Normal Forms Given a design, how do we know it is good or not? What is the best design? Can a bad design be transformed into a good.
Relational Normalization Theory. Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
1 Normalization. 2 Normal Forms v If a relation is in a certain normal form (BCNF, 3NF etc.), it is known that certain kinds of redundancies are avoided/minimized.
Chapter 8 Normal Forms Based on Functional Dependencies Deborah Costa Oct 18, 2007.
Design Guidelines Normalisation Table Design. Informal Design Guidelines Table Semantics A table should hold information about one and only one entity/concept.
Need for Normalization
Department of Computer Science and Engineering, HKUST Slide 1 Finding All the Keys Computationally, finding all the keys can be done by exhaustive search:
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Normalization B Database Systems Normal Forms Wilhelm Steinbuss Room G1.25, ext. 4041
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1.
Relational Database Design by Relational Database Design by Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Your name here. Improving Schemas and Normalization What are redundancies and anomalies? What are functional dependencies and how are they related to.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
FEN Quality checking table design: Design Guidelines Normalisation Table Design Is this OK?
Normalization Ioan Despi 2 The basic objective of logical modeling: to develop a “good” description of the data, its relationships and its constraints.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
1 Functional Dependencies and Normalization Chapter 15.
Design Process - Where are we?
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Ch 7: Normalization-Part 1
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Copyright © Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF.
Advanced Database System
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1.
4TH NORMAL FORM By: Karen McVay.
Normalization Karolina muszyńska
A brief summary of database normalization
Normalization First Normal Form (1NF) Boyce-Codd Normal Form (BCNF)
Database Design Dr. M.E. Fayad, Professor
Fourth normal form: 4NF.
Module 5: Overview of Normalization
Normalization Dale-Marie Wilson, Ph.D..
Sridhar Narayan Normalization Sridhar Narayan
Database Design Dr. M.E. Fayad, Professor
Chapter 7a: Overview of Database Design -- Normalization
Functional Dependencies and Normalization
Presentation transcript:

Dependency preservation, 3NF revisited and BCNF

Decomposition - more than one possibility normalisation  decomposition (non-loss) Modules(M_id, M_name, Type, Value) solution #1 (3NF) Modules_Descr(M_id, M_name, Type) Type_Val(Type, Val) solution #2 (3NF) Module_Val(M_id, Val) are they both non-loss? (apply Heath’s theorem) is there one better than the other?

Decomposition - update anomalies updates u1: insert the fact that a 3 semester module is worth 1.5cu u2: modify 1 semester modules; they are not worth 0.5cu any longer, they are 0.75cu u3: change the type of a module but forget to change its value solution #2 u1 and u2 are impossible or very difficult to perform u3 is allowed solution #1 u1 and u2 are straightforward u3 is not allowed

Solution #1 vs solution #2 more expressive certain facts cannot be expressed in solution #2; e.g. the value of a new type updates can be independently performed on the two component relations (i.e. all constraints are properly expressed) in solution #2: Type  Value is lost, so this constraint must be enforced by the user by procedural code independent projections updates can be performed independently on each projection, without the danger of ending with inconsistent data

Independent projections M_name M-id Type Value Solution #1 Solution #2 M-id Type M_name M-id Type M_name Type Value M_id Value all direct : intra all transitive : inter one transitive : intra one direct : lost

Independent projections - Risanen R1 and R2 are two projections of R; R1 and R2 are independent if and only if every FD in R is a logical consequence of the FDs in R1 and R2 the common attributes of R1 and R2 for a candidate key for at least one of R1 or R2 atomic relation cannot be decomposed into independent projections

Dependency preservation R was decomposed (normalisation) into R1, …, Rn S - the set of FDs for R S1, …, Sn - the set of FDs for R1, …, Rn (each Si refers to only the attributes of Ri) S’ = S1  …  Sn (usually, S’  S) the decomposition is dependency preserving if (not iff) S’+ = S+

2NF and 3NF - more than one CK a relation is in 2NF if and only if it is in 1NF and all non-key attributes are irreducibly dependent on the candidate keys 3NF (Zaniolo) R is a relation; X is any set of attributes of R; A is any single attribute of R; consider the following conditions: X contains A X contains a candidate key of R A is contained in a candidate key of R if either of the three is true for every FD X  A then R is in 3NF

Example Assume the supply department in a company is in charge of bringing parts from different manufacturers. A part is uniquely identified by its name and manufacturer; for convenience, a part is also given an id. A separate delivery is necessary for each type of part, from each manufacturer. At most one delivery is made in one day for one type of part from one manufacturer. A “transport” (e.g. van23) is associated with each delivery. Each transport has a unique driver. A driver can drive more than one “transports”.

Relevant FDs CK: (Type, Manufacturer, Date) CK: (Id, Date) (Type, Manufacturer)  Id Id  (Type, Manufacturer) Transport  Driver Manufacturer  Address Type  Handling_req

2NF 2NF? 2NF Type HR Man Add

3NF 3NF? 3NF Transp Driver

3NF 3NF is not free from update e.g. (Type, Manufacturer)  Id exercise insert delete update

BCNF a relation is in Boyce/Codd normal form (BCNF) if and only if every non-trivial irreducible FD has a candidate key as its determinant any relation can be non-loss decomposed into an equivalent set of BCNF relations BCNF  3NF  2NF  1NF BCNF is still not guaranteed to be free of any update anomalies caused by FDs example - later

BCNF - examples previous example: one candidate key only CKs: Id, Name, Photo (what do you think about this?), User_name draw the corresponding FD overlapping CKs: (Name, Contest), (Contest, Position)

Zaniolo’s definitions R is a relation; X is any set of attributes of R; A is any single attribute of R; consider the following conditions: X contains A X contains a candidate key of R A is contained in a candidate key of R if either of the three is true for every FD X  A then R is in 3NF if either of the first two is true for every FD X  A then R is in BCNF

BCNF again Patient Doctor Disease a patient is treated by a single doctor for a certain disease each doctor only treats one kind of disease a doctor can treat more than one patient is this relation 3NF? is this relation BCNF? can you identify update anomalies? consider also (Patient, Disease, Doctor, Treatment) with Patient, Disease  Treatment Disease Doctor Patient

Possible decompositions non-loss? (choose PKs) non-loss? (choose PKs) Heath’s theorem (choose PKs)

BCNF vs dependency preservation and do not enforce a FD existing in the original specification, namely: e.g. a patient can be given two doctors that treat the same disease (the system will not disallow this); the constraint would have to be maintained by procedural code

BCNF vs dependency preservation not every FD is expressible through normalisation when the relation was in 3NF (Patient, Disease)  Doctor was expressed a doctor could not be assigned to more than one patient-disease Doctor  Disease was not expressed generated update anomalies in BCNF Doctor  Disease was expressed (Patient, Disease)  Doctor was not expressed generated update anomalies (refer to previous slide) this latter FD would not have been expressed even if the decomposition in all three 2-attribute relations had been considered

Conclusions normal forms : formalisation of common sense BCNF art  engineering possibility for automation BCNF always achievable not always free of update anomalies (recall previous example), because it cannot always express all the FDs existing in the problem