Database Group, Georgia Tech 1 Normalization. Database Group, Georgia Tech 2 Normalization What it’s all about Given a relation, R, and a set of functional.

Slides:



Advertisements
Similar presentations
primary key constraint foreign key constraint
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Announcements Read 6.1 – 6.3 for Wednesday Project Step 3, due now Homework 5, due Friday 10/22 Project Step 4, due Monday Research paper –List of sources.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
+ Review: Normalization and data anomalies CSCI 2141 W2013 Slide set modified from courses.ischool.berkeley.edu/i257/f06/.../Lecture06_257.ppt.
Normal Forms. First Normal Form: all table cells must contain atomic values − no sets, arrays, lists, or other collection types − no structured objects.
3NF and Boyce-Codd Normal Form Prof. Sin-Min Lee Department of Computer Science San Jose State University.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
CMSC424: Database Design Instructor: Amol Deshpande
DATABASE DESIGN Functional Dependencies. Overview n Functional Dependencies n Normalization –Functional dependencies –Normal forms.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
Normalization I.
Decomposition By Yuhung Chen CS157A Section 2 October
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Schema Refinement and Normalization Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 6, 2004 Some slide content.
Databases 6: Normalization
1 Database Theory and Methodology. 2 The Good and the Bad So far we have not developed any measure of “goodness” to measure the quality of the design,
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Relational Database Design by Relational Database Design by Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
CSCD34 - Data Management Systems - A. Vaisman1 Schema Refinement and Normal Forms.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Your name here. Improving Schemas and Normalization What are redundancies and anomalies? What are functional dependencies and how are they related to.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li.
Functional Dependencies An example: loan-info= Observe: tuples with the same value for lno will always have the same value for amt We write: lno  amt.
Database Group, Georgia Tech 1 Relational Model The Relational Model - theoretical foundation.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Chapter Functional Dependencies and Normalization for Relational Databases.
Ihr Logo Fundamentals of Database Systems Fourth Edition El Masri & Navathe Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Ihr Logo Fundamentals of Database Systems Fourth Edition El Masri & Navathe Chapter 10 Functional Dependencies and Normalization for Relational Databases.
3NF and Boyce-Codd Normal Form Prof. Sin-Min Lee Department of Computer Science San Jose State University.
Logical Database Design (1 of 3) John Ortiz Lecture 6Logical Database Design (1)2 Introduction  The logical design is a process of refining DB schema.
1 Lecture 6: Schema refinement: Functional dependencies
Christoph F. Eick: Functional Dependencies, BCNF, and Normalization 1 Functional Dependencies, BCNF and Normalization.
Deanship of Distance Learning Avicenna Center for E-Learning 1 Session - 7 Sequence - 2 Normalization Functional Dependencies Presented by: Dr. Samir Tartir.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Databases Illuminated
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
IST Database Normalization Todd Bacastow IST 210.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
Database Group, Georgia Tech 1 Relational Model The Relational Model - theoretical foundation.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
11/06/97J-1 Principles of Relational Design Chapter 12.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Normal Forms Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems June 18, 2016 Some slide content courtesy of Susan Davidson.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Normalization Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 19.
Functional Dependency and Normalization
Advanced Normalization
Relational Database Design by Dr. S. Sridhar, Ph. D
Handout 4 Functional Dependencies
Advanced Normalization
Schema Refinement and Normalization
Schema Refinement and Normalization
Database Management systems Subject Code: 10CS54 Prepared By:
CS 405G: Introduction to Database Systems
Presentation transcript:

Database Group, Georgia Tech 1 Normalization

Database Group, Georgia Tech 2 Normalization What it’s all about Given a relation, R, and a set of functional dependencies, F, on R. Assume that R is not in a desirable form for enforcing F. Decompose relation R into relations, R 1,..., R k, with associated functional dependencies, F 1,..., F k, such that R 1,..., R k are in a more desirable form, 3NF or BCNF. While decomposing R, make sure to preserve the dependencies, and make sure not to lose information.

Database Group, Georgia Tech 3 Normalization Contents The Good and the Bad Bad database design –redundancy of fact –fact clutter –information loss –dependency loss Good database design How to compute with meaning –functional dependencies - FDs –Armstrong’s inference rules –the meaning of a set of FDs –minimal cover of a set of FDs Normal Forms - overview 1NF, 2NF, 3NF, BCNF

Database Group, Georgia Tech 4 Normalization The Four Commandments: Thou Shalt Commit No Redundancy of Fact Thou Shalt Clutter No Facts Thou Shalt Preserve Information Thou Shalt Preserve Functional Dependencies The Four Commandments: Thou Shalt Commit No Redundancy of Fact Thou Shalt Clutter No Facts Thou Shalt Preserve Information Thou Shalt Preserve Functional Dependencies The Good

Database Group, Georgia Tech 5 Normalization Primitive Domains FLT-SCHEDULE flt# weekday airline dtime from atime to DL242 MO WE FR DELTA 10:40 ATL 12:30 BOS SK912 SA SU SAS 12:00 CPH 15:30 JFK AA242 MO FR AA 08:00 CHI 10:10 ATL Attributes must be defined over domains with atomic values FLT-SCHEDULE flt# weekday airline dtime from atime to DL242 MO DELTA 10:40 ATL 12:30 BOS SK912 SA SAS 12:00 CPH 15:30 JFK AA242 MO AA 08:00 CHI 10:10 ATL DL242 WE DELTA 10:40 ATL 12:30 BOS DL242 FR DELTA 10:40 ATL 12:30 BOS SK912 SU SAS 12:00 CPH 15:30 JFK AA242 FR AA 08:00 CHI 10:10 ATL

Database Group, Georgia Tech 6 Normalization Bad Database Design - redundancy of fact FLIGHTS flt# date airline plane# DL242 10/23/00 Delta k-yo DL242 10/24/00 Delta t-up DL242 10/25/00 Delta o-ge AA121 10/24/00 American p-rw AA121 10/25/00 American q-yg AA411 10/22/00 American h-fe redundancy: airline name repeated for same flight inconsistency: when airline name for a flight changes, it must be changed many places

Database Group, Georgia Tech 7 Normalization Bad Database Design - fact clutter insertion anomalies: how do we represent that SK912 is flown by Scandinavian without there being a date and a plane assigned? deletion anomalies: cancelling AA411 on 10/22/00 makes us lose that it is flown by American. update anomalies: if DL242 is flown by Sabena, we must change it everywhere. FLIGHTS flt# date airline plane# DL242 10/23/00 Delta k-yo DL242 10/24/00 Delta t-up DL242 10/25/00 Delta o-ge AA121 10/24/00 American p-rw AA121 10/25/00 American q-yg AA411 10/22/00 American h-fe-65748

Database Group, Georgia Tech 8 Normalization Bad Database Design - information loss FLIGHTS flt# date airline plane# DL242 10/23/00 Delta k-yo DL242 10/24/00 Delta t-up DL242 10/25/00 Delta o-ge AA121 10/24/00 American p-rw AA121 10/25/00 American q-yg AA411 10/22/00 American h-fe FLIGHTS-AIRLINE flt# airline DL242 Delta AA121 American AA411 American DATE-AIRLINE-PLANE date airline plane# 10/23/00 Delta k-yo /24/00 Delta t-up /25/00 Delta o-ge /24/00 American p-rw /25/00 American q-yg /22/00 American h-fe-65748

Database Group, Georgia Tech 9 Normalization Bad Database Design - information loss FLIGHTS flt# date airline plane# DL242 10/23/00 Delta k-yo DL242 10/24/00 Delta t-up DL242 10/25/00 Delta o-ge AA121 10/24/00 American p-rw AA121 10/25/00 American q-yg AA211 10/22/00 American h-fe AA411 10/24/00 American p-rw AA411 10/25/00 American q-yg AA411 10/22/00 American h-fe DATE-AIRLINE-PLANE date airline plane# 10/23/00 Delta k-yo /24/00 Delta t-up /25/00 Delta o-ge /24/00 American p-rw /25/00 American q-yg /22/00 American h-fe FLIGHTS-AIRLINE flt# airline DL242 Delta AA121 American AA411 American information loss: we polluted the database with false facts; we can’t find the true facts.

Database Group, Georgia Tech 10 Normalization Bad Database Design - dependency loss DATE-AIRLINE-PLANE date airline plane# 10/23/00 Delta k-yo /24/00 Delta t-up /25/00 Delta o-ge /24/00 American p-rw /25/00 American q-yg /22/00 American h-fe FLIGHTS-AIRLINE flt# airline DL242 Delta AA121 American AA411 American dependency loss: we lost the fact that (flt#, date)  plane#

Database Group, Georgia Tech 11 Normalization Good Database Design no redundancy of FACT (!) no inconsistency no insertion, deletion or update anomalies no information loss no dependency loss FLIGHTS-DATE-PLANE flt# date plane# DL242 10/23/00 k-yo DL242 10/24/00 t-up DL242 10/25/00 o-ge AA121 10/24/00 p-rw AA121 10/25/00 q-yg AA411 10/22/00 h-fe FLIGHTS-AIRLINE flt# airline DL242 Delta AA121 American AA411 American

Database Group, Georgia Tech 12 Normalization Let X and Y be sets of attributes in R Y is functionally dependent on X in R iff for each x  R.X there is precisely one y  R.Y Y is fully functional dependent on X in R if Y is functional dependent on X and Y is not functional dependent on any proper subset of X We use keys to enforce functional dependencies in relations: X  Y X Y Functional Dependencies and Keys

Database Group, Georgia Tech 13 Normalization FLIGHTS flt# date airline plane# FLIGHTS flt# date airline plane# FLIGHTS flt# date airline plane# Functional Dependencies and Keys plane# is not determined by flt# alone airline is not determined by flt# and date the FLIGHT relation will not allow the FDs to be enforced by keys

Database Group, Georgia Tech 14 Normalization Functional Dependencies and Keys real worlddatabase name address cust# name address Consider the meaning cust# name address combined separate

Database Group, Georgia Tech 15 Normalization FLT-SCHEDULE FLT-INSTANCE FLT-WEEKDAY AIRPLANE CUSTOMER flt# date plane# RESERVATION flt# airline dtime from-airportcode atime to-airportcode miles price flt# weekday plane# plane-type total-#seats cust# first middle last phone# street city state zip flt# date cust# seat# check-in-statusticket# AIRPORT airportcode name city state #avail-seats CityAirport Name State Airport Code From To Flt Schedule Flt Instance Atime Date Dtime Airline Price Miles Flt# Assigned Customer Airplane Plane# Plane Type Weekday Instance Of Total #Seats Reser- Vation #Avail Seats Customer Address Street State City Zip Customer Name Middle First Last Cust#Phone# Check-In Status Seat# n n n n n n Ticket# Functional Dependencies Functional Dependencies in the ER-Diagram AIRPORT  Airportcode FLT-SCHEDULE  Flt# FLT-INSTANCE  (Flt#, Date) AIRPLANE  Plane# CUSTOMER  Cust# RESERVATION  (Cust#, Flt#, Date) RESERVATION  Ticket# Airportcode  name, City, State Flt#  Airline, Dtime, Atime, Miles, Price,  from) Airportcode,  ) Airportcode (Flt#, Date)  Flt#, Date, Plane# (Cust#, Flt#, Date)  Cust#, Flt#, Date, Ticket#, Seat#, CheckInStatus, Ticket#  Cust#, Flt#, Date Cust#  CustomerName, CustomerAddress, Phone#

Database Group, Georgia Tech 16 Normalization How to Compute Meaning - Armstrong’s inference rules Rules of the computation: –reflexivity: if Y  X, then X  Y –Augmentation: if X  Y, then WX  WY –Transitivity: if X  Y and Y  Z, then X  Z Derived rules: –Union: if X  Y and X  Z, the X  YZ –Decomposition: if X  YZ, then X  Y and X  Z –Pseudotransitivity: if X  Y and WY  Z, then XW  Z Armstrong’s Axioms: –sound –complete

Database Group, Georgia Tech 17 Normalization How to Compute Meaning -the meaning of a set of FDs, F+ umbrella: a collapsible shade consisting of fabric stretched over hinged ribs radiating from a central pole Given the ribs of an umbrella, the FDs, what does the whole umbrella, F +, look like? Determine each set of attributes, X, that appears on a left-hand side of a FD. Determine the set, X +, the closure of X under F.

Database Group, Georgia Tech 18 Normalization How to Compute Meaning when do sets of FDs mean the same? F covers E if every FD in E is also in F + F and E are equivalent if F covers E and E covers F. We can determine whether F covers E by calculating X + with respect to F for each FD, X  Y in E, and then checking whether this X + includes the attributes in Y +. If this is the case for every FD in E, then F covers E. F E F+F+ 

Database Group, Georgia Tech 19 Normalization How to Compute Meaning - minimal cover of a set of FDs Is there a minimal set of ribs that will hold the umbrella open? F is minimal if: every dependency in F has a single attribute as right-hand side we can’t replace any dependency X  A in F with a dependency Y  A where Y  X and still have a set of dependencies equivalent with F we can’t remove any dependency from F and still have a set of dependencies equivalent with F

Database Group, Georgia Tech 20 Normalization How to guarantee lossless joins Decompose relation, R, with functional dependencies, F, into relations, R 1 and R 2, with attributes, A 1 and A 2, and associated functional dependencies, F 1 and F 2. The decomposition is lossless iff: A 1  A 2  A 1 \A 2 is in F +, or A 1  A 2  A 2 \A 1 is in F + R 1 R 2 =R

Database Group, Georgia Tech 21 Normalization How to guarantee preservation of FDs Decompose relation, R, with functional dependencies, F, into relations, R 1,..., R k, with associated functional dependencies, F 1,..., F k. The decomposition is dependency preserving iff: F + =(F 1 ...    F k ) +

Database Group, Georgia Tech 22 Normalization Overview of NFs NF 2 1NF 2NF 3NF BCNF

Database Group, Georgia Tech 23 Normalization Normal Forms - definitions NF 2 : non-first normal form 1NF: R is in 1NF. iff all domain values are atomic2 2NF: R is in 2. NF. iff R is in 1NF and every nonkey attribute is fully dependent on the key 3NF: R is in 3NF iff R is 2NF and every nonkey attribute is non- transitively dependent on the key BCNF: R is in BCNF iff every determinant is a candidate key Determinant: an attribute on which some other attribute is fully functionally dependent.

Database Group, Georgia Tech 24 Normalization Example of Normalization flt# date plane# airline from to miles FLT-INSTANCE flt# date plane# airline from to miles

Database Group, Georgia Tech 25 Normalization flt# date plane# airline from to miles flt# date plane# flt# airline from to miles from to miles flt# airline from to flt# date plane# Example of Normalization 1NF: 3NF & BCNF: 2NF:

Database Group, Georgia Tech 26 Normalization 3NF that is not BCNF A B C Candidate keys:{A,B} and {A,C} Determinants:{A,B} and {C} A decomposition: Lossless, but not dependency preserving! A B C R C B R1R1 A C R2R2

Database Group, Georgia Tech 27 Normalization Major Results in Normalization Theory Theorem: There is an algorithm for testing a decomposition for lossless join wrt. a set of FDs Theorem: There is an algorithm for testing a decomposition for dependency preservation Theorem: There is an algorithm for lossless join decomposition into BCNF Theorem: There is an algorithm for dependency preserving decomposition into 3NF