Normalization CMSC 461 Michael Wilson. Anomalies  Poor relational database design can lead to the occurrence of anomalies  Anomalies that we tend to.

Slides:



Advertisements
Similar presentations
High-Level Database Models Spring 2011 Instructor: Hassan Khosravi.
Advertisements

 Definition  Components  Advantages  Limitations Contents  Definition Definition  Normal Forms Normal Forms  First Normal Form First Normal Form.
Spring 2011 Instructor: Hassan Khosravi
Boyce-Codd NF Takahiko Saito Spring 2005 CS 157A.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 6 A First Course in Database Systems.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 227 Database Systems I Design Theory for Relational Databases.
Midterm Review II. Redundancy. –Information may be repeated unnecessarily in several tuples. –E.g. length and filmType. Update anomalies. –We may change.
Instructor: Amol Deshpande  Data Models ◦ Conceptual representation of the data  Data Retrieval ◦ How to ask questions of the database.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 8 A First Course in Database Systems.
Closure The closure of {B 1 …B k } under the set of FDs S, denoted by {B 1 …B k } +, is defined as follows: {B 1 …B k } + = {B | any relation satisfies.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Boyce-Codd Normal Form Kelvin Nishikawa SE157a-03 Fall 2006 Kelvin Nishikawa SE157a-03 Fall 2006.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design: Normalization.
The principal problem that we encounter is redundancy, where a fact is repeated in more than one tuple. Most common cause: attempts to group into one relation.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
CMSC424: Database Design Instructor: Amol Deshpande
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Why Normalization? To Reduce Redundancy to 1.avoid modification, insertion, deletion anomolies 2.save space Goal: One Fact in One Place.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design 1: Normalization.
Normalization Quiz Tao Li Grant Horntvedt. 1. Which of the following statements is true: a. Normal forms can be derived by inspecting the data in various.
Daniel AdinugrohoDatabase Programming 1 DATABASE PROGRAMMING Lecture on 29 – 04 – 2005.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Week 6 Lecture Normalization
XP Chapter 1 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Level 3 Objectives: Identifying and Eliminating Database.
Functional Dependencies and Normalization 1 Instructor: Mohamed Eltabakh
Introduction to Normalization CPSC 356 Database Ellen Walker Hiram College.
Avoiding Database Anomalies
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Database Systems: Design, Implementation, and Management Tenth Edition
Concepts of Database Management, Fifth Edition
Database Management COP4540, SCS, FIU Relation Normalization (Chapter 14)
Logical Database Design (2 of 3) John Ortiz. Lecture 7Logical Database Design (2)2 Finding All Candidate Keys Let F be a set of FDs satisfied by R(A 1,...,
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
Database Normalization.
Your name here. Improving Schemas and Normalization What are redundancies and anomalies? What are functional dependencies and how are they related to.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
© D. Wong Ch. 3 (continued)  Database design problems  Functional Dependency  Keys of relations  Decompositions based on Functional Dependency.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Normalization MIS335 Database Systems. Why Normalization? Optimizing database structure Removing duplications Accelerating the instructions Data integrity!
Normalization.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
3 Spring Chapter Normalization of Database Tables.
Functional dependencies CMSC 461 Michael Wilson. Designing tables  Now we have all the tools to build our databases  How should we actually go about.
Multivalued Dependencies and 4th NF CIS 4301 Lecture Notes Lecture /21/2006.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
CPSC 603 Database Systems Lecturer: Laurie Webster II, M.S.S.E., M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 5 Introduction to a First Course in Database Systems.
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
Logical Database Design and Relational Data Model Muhammad Nasir
Marwan Al-Namari Hassan Al-Mathami. Normalization&Functional Dependencies What is Normalization? It is a technique. Why we do Normalization for a database?
1. Functional dependencies Modification anomalies Major normal forms Relationship independence Practical concerns.
Databases : Design of Relational Database Schemas 2007, Fall Pusan National University Ki-Joune Li.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Normalizing Database Designs. 2 Objectives In this chapter, students will learn: –What normalization is and what role it plays in the database design.
4NF & MULTIVALUED DEPENDENCY By Kristina Miguel. Review  Superkey – a set of attributes which will uniquely identify each tuple in a relation  Candidate.
More on Decompositions and Third Normal Form CIS 4301 Lecture Notes Lecture /16/2006.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
Design Theory for Relational Databases
Functional Dependencies and Normalization
Database.
Design Theory for Relational Databases
Presentation transcript:

Normalization CMSC 461 Michael Wilson

Anomalies  Poor relational database design can lead to the occurrence of anomalies  Anomalies that we tend to encounter  Redundancy  Data is repeated unnecessarily in several tuples  Update anomalies  Data is updated in one place but not another  Deletion anomalies  Data is deleted from a relation and additional data is removed along with it unintentionally

Redundancy titleyearlengthgenrestudioStar Star Wars SciFiFoxCarrie Fisher Star Wars SciFiFoxMark Hamill Star Wars SciFiFoxHarrison Ford Gone With the Wind DramaMGMVivien Leigh Wayne’s World ComedyParamountDana Carvey Wayne’s World ComedyParamountMike Meyers

Delete anomaly

Redundancy  We have the movie title, movie year, film length, genre, and studio name repeated each and every time we want to add a new star to the relation

Update anomaly titleyearlengthgenrestudioStar Star Wars SciFiFoxCarrie Fisher Star Wars SciFiFoxMark Hamill Star Wars SciFiFoxHarrison Ford Gone With the Wind DramaMGMVivien Leigh Wayne’s World ComedyParamountDana Carvey Wayne’s World ComedyParamountMike Meyers

Update anomaly  We changed the film length for one of the entries  Forgot to update it for the rest of the entries  Now the data is inconsistent

Delete anomaly titleyearlengthgenrestudioStar Star Wars SciFiFoxCarrie Fisher Star Wars SciFiFoxMark Hamill Star Wars SciFiFoxHarrison Ford Gone With the Wind DramaMGMVivien Leigh Wayne’s World ComedyParamountDana Carvey Wayne’s World ComedyParamountMike Meyers

Delete anomaly titleyearlengthgenrestudioStar Star Wars SciFiFoxCarrie Fisher Star Wars SciFiFoxMark Hamill Star Wars SciFiFoxHarrison Ford Wayne’s World ComedyParamountDana Carvey Wayne’s World ComedyParamountMike Meyers

Delete anomaly  Say we wanted to remove Vivien Leigh as a star  We unintentionally removed all information about Gone With the Wind from our relation

How do we address these anomalies?  Same answer for all three  Decomposing relations  Split the attributes of a relation to make two new relations  Relatively simple operation

Decomposing a relation  With a relation R(A 1, A 2,…,A n ), you can decompose into relations S(B 1,B 2,…,B m ) and T(C 1,C 2,…,C k )  {A 1,A 2,…A n }={B 1,B 2,…,B m }U{C 1,C 2,…,C k }  S = π B1,B2,…,Bm (R)  T = π C1,C2,…,Ck (R)  In other words, take a subset of R’s attributes and stuff them into S, and take the rest of R’s attributes and stuff them into T

Decomposing our movies relation – movies2 titleyearlengthgenreStudio Star Wars sciFiFox Gone With the Wind DramaMGM Wayne’s World199295comedyParamount

Decomposing our movies relation – movies3 titleyearstar Star Wars1977Carrie Fisher Star Wars1977Mark Hamill Star Wars1977Harrison Ford Gone With the Wind 1939Vivien Leigh Wayne’s World1992Dana Carvey Wayne’s World1992Mike Meyers

Further decomposition?  There are still some anomaly possibilities in these resulting relations  Also, we would like to be able to handle movies named the same thing in the same year  How could we decompose them further to avoid this?

Normalization  There are normal forms that can be applied to relation schemas and databases  Organizing the attributes to reduce redundancy and dependency on one another  Done primarily through decomposing of relations  Many different normal forms

Normalization forms  You’ll see NF to mean “normal form”  1NF  Attributes are atomic. That is, they cannot be decomposed any further  2NF  Satisfied 1NF, and also all non-prime attributes are dependent wholly on a candidate key (no subsets)  Non-prime attributes are attributes not part of a candidate key

Third normal form (3NF)  The relation satisifies 2NF  Every non-prime attribute is directly dependent on every superkey of the relation  Directly dependent means that it is not transitive  If you have two FDs:  A→B  B→C  A→C is a transitive dependency  In other words, relations that satisfied A→B must only have the key A or attributes directly related to A

Bill Kent’s pledge  3NF: “Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key.”  1NF – The key exists  2NF – Non-key attributes dependent on the whole key  3NF – Non-key attributes must be dependent on nothing but the key