Systems Analysis & Design Methods III Classic normalization rules for relational databases III Classic normalization rules for relational databases.

Slides:



Advertisements
Similar presentations
Relational Terminology. Normalization A method where data items are grouped together to better accommodate business changes Provides a method for representing.
Advertisements

© 2002 by Prentice Hall 1 SI 654 Database Application Design Winter 2003 Dragomir R. Radev.
Athabasca University Under Development for COMP 200 Gary Novokowsky
Database table design Single table vs. multiple tables Sen Zhang.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Boyce-Codd Normal Form Kelvin Nishikawa SE157a-03 Fall 2006 Kelvin Nishikawa SE157a-03 Fall 2006.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design: Normalization.
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 5 The Relational Model and Normalization.
Why Normalization? To Reduce Redundancy to 1.avoid modification, insertion, deletion anomolies 2.save space Goal: One Fact in One Place.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design 1: Normalization.
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Normalization Quiz Tao Li Grant Horntvedt. 1. Which of the following statements is true: a. Normal forms can be derived by inspecting the data in various.
Normalization Rules for Database Tables Northern Arizona University College of Business Administration.
File and Database Design SYS364. Today’s Agenda WHTSA DBMS, RDBMS, SQL A place for everything and everything in its place. Entity Relationship Diagrams.
Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from Update anomalies : Insertions Deletions Modification.
Week 6 Lecture Normalization
XP Chapter 1 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Level 3 Objectives: Identifying and Eliminating Database.
CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)
SQL Normalization Database Design Lecture 5. Copyright 2006Page 2 SQL Normalization Database Design 1 st Normal Form 1 st Normal Form 2 nd Normal Form.
Concepts and Terminology Introduction to Database.
Relational databases and third normal form As always click on speaker notes under view when executing to get more information!
Avoiding Database Anomalies
Normalization A technique that organizes data attributes (or fields) such that they are grouped to form stable, flexible and adaptive entities.
Concepts of Database Management Sixth Edition Chapter 5 Database Design 1: Normalization.
Concepts of Database Management, Fifth Edition
Chapter 7 Normalization. Outline Modification anomalies Functional dependencies Major normal forms Relationship independence Practical concerns.
1 Database Design and Development: A Visual Approach © 2006 Prentice Hall Chapter 4 DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH Chapter 4 Normalization.
Module III: The Normal Forms. Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form. The database.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Normalization. We will take a look at –First Normal Form –Second Normal Form –Third Normal Form There are also –Boyce-Codd, Fourth and Fifth normal forms.
資料庫正規化 Database Normalization 取材自 AIS, 6 th edition By Gelinas et al.
Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies.
Database Normalization Lynne Weldon July 17, 2000.
Schema Refinement and Normal Forms 20131CS3754 Class Notes #7, John Shieh.
1 Normalization Normalization intro Normalization intro First normal form (1NF) First normal form (1NF) Second normal form (2NF) Second normal form (2NF)
CORE 2: Information systems and Databases NORMALISING DATABASES.
Natural vs. Generated Keys. Definitions Natural key—a key that occurs in the data, that uniquely identifies rows. AKA candidate key. Generated key—a key.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Normalisation Africamuseum 5 June What is ‘Normalisation’?  Theoretical: satisfying the requirements of the different ‘Normal Forms’, as spelled.
What's a Database A Database Primer Let’s discuss databases n Why they are hard n Why we need them.
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
M1G Introduction to Database Development 4. Improving the database design.
Component 4: Introduction to Information and Computer Science Unit 6a Databases and SQL.
ITN Table Normalization1 ITN 170 MySQL Database Programming Lecture 3 :Database Analysis and Design (III) Normalization.
©NIIT Normalizing and Denormalizing Data Lesson 2B / Slide 1 of 18 Objectives In this section, you will learn to: Describe the Top-down and Bottom-up approach.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
A337 - Reed Smith1 Structure What is a database? –Table of information Rows are referred to as records Columns are referred to as fields Record identifier.
Database Management Supplement 1. 2 I. The Hierarchy of Data Database File (Entity, Table) Record (info for a specific entity, Row) Field (Attribute,
Concepts of Database Management Seventh Edition Chapter 5 Database Design 1: Normalization.
A table is a set of data elements (values) that is organized using a model of vertical columns (which are identified by their name) and horizontal rows.
Relational Database in Access Student System As always please use speaker notes!
IST Database Normalization Todd Bacastow IST 210.
Ch 7: Normalization-Part 1
1 CS 430 Database Theory Winter 2005 Lecture 7: Designing a Database Logical Level.
Microsoft Access 2010 Chapter 11 Database Design.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Southern Methodist University CSE CSE 2337 Introduction to Data Management Chapter 5 Part II.
IMS 4212: Normalization 1 Dr. Lawrence West, Management Dept., University of Central Florida Normalization—Topics Functional Dependency.
Databases Introduction - concepts. Concepts of Relational Databases.
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
Logical Database Design and Relational Data Model Muhammad Nasir
1 Microsoft Access 2002 Tutorial 2 – Creating And Maintaining A Database.
What Is Normalization  In relational database design, the process of organizing data to minimize redundancy  Usually involves dividing a database into.
1 Agenda TMA02 M876 Block 4. 2 Model of database development data requirements conceptual data model logical schema schema and database establishing requirements.
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
Relational Model.
Lecturer: Mukhtar Mohamed Ali “Hakaale”
Concepts of Database Management Eighth Edition
Presentation transcript:

Systems Analysis & Design Methods III Classic normalization rules for relational databases III Classic normalization rules for relational databases

2 Systems Analysis III Normalization Rules Contents Introduction Introduction First Normal Form: Columns should not repeat First Normal Form: Columns should not repeat Second Normal Form: Non-key columns should depend on the whole primary key, not just on a part. Second Normal Form: Non-key columns should depend on the whole primary key, not just on a part. Third Normal Form: Non-key should not depend on other non-key columns. Third Normal Form: Non-key should not depend on other non-key columns. Note: Excellent reading:

3 Systems Analysis III Normalization Rules Introduction By using normalization rules, we try to By using normalization rules, we try to avoid inconsistencies avoid inconsistencies avoid waste of space. avoid waste of space. enhance flexible use of data (easy SQL queries). enhance flexible use of data (easy SQL queries). minimize the effect of application changes on the database structure. minimize the effect of application changes on the database structure. The normalization rules are to be read accumulatively: The normalization rules are to be read accumulatively: E.g. Your database is in 3NF if it is compliant with the rules given by 1NF, 2NF and 3NF

4 Systems Analysis III Normalization Rules Some vocabulary candidate key = a combination of columns which uniquely determine a table row candidate key = a combination of columns which uniquely determine a table row the primary key = a chosen minimal combination of columns which uniquely determine a table row the primary key = a chosen minimal combination of columns which uniquely determine a table row alternate key = candidate key not chosen as primary key alternate key = candidate key not chosen as primary key foreign key = primary key of another table. Is used to reference a specific row in this other table foreign key = primary key of another table. Is used to reference a specific row in this other table With non-key I mean a key that is not candidate, primary or alternative key With non-key I mean a key that is not candidate, primary or alternative key

5 Systems Analysis III Normalization Rules First Normal Form Columns should not repeat This means that you are not allowed to try and store an array or a collection of the same kind of information, in one table row. This attempt can take two forms which result in two subrules: This means that you are not allowed to try and store an array or a collection of the same kind of information, in one table row. This attempt can take two forms which result in two subrules: You cannot have several columns, having similar information: You cannot have several columns, having similar information: 3 columns child1, child2, child3 (see also next slide) 3 columns child1, child2, child3 (see also next slide) Nor can you put multiple values in one column: Nor can you put multiple values in one column: 1 column children which contains a string of concatenated first names like ‘David-Ben-Joe’ 1 column children which contains a string of concatenated first names like ‘David-Ben-Joe’

6 Systems Analysis III Normalization Rules First Normal Form Columns should not repeat Violation Example: Violation Example: Appl_idAppl_NameRefphone1Refphone2 1237Smithers Simpson

7 Systems Analysis III Normalization Rules First Normal Form Columns should not repeat Problems when violating 1NF: Problems when violating 1NF: Every time more repeated fields are needed, the structure of the table changes, and rewriting of existing code/queries is needed. Every time more repeated fields are needed, the structure of the table changes, and rewriting of existing code/queries is needed. Explicit naming of different columns necessary when quering of programming. Explicit naming of different columns necessary when quering of programming. Rows who do not need many contacts waste space. Rows who do not need many contacts waste space.

8 Systems Analysis III Normalization Rules First Normal Form Columns should not repeat Solution: Solution: Appl_idAppl_Name 1237Smithers 1238Simpson Cand_idRefPhone

9 Systems Analysis III Normalization Rules Second Normal Form Non-key columns should depend on the whole primary key, not just on a part. A field y depends on a field x, if there is only one possible value for y, given a value for x. E.g. in the next slide: The applicant table on top of the slide has a primary key Appl_id. The applicant/reference table below tells you wich applicants have which references. The primary key is Appl_id + Refphone Within de applicant/reference table we can say this: When you know the value of the Appl_id column, you know which Appl_name goes with it. Clearly, the Appl_Name field is completely dependent on Appl_id. That’s is why we say the database violates 2NF.

10 Systems Analysis III Normalization Rules Second Normal Form Non-key columns should depend on the whole primary key, not just on a part. Violation Example Violation Example Appl_idAppl_Name 1237Smithers 1238Simpson Appl_idAppl_NameRefPhone1237Smithers Smithers Simpson Simpson

11 Systems Analysis III Normalization Rules Second Normal Form Non-key columns should depend on the whole primary key, not just on a part. Problems when violating 2NF: Problems when violating 2NF: The part of the primary key, on which the column is dependent, is mostly a foreign key. The part of the primary key, on which the column is dependent, is mostly a foreign key. This means that the dependent column contains information that is probably already available in the record (in another table) to which this foreign key points. So the dependend column contains copied information (from another table) that needs to be kept consistent with the original. (Danger for anomalies.)

12 Systems Analysis III Normalization Rules Second Normal Form Non-key columns should depend on the whole primary key, not just on a part. The part of the primary key, on which the column is dependent, probably contains the same values for different rows. This means that the dependent column contains the same values for these same rows. So the dependend column contains copied information (from the same table) that needs to be kept consistent with the original. The part of the primary key, on which the column is dependent, probably contains the same values for different rows. This means that the dependent column contains the same values for these same rows. So the dependend column contains copied information (from the same table) that needs to be kept consistent with the original. Copying (see above) information is a waste of space. Copying (see above) information is a waste of space.

13 Systems Analysis III Normalization Rules Second Normal Form Non-key columns should depend on the whole primary key, not just on a part. Solution: Solution: Appl_idAppl_Name 1237Smithers 1238Simpson Appl_idRefPhone

14 Systems Analysis III Normalization Rules Third Normal Form Non-key columns should not depend on other non-key columns. Violation Example: Violation Example: Client_idClient_NameZipCity 1SmithB-1000Brussels 2JonesB-2000Antwerp 3VacarelloB-2000Antwerp 4PetersB-3000Leuven

15 Systems Analysis III Normalization Rules Third Normal Form Non-key columns should not depend on other non-key columns. Problems when violating 3NF: Problems when violating 3NF: The dependent column contains information that is also available in other rows from the same table. So the dependend column contains copied information that needs to be kept consistent with the original. (Danger for update anomalies.) The dependent column contains information that is also available in other rows from the same table. So the dependend column contains copied information that needs to be kept consistent with the original. (Danger for update anomalies.) Copied information wastes space. Copied information wastes space.

16 Systems Analysis III Normalization Rules Third Normal Form Non-key columns should not depend on other non-key columns. Solution: Solution: Client_idClient_NameZip 1SmithB JonesB VacarelloB PetersB-3000 ZipCityB-1000Brussels B-2000Antwerp B-3000Leuven

17 Systems Analysis III Normalization Rules Third Normal Form Remarks Columns may be dependent on alternative keys. So they may be dependent on a field or combination of fields that uniquely define a record, but that was not chosen as the primary key. Columns may be dependent on alternative keys. So they may be dependent on a field or combination of fields that uniquely define a record, but that was not chosen as the primary key. E.g. On the next slide Appl_Name is dependent on the alternate key. This is okay since an alternate key uniquely defines the whole record. (In other words, we could just as well have chosen this alternate key as the primary key)

18 Systems Analysis III Normalization Rules Third Normal Form Remarks Appl_id Social_Security_Number Social_Security_NumberAppl_Name Smithers Simpson alterate key primary key