Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management Peter Rob & Carlos Coronel.

Slides:



Advertisements
Similar presentations
Normalization of Database Tables
Advertisements

Chapter 5 Normalization of Database Tables
Database Tables and Normalization
5 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Normalization of Database Tables
DBS201: Introduction to Normalization
1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Chapter 5 Normalization of Database Tables
Normalization of Database Tables Special adaptation for INFS-3200
Normalization of Database Tables
Need for Normalization
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Normalization of Database Tables
4 Chapter 4 Normalization Hachim Haddouti. 4 Hachim Haddouti, CH4, see also Rob & Coronel 2 In this chapter, you will learn: What normalization is and.
Normalization of Database Tables
Normalization of Database Tables
Database Systems: Design, Implementation, and Management Tenth Edition
Chapter 5 Normalization of Database Tables
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
5 Chapter 5 Normalization of Database Tables Example Database Systems: Design, Implementation, and Management, Rob and Coronel Special adaptation for INFS-3200.
NORMALIZATION N. HARIKA (CSC).
Chapter 5 Normalization of Database Tables
Normalization of Database Tables
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
FUNCTIONAL DEPENDENCIES
Text & Original Presentations
ITEC 3220M Using and Designing Database Systems Instructor: Prof. Z. Yang Course Website: 3220m.htm
Database Systems: Design, Implementation, and Management Tenth Edition
Concepts of Database Management, Fifth Edition
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
The Relational Model and Normalization R. Nakatsu.
1 DATABASE SYSTEMS DESIGN IMPLEMENTATION AND MANAGEMENT INTERNATIONAL EDITION ROB CORONEL CROCKETT Chapter 7 Normalisation.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Logical Database Design Relational Model. Logical Database Design Logical database design: process of transforming conceptual data model into a logical.
Database Design – Lecture 8
Database Principles: Fundamentals of Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables Carlos Coronel, Steven.
Normalization of Database Tables
Chapter 4 Normalization of Database Tables. 2 Database Tables and Normalization Table is basic building block in database design Table is basic building.
E-R Modeling: Table Normalization. Normalization of DB Tables Normalization ► Process for evaluating and correcting table structures determines the optimal.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization Example. Database Systems, 8 th Edition 2 Database Tables and Normalization Normalization –Process for evaluating and correcting table structures.
ITEC 3220A Using and Designing Database Systems Instructor: Gordon Turpin Course Website: Office: CSEB3020.
3 Spring Chapter Normalization of Database Tables.
Database Systems, 8 th Edition Improving the Design Table structures cleaned up to eliminate initial partial and transitive dependencies Normalization.
Brian Thoms.  Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain.
Logical Database Design and the Relational Model.
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Week 4 Lecture Part 1 of 3 Normalization of Database Tables Samuel ConnSamuel Conn, Asst. Professor.
5 1 Normalization of Database Tables. 5 2 Database Tables and Normalization Normalization –Process for evaluating and correcting table structures to minimize.
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Normalizing Database Designs. 2 Objectives In this chapter, students will learn: –What normalization is and what role it plays in the database design.
Normalization.
Chapter 5: Relational Database Design
Chapter 4: Relational Database Design
Functional Dependencies
Normalization of Database Tables PRESENTED BY TANVEERA AKHTER FOR BCA 2ND YEAR dated:15/09/2015 DEPT. OF COMPUTER SCIENCE.
Chapter 6 Normalization of Database Tables
CSCI 2141 – Intro to Database Systems
Normalization A337.
Normalization of Database Tables Uploaded by: mysoftbooks.ml
Normalization of DB relations examples Fall 2015
DATABASE DESIGN & DEVELOPMENT
Review of Week 3 Relation Transforming ERD into Relations
Presentation transcript:

Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management Peter Rob & Carlos Coronel

In this chapter, you will learn: 4What normalization is and what role it plays in database design 4About the normal forms 1NF, 2NF, 3NF, BCNF, and 4NF 4How normal forms can be transformed from lower normal forms to higher normal forms 4That normalization and E-R modeling are used concurrently to produce a good database design 4That some situations require denormalization to generate information efficiently

Database Tables and Normalization 4Normalization u Process for evaluating and correcting table structures to minimize data redundancies u process for assigning attributes to tables. It l reduces data redundancies l helps eliminate data anomalies. l produces controlled redundancies to link tables 4Normalization works through a series of stages called normal forms: u First normal form (1NF) u Second normal form (2NF) u Third normal form (3NF) u Fourth normal form (4NF)

Database Tables and Normalization 4Normalization u 2NF is better than 1NF; u 3NF is better than 2NF u For most business database design purposes, 3NF is as high as we need to go in normalization process 4The highest level of normalization is not always most desirable.

Database Tables and Normalization 4The Need for Normalization u Case of a Construction Company l Building project -- Project number, Name, Employees assigned to the project. l Employee -- Employee number, Name, Job classification l The company charges its clients by billing the hours spent on each project. The hourly billing rate is dependent on the employee’s position. l Periodically, a report is generated. Table 5.1 l The easiest way to generate the required report might seem to be a table whose contents correspond to the reporting requirements. Figure 5.1

4Need for Normalization: Problems with the Figure 5.1 u The project number is intended to be a primary key, but it contains nulls. u The table displays data redundancies. u The table entries invite data inconsistencies. u The data redundancies yield the following anomalies: l Update anomalies. (modify JOB_CLASS for Employee 105) l Insertion anomalies. (a new Employee not yet assigned) l Deletion anomalies. ( Employee 103 quits) Database Tables and Normalization

The Normalization Process

4Conversion to 1NF u Repeating groups – a group of multiple entries can exist for any single key attribute occurrence. u Repeating groups must be eliminated Any project number (PROJ_NUM) can have a group of several data entries. u A relational table must not contain repeating groups. Database Tables and Normalization

Step 1: Eliminate the Repeating Groups – Repeating groups can be eliminated by adding the appropriate entry in at least the primary key column(s). Step 2: Identify the Primary Key Uniquely identifies attribute values (rows) Combination of PROJ_NUM and EMP_NUM

4Step 3: Identify all Dependencies u Dependency Diagram l The primary key components are bold, underlined, and shaded in a different color. l The arrows above entities indicate all desirable dependencies ( dependencies based on PK ) l The arrows below the dependency diagram indicate less desirable dependencies – –partial dependencies ( dependencies based on only a part of PK ) –transitive dependencies ( nonprime attribute → nonprime attribute ) l Prime attribute = Key attribute l Nonprime attribute = Nonkey attribute Database Tables and Normalization

 EMP_NUM → EMP_NAME, JOB_CLASS_, CHG_HOUR  PROJ_NUM → PROJ_NAME  JOB_CLASS → CHG_HOUR

41NF Definition u The term first normal form (1NF) describes the tabular format in which: l All the key attributes are defined. l There are no repeating groups in the table. Each row/col intersection can contain one and only one value, not set of values. l All attributes are dependent on the primary key. 4All relational tables satisfy the 1NF requirements. 41NF Drawback u Partial dependencies (EMP_NUM → EMP_NAME, JOB_CLASS_, CHG_HOUR ) →→ data redundancies →→ data anomalies Database Tables and Normalization

4Conversion to 2NF u Step 1: Identify All Key Components l Writing each key component on a separate line, and then l writing the original key on the last line and PROJ_NUM EMP_NUM PROJ_NUM, EMP_NUM u Step 2: Identify the Dependent Attributes l Writing the dependent attributes after each new key. PROJECT ( PROJ_NUM, PROJ_NAME) EMPLOYEE ( EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR ) ASSIGN ( PROJ_NUM, EMP_NUM, HOURS) Database Tables and Normalization

42NF Definition u A table is in 2NF if: l It is in 1NF and l It includes no partial dependencies; that is, no attribute is dependent on only portion of primary key. u A table whose primary key is not composite must automatically be in 2NF. BECAUSE a partial dependency can exist only if a table has a composite primary key u It is still possible for a table in 2NF to exhibit transitive dependency; that is, one or more attributes may be functionally dependent on nonkey attributes. 42NF Drawback u Transitive dependencies (JOB_CLASS → CHG_HOUR) →→ data redundancies →→ data anomalies Database Tables and Normalization

4Conversion to 3NF u Create a separate table with attributes in a transitive functional dependence relationship. l Step 1: Identify Each New Determinant JOB_CLASS l Step 2: Identify the Dependent Attributes JOB_CLASS → CHG_HOUR l Step 3: Remove the Dependent Attributes from Transitive Dependencies EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS) JOB (JOB_CLASS, CHG_HOUR) PROJECT (PROJ_NUM, PROJ_NAME) ASSIGN (PROJ_NUM, EMP_NUM, HOURS) Database Tables and Normalization

43NF Definition u A table is in 3NF if: l It is in 2NF and l It contains no transitive dependencies. Database Tables and Normalization

Improving the Design 4Table structures are cleaned up to eliminate the troublesome initial partial and transitive dependencies 4Normalization cannot, by itself, be relied on to make good designs 4It is valuable because its use helps eliminate data redundancies

Improving the Design 4Issues to address in order to produce a good normalized set of tables: u Evaluate PK Assignments u Evaluate Naming Conventions u Refine Attribute Atomicity u Identify New Attributes u Identify New Relationships u Refine Primary Keys as Required for Data Granularity u Maintain Historical Accuracy u Evaluate Using Derived Attributes

Improving the Design u Adding relationships l Project’s manager l EMP_NUM as a FK in PROJECT. 3NF Project manager

Improving the Design u PK assignment l JOB_CODE u Naming conventions l JOB_CHG_HOUR l JOB_CLASS >> JOB_DESCRIPTION 2NF

Improving the Design u Attribute atomicity l EMP_NAME >> EMP_LNAME,EMP_FNAME,EMP_INITIAL u Adding attributes l EMP_HIREDATE JOB_CLASS 3NF

Improving the Design u Refining PKs l (EMP_NUM+PROJ_NUM) >> ASSIGN_NUM u Maintaining historical accuracy l ASSIGN_CHG_HOUR > JOB_CHG_HOUR u Using derived attributes l ASSIGN_CHARGE = ASSIGN_HOURS × ASSIGN_CHG_HOUR 3NF

Surrogate Key Considerations 4When primary key is considered to be unsuitable, designers use system-defined surrogate keys u The DBMS can be used to have the system assign the PK values (JOB_CODE) >> to ensure entity integrity

Limitations on system-defined surrogate keys u Data entries in Table 5.3 are inappropriate because they duplicate existing records u However, it does not prevent us from making the entries shown in Table 5.3. >> Multiple duplicate records problem u We still must ensure the uniqueness in JOB_DESCRIPTION through the use of a unique index.

4Boyce-Codd Normal Form (BCNF) u A table is in Boyce-Codd normal form (BCNF) if every determinant in the table is a candidate key. (A determinant is any attribute whose value determines other values with a row.) u If a table contains only one candidate key, the 3NF and the BCNF are equivalent. u BCNF can be violated only if the table contains more than one candidate key u BCNF is a special case of 3NF. u Figure 5.7 illustrates a table that is in 3NF but not in BCNF. u Figure 5.8 shows how the table can be decomposed to conform to the BCNF form. Database Tables and Normalization

A Table That Is In 3NF But Not In BCNF u A + B → C, D  C → B : Not transitive dependencies (A nonkey attribute is the determinant of a key attribute) → → 3NF  C : Not candidate key → → Not In BCNF

The Decomposition of a Table Structure to meet BCNF Requirements u A + B → C, D u C → B u A + C → D u C → B u A + C → B, D u C → B Change the PK to A+C

The Boyce-Codd Normal Form (BCNF) u STU_ID + STAFF_ID → CLASS_CODE, ENROLL_GRADE u CLASS_CODE → STAFF_ID

Decomposition into BCNF Figure 5.9

Normalization and Database Design 4Normalization should be part of the design process 4E-R Diagram provides macro view 4Normalization provides micro view of entities u Focuses on characteristics of specific entities u A micro view of the entities within the ER diagram 4Difficult to separate normalization from E-R diagramming 4Two techniques should be used concurrently

Normalization and Database Design 4Database Design and Normalization Example: (Construction Company) u Summary of Operations: l The company manages many projects. l Each project requires the services of many employees. l An employee may be assigned to several different projects. l Some employees are not assigned to a project and perform duties not specifically related to a project. l Some employees are part of a labor pool, to be shared by all project teams. l Each employee has a (single) primary job classification. This job classification determines the hourly billing rate. l Many employees can have the same job classification.

4Two Initial Entities: u PROJECT (PROJ_NUM, PROJ_NAME) [ 3NF ] u EMPLOYEE ( EMP_NUM, EMP_LNAME, EMP_FNAME, EMP_INITIAL, JOB_DESCRIPTION, JOB_CHG_HOUR ) No partial dep. Transitive dep. JOB_DESCRIPTION → JOB_CHG_HOUR [ 2NF ] Normalization and Database Design

4Three Entities After Transitive Dependency Removed PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE ( EMP_NUM, EMP_LNAME, EMP_FNAME, EMP_INITIAL, JOB_CODE) JOB ( JOB_CODE, JOB_DESCRIPTION, JOB_CHG_HOUR) Normalization and Database Design EMPLOYEE

The Modified ERD For A Contracting Company Because the normalization process yields an additional entity (JOB), we modify the initial ERD. M Is held by

Normalization and Database Design

4Creation of the Composite Entity ASSIGNMENT Normalization and Database Design The Final ( Implementable) ERD for the Contracting Company Is held by ASSIGNMENT

Normalization and Database Design

4Attribute ASSIGN_HOUR is assigned to the composite entity ASSIGN. 4“Manages” relationship is created between EMPLOYEE and PROJECT. PROJECT (PROJ_NUM, PROJ_NAME, EMP_NUM) ASSIGNMENT (ASSIGN_NUM, ASSIGN_DATE, ASSIGN_HOURS, ASSIGN_CHG_HOUR, ASSIGN_CHARGE, EMP_NUM, PROJ_NUM) EMPLOYEE (EMP_NUM, EMP_LNAME, EMP_FNAME, EMP_INITIAL, EMP_HIREDATE, JOB_CODE) JOB (JOB_CODE, JOB_DESCRIPTION, JOB_CHG_HOUR) Normalization and Database Design Manages

Normalization and Database Design

Higher-Level Normal Forms 4In some databases, multiple multivalued attributes exist 44NF Definition u A table is in 4NF if it is in 3NF and has no multiple sets of multivalued dependencies.

Higher-Level Normal Forms u An employee can have multiple assignments and can also be involved in multiple service organization. u EMP_SERVICE (volunteer work) and EMP_ASSIGN (assigned project) each may have many different values. u The table contain two sets of multivalued dependencies. 1 Independent

4The solution is to eliminate the problems caused by independent multivalued dependencies. A Set of Tables in 4NF

EMPLOYEE ORGANIZATION PROJECT Service Assign M M N N

Denormalization 4Creation of normalized relations is important database design goal 4Processing requirements should also be a goal 4If tables decomposed to conform to normalization requirements u Number of database tables expands 4Joining larger number of tables takes additional disk input/output (I/O) operations and processing logic u Reduces system speed

Denormalization 4Normalization is only one of many database design goals. 4Normalized (decomposed) tables require additional processing, ( Join : additional I/O operations ) reduce system speed. 4CUSTOMER( CUS_NUM, CUS_NAME, …, ZIP_CODE, CITY) Is it really practical? ZIP(ZIP_CODE, CITY) 4Some degree of denormalization →→ increase processing speed transitive dep.

Denormalization 4Normalization purity is often difficult to sustain in the modern database environment. 4The conflict between design efficiency, information requirements, and processing speed are often resolved through compromises that include denormalization. 4In fact, we used a 2NF structure in the JOB table (Fig.5.6) to decrease the likelihood of referential integrity violations. u JOB table l Original PK: JOB_CLASS l New PK: JOB_CODE u EMPLOYEE table l Original FK: JOB_CLASS l New FK: JOB_CODE 4Use denormalization cautiously.

The Initial 1NF Structure

Identifying the Possible PK Attributes

Table Structures Based On The Selected PKs Foreign key

Summary