Download presentation
Published byKatrina Weaver Modified over 9 years ago
1
Normalization of Database Tables Special adaptation for INFS-3200
Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Rob and Coronel Special adaptation for INFS-3200
2
Database Tables and Normalization
Table is basic building block in database design Normalization is process for assigning attributes to entities. Why: Reduces data redundancies Helps eliminate data anomalies Produces controlled redundancies to link tables and therefore, establish relationships. GENERAL GUIDELINES: Define business rules Define level of detail (granularity) – details & aggregates Each table must represent one and only one subject only All attributes in the table must be fully dependent on the PK, the entire PK and nothing but the PK. Database Systems: Design, Implementation, & Management, Rob & Coronel
3
Database Tables and Normalization
Normalization stages 1NF - First normal form: Put data in table format. Eliminate repeating groups. Select a suitable primary key. 2NF - Second normal form Eliminate partial dependencies. 3NF - Third normal form Eliminate transitive dependencies BCNF - Boyce-Codd normal form Every determinant in the table is a candidate key. 4NF - Fourth normal form Eliminate independent multi-valued set of facts. Database Systems: Design, Implementation, & Management, Rob & Coronel
4
Database Tables and Normalization
Original Report Data Database Systems: Design, Implementation, & Management, Rob & Coronel
5
Sample Data for Project Report
Report Data in Table Format (incomplete) Observations PROJ_NUM intended to be primary key Table entries invite data inconsistencies Table tend to create data anomalies: Update Modifying JOB_CLASS Insertion New employee must be assigned project Deletion If employee deleted, other vital data lost Database Systems: Design, Implementation, & Management, Rob & Coronel
6
Database Systems: Design, Implementation, & Management, Rob & Coronel
Conversion to 1NF Repeating groups must be eliminated (duplicated column types/multi-valued columns) Tabular format Each cell has single value - No repeating groups In our case: PROJ_NUM, EMP_NUM, etc. Proper primary key developed Uniquely determines (identifies) attribute values (in each row) In our case: combination of PROJ_NUM and EMP_NUM Identify Dependencies Desirable dependencies based on primary key Less desirable dependencies Partial based on part of composite primary key Transitive one nonprime attribute depends on another nonprime attribute Database Systems: Design, Implementation, & Management, Rob & Coronel
7
Database Systems: Design, Implementation, & Management, Rob & Coronel
Data Organization: 1NF Database Systems: Design, Implementation, & Management, Rob & Coronel
8
Dependency Diagram (1NF)
1NF (PROJ_NUM, EMP_NUM, PROJ_NUM,EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS) Partial Dependencies: (PROJ_NUM -> PROJ_NAME) (EMP_NUM -> EMP_NAME, JOB_CLASS, CHG_HOUR) Transitive Dependencies: (JOB_CLASS -> CHG_HOUR) Database Systems: Design, Implementation, & Management, Rob & Coronel
9
Database Systems: Design, Implementation, & Management, Rob & Coronel
1NF Summarized Data in a tabular format No repeating groups in table (repeated attributes or multi-value attributes ) Primary key attribute(s) identified All dependent attributes depend on the primary key Identify all partial and transitive dependencies Partial Attributes that depend on part of PK Can only exist if the table has a composite PK Transitive Non-PK attribute(s) determines other attributes JOB_CLASS determines CHG_HOUR JOB_CLASS is the determinant attribute CHG_HOUR is the dependent attribute Database Systems: Design, Implementation, & Management, Rob & Coronel
10
Database Systems: Design, Implementation, & Management, Rob & Coronel
Conversion to 2NF Start with 1NF format: Write each key component on separate line Write original key on last line Each component is new table Write dependent attributes after each key PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR) ASSIGN (PROJ_NUM, EMP_NUM, ASSIGN_HOURS) Transitive Dependencies: (JOB_CLASS -> CHG_HOUR) Database Systems: Design, Implementation, & Management, Rob & Coronel
11
Database Systems: Design, Implementation, & Management, Rob & Coronel
2NF Conversion Results PROJ_NUM PROJ_NAME Table name: PROJECT EMP_NUM EMP_NAME JOB_CLASS CHG_HOUR Table name: EMPLOYEE Transitive dependency EMP_NUM PROJ_NUM ASSIGN_HOURS Table name: ASSIGN Database Systems: Design, Implementation, & Management, Rob & Coronel
12
Database Systems: Design, Implementation, & Management, Rob & Coronel
2NF Summarized In 1NF Includes no partial dependencies No attribute dependent on a portion of primary key Still possible to exhibit transitive dependency Attributes may be functionally dependent on non-key attributes Database Systems: Design, Implementation, & Management, Rob & Coronel
13
Database Systems: Design, Implementation, & Management, Rob & Coronel
Conversion to 3NF Create separate table(s) to eliminate transitive functional dependencies For every transitive dependency: Write determinant as PK of new table Write all dependent attributes for each determinant Delete the dependent attribute(s) from the original table(s) Leave the determinant attribute(s) in the original table(s) PROJECT (PROJ_NUM, PROJ_NAME) ASSIGN (PROJ_NUM, EMP_NUM, ASSIGN_HOURS) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS) JOB (JOB_CLASS, CHG_HOUR) Database Systems: Design, Implementation, & Management, Rob & Coronel
14
Database Systems: Design, Implementation, & Management, Rob & Coronel
Conversion to 3NF Database Systems: Design, Implementation, & Management, Rob & Coronel
15
Database Systems: Design, Implementation, & Management, Rob & Coronel
3NF Summarized In 2NF Contains no transitive dependencies Database Systems: Design, Implementation, & Management, Rob & Coronel
16
Database Systems: Design, Implementation, & Management, Rob & Coronel
Improving the Design Evaluate PK Assignment (meet PK guidelines) Evaluate Naming Conventions Refine Attribute Atomicity (simple, single-valued) Identify New Attributes Identify New Relationships (decompose M:M) Refine Primary Keys (as required for data granularity) Maintain Historical Transactional Accuracy Identify Use of Derived Attributes Database Systems: Design, Implementation, & Management, Rob & Coronel
17
Improving the Design (Job Table)
Evaluate PK Assignment Introduce a better suited PK free of semantic content (non-intelligent PK) Add JOB_CODE as surrogate PK. To reduce data entry errors Repeat for other tables (see # 6 – Assign)* Evaluate Naming Conventions JOB_CLASS is actually a description of the job, change to JOB_DESCRIPTION. CHG_HOURS should be JOB_CHG_HOUR Database Systems: Design, Implementation, & Management, Rob & Coronel
18
Improving the Design (Employee Table)
Refine Attribute Atomicity Decompose composite attributes into simple attributes EMP_NAME should be EMP_LNAME EMP_FNAME EMP_INITIAL Identify New Attributes Add new attributes that describe real world entity characteristics EMP_HIREDATE Database Systems: Design, Implementation, & Management, Rob & Coronel
19
Improving the Design (Project Table)
Identify New Relationships Add new relationships as required by business rules. A project is managed by an employee, an employee can be the manager of only one project. Add EMP_NUM as FK in PROJECT Database Systems: Design, Implementation, & Management, Rob & Coronel
20
Improving the Design (Assign Table)
Identify new attributes Add ASSIGN_DATE Refine Primary Keys Consider the “granulity” of the data being represented in order to determine the PK. Can an employee have multiple hours worked entries for a given day in a given project? If yes, add ASSIGN_NUM as surrogate PK. Maintain Historical Transaction Accuracy Add ASSIGN_CHG_HOUR Identify Use of Derived Attributes Add ASSIGN_CHARGE Database Systems: Design, Implementation, & Management, Rob & Coronel
21
Limitations of System Assigned PK
Surrogate PK ensures that each row has an unique ID, not that the row’s dependent values are unique. JOB_CODE system assigned PK We still could have duplicate values: 511 Programmer 512 Programmer Clearly, entries are duplicated! To ensure unique values we must have create an unique index on all candidate keys. Unique index on JOB_DESCRIPTION This still will still not avoid data entry errors! Progranmer Database Systems: Design, Implementation, & Management, Rob & Coronel
22
Normalization and Database Design
First, business rules must be determined Determine the granularity of the data in each entity. Normalization should be part of the design process E-R Diagram provides macro view (conceptual) Normalization provides micro view of entities (logical) Focuses on characteristics of specific entities May yield additional entities/relationships Difficult to separate normalization from E-R diagramming – complementary Database Systems: Design, Implementation, & Management, Rob & Coronel
23
Initial ERD for Contracting Company
A company has many projects. Each project requires the services of many employees. An employee may be assigned to several different projects. Some employees are not assigned to a project. Each employee has a single primary job classification. Many employees can have the same job classification. The job classification determines the hourly billing rate. Database Systems: Design, Implementation, & Management, Rob & Coronel
24
Modified ERD for Contracting Company
Figure 4.11 Database Systems: Design, Implementation, & Management, Rob & Coronel
25
Final ERD for Contracting Company
Figure 4.12 Database Systems: Design, Implementation, & Management, Rob & Coronel
26
Database Systems: Design, Implementation, & Management, Rob & Coronel
Denormalization Normalization is one of many database design goals Normalization creates many small tables with PK/FK Reporting requirements over normalized tables requires multiple table joins to get complete data: Additional processing (join operations) Additional I/Os operations Design must find right balance among: Design Integrity requirements Information requirements Performance requirements Database Systems: Design, Implementation, & Management, Rob & Coronel
27
Unnormalized Table Defects
Data updates less efficient Normalization ensures that data is updated (insert/update/delete) only once in one place. Indexing more cumbersome No simple strategies for creating views Database Systems: Design, Implementation, & Management, Rob & Coronel
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.