Database Design Using Normalization

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

Accounting System Design
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 David M. Kroenke’s Chapter Three: The Relational Model and Normalization.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day 5.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day4.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day5.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 4-1 David M. Kroenke Database Processing Chapter 4 Database Design Using.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 4-1 COS 346 Day6.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 David M. Kroenke Database Processing Chapter 3 Normalization.
Accounting Databases Chapter 2 The Crossroads of Accounting & IT
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day4.
Chapter 5 Normalization of Database Tables
Database Lecture Notes Normalization 3 – Denormalization
Logical Database Design Nazife Dimililer. II - Logical Database Design Two stages –Building and validating local logical model –Building and validating.
Chapter 3 The Relational Model and Normalization
IT420: Database Management and Organization Normalization 31 January 2006 Adina Crăiniceanu
Database Systems: Design, Implementation, and Management Tenth Edition
Concepts of Database Management, Fifth Edition
Normalization (Codd, 1972) Practical Information For Real World Database Design.
In this chapter, you learn about the following: ❑ Anomalies ❑ Dependency and determinants ❑ Normalization ❑ A layman’s method of understanding normalization.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, Modified by Dr. Mathis 3-1 David M. Kroenke’s Chapter Three: The Relational.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-1 David M. Kroenke’s Chapter Four: Database.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 What Makes Determinant Values Unique? A determinant is unique in.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
David M. Kroenke and David J. Auer Database Processing Fundamentals, Design, and Implementation Chapter Four: Database Design Using Normalization 4-1 KROENKE.
Gegevens Analyse Les 4: Normaliseren. Functional Dependency Rules If A  (B, C), then A  B and A  C If (A,B)  C, then neither A nor B determines C.
1 CS 430 Database Theory Winter 2005 Lecture 7: Designing a Database Logical Level.
David M. Kroenke and David J. Auer Database Processing Fundamentals, Design, and Implementation Chapter Three: The Relational Model and Normalization.
David M. Kroenke and David J. Auer Database Processing Fundamentals, Design, and Implementation Chapter Four: Database Design Using Normalization.
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
David M. Kroenke and David J. Auer Database Processing: F undamentals, Design, and Implementation Chapter Three: The Relational Model and Normalization.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
CSIS 115 Database Design and Applications for Business Dr. Meg Fryling “Dr. Meg” Fall #csis115 © 2012 Meg Fryling.
Adapted from DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 Functional Dependencies and Normalization.
Logical Database Design and the Rational Model
Chapter 5 Database Design
A Guide to SQL, Eighth Edition
Chapter 4 Logical Database Design and the Relational Model
CSIS 115 Database Design and Applications for Business
CSIS 115 Database Design and Applications for Business
Chapter 6 - Database Implementation and Use
David M. Kroenke and David J
DESIGNING DATABASE APPLICATIONS
Chapter 5: Logical Database Design and the Relational Model
Applied CyberInfrastructure Concepts Fall 2017
Quiz Questions Q.1 An entity set that does not have sufficient attributes to form a primary key is a (A) strong entity set. (B) weak entity set. (C) simple.
Database Processing: David M. Kroenke’s Chapter Four:
Payroll Management System
COS 346 Day 8.
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
CMPE 226 Database Systems February 21 Class Meeting
Chapter 9 Designing Databases
SQL 101.
Accounting System Design
Normalization Referential Integrity
Database Design Using Normalization
Database Processing: Chapter Four: Using Normalization
Database Processing: David M. Kroenke’s Chapter Three:
Normalization Dale-Marie Wilson, Ph.D..
Normalization of Database Tables Uploaded by: mysoftbooks.ml
A Guide to SQL, Eighth Edition
Accounting System Design
David M. Kroenke and David J
國立臺北科技大學 課程:資料庫系統 2015 fall Chapter 14 Normalization.
Copyright © 2018, 2015, 20 Pearson Education, Inc. All Rights Reserved Database Concepts Eighth Edition Chapter # 2 The Relational Model.
Relational Database Design
Chapter 17 Designing Databases
Systems Analysis and Design, 7e Kendall & Kendall
Chapter 4 The Relational Model and Normalization
Presentation transcript:

Database Design Using Normalization Chapter Four: Database Design Using Normalization

Chapter Objectives To design updatable databases to store data received from another source To use SQL to access table structure To understand the advantages and disadvantages of normalization To understand denormalization To design read-only databases to store data from updateable databases

Chapter Objectives To recognize and be able to correct common design problems: The multivalue, multicolumn problem The inconsistent values problem The missing values problem The general-purpose remarks column problem

Chapter Premise We have received one or more tables of existing data. The data is to be stored in a new database. QUESTION: Should the data be stored as received, or should it be transformed for storage?

How Many Tables? SKU_DATA (SKU, SKU_Description, Buyer) BUYER (Buyer, Department) Where SKU_DATA.Buyer must exist in BUYER.Buyer Should we store these two tables as they are, or should we combine them into one table in our new database? Stock Keeping Unit

Assessing Table Structure

Counting Rows in a Table To count the number of rows in a table use the SQL COUNT(*) built-in aggregate function :

Examining the Columns To determine the number and type of columns in a table, use an SQL SELECT statement. To limit the number of rows retrieved, use the SQL TOP {NumberOfRows} function:

Checking Validity of Assumed Referential Integrity Constraints I Given two tables with an assumed foreign key constraint: SKU_DATA (SKU, SKU_Description, Buyer) BUYER (Buyer, Department) Where SKU_DATA.Buyer must exist in BUYER.Buyer

Checking Validity of Assumed Referential Integrity Constraints II To find any foreign key values that violate the foreign key constraint An empty set for the query result indicates that no foreign key values violate the foreign key constraint Note that this is correct and the book, p. 179 is not.

Type of Database Updateable database, or read-only database? If updateable database, we normally want tables in BCNF. If read-only database, we may not use BCNF tables.

Designing Updatable Databases Updatable databases are typically the operational databases of a company, such as the online transaction processing (OLTP) system discussed for Cape Codd Outdoor Sports at the beginning of Chapter 2. If you are constructing an updatable database, then you need to be concerned about modification anomalies and inconsistent data. Consequently, you must carefully consider normalization principles.

Normalization: Advantages and Disadvantages Why do we say reduce data duplication rather than eliminate data duplication? The answer is that we cannot eliminate all duplicated data because we must duplicate data in foreign keys. We cannot eliminate Buyer, for example, from the SKU_DATA table because we would then not be able to relate BUYER and SKU_DATA rows. Values of Buyer are thus duplicated in the BUYER and SKU_DATA tables. This observation leads to a second question: If we only reduce data duplication, how can we claim to eliminate inconsistent data values? Data duplication in foreign keys will not cause inconsistencies because referential integrity constraints prohibit them. As long as we enforce such constraints, the duplicate foreign key values will cause no inconsistencies.

Non-Normalized Table: EQUIPMENT_REPAIR

Normalized Tables: ITEM and REPAIR

Copying Data to New Tables To copy data from one table to another, use the SQL INSERT statement:

Final Steps In Chapters 7 and 8, you will learn how to: Remove unneeded tables after the data is copied, using the SQL DROP TABLE statement. Create the referential integrity constraint, using the SQL ALTER TABLE statement.

Choosing Not To Use BCNF BCNF is used to control anomalies from functional dependencies. There are times when BCNF is not desirable. The classic example is ZIP codes: ZIP codes almost never change. Any anomalies are likely to be caught by normal business practices. Not having to use SQL to join data in two tables will speed up application processing. ZIP → (City, State)

Multivalued Dependencies Anomalies from multivalued dependencies are very problematic. Always place the columns of a multivalued dependency into a separate table (4NF).

Designing Read-Only Databases The extracted sales data that we used for Cape Codd Outdoor Sports in Chapter 2 is a small, but typical example of a read-only database. Read-only databases are used in business intelligence (BI) systems for producing information for assessment, analysis, planning, and control, as we discussed for Cape Codd Outdoor Sports in Chapter 2. Read-only databases are commonly used in a data warehouse.

Read-Only Databases Read-only databases are nonoperational databases using data extracted from operational databases. They are used for querying, reporting, and data mining applications. They are never updated (in the operational database sense—they may have new data imported from time to time).

Denormalization For read-only databases, normalization is seldom an advantage. Application processing speed is more important. Denormalization is the joining of the data in normalized tables prior to storing the data. The data is then stored in nonnormalized tables.

Normalized Tables

Denormalizing the Data

Customized Tables I Read-only databases are often designed with many copies of the same data, but with each copy customized for a specific application. Consider the PRODUCT table:

Customized Tables II

Common Design Problems

The Multivalue, Multicolumn Problem The multivalue, multicolumn problem occurs when multiple values of an attribute are stored in more than one column: EMPLOYEE (EmployeeNumber, EmployeeLastName, Auto2_LicenseNumber, Auto3_LicenseNumber) This is another form of a multivalued dependency. Solution = like the 4NF solution for multivalued dependencies, use a separate table to store the multiple values.

Inconsistent Values I Inconsistent values occur when different users, or different data sources, use slightly different forms of the same data value: Different codings: SKU_Description = 'Corn, Large Can' SKU_Description = 'Can, Corn, Large' SKU_Description = 'Large Can Corn‘ Different spellings: Coffee, Cofee, Coffeee

Inconsistent Values II Particularly problematic are primary or foreign key values. To detect: Use referential integrity check already discussed for checking keys. Use the SQL GROUP BY clause on suspected columns.

Inconsistent Values III

Missing Values A missing value or null value is a value that has never been provided. In a database table, a null value appears in upper case letters as NULL.

Null Values Null values are ambiguous: May indicate that a value is inappropriate; DateOfLastChildbirth is inappropriate for a male. May indicate that a value is appropriate but unknown; DateOfLastChildbirth is appropriate for a female, but may be unknown. May indicate that a value is appropriate and known, but has never been entered; DateOfLastChildbirth is appropriate for a female, and may be known but no one has recorded it in the database.

Checking for Null Values Use the SQL IS NULL operator to check for null values:

The General-Purpose Remarks Column A general-purpose remarks column is a column with a name such as: Remarks Comments Notes It often contains important data stored in an inconsistent, verbal, and verbose way. A typical use is to store data on a customer’s interests. Such a column may: Be used inconsistently Hold multiple data items

The General-Purpose Remarks Column: Hidden Foreign Key Data In a typical situation, the data for the foreign key may have been recorded in the Remarks column. 'Wants to buy a Piper Seneca II‘ 'Owner of a Piper Seneca II‘ 'Possible buyer for a turbo Seneca'.

End of Presentation: Chapter Four