Database Design Using Normalization

Slides:



Advertisements
Similar presentations
The Relational Model Chapter Two DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 7 th Edition.
Advertisements

The Relational Model Chapter Two DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 6 th Edition.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 David M. Kroenke’s Chapter Three: The Relational Model and Normalization.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day 5.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day4.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day5.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 4-1 David M. Kroenke Database Processing Chapter 4 Database Design Using.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 4-1 COS 346 Day6.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 David M. Kroenke Database Processing Chapter 3 Normalization.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day4.
Chapter 5 Normalization of Database Tables
The Relational Model Chapter Two DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 5 th Edition.
Getting Started Chapter One DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 5 th Edition.
Getting Started Chapter One DATABASE CONCEPTS, 7th Edition
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
Why Normalization? To Reduce Redundancy to 1.avoid modification, insertion, deletion anomolies 2.save space Goal: One Fact in One Place.
Logical Database Design Nazife Dimililer. II - Logical Database Design Two stages –Building and validating local logical model –Building and validating.
Chapter 3 The Relational Model and Normalization
Getting Started Chapter One DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 6 th Edition.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
IT420: Database Management and Organization Normalization 31 January 2006 Adina Crăiniceanu
Database Systems: Design, Implementation, and Management Tenth Edition
Normalization (Codd, 1972) Practical Information For Real World Database Design.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
In this chapter, you learn about the following: ❑ Anomalies ❑ Dependency and determinants ❑ Normalization ❑ A layman’s method of understanding normalization.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, Modified by Dr. Mathis 3-1 David M. Kroenke’s Chapter Three: The Relational.
M1G Introduction to Database Development 2. Creating a Database.
Getting Started Chapter One DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 4 th Edition.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-1 David M. Kroenke’s Chapter Four: Database.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 What Makes Determinant Values Unique? A determinant is unique in.
The Relational Model Chapter Two DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 4 th Edition.
David M. Kroenke and David J. Auer Database Processing Fundamentals, Design, and Implementation Chapter Four: Database Design Using Normalization 4-1 KROENKE.
David M. Kroenke and David J. Auer Database Processing Fundamentals, Design, and Implementation Chapter Three: The Relational Model and Normalization.
David M. Kroenke and David J. Auer Database Processing Fundamentals, Design, and Implementation Chapter Four: Database Design Using Normalization.
David M. Kroenke and David J. Auer Database Processing: F undamentals, Design, and Implementation Chapter Three: The Relational Model and Normalization.
Database Planning Database Design Normalization.
Adapted from DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 Functional Dependencies and Normalization.
CSIS 115 Database Design and Applications for Business
Logical Database Design and the Rational Model
Understanding Data Storage
CSIS 115 Database Design and Applications for Business
GO! with Microsoft Office 2016
The Relational Database Model
David M. Kroenke and David J
(Winter 2017) Instructor: Craig Duckett
Database Design Chapter Five DATABASE CONCEPTS, 4th Edition
Normalization Karolina muszyńska
DESIGNING DATABASE APPLICATIONS
Database Normalization
Chapter 5: Logical Database Design and the Relational Model
MIS 322 – Enterprise Business Process Analysis
Applied CyberInfrastructure Concepts Fall 2017
Relational Database Design by Dr. S. Sridhar, Ph. D
GO! with Microsoft Access 2016
Database Processing: David M. Kroenke’s Chapter Four:
Database Design Using Normalization
Chapter 9 Designing Databases
David M. Kroenke and David J
© 2011 Pearson Education, Inc. Publishing as Prentice Hall
Teaching slides Chapter 8.
Database Processing: Chapter Four: Using Normalization
Database Processing: David M. Kroenke’s Chapter Three:
Relational Database Model
David M. Kroenke and David J
Copyright © 2018, 2015, 20 Pearson Education, Inc. All Rights Reserved Database Concepts Eighth Edition Chapter # 2 The Relational Model.
Chapter 17 Designing Databases
Getting Started Chapter One DATABASE CONCEPTS, 5th Edition
Getting Started Chapter One DATABASE CONCEPTS, 4th Edition
Presentation transcript:

Database Design Using Normalization David M. Kroenke and David J. Auer Database Processing: Fundamentals, Design, and Implementation Chapter Four: Database Design Using Normalization

Chapter Objectives To design updatable databases to store data received from another source To use SQL to access table structure To understand the advantages and disadvantages of normalization To understand denormalization To design read-only databases to store data from updateable databases KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Chapter Objectives To recognize and be able to correct common design problems: The multivalue, multicolumn problem The inconsistent values problem The missing values problem The general-purpose remarks column problem KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Chapter Premise We have received one or more tables of existing data. The data is to be stored in a new database. QUESTION: Should the data be stored as received, or should it be transformed for storage? KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

How Many Tables? SKU_DATA (SKU, SKU_Description, Buyer) BUYER (Buyer, Department) Where SKU_DATA.Buyer must exist in BUYER.Buyer Should we store these two tables as they are, or should we combine them into one table in our new database? KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Normal Forms Review 1NF 2NF Eliminate repeating groups. Make a separate table for each set of related attributes, and give each table a primary key. 2NF Eliminate redundant data. Each attribute must be functionally dependent on the primary key. If an attribute depends on only part of a multi-valued key, remove it to a separate table. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Normal Forms Review 3NF Eliminate columns not dependent on key. If attributes do not contribute to a description of the key, remove them to a separate table. Any transitive dependencies are moved into a smaller table. BCNF Every determinant in the table is a candidate key. If there are non-trivial dependencies between candidate key attributes, separate them out into distinct tables. All normal forms are additive, in that if a model is in 3NF, it is by definition also in 2NF and 1NF. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Another Example KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Putting a Relation into BCNF: EQUIPMENT_REPAIR KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Step 1 Is the Table in 1NF? A quick scan of the table suggests it is in 1NF. Even though a primary key is not identified, one could be determined. [Remember since no 2 rows can be identical in a relation, a candidate for the primary key can always be a composite key made up of all the attributes.] KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Identify Functional Dependencies EQUIPMENT_REPAIR (ItemNumber, Type, AcquisitionCost, RepairNumber, RepairDate, RepairAmount) FD: ItemNumber  (Type, AcquisitionCost) RepairNumber  (ItemNumber, Type, AcquisitionCost, RepairDate, RepairAmount) KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

2 NF Look for a composite primary key [or candidate key] The PK for this table could be a composite of all the attributes So the best place to start here would be to assess the determinants of the functional dependencies Hint: another way to look at this is to evaluate whether you see possible different entities KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Identify Functional Dependencies EQUIPMENT_REPAIR (ItemNumber, Type, AcquisitionCost, RepairNumber, RepairDate, RepairAmount) FD: ItemNumber  (Type, AcquisitionCost) RepairNumber  (ItemNumber, Type, AcquisitionCost, RepairDate, RepairAmount) Is there a determinate key that is not a candidate key? KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Put into Tables ItemNumber is not a candidate key so Move it and its attributes to a new table ITEM(ItemNumber,Type, AcquisitionCost) The determinate becomes the primary key Leave a foreign key in the original table REPAIR (ItemNumber, RepairNumber, RepairDate, RepairAmount) KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Tables KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

3 NF Look for transitive dependencies There are no transitive dependencies All functional dependencies have been taken care of KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

BCNF All determinates are candidate keys KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

What Does a Database Do? Stores information in a highly organized manner Manipulates information in various ways, some of which are not available in other applications or are easier to accomplish with a database Models some real world process or activity through electronic means Often called modeling a business process Often replicates the process only in appearance or end result KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

The Design Process Identify the purpose of the database Review existing data Make a preliminary list of fields Make a preliminary list of tables and enter fields Identify the key fields Draft the table relationships Enter sample data and normalize the data/tables Review and finalize the design [HANDOUT: EXERCISE 1] KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

1. Identify purpose of the DB Clients can tell you what information they want but have no idea what data they need. “We need to keep track of inventory” “We need an order entry system” “I need monthly sales reports” “We need to provide our product catalog on the Web” Be sure to Limit the Scope of the database. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

1. Continued Quite often, the stated intention implies data needs far beyond the client’s knowledge. Be sure to offer or question extension of the design to other areas. Example: Tracking inventory implies adjusting inventory in stock every time there is a sale, thus implying that some method of tracking sales is also needed. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

1. Continued Client may say “We have a database already for that”, which implies that you the designer may need to tap into the existing DB in some manner. Or client may say “We don’t have the budget for that this year; just do the inventory tracking part and we’ll keep track of sales manually.” thus limiting the scope of your design KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

2. Review Existing Data Electronic Manual Legacy database(s) Spreadsheets Web forms Manual Paper forms Receipts and other printed output KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

3. Make Preliminary Field List Make sure fields exist to support needs Ex. if client wants monthly sales reports, you need a date field for orders. Ex. To group employees by division, you need a division identifier Make sure values are atomic Ex. First and Last names stored separately Ex. Addresses broken down to Street, City, State, etc. Do not store values that can be calculated from other values Ex. “Age” can be calculated from “Date of Birth” KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

4. Make Preliminary Tables (and insert the fields into them) Each table holds info about one subject Don’t worry about the quantity of tables Look for logical groupings of information Use a consistent naming convention KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Naming Conventions Rules of thumb Table names must be unique in DB; should be plural Field names must be unique in the table(s) Clearly identify table subject or field data Be as brief as possible Avoid abbreviations and acronyms Use less than 30 characters, Use letters, numbers, underscores (_) Do not use spaces or other special characters Uniqueness of field names applies to the table they are in; fields in different tables can have the same name and linked fields usually should so they are easily identified KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

5. Identify the Key Fields Primary Key(s) Can never be Null; must hold unique values Automatically indexed in most RDBMSs Values rarely (if ever) change Try to include as few fields as possible Multi-field Primary Key Combination of two or more fields that uniquely identify an individual record Candidate Key Field or fields that qualify as a primary key Important in Third and Boyce-Codd Normal Forms KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

6. Identify Table Relationships Based on business rules being modeled Examples: “each customer can place many orders” “all employees belong to a department” “each TA is assigned to one course” Historical note: “Relational” as in “Relational Database” has nothing to do with “relationship” as in “table relationships”. Codd was a mathematician, and devised his rules for modern databases based on mathematical set theory. In set theory, when two groups of numbers have a correspondence of some kind, this is called a “relation”, and Codd named this type of database “relational” because the database storage structure follows some of the same rules as mathematical sets, not because we relate tables together. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

7. Normalization Normal Forms (NF): design standards based on database design theory Normalization is the process of applying the NFs to table design to eliminate redundancy and create a more efficient organization of DB storage. Each successive NF applies an increasingly stringent set of rules Much of what we’ll talk about now and much that you’ve already run into in your own experience will tell you that common sense can avoid many of these problems. At the very least, some of the earlier steps in the design process will obviate or prevent the occurrence of these problems later in the process. But the normal forms are your safety net. If you aren’t sure about whether something belongs in a table or not, run it through the normal forms to find out. Sometimes the problem isn’t in the table you’re currently analyzing, but in one at which you’ve already looked. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

8. Finalizing the Design Double-check to ensure good, principle-based design Evaluate design in light of business model and determine desired deviations from design principles Process efficiency Security concerns KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Design and Normalization Process Summary Watch for repeating values and fields Check against the Normal Forms Make new tables when necessary Re-check all tables against the NFs Remember the business rules Use common sense, but check anyway! KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Assessing Table Structure KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Counting Rows in a Table To count the number of rows in a table use the SQL COUNT(*) built-in aggregate function : KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Reasons for Counting Rows There are various reasons why you might need to know the row count of various database structures (tables etc), including: Determine if an application has loaded data Estimating how long a query might take to run Estimating how long update statistics might take to run Estimating how long create index might take to run Deciding why a query plan has chosen a particular join type KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Examining the Columns To determine the number and type of columns in a table, use an SQL SELECT statement. To limit the number of rows retrieved, use the SQL TOP {NumberOfRows} function: KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Checking Validity of Assumed Referential Integrity Constraints I Given two tables with an assumed foreign key constraint: SKU_DATA (SKU, SKU_Description, Buyer) BUYER (Buyer, Department) Where SKU_DATA.Buyer must exist in BUYER.Buyer KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Checking Validity of Assumed Referential Integrity Constraints II To find any foreign key values that violate the foreign key constraint An empty set for the query result indicates that no foreign key values violate the foreign key constraint KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Assessing Assumed Constraints Placing constraints on how and when and where data can be entered Done after or along with table design Part of design process because many constraints are established at the database and table levels KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Referential Integrity True relational databases support Referential Integrity: every non-null foreign key value must match an existing primary key value. In other words, every record in a related table must have a matching record in the primary table. Preserves the validity of foreign key values. Enforced at database level. Why is this important? Referential Integrity helps ensure that the database contains valid and usable values and records by preserving the connection between tables. Without it, table relationships quickly become meaningless and queries return unreliable results. The most common problem in the absence of referential integrity is the creation of orphan records: the primary key value is changed, causing the matching of the related records to fail. Default in most RDBMSs is for RefInt to be turned off, probably because the software can’t tell from the table design whether you want it turned on or not. So, what happens when you want to change the value on one side of a set of related records? RefInt in its absolute form won’t allow this, so… KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Levels of Enforcement Referential Integrity enforced at database level because it affects relationship between two tables. Many other business rules enforced at field and table level to ensure data integrity. Business rule implementation should be documented: how and where it is enforced in the design. Some rules can’t be enforced at table or field level; must be enforced in the application level. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Testing of Business Rules Always test business rule implementation What happens when rule is met? What happens when rule is violated? Not much good as a data entry constraint if it doesn’t constrain properly Good application or interface design will provide feedback when user violates a constraint or rule KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Type of Database Updateable database, or read-only database? If updateable database, we normally want tables in BCNF. If read-only database, we may not use BCNF tables. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Designing Updatable Databases Updatable databases are typically the operational databases of a company, such as the online transaction processing (OLTP) system discussed for Cape Codd Outdoor Sports at the beginning of Chapter 2. If you are constructing an updatable database, then you need to be concerned about modification anomalies and inconsistent data. Consequently, you must carefully consider normalization principles. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Normalization: Advantages and Disadvantages Why do we say reduce data duplication rather than eliminate data duplication? The answer is that we cannot eliminate all duplicated data because we must duplicate data in foreign keys. We cannot eliminate Buyer, for example, from the SKU_DATA table because we would then not be able to relate BUYER and SKU_DATA rows. Values of Buyer are thus duplicated in the BUYER and SKU_DATA tables. This observation leads to a second question: If we only reduce data duplication, how can we claim to eliminate inconsistent data values? Data duplication in foreign keys will not cause inconsistencies because referential integrity constraints prohibit them. As long as we enforce such constraints, the duplicate foreign key values will cause no inconsistencies. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Non-Normalized Table: EQUIPMENT_REPAIR KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Normalized Tables: ITEM and REPAIR KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Copying Data to New Tables To copy data from one table to another, use the SQL INSERT statement: KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Final Steps In Chapters 7 and 8, you will learn how to: Remove unneeded tables after the data is copied, using the SQL DROP TABLE statement. Create the referential integrity constraint, using the SQL ALTER TABLE statement. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Choosing Not To Use BCNF BCNF is used to control anomalies from functional dependencies. There are times when BCNF is not desirable. The classic example is ZIP codes: ZIP codes almost never change. Any anomalies are likely to be caught by normal business practices. Not having to use SQL to join data in two tables will speed up application processing. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Multivalued Dependencies Anomalies from multivalued dependencies are very problematic. Always place the columns of a multivalued dependency into a separate table (4NF). KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Designing Read-Only Databases The extracted sales data that we used for Cape Codd Outdoor Sports in Chapter 2 is a small, but typical example of a read-only database. Read-only databases are used in business intelligence (BI) systems for producing information for assessment, analysis, planning, and control, as we discussed for Cape Codd Outdoor Sports in Chapter 2. Read-only databases are commonly used in a data warehouse, which we also introduced in Chapter 2. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Read-Only Databases Read-only databases are nonoperational databases using data extracted from operational databases. They are used for querying, reporting, and data mining applications. They are never updated (in the operational database sense—they may have new data imported from time to time). KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Denormalization For read-only databases, normalization is seldom an advantage. Application processing speed is more important. Denormalization is the joining of the data in normalized tables prior to storing the data. The data is then stored in nonnormalized tables. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Normalized Tables KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Denormalizing the Data KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Customized Tables I Read-only databases are often designed with many copies of the same data, but with each copy customized for a specific application. Consider the PRODUCT table: KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Customized Tables II KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Common Design Problems KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

The Multivalue, Multicolumn Problem The multivalue, multicolumn problem occurs when multiple values of an attribute are stored in more than one column: EMPLOYEE (EmployeeNumber, EmployeeLastName, Auto2_LicenseNumber, Auto3_LicenseNumber) This is another form of a multivalued dependency. Solution = like the 4NF solution for multivalued dependencies, use a separate table to store the multiple values. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Inconsistent Values I Inconsistent values occur when different users, or different data sources, use slightly different forms of the same data value: Different codings: SKU_Description = 'Corn, Large Can' SKU_Description = 'Can, Corn, Large' SKU_Description = 'Large Can Corn‘ Different spellings: Coffee, Cofee, Coffeee KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Inconsistent Values II Particularly problematic are primary or foreign key values. To detect: Use referential integrity check already discussed for checking keys. Use the SQL GROUP BY clause on suspected columns. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Inconsistent Values III KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Missing Values A missing value or null value is a value that has never been provided. In a database table, a null value appears in upper case letters as NULL. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Null Values Null values are ambiguous: May indicate that a value is inappropriate; DateOfLastChildbirth is inappropriate for a male. May indicate that a value is appropriate but unknown; DateOfLastChildbirth is appropriate for a female, but may be unknown. May indicate that a value is appropriate and known, but has never been entered; DateOfLastChildbirth is appropriate for a female, and may be known but no one has recorded it in the database. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

Checking for Null Values Use the SQL IS NULL operator to check for null values: KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

The General-Purpose Remarks Column A general-purpose remarks column is a column with a name such as: Remarks Comments Notes It often contains important data stored in an inconsistent, verbal, and verbose way. A typical use is to store data on a customer’s interests. Such a column may: Be used inconsistently Hold multiple data items KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

The General-Purpose Remarks Column: Hidden Foreign Key Data In a typical situation, the data for the foreign key may have been recorded in the Remarks column. 'Wants to buy a Piper Seneca II‘ 'Owner of a Piper Seneca II‘ 'Possible buyer for a turbo Seneca'. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

End of Presentation: Chapter Four David Kroenke and David Auer Database Processing Fundamentals, Design, and Implementation (14th Edition) End of Presentation: Chapter Four KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America. KROENKE AND AUER - DATABASE PROCESSING, 14th Edition © 2016 Pearson Prentice Hall