DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-1 David M. Kroenke’s Chapter Four: Database Design Using Normalization Database Processing: Fundamentals, Design, and Implementation
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-2 Chapter Premise Received one or more tables of existing data Need to store data in new database Problem: Store data as received, or transform?
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-3 How Many Tables? One, two, more?
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-4 Assessing Table Structure
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-5 Counting Rows in a Table To count the number of rows in a table use the SQL built-in function COUNT(*): SELECTCOUNT(*) AS NumRows FROMSKU_DATA;
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-6 Examining the Columns To determine the number and type of columns in a table, use an SQL SELECT statement To limit the number of rows retreived, use the SQL TOP {NumberOfRows} keyword: SELECTTOP (10) * FROM SKU_DATA;
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-7 Checking Validity of Assumed Referential Integrity Constraints Given two tables with an assumed foreign key constraint: SKU_DATA (SKU, SKU_Description, Department, Buyer) BUYER(BuyerName, Department) Where SKU_DATA.Buyer must exist in BUYER.BuyerName
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-8 Checking Validity of Assumed Referential Integrity Constraints To find any foreign key values that violate the foreign key constraint: SELECTBuyer FROM SKU_DATA WHEREBuyer NOT IT (SELECTBuyer FROM SKU_DATA, BUYER WHERESKU_DATA.BUYER = BUYER.BuyerName;
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-9 Type of Database Updateable Read-only
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-10 Updateable Database Normally, put tables in BCNF Always, remove MVD
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-11 Read-Only Database Likely not to use BCNF tables Remove MVD
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-12 Designing Updateable Databases
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-13 Normalization: Advantages and Disadvantages
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-14 Choosing Not to Use BCNF BCNF controls anomalies from functional dependencies BCNF is not, always, desirable Classic ZIP code example
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-15 Multivaled Dependencies MVDs cause very problematic anomalies Always place MVDs into a separate tables for operational databases.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-16 Designing Read-Only Databases
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-17 Read-Only Database non-operational database data extracted from operational databases
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-18 Read-Only Databases For applications –querying –reporting –data mining NEVER updated operationally (new data imported)
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-19 Read-Only Databases Data Warehouse Data Mart Decision Support System
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-20 Read-Only Databases Normalization seldom an advantage Denormalization –Join data from normalized tables To create a new table –Store New non-normalized tables Not original normalized tables
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-21 Read-Only Databases Often, many copies of the same data Each customized for a specific application
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-22 Customized Tables PRODUCT_PURCHASING (SKU, SKU_Description, VendorNumber, VendorName, VendorContact_1, VendorContact_2, VendorStreet, VendorCity, VendorState, VendorZip) PRODUCT_USAGE (SKU, SKU_Description, QuantitySoldPastYear, QuantitySoldPastQuarter, QuantitySoldPastMonth) PRODUCT_WEB (SKU, DetailPicture, ThumbnailPicture, MarketingShortDescription, MarketingLongDescription, PartColor) PRODUCT_INVENTORY (SKU, PartNumber, SKU_Description, UnitsCode, BinNumber, ProductionKeyCode)
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-23 Creating Databases from Existing Tables Generally, accomplished by automated means –Import –SQL Beyond the scope of this class
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-24 Common Design Problems
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-25 The Multivalue, Multicolumn Problem WORKER (WorkerID, Name, Skill-Type, BuildID1, BuildID2, BuildID3) Problems?
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-26 The Multivalue, Multicolumn Problem WORKER (WorkerID, Name, Skill-Type, BuildID1, BuildID2, BuildID3) Another form of a MVD A Different Solution: –use separate table to store multiple values –WORKER ( WorkerID, Name, Skill-Type) –ASSIGNMENT (WorkerID, BuildingID
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-27 Inconsistent Values: Different forms of the same data value Different codings: CommonName = ‘Blueberry' CommonName = ‘Huckleberry' CommonName = ‘Wild Blueberry‘ Different spellings: ‘Blueberry’ ‘Blueberries’ ‘BLUBERRY’
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-28 Inconsistent Values Particularly problematic are –primary key values,or –foreign key values To detect: –Some semi-automatic methods –Beyond the scope of this class
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-29 Null Values Missing value Not a blank Not a zero Never been provided!
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-30 Null Values Ambiguous: –May indicate n/a –May indicate applicable but unknown –May indicate value applicable and known, but never entered When is your DB fault intolerant of null values?
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-31 The General-Purpose Remarks Column Black Hole –Remarks –Comments –Notes Such a column may: –Be used inconsistently –Hold multiple data items
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-32 The General-Purpose Remarks Column Example – Figure 4-10, p. 118
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-33 The General-Purpose Remarks Column Decompose column into separate columns Manual task
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis 4-34 David M. Kroenke’s Database Processing Fundamentals, Design, and Implementation (10 th Edition) End of Presentation: Chapter Four