Download presentation
Presentation is loading. Please wait.
1
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day4
2
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-2 Agenda Questions? Assignment Two is posted –Marcia’s Dry Cleaning Project on page 97 & 98, question A through F –Due Feb 6 at 3:35 PM Finish Discussion on The Relational Model and Normalization Discussion on Database Design using Normalization
3
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-3 Eliminating Modification Anomalies from Functional Dependencies in Relations Put all relations into Boyce-Codd Normal Form (BCNF):
4
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-4 Putting a Relation into BCNF: EQUIPMENT_REPAIR
5
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-5 Putting a Relation into BCNF: EQUIPMENT_REPAIR EQUIPMENT_REPAIR (ItemNumber, Type, AcquisitionCost, RepairNumber, RepairDate, RepairAmount) ItemNumber (Type, AcquisitionCost) RepairNumber (ItemNumber, Type, AcquisitionCost, RepairDate, RepairAmount) ITEM (ItemNumber, Type, AcquisitionCost) REPAIR (ItemNumber, RepairNumber, RepairDate, RepairAmount) Where REPAIR.ItemNumber must exist in ITEM.ItemNumber
6
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-6 Putting a Relation into BCNF: New Relations
7
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-7 Putting a Relation into BCNF: SKU_DATA
8
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-8 Putting a Relation into BCNF: SKU_DATA SKU_DATA (SKU, SKU_Description, Department, Buyer) SKU (SKU_Description, Department, Buyer) SKU_Description (SKU, Department, Buyer) Buyer Department SKU_DATA (SKU, SKU_Description, Buyer) BUYER (Buyer, Department) Where BUYER.Buyer must exist in SKU_DATA.Buyer
9
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-9 Putting a Relation into BCNF: New Relations
10
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-10 Multivaled Dependencies A multivaled dependency occurs when a determinant determines a particular set (one or more) of values: Employee Degree Employee Sibling PartKit Part The determinant of a multivalued dependency can never be a primary key
11
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-11 Multivalued Dependencies
12
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-12 Eliminating Anomalies from Multivalued Dependencies Multivalued dependencies are not a problem if they are in a separate relation, so: –Always put multivalued dependencies into their own relation –This is known as Fourth Normal Form (4NF) As long as it is also BCNF!
13
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-13 Fixing 4thNF (Generally Speaking) A relation R(A,B,C) –A->->B –A->->C –B and C are independent Create R(A,B) andR1(A,C)
14
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-14 Fifth Normal Form (5NF) The Fifth Normal Form concerns dependencies that are obscure and beyond the scope of this text. Punt!
15
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-15 Domain/Key Normal Form (DK/NF) To be in Domain/Key Normal Form (DK/NF) every constraint on the relation must be a logical consequence of the definition of keys and domains. The Ultimate Normal Form –1981 Fagin NO possible anomalies
16
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-16 DK/NF Terminology Constraint –A rule governing static values of attributes Key –A unique identifier of a tuple Domain –A description of an attribute’s allowable values
17
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-17 De-normalized Designs When a normalized design is unnatural, awkward, or results in unacceptable performance, a de- normalized design is preferred Example –Normalized relation CUSTOMER (CustNumber, CustName, Zip) CODES (Zip, City, State) –De-Normalized relations CUSTOMER (CustNumber, CustName, City, State, Zip)
18
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-18 David M. Kroenke’s Database Processing Fundamentals, Design, and Implementation (10 th Edition) End of Presentation: Chapter Three
19
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-19 David M. Kroenke’s Chapter Four: Database Design Using Normalization Database Processing: Fundamentals, Design, and Implementation
20
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-20 Chapter Premise We have received one or more tables of existing data The data is to be stored in a new database QUESTION: Should the data be stored as received, or should it be transformed for storage?
21
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-21 How Many Tables? Should we store these two tables as they are, or should we combine them into one table in our new database?
22
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-22 Assessing Table Structure
23
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-23 Counting Rows in a Table To count the number of rows in a table use the SQL built-in function COUNT(*): SELECTCOUNT(*) AS NumRows FROMSKU_DATA;
24
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-24 Examining the Columns To determine the number and type of columns in a table, use an SQL SELECT statement To limit the number of rows retreived, use the SQL TOP {NumberOfRows} keyword: SELECTTOP (10) * FROM SKU_DATA;
25
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-25 Checking Validity of Assumed Referential Integrity Constraints Given two tables with an assumed foreign key constraint: SKU_DATA (SKU, SKU_Description, Department, Buyer) BUYER(BuyerName, Department) Where SKU_DATA.Buyer must exist in BUYER.BuyerName
26
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-26 Checking Validity of Assumed Referential Integrity Constraints To find any foreign key values that violate the foreign key constraint: SELECTBuyer FROM SKU_DATA WHEREBuyer NOT IT (SELECTBuyer FROM SKU_DATA, BUYER WHERESKU_DATA.BUYER = BUYER.BuyerName;
27
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-27 Type of Database Updateable database or read-only database? If updateable database, we normally want tables in BCNF If read-only database, we may not use BCNF tables
28
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-28 Designing Updateable Databases
29
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-29 Normalization: Advantages and Disadvantages
30
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-30 Non-Normalized Table: EQUIPMENT_REPAIR
31
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-31 Normalized Tables: ITEM and REPAIR
32
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-32 Copying Data to New Tables To copy data from one table to another, use the SQL command INSERT INTO TableName command: INSERT INTO ITEM SELECTDISTINCT ItemNumber, Type, AcquisitionCost FROM EQUIPMENT_REPAIR; INSERT INTO REPAIR SELECTItemNumber, RepairNumber, RepairDate, RepairAmmount FROM EQUIPMENT_REPAIR;
33
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-33 Choosing Not to Use BCNF BCNF is used to control anomalies from functional dependencies There are times when BCNF is not desirable The classic example is ZIP codes: –ZIP codes almost never change –Any anomalies are likely to be caught by normal business practices –Not having to use SQL to join data in two tables will speed up application processing
34
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-34 Multivalued Dependencies Anomalies from multivalued dependencies are very problematic Always place the columns of a multivalued dependency into a separate table (4NF)
35
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-35 Designing Read-Only Databases
36
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-36 Read-Only Databases Read-only databases are non-operational databases using data extracted from operational databases They are used for querying, reporting and data mining applications They are never updated (in the operational database sense – they may have new data imported form time-to-time)
37
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-37 Denormalization For read-only databases, normalization is seldom an advantage –Application processing speed is more important Denormalization is the joining of data in normalized tables prior to storing the data The data is then stored in non-normalized tables
38
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-38 Normalized Tables
39
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-39 Denormalizing the Data INSERT INTO PAYMENT_DATA SELECT STUDENT.SID, Name, CLUB.Club, Cost, AmtPaid FROM STUDENT, PAYMENT, CLUB WHERESTUDENT.SID = PAYMENT.SID ANDPAYMENT.Club = CLUB.Club;
40
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-40 Customized Tables Read-only databases are often designed with many copies of the same data, but with each copy customized for a specific application Consider the PRODUCT table:
41
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-41 Customized Tables PRODUCT_PURCHASING (SKU, SKU_Description, VendorNumber, VendorName, VendorContact_1, VendorContact_2, VendorStreet, VendorCity, VendorState, VendorZip) PRODUCT_USAGE (SKU, SKU_Description, QuantitySoldPastYear, QuantitySoldPastQuarter, QuantitySoldPastMonth) PRODUCT_WEB (SKU, DetailPicture, ThumbnailPicture, MarketingShortDescription, MarketingLongDescription, PartColor) PRODUCT_INVENTORY (SKU, PartNumber, SKU_Description, UnitsCode, BinNumber, ProductionKeyCode)
42
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-42 Common Design Problems
43
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-43 The Multivalue, Multicolumn Problem The multivalue, multicolumn problem occurs when multiple values of an attribute are stored in more that one column: EMPLOYEE (EmpNumber, Name, Email, Auto1_LicenseNumber, Auto2_LicenseNumber, Auto3_LicenseNumber) This is another form of a multivalued dependency Solution: Like the 4NF solution for multivalued dependencies, use a separate table to store the multiple values Example on page 110 is wrong, Can you tell me why?
44
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-44 Inconsistent Values Inconsistent values occur when different users or different data sources use slightly different forms of the same data value: –Different codings: SKU_Description = 'Corn, Large Can' SKU_Description = 'Can, Corn, Large' SKU_Description = 'Large Can Corn‘ –Different spellings: Coffee, Cofee, Coffeee
45
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-45 Inconsistent Values Particularly problematic are primary or foreign key values To detect: –Use referential integrity check already discussed for checking keys –Use the SQL GROUP BY clause on suspected columns SELECT SKU_Description, COUNT(*) AS NameCount FROMSKU_DATA GROUP BYSKU_Description;
46
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-46 Missing Values A missing value or null value is a value that has never been provided
47
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-47 Null Values Null values are ambiguous: –May indicate that a value is inappropriate: DateOfLastChildbirth is inappropriate for a male –May indicate that a value is appropriate but unknown DateOfLastChildbirth is appropriate for a female, but may be unknown –May indicate that a value is appropriate and known, but has never been entered: DateOfLastChildbirth is appropriate for a female, and may be known but no one has recorded it in the database
48
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-48 Checking for Null Values Use the SQL keyword IS NULL to check for null values: SELECT COUNT(*) AS QuantityNullCount FROMORDER_ITEM WHEREQuantity IS NULL;
49
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-49 The General-Purpose Remarks Column A general-purpose remarks column is a column with a name such as: –Remarks –Comments –Notes It often contains important data stored in an inconsistent, verbal and verbose way –A typical use is to store data on a customer’s interests. Such a column may: –Be used inconsistently –Hold multiple data items
50
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-50 David M. Kroenke’s Database Processing Fundamentals, Design, and Implementation (10 th Edition) End of Presentation: Chapter Four
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.