CSIS 115 Database Design and Applications for Business Dr. Meg Fryling “Dr. Meg” Fall 2012 @SienaDrMeg #csis115 © 2012 Meg Fryling
Agenda Chapter 3: The Relational Model and Normalization Next Quiz (Monday, 11/19) Minimum cardinality and referential integrity (including cascade updates and deletes) How to enforce and what those constraints mean. As usual, can have cheat sheet!
Normalizing for Data Integrity Data integrity problems happen when data are duplicated Normalized tables eliminate data duplication General goal of normalization is to construct tables so every table has a single topic or theme One fact – one place! Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall
Normalization Normalization: Process of converting a poorly structured table into two or more well-structured tables. Problem with these tables below is they have two independent themes: Employees and Department. Table before update Some rows show DeptNo 100 is “Accounting and Finance” and others show DeptNo 100 is “Accounting.” Which one is correct? Table after update – what’s wrong? Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall
Normalization Steps Create a new table for separate theme (repeated data). Keep a copy of the new table’s primary key in the original table as a foreign key. Create relationship between original and new table.
Each higher layer prevents a different update anomaly Normal Forms Each higher layer prevents a different update anomaly A systematic way to remove update anomalies and improve data integrity We move “up the ladder” from 0 NF Each level (normal form) removes more anomalies Since 3NF / BCNF (Boyce-Codd Normal Form) is easy to describe and fixes most update problems, we will focus on it. 5 NF & DKNF Only in rare cases does a 3NF table not meet the requirements of BCNF. A 3NF table which does not have multiple overlapping candidate keys is guaranteed to be in BCNF. BCNF ~ 3.5NF http://en.wikipedia.org/wiki/Boyce%E2%80%93Codd_normal_form 4 NF 3 NF / BCNF 2 NF 1 NF 0 NF
Our Goal (BCNF) Make any needed design changes so that every determinant in every table is a candidate key Well, let’s define…
Candidate and Primary Keys A candidate key is a key that functionally determines all of the other columns (fields) in a record. A primary key is a candidate key selected as the primary means of identifying rows in a relation. There is only one primary key per record. The primary key may be a composite key. The ideal primary key is short, numeric, and never changes (Chapter 6). By definition… A candidate key of a relation will functionally determine all other attributes in the row Likewise, by definition… A primary key of a relation will functionally determine all other attributes in the row KROENKE AND AUER - DATABASE PROCESSING, 11th Edition © 2010 Pearson Prentice Hall
STUDENT Example (SID, FirstName, LastName, Street, City, State, Zip, AdmissionTerm, SSN) What are the candidate key(s)? What would you use as the primary key?
Functional Dependency A relationship between attributes in which one attribute (or group of attributes) uniquely determines the value of another attribute in the same table. Example: The price of one cookie can determine the price of a box of 12 cookies (CookiePrice, Qty) BoxPrice KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall
Determinants The attribute (or attributes) that we use as the starting point (the variable on the left side of the equation) is called a determinant (CookiePrice, Qty) BoxPrice Determinant Functional Dependency KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall
Composite Determinants Composite determinant = a determinant of a functional dependency that consists of more than one attribute. (CookiePrice, Qty) (BoxPrice) (SID, ClassID) (Grade) (EmployeeNumber, Phone) (FirstName, LastName, Department, Email) KROENKE AND AUER - DATABASE PROCESSING, 11th Edition © 2010 Pearson Prentice Hall
STUDENT Example (SID, FirstName, LastName, Street, City, State, Zip, AdmissionTerm, SSN, DriversLicenseNum, DriversLicenseState) What are the candidate key(s)? What would you use as the primary key?
Functional Dependency Rules If A (B, C), then A B and A C. (SID) (FirstName, LastName) (SID) (FirstName) (SID) (LastName) If (A,B) C, then neither A nor B determines C by itself (if it does then you are not in 2NF). (SID, CourseID) (Grade) (SID) NOT (Grade) (CourseID) NOT (Grade)
Functional Dependency Rule 1 If A (B, C), then A B and A C. (EmployeeNumber, Phone) (FirstName, LastName, Department, Email) ?
Functional Dependency Rule 2: No Partial Key Dependencies If (A,B) C, then neither A nor B determines C by itself (EmployeeNumber, Phone) (FirstName, LastName, Department, Email) FirstName, LastName, Department, and Email depend on EmployeeNumber alone! Thus this table is not in 2NF !! Any partial-key dependencies?
Functional Dependencies in the SKU_DATA Table What are the functional dependencies? KROENKE AND AUER - DATABASE PROCESSING, 11th Edition © 2010 Pearson Prentice Hall
Functional Dependencies in the SKU_DATA Table SKU (SKU_Description, Department, Buyer) SKU_Description (SKU, Department, Buyer) Buyer Department Buyers only have one department Department NOT Buyer Multiple buyers for the same department KROENKE AND AUER - DATABASE PROCESSING, 11th Edition © 2010 Pearson Prentice Hall
Functional Dependencies in the ORDER_ITEM Table What are the functional dependencies? KROENKE AND AUER - DATABASE PROCESSING, 11th Edition © 2010 Pearson Prentice Hall
Functional Dependencies in the ORDER_ITEM Table (OrderNumber, SKU) (Quantity, Price, ExtendedPrice) Because ExtendedPrice is a computed field, we also have: (Quantity, Price) (ExtendedPrice) For a particular order and item there is only one quantity, price and extended price KROENKE AND AUER - DATABASE PROCESSING, 11th Edition © 2010 Pearson Prentice Hall
Now you do it! Advisor Table What are the functional dependencies? What are the determinates?