Normalization ISYS 464
Database Design Based on ERD Strong entity: Create a table that includes all simple attributes –Composite Weak entity: add owner primary key Multi-valued attribute: Create a table for each multi-valued attribute –Key + attribute Relationship: –1:1, 1:M Relationship table: for partial participation to avoid null Foreign key –M:M: relationship table –N-ary relationship: relationship table –Recursive relationship Attribute of relationship Superclass and subclass Note: The database designed according to these rules will meet the 3NF requirements.
Database design objectives Eliminate data duplication. Link related records in related tables.
Example Employee/Dependent report: EmpID: E101Ename: Peter Address: 123 XYZ St DependentNameRelationshipDOB NancyDaughter1/1/95 AlanSon12/25/03 EmpDependent Table: EmpID EmpName Address DepName Relation DepDOB E101Peter 123 XYZ St NancyD1/1/95 E101Peter 123 XYZ St AlanS12/25/03 Note: This database is able to produce the report, but has duplicated data.
Update Anomalies Due To Duplication Modification anomaly: –Inconsistent data Insertion Anomalies: –Enter an employee with no dependent –Null Deletion Anomaly: –If Nancy and Alan become independent.
If we mix multivalue attribute with regular attributes in one table Employee Table: –SSN, Ename, Sex, DOB, Phone –Employee may have more than 1 phone. Key: SSN or SSN + Phone Duplication ?
Example 2 EmpDependent table: –EmpID, Ename, Address, Depname, Relation, DepDOB Key: EmpID + Depname
If we mix two entities with 1:M relationship in one table FacultyStudent table: –Faculty Advise Student: 1:M relationship –FID, Fname, SID, Sname, SAddress Key: SID Duplication?
If we mix two entities with M:M relationship in one table StudentCourse table: –SID, Sname, GPA, CID, Cname, Units Key: SID + CID Duplication?
Normalization Decompose unsatisfactory relation into smaller relations with desirable properties. –No duplication The original relation can be recovered by applying natural join to the smaller relations. –So that no information is lost in the process. Keys and function dependency: –Which field is the key field of the EMpDependent Table? EmpID + DepName
Function Dependency Relationship between attributes X -> Y –The value of X uniquely determines the value of Y. –Y is functionally dependent on X. –A value of X is associated with only one value of Y.
Example Employee table: –SSNEnameSexDOB –S1PeterM1/1/75 –S2PaulM12/25/80 –S3MaryF7/4/72 Function Dependencies: –SSN -> Ename, SSN ->Sex, SSN -> DOB –SSN -> Ename, Sex, DOB Any other FD: –Ename -> SSN? –Ename -> Sex ? –DOB -> SSN?
What is the key of Employee table: –SSN Observations: –All non-key fields are functionally dependent on SSN. –There is no other FD. –The only FD is the key dependency. –There is no data duplication in the Employee table.
Normalization Process Inputs: –A “universal relation” –Function dependencies Output: Normalized tables Process: –Decompose the unnormalized relation into smaller relations such that in each relation the non key fields are functionally dependent on the key, the whole key, and nothing but the key. So help me Codd!
First Normal Form The fields of a relation are all simple attribute. –All relational database tables meet this requirement. EmpDependent table: –EmpID, Ename, Address, Depname, Relation, DepDOB –First normal form? Yes –Second normal form?
Second Normal Form The non-key fields are functionally dependent on the key, and the whole key. –FD: EmpID ->Ename, Address –Key: EmpID + Depname –Ename and Address depend on part of the key. Every non-key field is fully functionally dependent on the key. Decompose the EMpDependent table into two tables: –EmpID, Ename, Address –EmpID, Depname, Relation, DepDOB
Employee Table: –SSN, Ename, Sex, DOB, Phone –Employee may have more than 1 phone. FD: –SSN -> Ename, Sex, DOB, –SSN -> Phone? Key: SSN + Phone 2NF? No Decompose into two tables: –SSN, Ename, Sex, DOB –SSN, Phone
FacultyStudent table: –Faculty Advise Student: 1:M relationship –FID, Fname, Office, SID, Sname, SAddress FD: –FID -> Fname, Office –SID -> Sname, SAddress, FID, Fname, Office Key: SID 2NF ? Yes Duplication? Yes Why? –All non-key fields depend on the whole key, but not Nothing But the Key! SID -> FID, Fname, Office FID -> Fname, Office
Transitive Dependency If X -> Y, and Y->Z then X -> Z. Z if transitively dependent on the key. SID -> FID, FID -> Fname, Office –SID -> Fname, Office –Fname and Office are transitively dependent on SID.
Third Normal Form Every non-key field is: –Fully functionally dependent on the key, and –Non-transitively dependent on the key. Decompose: –FID, Fname, Office –SID, FID, Sname, SAddress
Example Customer/Orders report: CID: C101Cname: Peter Address: 123 XYZ St OIDOdateSalesPersonAmount O251/1/04John125 O302/25/04Alan500 CustomerOrders Table: CIDCName Address OID Odate SalesPerson Amount C101Peter 123 XYZ St O251/1/04John125 C101Peter 123 XYZ St O302/25/04Alan500
Example Key: OID FD: –OID -> CID, Cname, Address, Odate, SalesPerson, Amount –CID -> Cname, Address 2NF? Yes 3 NF? No Decompose: –CID, Cname, Address –OID, CID, Odate, SalesPerson, Amount
Example with 1:M Relationship FacultyStudent table: –Faculty Advise Student: 1:M relationship –FID, Fname, SID, Sname, SAddress FD: –FID -> Fname –SID -> Sname, Saddress Key: SID 2NF? Yes 3NF? No, because SID ->FID, FID -> Fname Decompose: –Table 1: FID, Fname –Tablw 2: SID, FID, Sname, SAddress
Example with M:M Relationship StudentCourse table: –SID, Sname, GPA, CID, Cname, Units Key: SID + CID Function Dependencies: –SID -> Sname, GPA –CID -> Cname, Units 2NF? No –Decompose: Table 1: SID -> Sname, GPA Table 2: CID -> Cname, Units Table 3: SID, CID 3NF? Yes
Online Shopping Cart Customer ShoppingCart Product Has 1 M M M CID Cname Addr CartIDDate Qty PID Pname Price
Normalized Database Universal Relation: –CID, Cname, Addr, CartID, Date, PID, Pname, Price, Qty Key: CartID + PID FDs: –CartID -> Date, CID, Cname, Addr –CID -> Cname, Addr –PID -> Pname, Price Normalized database: –CID, Cname, Addr –CartID, Date, CID –PID, Pname, Price –CartID, PID, Qty
Database Design Based on ERD Strong entity: Create a table that includes all simple attributes –Composite Weak entity: add owner primary key Multi-valued attribute: Create a table for each multi-valued attribute –Key + attribute Relationship: –1:1, 1:M Relationship table: for partial participation to avoid null Foreign key –M:M: relationship table –N-ary relationship: relationship table –Recursive relationship Attribute of relationship Superclass and subclass Note: The database designed according to these rules will meet the 3NF requirements.
Denormalization The refinement to the relational schema such that the degree of normalization for a modified relation is less than the degree of at least one of the original relations. Objective: –Speed up processing