Download presentation
Presentation is loading. Please wait.
Published byElfrieda Johnston Modified over 8 years ago
1
1 Theory, Practice & Methodology of Relational Database Design and Programming Copyright © Ellis Cohen 2002-2008 Functional Dependencies & Normalization These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. For more information on how you may use them, please see http://www.openlineconsult.com/db
2
2 © Ellis Cohen 2001-2008 Overview of Lecture Normalization & Redundancy Simple Functional Dependencies Simple Conceptual Normalization Redundancy & FD Constraints Simple FD's and Normalization Simple Relational Normalization The Reverse Fan Trap Composite Functional Dependencies Composite FD's and Normalization
3
3 © Ellis Cohen 2001-2008 Normalization & Redundancy
4
4 © Ellis Cohen 2001-2008 Normalization Conceptual Normalization Do Conceptual Normalization as part of the design of entity classes Useful as a way of really understanding and honing the design Simple conceptual normalization is simple Relational Normalization Do Relational Normalization after creating relational schema Simple consistent approach for many normalization problems Essential for catching problems missed during entity-class design
5
5 © Ellis Cohen 2001-2008 Normalization Problem In designing a table for Employees with the attributes empno, ename, deptno, dname, addr What's wrong with just using a single entity class (or table): Employee( empno, ename, deptno, dname, addr )?
6
6 © Ellis Cohen 2001-2008 Answer: What's Wrong … Redundancy: deptno & dname Extra Work: If changed name of a department, would have to do it in multiple places Anomalies: Could change deptno without changing dname or vice versa. Too many NULLs: If an employee could be unassigned, both deptno and dname would be NULL.
7
7 © Ellis Cohen 2001-2008 Redundancy: deptno dname empno deptno dname Employee 7654…30SALES… Entities with the same value for deptno have the same value for dname Including dname in the entity class is redundant, since it can be derived from deptno Redundancy causes duplicate work Suppose the company wants to change deptno 30 to be the Sales & Marketing department. That change must be made to multiple employees 798650SUPPORT 769830SALES 783910ACCOUNTING 784430SALES
8
8 © Ellis Cohen 2001-2008 Redundancy and Anomaly Redundancy can cause anomalies (inconsistencies) if modifications are not done carefully Update Anomaly: Updating a value in a single cell can make the database inconsistent Insertion Anomaly: Adding an entity can make the database inconsistent Deletion Anomaly: Deleting some information can make the database inconsistent or cause unintended loss of information
9
9 © Ellis Cohen 2001-2008 Anomaly Examples Modification Anomaly: Modify 7654's dname to 'SUPPORT' (without changing its deptno) Insert Anomaly: Insert a new employee with a deptno of 20, and a dname of 'SUPPORT' Delete Anomaly: Delete employee 7986 (it’s the only employee in SUPPORT, and no other entity class keeps track that dept 50 is SUPPORT) Employee empno deptno dname 7654…30SALES… 798650SUPPORT 769830SALES 783910ACCOUNTING 784430SALES
10
10 © Ellis Cohen 2001-2008 Simple Functional Dependencies
11
11 © Ellis Cohen 2001-2008 Redundancy and Functional Dependencies Functional Dependencies Specify which attributes in a table or entity class are determined by other attributes Identify potential redundancies Help us see how to eliminate those redundancies (generating the conceptual model we really should have produced initially!)
12
12 © Ellis Cohen 2001-2008 Simple Functional Dependencies (FD's) Dependencies among attributes A B A functionally determines B B functionally depends on A The value of A uniquely determines a single value for B If two or more tuples (of a specific table or entity class) have the same value for A, they have the same value for B (e.g. Every employee that has the same value for deptno – e.g. 30 has the same value for dname – e.g. SALES)
13
13 © Ellis Cohen 2001-2008 FD's for a Normalized Example Employee( empno, ename, addr ) empno ename empno addr This can also be written as empno ename, addr orempno { ename, addr } Also empno empno (this is a trivial FD, which we usually don't write) empno can be used to lookup (and therefore uniquely determine) all the other attributes of an Employee tuple
14
14 © Ellis Cohen 2001-2008 How FD's Come About Reflect real world facts zip state The real world defines a mapping from zip code to state, reflected in the database Created UNIQUE & NOT NULL id empno ename, deptno,.. empno is established as a unique non-null id which identifies each employee Computationally (Derived) Suppose we had an attribute tithe with tithe = sal / 10 sal tithe and tithe sal In general, it is a bad idea to try to use normalization to deal with derived attributes!
15
15 © Ellis Cohen 2001-2008 Simple Candidate Keys Employee empno ssno ename addr Simple Candidate Keys A simple candidate key is any attribute of an entity class or table which uniquely identifies a tuple UNIQUE and NOT NULL A designer chooses a primary key from one of the candidate keys Both empno and ssno uniquely identify an employee
16
16 © Ellis Cohen 2001-2008 Determinants & Dependents Determinant Dependent empno addr In a Simple FD, the determinant is a single attribute
17
17 © Ellis Cohen 2001-2008 FD's for an Example with Redundancy Employee( empno, ename, deptno, dname, addr ) empno ename empno deptno empno dname empno addr However, also deptno dname (possibly) dname deptno This is a problem! deptno is NOT a candidate key It indicates redundancy!
18
18 © Ellis Cohen 2001-2008 Redundancy: deptno dname Because deptno is not a candidate key, the same deptno value (e.g. 30) can appear multiple times. But deptno dname, that is, two tuples with the same value of deptno have the same value of dname Voila! REDUNDANCY! empno deptno dname Employee 7654…30SALES… 798650SUPPORT 769830SALES 783910ACCOUNTING 784430SALES
19
19 © Ellis Cohen 2001-2008 Transitive Dependencies But empno dname can also be derived from empno deptno & deptno dname Because empno is the primary key Employee( empno, ename, deptno, dname, addr ) empno deptno AND empno dname But, there is an attribute deptno, which is not a candidate key and deptno dname This transitive dependency or transitivity violation is another way of characterizing the redundancy problem
20
20 © Ellis Cohen 2001-2008 Simple Conceptual Normalization
21
21 © Ellis Cohen 2001-2008 Simple Conceptual Normal Form Employee( empno, ename, deptno, dname, addr ) empno is the only candidate key (i.e the only attribute which uniquely identifies tuples) but deptno dname so this is not in Simple Conceptual Normal Form, since deptno is the determinant of deptno dname but deptno is not a candidate key An entity class is in Simple Conceptual Normal Form if, within each entity class, the determinant of every (simple, non-trivial) functional dependency is a candidate key
22
22 © Ellis Cohen 2001-2008 Simple Conceptual Normalization Given an entity class with a (non-trivial) functional dependency whose determinant is NOT a candidate key Split out a new entity class Make the determinant the primary key (or at least a candidate key) of the new class Move all attributes that depend on it Employee( empno, ename, deptno, dname, addr ) deptno dname Employee( empno, ename, addr ) Dept( deptno, dname ) Note: Most books only discuss Normalization at the Relational Design level. However, Conceptual Normalization, though not complete, is a way to improve a conceptual design.
23
23 © Ellis Cohen 2001-2008 Result of Simple Conceptual Normalization Each simple conceptual normalization step adds one entity class adds one 1:M relationship link Employee empno ename deptno dname addr Employee has empno ename addr Dept deptno dname deptno dname
24
24 © Ellis Cohen 2001-2008 Conceptual Normalization Exercise Assume you have designed an Item entity class with the following attributes itemsku e.g. 'FX311B-24M' stylecode e.g. '302' stylenam e.g. 'Hunting Bikini Brief' styledate e.g. 1992 (when introduced) catid e.g. 'MU' catnam e.g. 'Mens Underwear' size e.g. 9 color e.g. 'red' Find an FD with a non-candidate-key determinant, and use Conceptual Normalization to split out a new entity class. Continue doing this with all of the resulting entity classes until each of them are in Conceptual Normal Form.
25
25 © Ellis Cohen 2001-2008 Conceptual Normalization (a) StyleItem has itemsku size color stylecode stylenam styledate Category has catid catnam Item itemsku stylecode stylenam styledate catid catnam size color Item has itemsku size color Style stylecode stylenam styledate catid catnam stylecode stylenam, styledate, catid catnam catid catnam Each simple conceptual normalization step adds one entity class adds one relationship link Step 1 Step 2
26
26 © Ellis Cohen 2001-2008 Conceptual Normalization (b) StyleItem has itemsku size color stylecode stylenam styledate Category has catid catnam Item itemsku stylecode stylenam styledate catid catnam size color Item has itemsku stylecode stylenam styledate size color Category catid catnam catid catnam stylecode stylenam, styledate Each simple conceptual normalization step adds one entity class adds one relationship link Step 1 Step 2
27
27 © Ellis Cohen 2001-2008 Redundancy & FD Constraints
28
28 © Ellis Cohen 2001-2008 Anomalies & Constraints Anomalies caused by redundancy can be avoided by using state constraints –A state constraint could specify a relationship between deptno & dname (avoids insertion & modification anomalies) –A state constraint could even check that the dname associated with each deptno matched a separate table containing just deptno & dname (avoids deletion anomalies as well)
29
29 © Ellis Cohen 2001-2008 FD's as Constraints FD's are constraints empno deptno means Each employee works for (at most) a single department What does deptno dname mean (in Employee)
30
30 © Ellis Cohen 2001-2008 Enforce FD's deptno dname Every tuple with the same deptno has the same dname empnodeptnodname 369910MKT 819810MKT 512210MKT 807720PARTY 211530SALES 817330ACCT 490230SALES Consider this table which violates the constraint. What’s the assertion corresponding to the constraint deptno dname ? (Use a manifest view)
31
31 © Ellis Cohen 2001-2008 Assertions to Enforce FD's (SELECT count(DISTINCT dname) FROM Emps GROUP BY deptno) ALL = 1 empnodeptnodname 369910MKT 819810MKT 512210MKT 807720PARTY 211530SALES 817330ACCT 490230SALES deptnoknt 101 201 302 Example with Violation! deptno dname Every tuple with the same deptno has the same dname
32
32 © Ellis Cohen 2001-2008 Anomalies, Constraints & Triggers Anomalies caused by redundancy can be avoided by enforcing state constraints. No built-in support to automatically enforce FD constraints, since commercial DB's don't support assertions. Assertions can be enforced using Triggers: But triggers can be difficult to understand, debug & maintain Generally Preferred Solution: Avoid redundancies: No anomalies, No triggers
33
33 © Ellis Cohen 2001-2008 Simple Functional Dependencies & Normalization
34
34 © Ellis Cohen 2001-2008 FD's and Normalization Goal: Simple Normalization Eliminate "bad" FD's Why: Because they would need to be turned into assertions and potentially enforced in some way How: By decomposing the original table into multiple tables with foreign keys. The original FD's can be inferred by a) The table's uniqueness constraints b) The key constraints of the FK's
35
35 © Ellis Cohen 2001-2008 Enforcing FD's Before Normalization empno ename, addr, deptno, dname Employee empno ename addr deptno dname deptno dname (i.e. every deptno has a single associated dname) enforced via empno uniqueness constraints (i.e. empno is unique) cannot be enforced directly It would need to enforced via a state assertion
36
36 © Ellis Cohen 2001-2008 Enforcing FD's After Normalization Note how all the original FD's are now represented by either key constraints or unique constraints empno ename, addr EmployeeDept empno ename addr deptno dname deptno dname empno deptno or can be derived from them, e.g. empno dname Chen Employee Dept works for Crow Magnum
37
37 © Ellis Cohen 2001-2008 Simple Relational Normalization
38
38 © Ellis Cohen 2001-2008 Conceptual vs Relational Normalization Conceptual Normalization Do Conceptual Normalization as part of the design of entity classes Relational Normalization Do Relational Normalization after creating relational schema Essential for catching problems missed during entity-class design
39
39 © Ellis Cohen 2001-2008 Simple Relational Normalization Given a relation (i.e. table) with a (simple, non-trivial) functional dependency whose determinant is NOT a candidate key –Split out a new relation –Make the determinant the primary key (or at least a candidate key) of the new relation –Move all attributes that depend on it, except for the key itself, which becomes a foreign key Emps( empno, ename, deptno, dname, addr ) deptno dname Depts( deptno, dname ) Emps( empno, ename, addr, deptno ) deptno references Depts Relations
40
40 © Ellis Cohen 2001-2008 Normalizing Relations Emps new empno ename addr deptno Depts deptno dname Emps old empno ename address deptno dname deptno dname
41
41 © Ellis Cohen 2001-2008 Relational Normalization Exercise Assume you have designed an Items table with the following attributes itemsku e.g. 'FX311B-24M' stylecode e.g. '302' stylenam e.g. 'Hunting Bikini Brief' styledate e.g. 1992 (when introduced) catid e.g. 'MU' catnam e.g. 'Mens Underwear' size e.g. 9 color e.g. 'red' Find an FD with a non-candidate-key determinant, and use Relational Normalization to split out a new table. Continue doing this with all of the resulting tables classes until you can no longer split tables resulting from FD's that represent redundancy
42
42 © Ellis Cohen 2001-2008 Relational Normalization (a) Styles Items itemsku size color stylecode stylenam styledate catid Categories catid catnam Items itemsku stylecode stylenam styledate catid catnam size color Items itemsku size color stylecode Styles stylecode stylenam styledate catid catnam stylecode stylenam, styledate, catid catnam catid catnam Each simple relational normalization step adds one table adds one foreign key Step 1 Step 2
43
43 © Ellis Cohen 2001-2008 Relational Normalization (b) Items itemsku stylecode stylenam styledate catid size color catid catnam stylecode stylenam, styledate catid Step 1 Step 2 Each simple relational normalization step adds one table adds one foreign key Styles Items itemsku stylecode size color stylecode stylenam styledate catid Categories catid catnam Items itemsku stylecode stylenam styledate catid catnam size color Categories catid catnam
44
44 © Ellis Cohen 2001-2008 The Reverse Fan Trap
45
45 © Ellis Cohen 2001-2008 The Reverse Fan Trap Division EmployeeDept employs works for Suppose a company has multiple divisions Every employee is employed by a division, and assigned to a particular department in that division What's wrong with this diagram? empnodeptnodivno
46
46 © Ellis Cohen 2001-2008 Reverse Fan Trap Instances 7698BLAKE 7499ALLEN 7654MARTIN 7986STERN 7844TURNER DIV A DIV B … 10SALES 30ACCOUNTING … Two employees in the same department could be assigned to different divisions But we could enforce the state constraint: two employees in the department must be in the same division
47
47 © Ellis Cohen 2001-2008 Reverse Fan Trap Instances with Assertion 7698BLAKE 7499ALLEN 7654MARTIN 7986STERN 7844TURNER DIV A DIV B … 10SALES 30ACCOUNTING … So suppose we require that constraint. Are there any other problems? 50PARTY
48
48 © Ellis Cohen 2001-2008 Reverse Fan Trap Instances 7698BLAKE 7499ALLEN 7654MARTIN 7986STERN 7844TURNER There's a Deletion Anomaly If Turner is deleted (or no longer assigned to Accounting), then the link between Accounting & DIV A is lost DIV A DIV B … This suggests redundancy, which is not obvious in the conceptual model but which we can see in the relational model 10SALES 50PARTY … 30ACCOUNTING
49
49 © Ellis Cohen 2001-2008 Relational Reverse Fan Trap Division EmployeeDept employs works for Emps empno ename addr deptno divno Depts deptno dname Divs divno divname Where's the redundancy? empnodeptnodivno
50
50 © Ellis Cohen 2001-2008 Relational Redundancy Emps empno ename addr deptno divno Depts deptno dname Divs divno divname REDUNDANCY! A department is part of a single division. That is, deptno divno The redundancy is not immediately obvious in the conceptual model It is obvious in the relational model: we can see that deptno divno
51
51 © Ellis Cohen 2001-2008 Relational Transitivity Violation Because empno is key empno deptno empno divno Because divisions are divided into departments deptno divno empno divno can be derived through transitivity. The Reverse Fan Trap has a transitivity violation. We have to use the relational model to see it. Full discovery & normalization of redundancies must be done at the relational level by considering FD's at the relational level! Emps empno ename addr deptno divno deptno dname Divs divno divname
52
52 © Ellis Cohen 2001-2008 Fixing the Reverse Fan Trap deptno divno Emps empno ename addr deptno Depts deptno dname Divs divno divname DeptDivs deptno divno Emps empno ename addr deptno Divs divno divname Depts deptno divno dname Combine Depts & DeptDivs Normalize Emps empno ename addr deptno divno deptno dname Divs divno divname
53
53 © Ellis Cohen 2001-2008 Reverse Fan Trap Conceptually Division EmployeeDept employs works for empnodeptnodivno EmployeeDeptDivision works for part of + Conceptual State Constraint: Every department is in at most a single division
54
54 © Ellis Cohen 2001-2008 Database Design Process Requirements Physical Model using DDL & DCL Relational Model Conceptual Model Conceptual Design & Conceptual Normalization Relational Mapping & Relational Normalization Physical Mapping
55
55 © Ellis Cohen 2001-2008 Composite Functional Dependencies Functional Dependencies with Composite Determinants
56
56 © Ellis Cohen 2001-2008 Composite Keys Suppose employee numbers are unique within a division Emps( divno, empno, ename, addr ) Composite Key divno + empno Definition of Composite Key –A group of fields –No subset of the fields uniquely identifies a tuple –The combination of all the fields in the key do uniquely identify a tuple
57
57 © Ellis Cohen 2001-2008 Composite Determinants A1 + A2 + … An B or { A1, A2, …, An } B A1 + A2 + … An functionally determine B B functionally depends on A1 + A2 + … An If you know A1 and A2, and … An, you know (or you can lookup) a single corresponding value for B If two tuples have the same values for A1, A2, … and An, then they have the same value for B
58
58 © Ellis Cohen 2001-2008 Minimal Composite Determinants Suppose employee numbers are unique within a division Emps( divno, empno, ename, addr ) divno ename empno ename divno + empno ename divno + empno + addr ename Not true: neither company nor empno alone determine ename True & Minimal True, but Not Minimal A determinant is minimal if No subset of the determinant fields uniquely determine the dependent
59
59 © Ellis Cohen 2001-2008 FDs as Assertions You can think of FDs as assertions about relations Writing that a relation R has the functional dependency A B (where A or B can be a set of attributes) means R{ A, B ! }{ A ! knt:count(*) }{ knt } ALL = 1 WITH ABs AS (SELECT DISTINCT A1, … Am, B1, …, Bn FROM R), (SELECT count(*) FROM ABs GROUP BY A1, …, Am) ALL = 1 Any set of tuples that have the same values for the A attribute(s) also have the same values for the B attribute(s)
60
60 © Ellis Cohen 2001-2008 Surrogate Keys For Emps( empid, divno, empno, ename, addr ) Candidate Keys –empid –divno + empno Surrogate keys are sometimes used to add a single-attribute key when there is also a composite key Chosen by the designer as the primary key
61
61 © Ellis Cohen 2001-2008 Prime Attributes Any attribute in a candidate key is called a prime attribute For Emps( empid, divno, empno, ename, addr ) The candidate keys of Emps: –empid –divno + empno The prime attributes of Emps: empid, divno, empno
62
62 © Ellis Cohen 2001-2008 Overlapping Candidate Keys Suppose that in the Depts table, both deptno and dname uniquely identify a department within a division. Depts( divno, deptno, dname, loc ) The candidate keys of Depts: –divno + deptno –divno + dname The prime attributes of Depts: divno, deptno, dname Chosen by the designer as the primary key
63
63 © Ellis Cohen 2001-2008 Overlapping Key Example divnodeptnodnameloc 110SALESNY 120MKTNY 130PARTYLA 210BEACHLA 220MKTPHILA 230SALESDC 240R&DSF candidate key
64
64 © Ellis Cohen 2001-2008 Functional Dependency Exercise Video( vidid, acqdate, title, year, studio, empno, ename, rentdate, renttime, duedate, custid, cphone ) vididunique id of a video acqdatedate this video was acquired titletitle of the film yearyear the film was made studiostudio that made the film empnoemployee who rented this video enamename of that employee rentdatedate the video was rented renttimetime the video was rented duration# of days video can be rented (depends on the film, and may change daily) custidcustomer id cphonecustomer's phone There may be other candidate keys (simple or composite) if the requirements specify additional constraints. What are other possible keys & the corresponding constraints? Suppose the Video table keeps track of all the videos which are currently rented
65
65 © Ellis Cohen 2001-2008 Candidate Keys acqdate + title + year requires that each shipment only has a single copy of a particular film custid requires that customers are only allowed to have a single video rented at a time custid + title + year requires that customers can only have a single copy of a film rented at a time title + year + rentdate + renttime requires that only a single copy of a film can be rented at a time custid + rentdate + renttime requires that each video be rented separately OK. Suppose vidid is actually the only candidate key. Find a minimal set of FD's, from which all the (non-trivial) FD's can be inferred
66
66 © Ellis Cohen 2001-2008 Functional Dependency Answer Candidate Keys: vidid Prime Attributes: vidid Minimal Functional Dependencies vidid custid, acqdate, title, year, rentdate, renttime empno ename custid cphone title + year studio title + year + rentdate duration custid + rentdate + renttime empno empno + rentdate + renttime custid Non-key FD's This is fun! In practice, we don't need to find all the FD's. We can normalize finding one FD at a time.
67
67 © Ellis Cohen 2001-2008 Transitive Inference of FD's Transitivity & Pseudo-Transitivity: If A B and B C, then A C If A B and W + B C, then W + A C If W + A B and W + B C, then W + A C So If you're writing a collection of FD's, don’t include FD's that can be deduced by transitivity or pseudo-transitivity
68
68 © Ellis Cohen 2001-2008 Composite Functional Dependencies & Normalization
69
69 © Ellis Cohen 2001-2008 Example Conceptual Model Video Employee vidid acqdate rentdtime empno ename Customer custid cphone Consider this conceptual model in which employees process videos rented by customers. We add the state constraint: All rentals by a customer at the same time are processed by the same employee. It may not be immediately obvious how to use that information to further normalize the model. Using the relational model, the normalization becomes clear. rent processed by
70
70 © Ellis Cohen 2001-2008 vidid acqdate custid rentdtime empno Relational Mapping Videos Employees empno ename Customers custid cphone The state constraint can be represented as the composite FD: custid + rentdtime empno It can be resolved using Relational Normalization Video Employee vidid acqdate rentdtime empno ename Customer custid cphone rent processed by
71
71 © Ellis Cohen 2001-2008 vidid acqdate custid rentdtime empno Normalization with Composite FD's Videos Employees empno ename Customers custid cphone Normalize based on custid + rentdtime empno custid rentdtime empno Rentals Employees empno ename Customers custid cphone vidid acqdate custid rentdtime Videos What's the conceptual model?
72
72 © Ellis Cohen 2001-2008 Resulting Conceptual Model Rental Employee rentdtime empno ename Customer custid cphone rent processed by Video vidid acqdate
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.