Presentation is loading. Please wait.

Presentation is loading. Please wait.

IS 230Lecture 8Slide 1 Normalization Lecture 9. IS 230Lecture 8Slide 2 Lecture 8: Normalization 1. Normalization 2. Data redundancy and anomalies 3. Spurious.

Similar presentations


Presentation on theme: "IS 230Lecture 8Slide 1 Normalization Lecture 9. IS 230Lecture 8Slide 2 Lecture 8: Normalization 1. Normalization 2. Data redundancy and anomalies 3. Spurious."— Presentation transcript:

1 IS 230Lecture 8Slide 1 Normalization Lecture 9

2 IS 230Lecture 8Slide 2 Lecture 8: Normalization 1. Normalization 2. Data redundancy and anomalies 3. Spurious information 4. Functional dependencies 5. Normalization first normal form second normal form third normal form BCNF normal form 6. Normalization Methodology: Example

3 IS 230Lecture 8Slide 3 1. Normalization A technique for producing a set of relations with desirable properties, given the data requirements of the applications

4 IS 230Lecture 8Slide 4 2. Data redundancy and anomalies Emp-Dept relation Insertion How do we insert a new department with no employees yet? (keys?) Entering employees is difficult as department information must be entered correctly. Deletion What happens when we delete CC's data - do we lose department 6? Modification If we change the manager of department 5, we must change it for tuples with Dept# = 5.

5 IS 230Lecture 8Slide 5 3. Spurious information Avoid breaking up relations in such a way that spurious information is created may be broken into: Joining them back together, we get NEW TUPLES!

6 IS 230Lecture 8Slide 6 4. Functional dependencies Formal concepts that may be used to exhibit “goodness” and “badness” of individual relational schemas, and describe relationships between attributes Examples of Functional Dependency Name, DateOfBirth, Dept  all depend on NI  Dname & Manager depend on Dept  ProjLocation depends on ProjName.

7 IS 230Lecture 8Slide 7 Functional Dependence An attribute, X, of a relation is functionally dependent on attributes A, B,..., N if the same values of A,..., N are always associated with the same value of X {A,…,N}  X A,..., N is called the determinant of the functional dependency

8 IS 230Lecture 8Slide 8 Full Functional Dependency X fully depends on A, …, N if it is not dependent on any subset of A,..., N; otherwise we talk of partial dependency e.g. age is dependent on NI  and Name but only fully dependent on NI .

9 IS 230Lecture 8Slide 9 5. Normalization Process taking a set of relations and decomposing them into more relations satisfying some criteria. Decomposition essentially a series of projections so that the original data can be reconstituted using joins. Normal form form of the relations which satisfy the criteria

10 IS 230Lecture 8Slide 10 5.1. First Normal Form (1) A relation is in first normal form (1NF) if all values are atomic, i.e. single values - small strings and numbers. Identify and remove repeating groups (multi-valued attributes)

11 IS 230Lecture 8Slide 11 First Normal Form (2) DEPARTMENT Two ways of normalising this: Have a tuple for each location of each department: Have a separate relation for (Dnumber, Locations) pairs: The latter is better as it avoids redundancy.

12 IS 230Lecture 8Slide 12 First Normal Form (3) Example of composite attribute: Room is not atomic Room includes Deptname and Room#

13 IS 230Lecture 8Slide 13 5.2. Second Normal Form By the definition of the primary key, every other attribute is functionally dependent on it. If all the other attributes are fully functionally dependent then the relation is in Second Normal Form (2NF). Clearly, any relation with a single primary key will be 2NF. If there are two primary key attributes, A & B, then each other attribute is either dependent on A alone; dependent on B alone; or dependent on both. 2NF Normal consists of creating a separate relation for each of the three cases.

14 IS 230Lecture 8Slide 14 Example of 2NF decomposition EMP_PROJ(SSN, Pnumber, Hours, Ename, Pname, Ploc) SSNPnumber Hours EnamePnamePloc Decomposition into three 2NF relations: Work(SSN,Pnumber,Hours) EMP(SSN,Ename) Project(Pnumber,Pname,Ploc) SSN  Ename Pnumber  Pname,Ploc SSN, Pnumber  Hours

15 IS 230Lecture 8Slide 15 5.3. Third Normal Form Third Normal Form eliminates transitive dependencies - i.e. those dependencies which hold only because of some intermediary. An attribute is transitively dependent on the primary key if there is some other attribute which is dependent on and which is, in turn, dependent on the key. Dname is dependent on NI , but only because it is dependent on Dnumber which is, in turn, dependent on NI . Non-3NF relations are likely to hold redundant information. A relation is in 3NF if for any pair of attribute A & B such that A  B, there is no attribute such that A  X and X  B. Normalizing this would create:

16 IS 230Lecture 8Slide 16 Example of 3NF decomposition EMP_DEPT(SSN, Ename, Bdate, Address, Dnum, Dname, Dman) Ename Bdate AddressDnum Dname Dman SSN Decomposition into two 3NF relations: EMPLOYEE(SSN, Ename, Bdate, Address, Dnum) DEPT(Dnum, Dname, Dman) SSN  Ename, Bdate, Address, Dnum Dnum  Dname, Dman

17 IS 230Lecture 8Slide 17 General Definitions of Normal Forms

18 IS 230Lecture 8Slide 18 5.4. Boyce-Codd Normal Form Every relation in BCNF is also in 3NF Relation in 3NF is not necessarily in BCNF Nontrivial FD means not trivial FD A trivial functional dependency X  Y is one in which Y is a subset of X Example: A, B  B is a trivial FD Most relation schemas that are in 3NF are also in BCNF

19 IS 230Lecture 8Slide 19 Example of not BCNF The following is not in BCNF: bor_loan = (customer_id, loan_number, amount) The following is a functional dependency that may hold: loan_number  amount but loan_number is not a superkey of bor_loan We decompose into two relations: R1=(customer_id, loan_number) R2=(loan_number, amount) R1 and R2 are in BCNF

20 IS 230Lecture 8Slide 20 R=(A, B, C, D, E, F) A, B  D B  E (not in BCNF) D  F (not in 3NF)

21 IS 230Lecture 8Slide 21 Normalization Methodology: Example Consider the following description of a company: The company is divided into departments. Each department is identified by its department number. A department has a name and a manager (an employee). The employees of the company are identified by their National Insurance number. An employee has a name, an address, an age, and work in one department only. An employee is supervised by several supervisors (employees), and a supervisor can supervise several employees (supervisees). An employee can have dependents, where each dependent is described by his/her name and age. The company has a number of running projects. A project is identified by its project number. A project has also a name and a description. Several employees can work on a project, and an employee can work on several projects, each a fixed number of hours.

22 IS 230Lecture 8Slide 22 AB141537AB141537 1. Functional dependencies An attribute, X, of a relation is functionally dependent on attributes A, B,..., N if the same values of A,..., N are always associated with the same value of X {A,…,N}  X Example A  B does NOT hold, B  A does hold

23 IS 230Lecture 8Slide 23 1. Functional dependencies (Cont.) The company is divided into departments. Each department is identified by its department number. A department has a name and a manager (an employee). Dnumber  Dname, Manager The employees of the company are identified by their National Insurance number. An employee has a name, an address, an age, and work in one department only. NI  Ename, Address, Eage, Dnumber

24 IS 230Lecture 8Slide 24 1. Functional dependencies (Cont.) An employee is supervised by several supervisors (employees), and a supervisor can supervise several employees (supervisees). NI  Supervisor (is not a valid functional dependency since an employee can have several supervisors) An employee can have dependents, where each dependent is described by his/her name and age. NI, DepName  DepAge

25 IS 230Lecture 8Slide 25 1. Functional dependencies (Cont.) The company has a number of running projects. A project is identified by its project number. A project has also a name and a description. Pno  Pname, Description Several employees can work on a project, and an employee can work on several projects, each a fixed number of hours. Pno, NI  Hours

26 IS 230Lecture 8Slide 26 1. All Functional dependencies Dnumber  Dname, Manager NI  Ename, Address, Eage, Dnumber NI, DepName  DepAge Pno  Pname, Description Pno, NI  Hours The Universal Relation: U(Dnumber, Dname, Manager, NI, Ename, Address, Eage, Pno, Pname, Description, DepName, DepAge, Hours, Supervisor)

27 IS 230Lecture 8Slide 27 2. The Primary key What is the primary key? (Dnumber, NI, Pno, DepName)? Supervisor must be part of the primary key because it is a multivalued attribute DepName must be part of the primary key since there can be several dependents to an employee Dnumber is not part of the primary key since it can be derived from NI Thus the primary key is (NI, Pno, DepName, Supervisor)

28 IS 230Lecture 8Slide 28 3. Is U in 1NF? A relation is in first normal form (1NF) if all values are atomic, i.e. single values Supervisor is a multivalued attribute, hence U is not in 1NF To remove the multivalued attribute, the relation U is decomposed as follows U1(Dnumber, Dname, Manager, NI, Ename, Address, Eage, Pno, Pname, Description, DepName, DepAge, Hours) Supervise(NI, Supervisor)

29 IS 230Lecture 8Slide 29 4. Is U in 2NF? A relation is in Second Normal Form (2NF) if all the attributes are fully functionally dependent on the primary key. A relation with a single primary key is in 2NF. Supervise is in 2NF U1 is not in 2NF because some attributes are partially functionally dependent on the primary key (e.g. DepAge, Ename, etc). We decompose U1 into the following 2NF relations: Dept_Emp(Dnumber, Dname, Manager, NI, Ename, Address, Eage) Dependent(NI, DepName, DepAge) Project(Pno, Pname, Description) Work(NI, Pno, Hours)

30 IS 230Lecture 8Slide 30 5. Is the previous decomposition in 3NF? A relation is in 3NF if for any pair of attributes A & B such that A  B, there is no attribute such that A  X and X  B Supervise is in 3NF Among the above relations, only Dept_Emp is not in 3NF, since it has transitive dependencies (e.g. NI  Dnumber  Dname). We therefore decompose the relation into the following two relations, which are in 3NF: Department(Dnumber, Dname, Manager) Employee(NI, Ename, Address, Eage, Dnumber)

31 IS 230Lecture 8Slide 31 6. Complete Schema in 3NF Department(Dnumber, Dname, Manager) Employee(NI, Ename, Address, Eage,Dnumber) Dependent(NI, DepName, DepAge) Project(Pno, Pname, Description) Work(NI, Pno, Hours) Supervise(NI, Supervisor)

32 IS 230Lecture 8Slide 32 End of Chapter


Download ppt "IS 230Lecture 8Slide 1 Normalization Lecture 9. IS 230Lecture 8Slide 2 Lecture 8: Normalization 1. Normalization 2. Data redundancy and anomalies 3. Spurious."

Similar presentations


Ads by Google