Download presentation
Presentation is loading. Please wait.
Published byGeorgina Kennedy Modified over 9 years ago
1
Schema Refinement and Normal Forms 20131CS3754 Class Notes #7, John Shieh
2
2CS4753/2006F Normalization It is a process that we can use to remove design flaws from a database A number of normal forms, which are sets of rules describing what we should and should not do in our table structure 3NF is sufficient to avoid the data redundancy problem of a designed relational database
3
Problems caused by redundancy Redundant Storage –Some information is stored repeatedly. Update Anomalies –If one copy of such repeated data is updated, an inconsistency is created, unless all copies are similarly updated. Insertion anomalies –It may not be possible to store certain information unless some other, unrelated, information is stored. Deletion Anomalies –It may not be possible to delete certain information without losing some other, unrelated, information. 20133CS3754 Class Notes #7, John Shieh
4
Redundant Storage –The hourly wages depend on rating levels. So, for example, hourly wage 10 for rating level 8 is repeated three times. Update Anomalies –The hourly_wages in the first tuple could be updated without making a similar change in the second tuple. IdnamelotratingHourly_wagesHours_worked 123-22-3666Attishoo4881040 231-31-5368Smiley2281030 131-24-3650Smethurst355730 434-26-3751Guldu355732 612-67-4134Madayan3581040 20134CS3754 Class Notes #7, John Shieh
5
Insertion Anomalies –We cannot insert a tuple for an employee unless we know the hourly wage for the employee’s rating value. Deletion Anomalies –If we delete all tuples with a given rating value (e.g. tuples of Smethurst and Guldu) we lose the association between the rating value and its hourly_wage value. IdnamelotratingHourly_wagesHours_worked 123-22-3666Attishoo4881040 231-31-5368Smiley2281030 131-24-3650Smethurst355730 434-26-3751Guldu355732 612-67-4134Madayan3581040 20135CS3754 Class Notes #7, John Shieh
6
Decompositions Intuitively, redundancy arise when a relational schema forces an association between attributes that is not natural. Functional dependencies can be used to identify such situations and suggest refinements to the schema. The essential idea is that many problems arising from redundancy can be addressed by replacing a relation with a collection of ‘smaller’ relation. 20136CS3754 Class Notes #7, John Shieh
7
IdnamelotratingHourly_wagesHours_worked 123-22-3666Attishoo4881040 231-31-5368Smiley2281030 131-24-3650Smethurst355730 434-26-3751Guldu355732 612-67-4134Madayan3581040 IdnamelotratingHours_worked 123-22-3666Attishoo48840 231-31-5368Smiley22830 131-24-3650Smethurst35530 434-26-3751Guldu35532 612-67-4134Madayan35840 ratingHourly_wages 810 57 A decomposition of a relation schema R consists of replacing the relation schema by two (or more) relation schemas each of which contains a subset of attributes of R and together include all attributes in R Functional dependency: - rating determines Hourly_wages 20137CS3754 Class Notes #7, John Shieh
8
Functional Dependencies A functional dependency (FD) is a kind of IC that generalizes the concept of a key. Let R be a relation schema, and X and Y be sets of nonempty sets of attributes in R. –An FD X Y exists, if in every relation instance for R, any two tuples that agree on the value of X also agree on the value of Y. –More formally Let R be a relation schema and let X and Y be nonempty sets of attributes in R. An FD X Y exists in R if every instance of R preserves the FD X Y. We say that an instance r of R preserves the FD X Y if the following holds for every pair of tuples t1 and t2 in r If t1.X = t2.X, then t1.Y = t2.Y The notation t1.X refers to the subset of fields of tuple t1 for the attributes in X 20138CS3754 Class Notes #7, John Shieh
9
Examples: course_ID course_name is preserved? {student_ID, course_ID} course_name is preserved ? if no two rows agree on value, then is trivially preserved. yes Take 20139CS3754 Class Notes #7, John Shieh
10
The table instance also preserves the following student_ID student_name Student_ID, course_ID {student_name, course_name} student_ID, course_ID {student_ID, student_name, course_ID, course_Name} student_name student_name (a trivial dependency) student_name, course_name student_name (also trivial) many more …. 201310CS3754 Class Notes #7, John Shieh
11
How do we know if a FD exist in R? Can we check all instances of R to see if the FD is preserved? –Definitely, not possible! –Whether or not a functional dependency exists must be determined by assumptions given in advance, or common sense, not by individual relation instances. Given an instance r of R, we can check if r preserves some functional dependency f, but we cannot tell if f holds over R. course_ID student_name ? Although it is preserved by this table, it does not fit the assumption. no 201311CS3754 Class Notes #7, John Shieh
12
The assumptions given in advance, or common sense, impose some constraints, and are called the semantics of a database Assumptions given in advance impose explicit constraints; common sense imposes implicit constraints 201312CS3754 Class Notes #7, John Shieh
13
Example: Application is to keep track of information about employees in a company. Information to be kept track of includes: eid: employee ’ s id number ename: employee name address: address of the employee sex: employee ’ s sex dname: name of the department that the employee works for dhname: department head ’ s name dhsex: department head ’ s sex 201313CS3754 Class Notes #7, John Shieh
14
Let ’ s construct a relation schema as follows: Which of the following dependencies are true? 1.eid ename 2.ename eid 3.eid address 4.eid sex 5.sex address 6.dhname dname 7.dhname eid 8.dhsex sex Assumptions: a:Employee’s id number is unique b:Each employee has a unique address c:Each employee works for only one dept. d:A person can be the head of at most one department e:All department heads have different names Implicit: common sense Employee eidenameaddresssexdnamedhnamedhsex 2013 14 CS3754 Class Notes #7, John Shieh
15
is a superkey for relation schema R iff attri(R) where attri(R) denotes the set of all the attributes in schema R is a candidate key (or simply, key) for R iff attri(R), and is minimal, i.e., for any , attri(R) In other words, a candidate key is a minimal superkey (student_ID, course_ID) is a candidate key (and the only one) (student_ID, course_ID, course_name) is a superkey, but not a candidate key (student_ID, course_ID, student_name) is another non-candidate superkey (student_ID, course_ID, course_name, student_name) is also a non-candidate superkey 201315CS3754 Class Notes #7, John Shieh
16
1 st Normal FormNo repeating data groups 2 nd Normal FormNo partial key dependency 3 rd Normal FormNo transitive dependency Boyce-Codd Normal FormReduce keys dependency 4 th Normal FormNo multi-valued dependency 5 th Normal FormNo join dependency Normal Forms 201316CS3754 Class Notes #7, John Shieh
17
17CS4753/2006F Normal Form (NF) 1NF: each attribute or column value must be atomic 2NF: if a schema is 1NF, and if its all attributes that are not part of the primary key are fully functionally dependent on the primary key 3NF: if a schema is 2NF, and all transitive dependencies have been removed Ex: employeeDept(employeeID, name, job, deptID, deptName) has to convert to employee(employeeID, name, job, deptID) Dept(deptID, deptName)
18
18CS4753/2006F 2NF It means that each non-key attribute must be functionally dependent on all parts of the primary key (i.e., the combination of the composite attributes of the key). Example: not 2NF Employee(employeeID, name, job, departmentID, skill) employeeID, skill name, job, departmentID employeeID name, job, departmentID (Note: determine) Break the table into two tables to become 2NF Employee(employeeID, name, job, departmentID) employeeSkills(employeeID, skill)
19
19CS4753/2006F 3NF Example: 2NF but not 3NF Employee(employeeID, name, job, departmentID, departmentName) Here employeeID departmentID employeeID departmentName Also departmentID departmentName, departmentID is not a key Therefore, employeeID departmentName is a transitive dependency Convert the schema to 3NF by breaking to two tables: Employee(employeeID, name, job, departmentID) Department(departmentID, departmentName)
20
20CS4753/2006F Normal Forms Defined Informally 1 st normal form –All attributes depend on the key 2 nd normal form –All attributes depend on the whole key 3 rd normal form –All attributes depend on nothing but the key
21
21CS4753/2006F SUMMARY OF NORMAL FORMS based on Primary Keys
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.