Download presentation
Presentation is loading. Please wait.
Published byDenis Carter Modified over 8 years ago
1
Chapter 15 1 Functional Dependencies and Normalization for Relational Databases تنبيه : شرائح العرض (Slides) هي وسيلة لتوضيح الدرس واداة من الادوات في ذلك. حيث المرجع الاساسي للمادة هي الكتاب المعتمد في وصف المقرر
2
Introduction Chapter 10-2 We have studied Relational model, which consist of group of relation schemas Each relation schema consists of number of attributes The attributes are grouped together to form a relation schema by using a commonsense of the database designer We still need some formal measure of why one group of attributes of one relation may be better than another So far, we have not develop any measure of appropriateness to measure the quality of design 2 2
3
Outline Chapter 10-3 Informal design guidelines for good and bad relation schemas 3 1.Semantics of the Relation Attributes 2.Reducing the Redundant Information in Tuples 3.Reducing the Null Values in Tuples 4.Disallowing the Possibility of Generating Spurious Tuples 3 3
4
Semantics of the Relation Attributes Chapter 10-4 Design a relation schema so that it is easy to explain its meaning. Do not combine attributes from multiple entity types and relationship types into a single relation Intuitively, if a relation corresponds to one entity type or one relationship type, it is straightforward to explain its meaning. Otherwise, if the relation corresponds to mixture of multiple entities, semantic ambiguities will result and the relation cannot be easily explained The semantics of attributes should be easy to interpret. 4 4
5
Simplified COMPANY (1) Chapter 10-5 ENAMESSNBDATEADDRESSDNUMBER DEPARTMERNT DNAMEDNUMBERDMGR_SSN EMPLOYEE DNUMBERDLOCATION DEPARTMERNT LOCATIONS PNAMEPNUMBERPLOCATIONDNUM PROJECT SSNPNUMBERHOURS WORKS ON 5
6
Simplified COMPANY (1) The meaning of the EMPLOYEE relation is quit simple Each tuple represents an employee, with values for employee’s name, social security number, birth date, and address, and the department number that the employee works for The DNUMBER attribute is a foreign key that represents an implicit relationship between EMPLOYEE and DEPARTMENT 6 6
7
Simplified COMPANY (1) The semantics of DEPARTMENT and PROJECT schemas are also straightforward Each department tuple represents a department entity, each project tuple represents a project entity The attribute DMGR_SSN of Department relates a department to the employee who is its manager The attribute DNUM of Project relates a project to its controlling department, both attributes are foreign key 7 7
8
Simplified COMPANY (1) 8 The semantics of the other two relation schemas are slightly more complex Each tuple in DEPARTMENT LOCATION gives a department number(Dnumber) and one of the locations of the department (Dlocation) Each tuple in WORKS ON gives an employee social security number (ssn), the project number of one of the projects that the employee works on (Pnumber), and the number of hours per week that the employee works on that project(Hours) All the relation schemas is considered as easy to explain and hence having clear semantic 8
9
The ease with which the meaning of a relation’s attributes can be explained is an informal measure of how well the relation is designed Informally 9 9
10
Design a relation schema so that it is easy to explain its meaning. Do not combine attributes from multiple entity types and relationship types into a single relation Guideline 1 10
11
Simplified COMPANY (2) Chapter 10-11 ENAMESSNBDATEADDRESSDNUMBERDNAMEDMGRSSN EMP_DEPT SSNPNUMBERHours ENAME PNAMEPLOCATION EMP_PROJ 11
12
Simplified COMPANY (2) Chapter 10-12 The relation schemas above have clear semantics Each tuple in EMP_DEPT represents a single employee but includes additional information namely the name of the department for which the employee works and the social security number of the department Each tuple in EMP_PROJ relates an employee to a project but also includes employee name, project name, and project location 12
13
Although there is nothing wrong logically with these two relations, they are considered poor design because they violate the guideline 1 by mixing attributes from distinct real world entities Conclusion They may be used as views but they cause problems when used as a base relations 13
14
Redundant Information in Tuples and Update Anomalies One goal of schema design is to minimize the storage space used by the base relations Combining attributes from multiple entity types has a significant effect on storage space 14
15
Redundant Information in Tuples and Update Anomalies 15 ENAMESSNBDATEADDRESSDNUMBERDNAMEDMGRSSN EMP_DEPT SSNPNUMBERHours ENAME PNAMEPLOCATION EMP_PROJ Redundancy DNAME, DMGRSSN are repeated for every employee who works for that department 15
16
Another serious problem with using the previous schemas as a base relations is the problem of update anomalies which is classified into Insertion anomalies Deletion anomalies Modification anomalies Redundant Information in Tuples and Update Anomalies 16
17
Insertions anomalies Problem1: To insert a new tuple for an employee who works in department 5, we must enter the attribute values of department 5 correctly so that they are consistent with values for department 5 in other tuple Problem2: The only way to insert a new department that has no employees is to place null values in the attributes for employee. This cause a problem because the SSN is the primary key and cannot be NULL ENAMESSNBDATEADDRESSDNUMBERDNAMEDMGRSSN 17
18
Deletion anomalies Problem3: if an employee is the sole employee on a department, deleting that employee would result in deleting the corresponding department ENAMESSNBDATEADDRESSDNUMBERDNAMEDMGRSSN 18
19
Modification anomalies Problem4: If we change the value of the manager of department 5, we must update the tuples of all employees who work in that department ; Otherwise, the database will become inconsistence If some tuples does not updated then the same department will have two different values for manager, which would be wrong ENAMESSNBDATEADDRESSDNUMBERDNAMEDMGRSSN 19
20
Mixing attributes of multiple entities may cause problems Information is stored redundantly wasting storage Problems with update anomalies Insertion anomalies Deletion anomalies Modification anomalies Conclusion 20
21
Design a schema that does not suffer from the insertion, deletion and update anomalies. If any anomalies are present: Note them clearly Make sure that the programs that update the database will operate correctly Guideline 2 21
22
Null Values in Tuples Chapter 10-22 In some schema designs we may group many attributes together into “fat” relation If many of the attributes do not apply to all tuples in the relation, we end up with many nulls in those tuples Problems of NULL Waste space at storage level Lead to problem when using aggregate operation such as SUM Nulls have multiple interpretations The attribute does not apply to this tuple attribute value unknown value known to exist, but unavailable 22
23
Avoid placing attributes in a base relation whose values may frequently be NULL Attributes that are NULL frequently could be placed in separate relations (with the primary key) Guideline 3 23 For example, if only 10% of employees have individual office, then it does not make sense to include attribute called OFFICE_NUMBER in EMPLOYEE relation Rather, make new relation EMP_OFFICE( ESSN, OFIICE_NUMBER) 23
24
Generation of Spurious Tuples 24 Emp_Dept EnameSSNBdateAddressDnumberDnameDmgr_ssn Smith,John B.1234567891965/01/09731 Fondren,Houston,TX5Research333445555 Wong,Franklin T.3334455551955/12/08638Voss,Houston,TX5Research333445555 Zelaya, Alicia J.9998877771968/07/193321 Castle,Spring,TX4Administration987654321 Wallace,Jennifer S.9876543211941/06/20291 Berry,Beliaire,TX4Administration987654321 Narayan,Ramesh K.6668844441962/09/15975 FireOak,Humble,TX5Research333445555 English,Joyce A.4534534531972/07/315631 Rice,Houston,TX5Research333445555 Jabbar,Ahmad V.9879879871969/03/29980 Dallas,Houston,TX4Administration987654321 Borg,James E.8886655551937/11/10450 Stone,Houston,TX1Headquarters888665555 Redundancy 24
25
Generation of Spurious Tuples 25 SSNPNUMBERHours ENAME PNAMEPLOCATION Emp_Proj SSNPnumberHoursEnamePnamePlocation 123456789132.5Smith,JohnB.ProductXBellaire 12345678927.5Smith,JohnB.ProductYSugarland 666884444340Narayan,RameshK.ProductZHouston 453453453120English,JoyceA.ProductXBellaire 453453453220English,JoyceA.ProductYSugarland 333445555210Wong,FranklinT.ProductYSugarland 333445555310Wong,FranklinT.ProductZHouston 33344555510 Wong,FrankiinT.ComputerizationStafford 3334455552010Wong,FranklinT.ReorganizationHouston 99988777730 Zelaya,AliciaJ.NewbenefitsStafford 99988777710 Zelaya,AliciaJ.ComputerizationStafford 9879879871035Jabbar,AhmadV.ComputerizationStafford 987987987305Jabbar,AhmadV.NewbenefitsStafford 9876543213020Wallace,JenniferS.NewbenefitsStafford 9876543212015Wallace,JenniferS.ReorganizationHouston 88866555520nullBorg,JamesE.ReorganizationHouston Redundancy 25
26
Generation of Spurious Tuples 26 SSNPNUMBERHours ENAME PNAMEPLOCATION Suppose that we used EMP_PROJ1 and EMP_LOCS as the base relations instead of EMP_PROJ This produces a bad schema design, because we cannot recover the information that was originally in EMP_PROJ from EMP_PROJ1 and EMP_LOCS
27
Generation of Spurious Tuples 27 Emp_Proj1 SSNPnumberHoursPnamePlocation 123456789132.5ProductXBellaire 12345678927.5ProductYSugarland 666884444340ProductZHouston 453453453120ProductXBellaire 453453453220ProductYSugarland 333445555210ProductYSugarland 333445555310ProductZHouston 33344555510 ComputerizationStafford 99988777730 NewbenefitsStafford 999887m10 ComputerizationStafford 9879879871035ComputerizationStafford 987987987305NewbenefitsStafford 9876543213020NewbenefitsStafford 9876543212015ReorganizationHouston 88866555520nullReorganizationHouston EMP_LOCS EnamePlocation Smith,JohnB.Bellaire Smith,JohnB.Sugarland Narayan,RameshK.Houston English,JoyceA.Bellaire English,JoyceA.Sugarland Wong,FranklinT.Sugarland Wong,FranklinT.Houston Zelaya,AliciaJ.Stafford Jabbar,AhmadV.Stafford Wallace,JenniferS.Stafford Wallace,JenniferS.Houston Borg,JamesE.Houston 27
28
Generation of Spurious Tuples 28 If we attempt a NATURAL JOIN operation on EMP_PROJ1 and EMP_LOCS The results produces many more tuples than the original set of tuples in EMP_PROJ Additional tuples that were not in EMP_PROJ are called spurious tuples because they represent wrong information that is not valid 28
29
Generation of Spurious Tuples 29 SSNPnumberHoursPnamePlocationEname 123456789132.5ProductXBellaireSmith,John B. *123456789132.5ProductXBellaireEnglish,Joyce A. 12345678927.5ProductYSugarlandSmith,John B. *12345678927.5ProductYSugarlandEnglish,Joyce A. *12345678927.5ProductYSugarlandWong,Franklin T. 666884444340ProductZHoustonNarayan,Ramesh K. *666884444340ProductZHoustonWong,Franklin T. *453453453120ProductXBellaireSmith,John B. 453453453120ProductXBellaireEnglish,Joyce A. *453453453220ProductYSugarlandSmith,John B. 453453453220ProductYSugarlandEnglish,Joyce A. *453453453220ProductYSugarlandWong,Franklin T. *333445555210ProductYSugarlandSmith,John B. *333445555210ProductYSugarlandEnglish,Joyce A. 333445555210ProductYSugarlandWong,Franklin T. *333445555310ProductZHoustonNarayan,Ramesh K. 333445555310ProductZHoustonWong,Franklin T. 33344555510 ComputerizationStaffordWong,Franklin T. *3334455552010ReorganizationHoustonNarayan,Ramesh K. 3334455552010ReorganizationHoustonWong,Franklin T. Emp_Proj1 * Emp_Locs 28
30
30
31
Guideline 4 31 Avoid relations that contain matching attributes that are not primary or foreign keys because joining on such attributes may produce spurious tuples Guideline 4 31
32
Anomalies cause Redundant work during insertion and modification Loss of information during a deletion NULL Cause Waste of storage space Difficulty of performing aggregation operations Generation of invalid and spurious data during joins on improperly related base relations Conclusions 32
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.