Presentation is loading. Please wait.

Presentation is loading. Please wait.

PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.

Similar presentations


Presentation on theme: "PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University."— Presentation transcript:

1 PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University

2 Lecture 04 Relational Database Design Normalization

3 Outline Overview of Relational DBMS  Normalization

4 The aim of normalization is to eliminate various anomalies (or undesirable aspects) of a relation in order to obtain “better” relations. The following four problems might exist in a relation scheme:  Repetition anomaly  Update anomaly  Insertion anomaly  Deletion anomaly Slide 4 Normalization

5 Repetition Anomaly The NAME,TITLE, SAL attribute values are repeated for each project that the employee is involved in.  Waste of space  Complicates updates  Contrary to the spirit of databases ENO EMP ENAMETITLESAL J. DoeElect. Eng.40000 M. Smith34000 M. Smith Analyst 34000 A. LeeMech. Eng.27000 A. LeeMech. Eng.27000 J. MillerProgrammer24000 B. CaseySyst. Anal.34000 L. ChuElect. Eng.40000 R. DavisMech. Eng.27000 E1 E2 E3 E4 E5 E6 E7 E8J. Jones Syst. Anal. 34000 24 PNORESPDUR P1Manager12 P1Analyst P2Analyst6 P3Consultant10 P4Engineer48 P2Programmer18 P2Manager24 P4Manager48 P3Engineer36 P3Manager40

6 Update Anomaly If any attribute of project (say SAL of an employee) is updated, multiple tuples have to be updated to reflect the change. ENO EMP ENAMETITLESAL J. DoeElect. Eng.40000 M. Smith34000 M. Smith Analyst 34000 A. LeeMech. Eng.27000 A. LeeMech. Eng.27000 J. MillerProgrammer24000 B. CaseySyst. Anal.34000 L. ChuElect. Eng.40000 R. DavisMech. Eng.27000 E1 E2 E3 E4 E5 E6 E7 E8J. Jones Syst. Anal. 34000 24 PNORESPDUR P1Manager12 P1Analyst P2Analyst6 P3Consultant10 P4Engineer48 P2Programmer18 P2Manager24 P4Manager48 P3Engineer36 P3Manager40

7 Insertion Anomaly It may not be possible to store information about a new project until an employee is assigned to it. ENO EMP ENAMETITLESAL J. DoeElect. Eng.40000 M. Smith34000 M. Smith Analyst 34000 A. LeeMech. Eng.27000 A. LeeMech. Eng.27000 J. MillerProgrammer24000 B. CaseySyst. Anal.34000 L. ChuElect. Eng.40000 R. DavisMech. Eng.27000 E1 E2 E3 E4 E5 E6 E7 E8J. Jones Syst. Anal. 34000 24 PNORESPDUR P1Manager12 P1Analyst P2Analyst6 P3Consultant10 P4Engineer48 P2Programmer18 P2Manager24 P4Manager48 P3Engineer36 P3Manager40

8 Deletion Anomaly If an engineer, who is the only employee on a project, leaves the company, his personal information cannot be deleted, or the information about that project is lost. May have to delete many tuples. ENO EMP ENAMETITLESAL J. DoeElect. Eng.40000 M. Smith34000 M. Smith Analyst 34000 A. LeeMech. Eng.27000 A. LeeMech. Eng.27000 J. MillerProgrammer24000 B. CaseySyst. Anal.34000 L. ChuElect. Eng.40000 R. DavisMech. Eng.27000 E1 E2 E3 E4 E5 E6 E7 E8J. Jones Syst. Anal. 34000 24 PNORESPDUR P1Manager12 P1Analyst P2Analyst6 P3Consultant10 P4Engineer48 P2Programmer18 P2Manager24 P4Manager48 P3Engineer36 P3Manager40

9 What to do? Take each relation individually and “improve” it in terms of the desired characteristics  Normal forms o Atomic values (1NF) o Can be defined according to keys and dependencies. o Functional Dependencies ( 2NF, 3NF, BCNF) o Multivalued dependencies (4NF) o Projection-join dependencies (5NF)  Normalization o Normalization is a process of concept separation which applies a top-down methodology for producing a schema by subsequent refinements and decompositions. o Do not combine unrelated sets of facts in one table; each relation should contain an independent set of facts. o Universal relation assumption

10 Normalization Issues How do we decompose a schema into a desirable normal form? What criteria should the decomposed schemas follow in order to preserve the semantics of the original schema?  Reconstructability: recover the original relation  no spurious joins  Lossless decomposition: no information loss  Dependency preservation: the constraints (i.e., dependencies) that hold on the original relation should be enforceable by means of the constraints (i.e., dependencies) defined on the decomposed relations.

11 A Lossy Decomposition

12 Example of Lossless-Join Decomposition Lossless join decomposition Decomposition of R = (A, B, C) R 1 = (A, B)R 2 = (B, C) AB  1212 A  B 1212 r  B,C (r)  A (r)  B (r) AB  1212 C ABAB B 1212 C ABAB C ABAB  A,B (r)

13 Unnormalized (UDF) First normal form (1NF) Remove repeating groups Second normal form (2NF) Remove partial dependencies Third normal form (3NF) Remove transitive dependencies Boyce-Codd normal form (BCNF) Remove remaining functional dependency anomalies Fourth normal form (4NF) Remove multivalued dependencies Fifth normal form (5NF) Remove remaining anomalies Stages of Normalization

14 Repeating Groups A repeating group is an attribute (or set of attributes) that can have more than one value for a primary key value. staffNojobdeptdnamecity contact Number SL10Salesman10SalesStratford018111777, 018111888, 079311122 SA51Manager20AccountsBarking017111777 DS40Clerk20AccountsBarkingNull OS45Clerk30OperationsBarking079311555 Example We have the following relation that contains staff and department details and a list of telephone contact numbers for each member of staff. Repeating Groups are not allowed in a relational design, since all attributes have to be ‘atomic’ - i.e., there can only be one value per cell in a table!

15 Multivalued Attributes (or repeating groups): non-key attributes or groups of non-key attributes the values of which are not uniquely identified by (directly or indirectly) (not functionally dependent on) the value of the Primary Key (or its part). Stud_IDNameCourse_IDUnits 101LennonMSI 250, MSI 4153.00 125JohnsonMSI 3313.00 Repeating Groups STUDENT

16 Functional Dependency Formal Definition: Attribute B is functionally dependant upon attribute A (or a collection of attributes) if a value of A determines a single value of attribute B at any one time. Formal Notation: A  B This should be read as ‘A determines B’ or ‘B is functionally dependant on A’. A is called the determinant and B is called the object of the determinant. staffNo job dept dname SL10 Salesman 10 Sales SA51 Manager 20 Accounts DS40 Clerk 20 Accounts OS45 Clerk 30 Operations Example: staffNo  job staffNo  dept staffNo  dname dept  dname Functional Dependencies

17 Functional Dependency Full Functional Dependency: Only of relevance with composite determinants. This is the situation when it is necessary to use all the attributes of the composite determinant to identify its object uniquely. order# line# qty price A001 001 10 200 A002 001 20 400 A002 002 20 800 A004 001 15 300 Example: (Order#, line#)  qty (Order#, line#)  price Full Functional Dependencies Compound Determinants: If more than one attribute is necessary to determine another attribute in an entity, then such a determinant is termed a composite determinant.

18 Functional Dependency Partial Functional Dependency: This is the situation that exists if it is necessary to only use a subset of the attributes of the composite determinant to identify its object uniquely. (student#, unit#)  grade Full Functional Dependencies unit#  room Partial Functional Dependencies Repetition of data! student#unit#roomgrade 9900100 A01TH2242 9900010 A01TH22414 9901011A02JS0753 9900001 A01TH22416

19 Partial Dependency – when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key. Partial Dependency Functional Dependency

20 Transitive Dependency Definition: A transitive dependency exists when there is an intermediate functional dependency. Formal Notation: If A  B and B  C, then it can be stated that the following transitive dependency exists: A  B  C staffNo  dept dept  dname staffNo  dept  dname Transitive Dependencies Repetition of data! staffNo jobdeptdname SL10Salesman10Sales SA51Manager20Accounts DS40Clerk20Accounts OS45Clerk30Operations Example:

21 Transitive Dependency – when a non-key attribute determines another non-key attribute. Transitive Dependency

22 Thank You


Download ppt "PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University."

Similar presentations


Ads by Google