Presentation is loading. Please wait.

Presentation is loading. Please wait.

N ORMALIZATION 1. Chapter 5 R ELATION Definition: A relation is a named, two-dimensional table of data Table consists of rows (records) and columns (attribute.

Similar presentations


Presentation on theme: "N ORMALIZATION 1. Chapter 5 R ELATION Definition: A relation is a named, two-dimensional table of data Table consists of rows (records) and columns (attribute."— Presentation transcript:

1 N ORMALIZATION 1

2 Chapter 5 R ELATION Definition: A relation is a named, two-dimensional table of data Table consists of rows (records) and columns (attribute or field) Requirements for a table to qualify as a relation: It must have a unique name It must have a unique name Every attribute value must be atomic (not multivalued, not composite) Every attribute value must be atomic (not multivalued, not composite) Every row must be unique (can’t have two rows with exactly the same values for all their fields) Every row must be unique (can’t have two rows with exactly the same values for all their fields) Attributes (columns) in tables must have unique names Attributes (columns) in tables must have unique names The order of the columns must be irrelevant The order of the columns must be irrelevant The order of the rows must be irrelevant The order of the rows must be irrelevant NOTE: all relations are in 1 st Normal form 2

3 Chapter 5 N OT A R ELATION BUT W HY ? 3

4 Chapter 5 A R ELATION 4

5 Chapter 5 C ORRESPONDENCE WITH E-R M ODEL Relations (tables) correspond with entity types Rows correspond with entity instances Columns correspond with attributes NOTE: The word relation (in relational database) is NOT the same as the word relationship (in E-R model) 5

6 Chapter 5 I NTEGRITY C ONSTRAINTS Domain Constraints Allowable values for an attribute. See Table 5-1 Allowable values for an attribute. See Table 5-1 Entity Integrity No primary key attribute may be null. All primary key fields MUST have data No primary key attribute may be null. All primary key fields MUST have data Action Assertions Business rules that control the actions of the organization Business rules that control the actions of the organization E.g. A student may graduate if he/she has CGPA >= 2.0 E.g. A student may graduate if he/she has CGPA >= 2.0 Similarly, A student may enroll in a course if he/she has passed the prerequisite of that course Similarly, A student may enroll in a course if he/she has passed the prerequisite of that course 6

7 Chapter 5 7 Domain definitions enforce domain integrity constraints

8 Chapter 5 I NTEGRITY C ONSTRAINTS Referential Integrity–rule states that any foreign key value (on the relation of the many side) MUST match a primary key value in the relation of the one side. (Or the foreign key can be null) For example: Delete Rules For example: Delete Rules Restrict–don’t allow delete of “parent” side if related rows exist in “dependent” side Cascade–automatically delete “dependent” side rows that correspond with the “parent” side row to be deleted Set-to-Null–set the foreign key in the dependent side to null if deleting from the parent side  not allowed for weak entities 8

9 Chapter 5 9 Figure 5-5 Referential integrity constraints (Pine Valley Furniture) Referential integrity constraints are drawn via arrows from dependent to parent table

10 Chapter 5 10 Figure 5-6 SQL table definitions Referential integrity constraints are implemented with foreign key to primary key references

11 Chapter 5 D ATA N ORMALIZATION Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data The process of decomposing relations with anomalies to produce smaller, well- structured relations 11

12 Chapter 5 W ELL -S TRUCTURED R ELATIONS A relation that contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies Goal is to avoid anomalies Insertion Anomaly –adding new rows forces user to create duplicate data Insertion Anomaly –adding new rows forces user to create duplicate data Deletion Anomaly –deleting rows may cause a loss of data that would be needed for other future rows Deletion Anomaly –deleting rows may cause a loss of data that would be needed for other future rows Modification Anomaly –changing data in a row forces changes to other rows because of duplication Modification Anomaly –changing data in a row forces changes to other rows because of duplication 12 General rule of thumb: A table should not pertain to more than one entity type

13 Chapter 5 E XAMPLE –F IGURE 5-2 B 13 Question–Is this a relation? Answer–Yes: Unique rows and no multivalued attributes Question–What’s the primary key? Answer–Composite: Emp_ID, Course_Title

14 Chapter 5 A NOMALIES IN THIS T ABLE Insertion –can’t enter a new employee without having the employee take a class Deletion –if we remove employee 140, we lose information about the existence of a Tax Acc class Modification –giving a salary increase to employee 100 forces us to update multiple records 14 Why do these anomalies exist? Because there are two themes (entity types) in this one relation. This results in data duplication and an unnecessary dependency between the entities

15 Chapter 5 F UNCTIONAL D EPENDENCIES AND K EYS Functional Dependency: The value of one attribute (the determinant ) determines the value of another attribute Candidate Key: A unique identifier. One of the candidate keys will become the primary key A unique identifier. One of the candidate keys will become the primary key E.g. perhaps there is both credit card number and SS# in a table…in this case both are candidate keys Each non-key field is functionally dependent on every candidate key Each non-key field is functionally dependent on every candidate key 15

16 Chapter 5 16 Figure 5.22 Steps in normalization

17 Chapter 5 F IRST N ORMAL F ORM No multivalued attributes Every attribute value is atomic Fig. 5-25 is not in 1 st Normal Form (multivalued attributes)  it is not a relation Fig. 5-26 is in 1 st Normal form All relations are in 1 st Normal Form 17

18 Chapter 5 A S IMPLE R EPORT 18

19 Chapter 5 19 Table with multi valued attributes, not in 1 st normal form Note: this is NOT a relation

20 Chapter 5 20 Table with no multi valued attributes and unique rows, in 1 st normal form Note: this is relation, but not a well-structured one

21 Chapter 5 A NOMALIES IN THIS T ABLE Insertion –if new product is ordered for order 1007 of existing customer, customer data must be re-entered, causing duplication Deletion –if we delete the Dining Table from Order 1006, we lose information concerning this item's finish and price Update –changing the price of product ID 4 requires update in several records 21 Why do these anomalies exist? Because there are multiple entity types in one relation. This results in duplication and an unnecessary dependency between the entities

22 Chapter 5 S ECOND N ORMAL F ORM 1NF PLUS every non-key attribute is fully functionally dependent on the ENTIRE primary key Every non-key attribute must be defined by the entire key, not by only part of the key Every non-key attribute must be defined by the entire key, not by only part of the key No partial functional dependencies No partial functional dependencies 22

23 Chapter 5 23 Order_ID  Order_Date, Customer_ID, Customer_Name, Customer_Address Therefore, NOT in 2 nd Normal Form Customer_ID  Customer_Name, Customer_Address Product_ID  Product_Description, Product_Finish, Unit_Price Order_ID, Product_ID  Order_Quantity Figure 5-27 Functional dependency diagram for INVOICE

24 Chapter 5 24 Partial dependencies are removed, but there are still transitive dependencies Getting it into Second Normal Form Figure 5-28 Removing partial dependencies

25 Chapter 5 T HIRD N ORMAL F ORM 2NF PLUS no transitive dependencies (functional dependencies on non-primary-key attributes) Note: This is called transitive, because the primary key is a determinant for another attribute, which in turn is a determinant for a third Solution: Non-key determinant with transitive dependencies go into a new table; non-key determinant becomes primary key in the new table and stays as foreign key in the old table 25

26 Chapter 5 26 Transitive dependencies are removed Figure 5-28 Removing partial dependencies Getting it into Third Normal Form

27 Chapter 5 B OYCE C ODD NORMAL FORM (BCNF) Boyce Codd Normal Form (BCNF) is considered a special condition of third Normal form. A table is in BCNF if every determinant is a candidate key. A table can be in 3NF but not in BCNF. This occurs when a non key attribute is a determinant of a key attribute. 27

28 Chapter 5 E XAMPLE TABLE NOT IN BCNF S_NumT_Code Offering#Review Date 123599FIT10401764 2nd March 123599PIT3050176512th April 123599PIT30501789 2nd May 346700FIT10401764 3rd March 346700PIT30501765 7th May 28

29 Chapter 5 ANOMALIES The dependencies are S_Num, T_Code ® Offering#, Review Date which means that the table is in third normal form. The table is not in BCNF as if we know the offering number we know who the teacher is. Each offering can only have one teacher! Offering# ® T_Code A non key attribute is a determinant. 29

30 Chapter 5 R ELATIONS IN BCNF S_NumOffering# Review Date 123599017642nd March 1235990176512th April 123599017892nd May 346700017643rd March 346700017657th May 30 Offering#T_Code 01764FIT104 01765PIT305 01789PIT107

31 Chapter 5 F OURTH NORMAL FORM A relation R is in Fourth Normal Form (4NF) if and only if the following conditions are satisfied simultaneously: 1. R is already in 3NF or BCNF. 2. If it contains no multi-valued dependencies. 31

32 Chapter 5 M ULTI VALUED DEPENDENCY A multi-valued dependency exists when: 1. There are atleast three attributes A, B and C in a relation and 2. For each value of A there is a well defined set of values for B and C. 3. But the set of values of B is independent of set C 32

33 Chapter 5 E XAMPLE TABLE NOT IN 4 TH NF : Pizza Delivery Permutations RestaurantPizza VarietyDelivery Area A1 PizzaThick CrustSpringfield A1 PizzaThick CrustShelbyville A1 PizzaThick CrustCapital City A1 PizzaStuffed CrustSpringfield A1 PizzaStuffed CrustShelbyville A1 PizzaStuffed CrustCapital City Elite PizzaThin CrustCapital City Elite PizzaStuffed CrustCapital City Vincenzo's PizzaThick CrustSpringfield Vincenzo's PizzaThick CrustShelbyville Vincenzo's PizzaThin CrustSpringfield Vincenzo's PizzaThin CrustShelbyville 33

34 Chapter 5 A NOMALIES The table has no non-key attributes because its only key is {Restaurant, Pizza Variety, Delivery Area}. Therefore it meets all normal forms up to BCNF. Pizza varieties offered by a restaurant are not affected by delivery area. The varieties of pizza a restaurant offers are independent from the areas to which the restaurant delivers. This leads to redundancy in the table. The dependencies are: 1. {Restaurant} {Pizza Variety} 2. {Restaurant} {Delivery Area} 34

35 Chapter 5 R ELATIONS IN 4 TH NORMAL FORM Varieties By Restaurant RestaurantPizza Variety A1 PizzaThick Crust A1 PizzaStuffed Crust Elite PizzaThin Crust Elite PizzaStuffed Crust Vincenzo's Pizza Thick Crust Vincenzo's Pizza Thin Crust 35 Delivery Areas By Restaurant RestaurantDelivery Area A1 PizzaSpringfield A1 PizzaShelbyville A1 PizzaCapital City Elite PizzaCapital City Vincenzo's Pizza Springfield Vincenzo's Pizza Shelbyville

36 Chapter 5 4NF IN PRACTICE Tables violating 4NF (but meeting all lower normal forms) are rarely encountered in business applications. In a study of forty organizational databases, over 20% contained one or more tables that violated 4NF while meeting all lower normal forms. 36

37 Chapter 5 F IFTH NORMAL FORM A relation R is in Fifth Normal Form (5NF) if and only if the following conditions are satisfied simultaneously: 1. R is already in 4NF. 2. If it contains no join dependencies. Also known as project-join normal form. 37

38 Chapter 5 J OIN DEPENDENCY Let R be a relation and R 1, R 2,..., R n be a decomposition of R. If R = R 1  R 2  ….  R n, we say that a relation r ( R ) satisfies the join dependency *( R 1, R 2,..., R n ) if: r =  R1 ( r ) ⋈  R2 ( r ) ⋈ …… ⋈  Rn ( r ) The main goal of the decomposition technique is to avoid redundancies due to data dependencies by decomposing a relation into smaller parts. A good decomposition should have the lossless join property, meaning that no information should be lost after the decomposition. 38

39 Chapter 5 E XAMPLE TABLE NOT IN 5 TH NF : Traveling Salesman Product Availability By Brand Traveling SalesmanBrandProduct Type Jack SchneiderAcmeVacuum Cleaner Jack SchneiderAcmeBreadbox Mary JonesRobustoPruning Shears Mary JonesRobustoVacuum Cleaner Mary JonesRobustoBreadbox Mary JonesRobustoUmbrella Stand Louis FergusonRobustoVacuum Cleaner Louis FergusonRobustoTelescope Louis FergusonAcmeVacuum Cleaner Louis FergusonAcmeLava Lamp Louis FergusonNimbusTie Rack 39

40 Chapter 5 R ELATIONS IN 5 TH NORMAL FORM Product Types By Traveling Salesman Traveling Salesman Product Type Jack SchneiderVacuum Cleaner Jack SchneiderBreadbox Mary JonesPruning Shears Mary JonesVacuum Cleaner Mary JonesBreadbox Mary JonesUmbrella Stand Louis FergusonTelescope Louis FergusonVacuum Cleaner Louis FergusonLava Lamp Louis FergusonTie Rack 40 Product Types By Brand BrandProduct Type Acme Vacuum Cleaner AcmeBreadbox AcmeLava Lamp Robusto Pruning Shears Robusto Vacuum Cleaner RobustoBreadbox Robusto Umbrella Stand RobustoTelescope NimbusTie Rack

41 Chapter 5 R ELATIONS IN 5 TH NORMAL FORM Brands By Traveling Salesman Traveling SalesmanBrand Jack SchneiderAcme Mary JonesRobusto Louis FergusonRobusto Louis FergusonAcme Louis FergusonNimbus 41


Download ppt "N ORMALIZATION 1. Chapter 5 R ELATION Definition: A relation is a named, two-dimensional table of data Table consists of rows (records) and columns (attribute."

Similar presentations


Ads by Google