Presentation is loading. Please wait.

Presentation is loading. Please wait.

INLS 623 – Database Normalization

Similar presentations


Presentation on theme: "INLS 623 – Database Normalization"— Presentation transcript:

1 INLS 623 – Database Normalization
Instructor: Jason Carter

2 Set Theory Set What does set theory have to do with databases?
A collection of zero or more distinct objects. What does set theory have to do with databases? A record is a set of attribute/property values Columns are a set of attributes Rows are a set of records Conventionally sets are denoted with capital letters A = {1,2,3} B = {2,1,5} C = {red, green, blue}

3 Sets Equality Membership {6, 11} = {11, 6} = {11, 6, 6, 11} .
{1,2} = {2,1} Membership A = {1,2,3,4} ∈ = member of 4 ∈ A, 1 ∈ A, 3 ∈ A, 2 ∈ A  ∉ = not a member of 6  ∉ A

4 Sets Subsets a set A is a subset of a set B if all members of set A is also a member of set B ⊆ = subset A = {1,3} B = {1,2,3,4} {1, 3} ⊆ {1, 2, 3, 4} A ⊆ B

5 Sets Superset a set B is a superset of a set A if all members of set A are members of set B ⊋ = superset A = {1,3} B = {1,2,3,4} {1, 2, 3, 4} ⊋ {1, 3} B ⊋ A

6 Terminology

7 What is Normalization? A technique to organize “efficiently” organize data in a database “Efficiently”: Eliminating redundant data Not storing the same data in more than one table Ensuring that functional dependencies make sense Database normalization, or data normalization, is a technique to organize the contents of the tables for transactional databases and data warehouses. Normalization is part of successful database design; without normalization, database systems can be inaccurate, slow, and inefficient, and they might not produce the data you expect.

8 Without Normalization
student_id name address subject 401 Adam 133 Our Lane Biology 402 Alex 123 Here Lane Math 403 Stuart 123 My Lane 404 123 Their Lane Physics Update Anomaly : To update address of a student who occurs twice or more than twice in a table, we will have to update address column in all the rows, else data will become inconsistent. Insertion Anomaly : Suppose for a new admission, we have a Student id(S_id), name and address of a student but if student has not opted for any subjects yet then we have to insert NULL there, leading to Insertion Anamoly. Deletion Anomaly : If (student_id) 401 has only one subject and temporarily he drops it, when we delete that row, entire student record will be deleted along with it.

9 Normal Form 1st Normal Form 2nd Normal Form 3rd Normal Form
Boyce-Codd Normal Form  (3.5 Normal Form) 4th Normal Form

10 1st Normal Form Every cell in the table is atomic No repeating groups
A cell value cannot be divided further Seen differently – there are no grouping of information inside a cell. No repeating groups Eliminate duplicative columns from the same table. Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

11 1st Normal Form Does this table violate first normal form? Student Id
First name Last name Grades Classes 1 Bob Wood C,B 401, 623 2 Joe Smith A,D 550, 823 3 Alice Boone A,A 890,991 4 Shelly Kent A,B 770,881 Does this table violate first normal form?

12 1st Normal Form Student Id First name Last name Grades Classes 1 Bob Wood C,B 401, 623 2 Joe Smith A,D 550, 823 3 Alice Boone A,A 890,991 4 Shelly Kent A,B 770,881 Grades and Classes have multiple rows of data in one column

13 1st Normal Form Create new rows Student Id First name Last name Grades
Classes 1 Bob Wood C 401 2 Joe Smith A 550 3 Alice Boone 890 4 Shelly Kent 770 5 B 623 6 D 823 7 991 8 881 Create new rows

14 2nd Normal Form Table must be in 1st Normal Form
If the primary key is a composite of attributes (contains multiple columns), the non key attributes (columns) must depend on the whole key. 2nd Normal Form Table must be in 1st Normal Form If the primary key is a composite of attributes (contains multiple columns), the non key attributes (columns) must depend on the whole key.

15 Functional Dependencies
A functional dependency is a relationship between or among attributes in a table. One attribute is functionally dependent on another if the value of the second attribute determines the value of the first attribute. If you know the value of the second attribute, you can determine the value of the first attribute. Total Charge = StandardCharge * NumberOfTests

16 2nd NF: Functional Dependencies Examples
SSN (PK) First Name Last Name Age Jack Doe 21 Jane 25 Jill Roy 32 50 SSN → Age SSN → FN SSN → LN

17 2nd NF: Functional Dependencies Examples
Customer_ID Product Price 100 Cell Phone 295.00 101 Wallet 25.00 Toothpaste 5.99 Jeans 49.99 What is the primary key? Customer_ID, Product

18 2nd NF: Functional Dependencies Examples
All attributes must depend on the whole key (Customer_ID + Product). Is this table in 2nd NF? Customer_ID (PK) Product Price 100 Cell Phone 295.00 101 Wallet 25.00 Toothpaste 5.99 Jeans 49.99

19 2nd NF: Functional Dependencies Examples
Customer_ID (PK) Product (FK) 100 Cell Phone 101 Wallet Toothpaste Jeans Product (PK) Price Cell Phone 295.00 Wallet 25.00 Toothpaste 5.99 Jeans 49.99

20 2nd NF: Functional Dependencies Examples
First Ten Customers get a discount off the normal price Is this table in 2nd NF? Customer_ID (PK) Product Price 100 Cell Phone 295.00 101 Wallet 25.00 Toothpaste 5.99 Jeans 49.99

21 2nd NF: Functional Dependencies Examples
YES Customer_ID (PK) Product Price 100 Cell Phone 295.00 101 Wallet 25.00 Toothpaste 5.99 Jeans 49.99 Price depends on the Customer_ID and Product

22 3rd Normal Form Table must be in 2nd Normal Form A table is in 3NF if:
Table has no transitive dependencies.

23 Transitive Dependency
One attribute (column) depends on a second attribute, which depends on a third attribute A non-key column that rely on another non-key attributes, and not the primary key.

24 3rd NF: Transitive Dependency
Only one for each CustomerID Customer # (PK) Product Price 100 Cell Phone 295.00 102 Wallet 25.00 103 Toothpaste 5.99 104 Jeans 49.99 Customer 102 is unhappy with his wallet and wants to return it. What do you do?

25 3rd NF: Transitive Dependency
Want to remove the second row Customer_ID (PK) Product Price 100 Cell Phone 295.00 103 Toothpaste 5.99 104 Jeans 49.99 You lose the fact that a wallet cost $25. Price depends on Product, Product depends on Customer_ID

26 3rd NF: Transitive Dependency
Customer_ID (PK) Product (FK) 100 Cell Phone 101 Wallet Toothpaste Jeans Product (PK) Price Cell Phone 295.00 Wallet 25.00 Toothpaste 5.99 Jeans 49.99


Download ppt "INLS 623 – Database Normalization"

Similar presentations


Ads by Google