Normalisation Lecture 3 Akhtar Ali 12/16/20151. Learning Objectives 1.To consider the process of Normalisation 2.To consider the definition and application.

Slides:



Advertisements
Similar presentations
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Advertisements

Functional Dependencies and Normalization for Relational Databases
 Definition  Components  Advantages  Limitations Contents  Definition Definition  Normal Forms Normal Forms  First Normal Form First Normal Form.
NORMALIZATION. Normalization Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations.
NORMALIZATION FIRST NORMAL FORM (1NF): A relation R is in 1NF if all attributes have atomic value = one value for an attribute = no repeating groups =
Ch 10, Functional Dependencies and Normal forms
Relational Normalization Theory. Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
Functional Dependencies and Normalization for Relational Databases.
Ms. Hatoon Al-Sagri CCIS – IS Department Normalization.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Relational Data Analysis Learning outcomes  understand the process of normalisation;  perform Relational Data Analysis;  recognise the importance of.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
CS263:Revision on Normalisation
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Chapter 5 Normalization Transparencies © Pearson Education Limited 1995, 2005.
Chapter 5 Normalization of Database Tables
Databases 6: Normalization
Chapter 8 Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
Chapter 7 Logical Database Design
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 6 NORMALIZATION FOR RELATIONAL DATABASES Instructor Ms. Arwa Binsaleh.
Week 6 Lecture Normalization
Relational Database Design by Relational Database Design by Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING.
Logical Database Design ( 補 ) Unit 7 Logical Database Design ( 補 )
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
Your name here. Improving Schemas and Normalization What are redundancies and anomalies? What are functional dependencies and how are they related to.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Normalization Transparencies
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Functional Dependencies and Normalization for Relational Databases.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
COMP1212 COMP1212 Anomalies and Dependencies Dr. Mabruk Ali.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
M1G Introduction to Database Development 4. Improving the database design.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 10 Normalization Pearson Education © 2009.
1 Functional Dependencies and Normalization Chapter 15.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Normalization.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1.
Al-Imam University Girls Education Center Collage of Computer Science 1 st Semester, 1432/1433H Chapter 10_part 1 Functional Dependencies and Normalization.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Objectives of Normalization  To create a formal framework for analyzing relation schemas based on their keys and on the functional dependencies among.
Copyright © Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF.
Advanced Database System
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
Lecture # 17 Chapter # 10 Normalization Database Systems.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
Functional Dependencies and Normalization for Relational Databases تنبيه : شرائح العرض (Slides) هي وسيلة لتوضيح الدرس واداة من الادوات في ذلك. حيث المرجع.
Chapter 8 Relational Database Design Topic 1: Normalization Chuan Li 1 © Pearson Education Limited 1995, 2005.
Dr Gordon Russell, Napier University Normalisation 1 - V2.0 1 Normalisation 1 Unit 3.1.
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Relational Database Design by Dr. S. Sridhar, Ph. D
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization
Chapter 14 Normalization – Part I Pearson Education © 2009.
Chapter 4.1 V3.0 Napier University Dr Gordon Russell
國立臺北科技大學 課程:資料庫系統 2015 fall Chapter 14 Normalization.
Normalisation 1 Unit 3.1 Dr Gordon Russell, Napier University
Presentation transcript:

Normalisation Lecture 3 Akhtar Ali 12/16/20151

Learning Objectives 1.To consider the process of Normalisation 2.To consider the definition and application of 1NF 3.To consider the definition and application of 2NF 4.To consider the definition and application of 3NF 12/16/20152

NORMALISATION PRINCIPLES 12/16/20153

Normalisation Definition : a systematic method that takes pre-existing relations and produces a canonical set of relations. – By canonical is meant well-designed, sound, or a recognised and lawful form. It can be used both for : designing canonical relations, checking existing relations to ensure they are canonical. 12/16/20154

5 How Normalization Supports Database Design?

12/16/20156 Normal forms Normalisation uses the concept of Normal Forms. They are organised in a sequence, each successive normal form being higher than the one before. A normal form is higher because it applies more stringent constraints to a relation than a lower normal form. A relation is said to a be in a certain “normal form” if it conforms to the constraints of that normal form.

Normalisation as a Relational Design Tool Sometimes, we need to use normalisation for designing relations. – For example, when ER modelling is not feasible or if we deal with a small number of attributes. – So we need to learn normalisation. 1NF stands for First Normal Form, 2NF for Second Normal Form, and so on. The constraints of a particular normal form are those of the previous normal form – plus the additional constraint(s) peculiar to this particular normal form. 12/16/20157

The Normalisation Procedure The normalisation procedure starts with a set of relations, for each of which, it is presumed to be un- normalised or in 0NF. – DO FOR xNF = 1NF, NF DO FOR each relation that exists – IF relation already conforms to xNF » THEN it is in xNF, so do nothing – ELSE create 2 or more replacement relations from it that do conform to xNF. – END-LOOP END-LOOP 5NF is the highest possible normal form. In practice, 3NF is the highest normal form usually reached. 8

What is a Normal Form? Each Normal Form has two parts – A definition that specifies exactly what constraints apply to a relation in that normal form. This is used to check whether any given relation is already in that normal form or not. – A method to be used to replace the relation with 2 or more that will be in that normal form. The method assumes that the relation-to-be-replaced is in the previous normal form. 12/16/20159

10 Normalising : Possibilities

Consequences of Normalisation If new, replacement relations are created, then they must be projections of the original. – New-Relation  π set of attributes (Original-relation) The symbol π denotes projection of a set of attributes from a relation. Normalisation always creates new relations such that – Original-relation  New-Rel-1 ⋈ New-Rel-2 The symbol ⋈ denotes a join between two relations. This ensures that no information is ever lost. 12/16/201511

FIRST NORMAL FORM (1NF) 12/16/201512

Definition of 1NF A relation is in 1NF if and only if every attribute value it can ever contain is an atomic value Question : What is an atomic value ? Answer : A value that cannot meaningfully be broken down into two or more constituent parts. 12/16/201513

12/16/ Example : Purchase Order Relation The following relation holds data about purchase orders placed on suppliers for parts OrdOrder number that uniquely identifies every purchase order. SnoSupplier number that uniquely identifies any supplier. SnameThe name of a supplier. SaddrThe address of a supplier. DateThe date on which the order was placed. PartPart number that uniquely identifies every kind of part used by the company. PnameThe name of a particular kind of part. QtyThe quantity of a particular kind of part ordered on a purchase order. PriceThe price of that quantity of that particular kind of part. TotThe total price to be paid for the whole order.

Not in 1NF Attributes Ord, Sno, Sname, Saddr, Date and Tot currently contain only atomic values, and in fact can only ever contain atomic values. Attributes Part, Pname, Qty and Price currently contain non-atomic values, and in fact may often contain non-atomic values. Therefore the relation is not in 1NF. 12/16/201515

12/16/ Putting Purchase Order into 1NF Separate out the atomic and non-atomic attributes Put all the atomic attributes in a new replacement relation, which then by definition is in 1NF.

The Non-Atomic Attributes We can’t just throw away this data because it is a nuisance to store! The values in all these attributes repeat together. – If a part is removed from an order, its values must be removed from all 4 attributes. – If another part is placed on an order, there must be a value for that part in all 4 attributes. 12/16/201517

Repeating Together Thus a set of values that repeat together should become a tuple in a new relation. Now the attributes in these tuples contain only atomic data ! Thus we form another new replacement relation to hold the tuples of data that repeat together. There is no intrinsic reason why all the non-atomic attributes in an un-normalised relation should always repeat together. 12/16/ PartPnameQtyPrice N8Nut704 B6Bolt605 L4Nail1003 P3Pump5150 Q7Motor5250

Foreign Keys The problem with this relation is that the part data is no longer associated with its order data. We no longer know which part type was ordered on which purchase order. We can solve this problem by adding the (purchase) order number attribute to this relation. In general, we must add the attribute(s) which formed a candidate key in the original relation, to this relation as a foreign key. This retains the relationship information. 12/16/ OrdPartPnameQtyPrice L5N8Nut704 L5B6Bolt605 L5L4Nail1003 L6P3Pump5150 L6Q7Motor5250

12/16/ Candidate Keys for Relations Extend the candidate key to Ord and Part including the foreign key Ord* The candidate key is Ord Ord*PartPnameQtyPrice L5N8Nut704 L5B6Bolt605 L5L4Nail1003 L6P3Pump5150 L6Q7Motor5250

SECOND NORMAL FORM (2NF) 12/16/201521

12/16/ Definition of 2NF Note that 2NF is more strict than 1NF because it requires the relation to conform to the additional “full functional dependency” constraint.

Fully Functionally Dependent Question : What does fully functionally dependent mean? We will first consider the principle of functional dependency, and then see – what full functional dependency means, – the application to achieve 2NF. 12/16/201523

12/16/ Example of Functional Dependency Assume some kind of loan account where payments of a certain amount have to be made on a regular basis to pay off the loan. This means : A given account number determines what payment is due. In principle, given an account number, one can find out what regular payment is due. (May not always be easy or feasible in practice).

Terminology The Account Number is said to functionally determine the Payment Due. The Payment Due is said to be functionally dependent on the Account Number. Both are equally good means of expression, and convenience and emphasis usually determine which of the two is preferred in any particular situation. 12/16/201525

12/16/ Further FD Examples A set containing one attribute determining a set of three attributes. a set of two attributes determining a set containing one attribute.

12/16/ Full Functional Dependency & 2NF The definition of 2NF requires not merely functional dependency, but full functional dependency. Definition of FULL Functional Dependency: A set of attributes Y is fully functionally dependent on a set of attributes X if and only if Y is functionally dependent on all the attributes of X and not just a subset of them.

Condition for 2NF Thus, to be in 2NF means that: all attributes not in the candidate key are fully FD on all those attributes that are in the candidate key. 12/16/201528

12/16/ Examples: Purchase Order Relations P_ORDER_1: FD Diagram

12/16/ Ord*PartPnameQtyPrice L5N8Nut704 L5B6Bolt605 L5L4Nail1003 L6P3Pump5150 L6Q7Motor5250 P-ITEM-1 P-ITEM-1: FD Diagram

Reason for non-2NF Attributes Price and Qty depend on the full key. – They depend not only on what kind of part they refer to, but also on the order itself the quantity of a part type ordered will vary with & depend on the order, as will the price since it depends on the quantity. However Pname depends solely on the type of part. – A particular kind of part will have the same name on every order on which it appears. 12/16/201531

Three Problems of a Non-2NF Relation Redundant data may be stored. Update anomalies – there can be problems in inserting, deleting and amending some of the data. Semantic problems. – relation does not reflect the real-world meaning of the data, leading to problems in its use. 12/16/201532

12/16/ Redundant Data Example: Pname is unnecessarily repeated. Every time a part type appears on an order (say Q7), its name (Motor) also appears. N.B. The part number (say Q7) is enough to identify the part type. Motor is repeated in orders L6 & L7. One order is sufficient to give us the name, so the Pname is redundant (either one). P-ITEM-1 Ord*PartPnameQtyPrice L5N8Nut704 L5B6Bolt605 L5L4Nail1003 L6P3Pump5150 L6Q7Motor5250 L7Q7Motor2100

12/16/ Ord*PartPnameQtyPrice L5N8Nut704 L5B6Bolt605 L5L4Nail1003 L6P3Pump5150 L6Q7Motor5250 L7Q7Engine2100 ??F5Flange??? Example: Part type details (Part and Pname) cannot always be updated. Update Anomalies P-ITEM-1

12/16/ Semantic Problems Q7 now has two different names. Ord*PartPnameQtyPrice L5N8Nut704 L5B6Bolt605 L5L4Nail1003 L6P3Pump5150 L6Q7Motor5250 L7Q7Engine2100 P-ITEM-1

12/16/ Putting P_ITEM_1 into 2NF (1)

Satisfaction of 2NF A relation created with a determinant as its candidate key, and with non-key attributes that are fully functionally dependent on that candidate key, must be in 2NF by definition. Note that a determining attribute - Part in the above example - can appear in more than one complete determinant. – This is perfectly acceptable. It just depends what attributes form determinants. 12/16/201537

12/16/ Putting P_ITEM_1 into 2NF (2) Ord*PartQtyPrice L5N8704 L5B6605 L5L41003 L6P35150 L6Q75250 L7Q72100 P-ITEM-2

12/16/ Putting P_ITEM_1 into 2NF (3) PartPname N8Nut B6Bolt L4Nail P3Pump Q7Motor PART_2

Benefits of 2NF No information has been lost. – A natural join of P_ITEM_2 and PART_2 on attribute Part will re-create the original relation P_ITEM_1. Problems Solved: – Redundant data removed – each Pname in once – Update anomalies – no side effects in operations – Semantic problems – each part type has just one name 12/16/201540

THIRD NORMAL FORM (3NF) 12/16/201541

12/16/ Definition of 3NF Question : What does non-transitively mean ? Note that 3NF is more stringent than 2NF, as it requires that the relation not only have full functional dependencies on the candidate key, but that these dependencies must now additionally be “non-transitive”.

12/16/ Transitivity Assume there are three sets of attributes, ‘A’, ‘B’ and ‘C’. If A determines B, and B determines C, then logically A determines C, but transitively via B.

Example of Transitive FD Suppose pilots always fly the same aircraft – then if we know the pilot, we know the aircraft; so pilot functionally determines aircraft. If we know the aircraft, then we know the airline that owns it – so aircraft functionally determines airline. Putting these two dependencies together – then pilot functionally determines airline. But the functional dependency of airline on pilot is transitive, because it goes via aircraft. 12/16/201544

Non-Transitive Full FD & 3NF So, to be in 3NF means that all attributes not in the candidate key are non-transitively - i.e. directly - fully FD on all those attributes that are in candidate key, and not FD on the candidate key via some other non-key attribute. 12/16/201545

12/16/ Reviewing the Definition of 3NF R1’s FD diagram shows a “chain of dependencies”. It is not in 3NF. R2’s FD diagram shows no “chain of dependencies”. It is in 3NF.

12/16/ Example: P_ITEM_2 Neither ‘Price’ nor ‘Qty’ is FD on the candidate key via the other, but non- transitively FD on the key. Thus P_ITEM_2 is already in 3NF. Ord*PartQtyPrice L5N8704 L5B6605 L5L41003 L6P35150 L6Q75250 L7Q72100 P-ITEM-2

12/16/ Example : PART_2 Thus PART_2 is already in 3NF. If a 2NF relation only has one non-key attribute, then it must already be in 3NF, as there is no other non-key attribute via which a transitive dependency can occur. PartPname N8Nut B6Bolt L4Nail P3Pump Q7Motor PART_2

12/16/ Example : P_ORDER_1 Taking account now of transitivity, the FD diagram can be re-drawn as:- Hence P_ORDER_1 is not in 3NF.

12/16/ Putting P_ORDER_1 into 3NF (1)

12/16/ Putting P_ORDER_1 into 3NF (2)

12/16/ Putting P_ORDER_1 into 3NF (3)

Benefits No information has been lost. – A natural join of P_ORDER_3 and SUPPLIER_3 on attribute Sno will re-create the original relation P_ORDER_1. Problems Solved: – Redundant data removed – each Sname in once – Update anomalies – no side effects in operations – Semantic problems – each supplier has just one name 12/16/201553