Download presentation
Presentation is loading. Please wait.
Published byElijah Hoover Modified over 9 years ago
1
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-usewww.db-book.com Chapter 8: Relational Database Design Normalization in Databases
2
Chapter 8: Relational Database Design Features of Good Relational Design Atomic Domains and First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF)
3
Combine Schemas? Suppose we combine instructor and department into inst_dept (No connection to relationship set inst_dept) Result is possible repetition of information
4
A Combined Schema Without Repetition Consider combining relations sec_class(sec_id, building, room_number) and section(course_id, sec_id, semester, year) into one relation section(course_id, sec_id, semester, year, building, room_number) No repetition in this case
5
What About Smaller Schemas? Suppose we had started with inst_dept. How would we know to split up (decompose) it into instructor and department? Write a rule “if there were a schema (dept_name, building, budget), then dept_name would be a candidate key” Denote as a functional dependency: dept_name building, budget In inst_dept, because dept_name is not a candidate key, the building and budget of a department may have to be repeated. This indicates the need to decompose inst_dept Not all decompositions are good. Suppose we decompose employee(ID, name, street, city, salary) into employee1 (ID, name) employee2 (name, street, city, salary) The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.
6
A Lossy Decomposition
7
Example of Lossless-Join Decomposition Lossless join decomposition Decomposition of R = (A, B, C) R 1 = (A, B)R 2 = (B, C) AB 1212 A B 1212 r B,C (r) A (r) B (r) AB 1212 C ABAB B 1212 C ABAB C ABAB A,B (r)
8
Normal Forms 1NF 2NF 3NF Other… Not covered.
9
Normal Forms: Review Unnormalized – There are multivalued attributes or repeating groups 1 NF – No multivalued attributes or repeating groups. 2 NF – 1 NF plus no partial dependencies 3 NF – 2 NF plus no transitive dependencies
10
First Normal Form Domain is atomic if its elements are considered to be indivisible units Examples of non-atomic domains: Set of names, composite attributes Identification numbers like CS101 that can be broken up into parts A relational schema R is in first normal form if the domains of all attributes of R are atomic Non-atomic values complicate storage and encourage redundant (repeated) storage of data Example: Set of accounts stored with each customer, and set of owners stored with each account
11
First Normal Form (Cont’d) Atomicity is actually a property of how the elements of the domain are used. Example: Strings would normally be considered indivisible Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127 If the first two characters are extracted to find the department, the domain of roll numbers is not atomic. Doing so is a bad idea: leads to encoding of information in application program rather than in the database.
12
Example 1: Table Violating 1NF InstructorFirst NameLast NamePhone Number 123AliBaba215-000-1212 222MikeyMouse215-000-1212 215-111-1212 215-222-1212 555DonaldDuck312-000-1212 312-111-1212 312-222-1212
13
Example 1: Table Not Violating 1NF InstructorFirst NameLast NamePhone Number 123AliBaba215-000-1212 222MikeyMouse215-000-1212 222MikeyMouse215-111-1212 222MikeyMouse215-222-1212 555DonaldDuck312-000-1212 555DonaldDuck312-111-1212 It violates other normal forms, though.
14
Example 2: Table Violating 1NF Product IDColorPrice 1Black, Red$15 2Yellow, Purple$20 5White, Green$40
15
Example 2: Table Not Violating 1NF It violates other normal forms, though. Product IDColorPrice 1Black$15 1Red$15 2Yellow$20 2Purple$20 5White$40 5Green$40
16
Types of Normalization First Normal Form each field contains the smallest meaningful value the table does not contain repeating groups of fields or repeating data within the same field Create a separate field/table for each set of related data. Identify each set of related data with a primary key
17
PART (Primary Key) WAREHOUSE P0010Warehouse A, Warehouse B, Warehouse C P0020Warehouse B, Warehouse D PART (Primary Key) WAREHOUSE AWAREHOUSE BWAREHOUSE C P0010YesNoYes P0020NoYes Really Bad Set-up! Better, but still flawed! Tables Violating First Normal Form
18
PART (Primary Key) WAREHOUSE (Primary Key)QUANTITY P0010Warehouse A400 P0010Warehouse B543 P0010Warehouse C329 P0020Warehouse B200 P0020Warehouse D278 Table Conforming to 1NF
19
usually used in tables with a multiple-field primary key (composite key) each non-key field relates to the entire primary key any field that does not relate to the primary key is placed in a separate table MAIN POINT – eliminate redundant data in a table Create separate tables for sets of values that apply to multiple records Second Normal Form – 2NF
20
PART (Primary Key) WAREHOUSE (Primary Key)QUANTITY WAREHOUSE ADDRESS P0010Warehouse A4001608 New Field Road P0010Warehouse B5434141 Greenway Drive P0010Warehouse C329171 Pine Lane P0020Warehouse B2004141 Greenway Drive P0020Warehouse D278800 Massey Street Where is the problem? Table Violating 2NF
21
PART (Primary Key) WAREHOUSE (Primary Key)QUANTITY WAREHOUSE ADDRESS P0010Warehouse A4001608 New Field Road P0010Warehouse B5434141 Greenway Drive P0010Warehouse C329171 Pine Lane P0020Warehouse B2004141 Greenway Drive P0020Warehouse D278800 Massey Street Table Violating 2NF
22
PART_STOCK TABLE PART (Primary Key)WAREHOUSE (Primary Key)QUANTITY P0010Warehouse A400 P0010Warehouse B543 P0010Warehouse C329 P0020Warehouse B200 P0020Warehouse D278 WAREHOUSE TABLE WAREHOUSE (Primary Key)WAREHOUSE_ADDRESS Warehouse A1608 New Field Road Warehouse B4141 Greenway Drive Warehouse C171 Pine Lane Warehouse D800 Massey Street 1 ∞ Tables Conforming to 2NF
23
Usually used in tables with a single- field primary key Records do not depend on anything other than a table's primary key Each non-key field is a fact about the key Values in a record that are not part of that record's key do not belong in the table. In general, any time the contents of a group of fields may apply to more than a single record in the table, consider placing those fields in a separate table. Third Normal Form – 3NF
24
Table Violating 3NF EMPLOYEE_DEPARTMENT TABLE EMPNO (Primary Key) FIRSTNAMELASTNAMEWORKDEPTDEPTNAME 000290JohnParkerE11Operations 000320RamlalMehtaE21Software Support 000310MaudeSetrightE11Operations The underlying problem is the transitive dependency to which the DeptName attribute is subject. DeptName actually depends on WORKDEPT, which in turn depends on the key EmpNO.
25
Tables Conforming to Third Normal Form EMPLOYEE TABLE EMPNO (Primary Key)FIRSTNAMELASTNAMEWORKDEPT 000290JohnParkerE11 000320RamlalMehtaE21 000310MaudeSetrightE11 DEPARTMENT TABLE DEPTNO (Primary Key)DEPTNAME E11Operations E21Software Support 1 ∞
26
A Note on 2NF A table may have multiple candidate key. A functional dependency on part of any candidate key is a violation of 2NF. It is necessary to establish that no non-prime attributes have part-key dependencies on any of these candidate keys.
27
Example ManufacturerModelModel Full NameManufacturer Country ForteX-PrimeForte X-PrimeItaly ForteUltracleanForte UltracleanItaly Dent-o-FreshEZbrushDent-o-Fresh EZbrushUSA KobayashiST-60Kobayashi ST-60Japan HochToothmasterHoch ToothmasterGermany HochX-PrimeHoch X-PrimeGermany PK Example taken from Wikipedia: http://en.wikipedia.org/wiki/Second_normal_form Candidate Key
28
Example ManufacturerManufacturer Country ForteItaly Dent-o-FreshUSA KobayashiJapan HochGermany Electric Toothbrush Manufacturers ManufacturerModelModel Full Name ForteX-PrimeForte X-Prime ForteUltracleanForte Ultraclean Dent-o-FreshEZbrushDent-o-Fresh EZbrush KobayashiST-60Kobayashi ST-60 HochToothmasterHoch Toothmaster HochX-PrimeHoch X-Prime Electric Toothbrush Models
29
More Examples
30
Example 1 Un-normalized Table: Student#Advisor#AdvisorAdv-RoomClass1Class2Class3 102210Susan Jones412101-07143-01159-02 412312Anne Smith216101-07159-02214-01
31
Table in First Normal Form No Repeating Fields Data in Smallest Parts Student#Advisor#AdvisorFNameAdvisorLName Adv- Room Class# 102210SusanJones412101-07 102210SusanJones412143-01 102210SusanJones412159-02 412312AnneSmith216101-07 412312AnneSmith216159-02 412312AnneSmith216214-01
32
Is table in 2NF? What is the key? Student#Advisor#AdvisorFNameAdvisorLName Adv- Room Class# 102210SusanJones412101-07 102210SusanJones412143-01 102210SusanJones412159-02 412312AnneSmith216101-07 412312AnneSmith216159-02 412312AnneSmith216214-01 201110SusanJones412101-07
33
Is table in 2NF? What is the key? Student#Advisor#AdvisorFNameAdvisorLName Adv- Room Class# 102210SusanJones412101-07 102210SusanJones412143-01 102210SusanJones412159-02 412312AnneSmith216101-07 412312AnneSmith216159-02 412312AnneSmith216214-01 201110SusanJones412101-07 What do we notice? Advisor fields depend on Student#
34
Tables in Second Normal Form Redundant Data Eliminated Student#Advisor#AdvFirstNameAdvLastName Adv- Room 102210SusanJones412 412312AnneSmith216 201110SusanJones412 Table: Students Student#Class# 1022101-07 1022143-01 1022159-02 4123201-01 4123211-02 4123214-01 Table: Registration
35
Tables Registration in 2NF Who about the Students? Student#Advisor#AdvFirstNameAdvLastName Adv- Room 102210SusanJones412 412312AnneSmith216 201110SusanJones412 Table: Students Student#Class# 1022101-07 1022143-01 1022159-02 4123201-01 4123211-02 4123214-01 Table: Registration What is the candidate key for Students?
36
Tables in 2NF. Student#Advisor# 102210 412312 201110 Table: Students Student#Class# 1022101-07 1022143-01 1022159-02 4123201-01 4123211-02 4123214-01 Table: Registration Advisor#AdvFirstNameAdvLastName Adv- Room 10SusanJones412 12AnneSmith216 Table: Advisors
37
Relationships for Example 1 Registration Student# Class# Students Student# Advisor# Advisors Advisor# AdvFirstName AdvLastName Adv-Room
38
Example 2 Un-normalized Table: EmpIDNameDept Code Dept NameProj 1Time Proj 1 Proj 2Time Proj 2 Proj 3Time Proj 3 EN1-26Sean BreenTWTechnical Writing30-T325%30-TC40%31-T330% EN1-33Amy GuyaTWTechnical Writing30-T350%30-TC35%31-T360% EN1-36Liz RoslynACAccounting35-TC90%
39
Table in First Normal Form EmpIDProject Number Time on Project Last Name First Name Dept Code Dept Name EN1-2630-T325%BreenSeanTWTechnical Writing EN1-2630-TC40%BreenSeanTWTechnical Writing EN1-2631-T330%BreenSeanTWTechnical Writing EN1-3330-T350%GuyaAmyTWTechnical Writing EN1-3330-TC35%GuyaAmyTWTechnical Writing EN1-3331-T360%GuyaAmyTWTechnical Writing EN1-3635-TC90%RoslynLizACAccounting What is the candidate key?
40
Tables in Second Normal Form EmpIDProject Number Time on Project EN1-2630-T325% EN1-2630-T340% EN1-2631-T330% EN1-3330-T350% EN1-3330-TC35% EN1-3331-T360% EN1-3635-TC90% Table: Employees and Projects EmpIDLast Name First Name Dept Code Dept Name EN1-26BreenSeanTWTechnical Writing EN1-33GuyaAmyTWTechnical Writing EN1-36RoslynLizACAccounting Table: Employees Are they in 3NF? The underlying problem is the transitive dependency to which the Dept Name attribute is subject. Dept Name actually depends on Dept Code, which in turn depends on the key EmpID.
41
Tables in Third Normal Form Dept CodeDept Name TWTechnical Writing ACAccounting EmpIDProject Number Time on Project EN1-2630-T325% EN1-2630-T340% EN1-2631-T330% EN1-3330-T350% EN1-3330-TC35% EN1-3331-T360% EN1-3635-TC90% Table: Employees_and_Projects EmpIDLast Name First Name Dept Code EN1-26BreenSeanTW EN1-33GuyaAmyTW EN1-36RoslynLizAC Table: Departments Table: Employees
42
Relationships for Example 2 Employees_and_Projects EmpID ProjectNumber TimeonProject Employees EmpID FirstName LastName DeptCode Departments DeptCode DeptName
43
Example 3 EmpIDNameManagerDeptSectorSpouse/Children 285Carl Carlson SmithersEngineering6G 365LennySmithersMarketing8G 458Homer Simpson Mr. BurnsSafety7GMarge, Bart, Lisa, Maggie Un-normalized Table:
44
Table in First Normal Form Fields contain smallest meaningful values EmpIDFNameLNameManagerDeptSectorSpouseChild1Child2Child3 285CarlCarlsonSmithersEng.6G 365LennySmithersMarketing8G 458HomerSimpsonMr. BurnsSafety7GMargeBartLisaMaggie
45
Table in First Normal Form No more repeated fields EmpIDFNameLNameManagerDepartmentSectorDependent 285CarlCarlsonSmithersEngineering6G 365LennySmithersMarketing8G 458HomerSimpsonMr. BurnsSafety7GMarge 458HomerSimpsonMr. BurnsSafety7GBart 458HomerSimpsonMr. BurnsSafety7GLisa 458HomerSimpsonMr. BurnsSafety7GMaggie Is the table in 2NF? What is the candidate key?
46
Second/Third Normal Form Remove Repeated Data From Table Step 1 EmpIDFNameLNameManagerDepartmentSector 285CarlCarlsonSmithersEngineering6G 365LennySmithersMarketing8G 458HomerSimpsonMr. BurnsSafety7G EmpIDDependent 458Marge 458Bart 458Lisa 458Maggie
47
Tables in Second Normal Form EmpIDFNameLNameManagerIDDeptSector 285CarlCarlson2Engineering6G 365Lenny2Marketing8G 458HomerSimpson1Safety7G EmpIDDependent 458Marge 458Bart 458Lisa 458Maggie ManagerIDManager 1Mr. Burns 2Smithers Removed Repeated Data From Table Step 2 We look for the transitive dependency.
48
Tables in Second Normal Form EmpIDFNameLNameManagerIDDeptSector 285CarlCarlson2Engineering6G 365Lenny2Marketing8G 458HomerSimpson1Safety7G EmpIDDependent 458Marge 458Bart 458Lisa 458Maggie ManagerIDManager 1Mr. Burns 2Smithers How about 3NF? Step 3 We look the transitive dependency. If I know Dept, then I know ManagerID and Sector. If I know EmpID then I know Dept.
49
Tables in Third Normal Form EmpIDFNameLNameDeptCode 285CarlCarlsonEN 365LennyMK 458HomerSimpsonSF EmpIDDependent 458Marge 458Bart 458Lisa 458Maggie ManagerIDManager 1Mr. Burns 2Smithers DeptCodeDepartmentSectorManagerID ENEngineering6G2 MKMarketing8G2 SFSafety7G1 Employees Table Dependents Table Department Table Manager Table
50
Example 4 Rep IDRepresentativeClient 1Time 1Client 2Time 2Client 3Time 3 TS-89Gilroy GladstoneUS Corp.14 hrsTaggarts26 hrsKilroy Inc.9 hrs RK-56Mary MayhemItaliana67 hrsLinkers2 hrs Rep IDRep First NameRep Last Name Client ID* ClientTime With Client TS-89GilroyGladstone978US Corp14 hrs TS-89GilroyGladstone665Taggarts26 hrs TS-89GilroyGladstone782Kilroy Inc.9 hrs RK-56MaryMayhem221Italiana67 hrs RK-56MaryMayhem982Linkers2 hrs Table in 1 st Normal Form Table Violating 1 st Normal Form
51
Tables in 2 nd and 3 rd Normal Form Rep ID*Client ID*Time With Client TS-8997814 hrs TS-8966526 hrs TS-897829 hrs RK-5622167 hrs RK-569822 hrs RK-566654 hrs Rep ID*First NameLast Name TS-89GilroyGladstone RK-56MaryMayhem Client ID* Client Name 978US Corp 665Taggarts 782Kilroy Inc. 221Italiana 982Linkers This example comes from a tutorial from http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=95 and http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=104 Please check them out, as they are very well done.
52
Example 5 Although this table is in 1NF it contains redundant data. For example, information about the supplier's location and the location's status have to be repeated for every part supplied. Redundancy causes what are called update anomalies. Update anomalies are problems that arise when information is inserted, deleted, or updated. For example, the following anomalies could occur in this table: INSERT. The fact that a certain supplier (s5) is located in a particular city (Athens) cannot be added until they supplied a part. DELETE. If a row is deleted, then not only is the information about quantity and part lost but also information about the supplier. UPDATE. If supplier s1 moved from London to New York, then two rows would have to be updated with this new information. Table in 1 st Normal Form SupplierIDStatusCityPartIDQuantity S120LondonP1300 S120LondonP2200 S210ParisP1300 S210ParisP2400 S310ParisP2200 S420LondonP2200 S420LondonP4300
53
Tables in 2NF SupplierIDStatusCity S120London S210Paris S310Paris S420London S530Athens Tables in 2NF but not in 3NF still contain modification anomalies. In the example of Suppliers, they are: INSERT. The fact that a particular city has a certain status (Rome has a status of 50) cannot be inserted until there is a supplier in the city. DELETE. Deleting any row in SUPPLIER destroys the status information about the city as well as the association between supplier and city. SupplierIDPartIDQuantity S1P1300 S1P2200 S2P1300 S2P2400 S3P2200 S4P4300 S4P5400 Suppliers Parts
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.