1 Lecture 7: Normal Forms, Relational Algebra Monday, 10/15/2001.

Slides:



Advertisements
Similar presentations
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala N ATIONAL I NSTITUTE OF T ECHNOLOGY A GARTALA Aug-Dec,2010 Normalization 2 CSE-503 :: D ATABASE.
Advertisements

Lecture 07: Relational Algebra
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Lecture 6: Design Constraints and Functional Dependencies January 21st, 2004.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 8 A First Course in Database Systems.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Lecture 10: Database Design XML Wednesday, October 20, 2004.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
1 Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design April 16 & 18, 2008.
Lecture #3 Functional Dependencies Normalization Relational Algebra Thursday, October 12, 2000.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Lecture 06 The Relational Data Model. 2 Outline Relational Data Model Functional Dependencies FDs in ER Logical Schema Design Reading Chapter 8.
1 Lecture 07: Relational Algebra. 2 Outline Relational Algebra (Section 6.1)
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #7 Matthew P. Johnson Stern School of Business, NYU Spring,
Boyce-Codd NF & Lossless Decomposition Professor Sin-Min Lee.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #6 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Relational Schema Design (end) Relational Algebra Finally, querying the database!
One More Normal Form Consider the dependencies: Product Company Company, State Product Is it in BCNF?
Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such.
Functional Dependencies and Relational Schema Design.
Copyright © Curt Hill Schema Refinement III 4 th NF and 5 th NF.
1 Schema Design & Refinement (aka Normalization).
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Lecture 09: Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
1 Introduction to Database Systems CSE 444 Lecture 20: Query Execution: Relational Algebra May 21, 2008.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
Tallahassee, Florida, 2015 COP4710 Database Systems Relational Design Fall 2015.
Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language.
1 Lecture 10: Database Design Wednesday, January 26, 2005.
Functional Dependencies and Relational Schema Design.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
Design Theory for RDB Normal Forms. Lu Chaojun, SJTU 2 Redundant because these info may be figured out by using FD s1  … What’s Bad Design? Redundancy.
Lecture 3: Conceptual Database Design and Schema Design April 12 th, 2004.
Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003.
1 Lecture 10: Database Design and Relational Algebra Monday, October 20, 2003.
Multivalued Dependencies and 4th NF CIS 4301 Lecture Notes Lecture /21/2006.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 Lecture 9: Database Design Wednesday, January 25, 2006.
CSE 326: Data Structures Lecture #22 Databases and Sorting Alon Halevy Spring Quarter 2001.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
Lecture 11: Functional Dependencies
Relational Algebra.
Advanced Normalization
COP4710 Database Systems Relational Design.
Relational Algebra at a Glance
Lecture 8: Relational Algebra
3.1 Functional Dependencies
Advanced Normalization
Problems in Designing Schema
Relational Design Theory
Cse 344 May 16th – Normalization.
Functional Dependencies and Relational Schema Design
Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design October 12 & 15, 2007.
Lecture 09: Functional Dependencies, Database Design
RDBMS RELATIONAL DATABASE MANAGEMENT SYSTEM.
Lecture 8: Database Design
Lecture 33: The Relational Model 2
Lecture 07: E/R Diagrams and Functional Dependencies
CSE544 Data Modeling, Conceptual Design
Terminology Product Attribute names Name Price Category Manufacturer
Relational Schema Design (end) Relational Algebra SQL (maybe)
Syllabus Introduction Website Management Systems
Lecture 6: Functional Dependencies
Lecture 11: Functional Dependencies
Lecture 09: Functional Dependencies
Presentation transcript:

1 Lecture 7: Normal Forms, Relational Algebra Monday, 10/15/2001

2 Outline Normal forms: BCNF, 3NF (3.7) Multi-valued Dependencies, 4NF (3.8) Relational Algebra (4.1)

3 Relational Schema Design (or Logical Design) Main idea: Start with some relational schema Find out its FD’s Use them to design a better relational schema

4 Relational Schema Design Goal: eliminate anomalies Redundancy anomalies Deletion anomalies Update anomalies

5 Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Third Normal Form (3NF) = this lecture Boyce Codd Normal Form (BCNF) = this lecture Fourth Normal Form (4NF) = this lecture (only briefly)

6 Boyce-Codd Normal Form A simple condition for removing anomalies from relations: In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, should determine all the attributes of R. A relation R is in BCNF if and only if: Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R. A relation R is in BCNF if and only if: Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R.

7 Example Name SSN Phone Number Fred (201) Fred (206) Joe (908) Joe (212) What are the dependencies? SSN Name What are the keys? Name, SSN, PhoneNumber Is it in BCNF?

8 Decompose it into BCNF SSN Name Fred Joe SSN Phone Number (201) (206) (908) (212) SSN Name SSN PhoneNumber

9 What About This? Name Price Category Gizmo $19.99 gadgets OneClick $24.99 camera Name Price, Category It is already in BCNF

10 BCNF Decomposition Algorithm Find a dependency that violates the BCNF condition: A, A, … A 12n B, B, … B 12m A’s Others B’s R1R2 Heuristics: choose B, B, … B “as large as possible” 12m Decompose: Continue, until there are no BCNF violations left.

11 Example Decomposition FDs: SSN  Name, Age, Eye Color, Draft-worthy Age  Draft-worthy FDs: SSN  Name, Age, Eye Color, Draft-worthy Age  Draft-worthy Person: BNCF Decomposition: Step 1: Person1(SSN, Name, Age, EyeColor, Draft-worthy), Person2(SSN, PhoneNumber) Step 2: Person1a(SSN, Name, Age, EyeColor ), Person1b(Age, Draft-worthy) Person2(SSN, PhoneNumber) Name SSN Age EyeColor Draft-worthy PhoneNumber Keys ? Meaning ?

12 Other Example R(A,B,C,D) A  B, B  C Key: A, D Violations of BCNF: A  B, A  C, A  BC Pick A  BC: split into R1(A,BC) R2(A,D) What happens if we pick A  B first ?

13 Correct Decompositions A decomposition is lossless if we can recover: R(A,B,C) { R1(A,B), R2(A,C) } R’(A,B,C) = R(A,B,C) R’ is in general larger than R. Must ensure R’ = R Decompose Recover

14 Decomposition Based on BCNF is Necessarily Lossless R(A, B, C), A  C BCNF: R1(A,B), R2(A,C) Some tuple (a,b,c) in R (a,b’,c’) also in R decomposes into (a,b) in R1 (a,b’) also in R1 and (a,c) in R2 (a,c’) also in R2 Recover tuples in R: (a,b,c), (a,b,c’), (a,b’,c), (a,b’,c’) also in R ? Can (a,b,c’) be a bogus tuple? What about (a,b’,c’) ?

15 3NF: A Problem with BCNF Unit Company Product Unit Company Unit Product FD’s: Unit  Company; Company, Product  Unit So, there is a BCNF violation, and we decompose. Unit  Company No FDs

16 So What’s the Problem? Unit Company Product Unit CompanyUnit Product Galaga99 UW Galaga99 databases Bingo UW Bingo databases No problem so far. All local FD’s are satisfied. Let’s put all the data back into a single table again: Galaga99 UW databases Bingo UW databases Violates the dependency: company, product -> unit!

17 Solution: 3rd Normal Form (3NF) A simple condition for removing anomalies from relations: A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R, or B is part of a key. A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R, or B is part of a key.

18 Multi-valued Dependencies SSN Phone Number Course (206) CSE (206) CSE (206) CSE (206) CSE-341 The multi-valued dependencies are: SSN Phone Number SSN Course

19 Definition of Multi-valued Dependecy Given R(A1,…,An,B1,…,Bm,C1,…,Cp) the MVD A1,…,An B1,…,Bm holds if: for any values of A1,…,An the “set of values” of B1,…,Bm is “independent” of those of C1,…Cp Given R(A1,…,An,B1,…,Bm,C1,…,Cp) the MVD A1,…,An B1,…,Bm holds if: for any values of A1,…,An the “set of values” of B1,…,Bm is “independent” of those of C1,…Cp

20 Definition of MVDs Continued Equivalently: the decomposition into R1(A1,…,An,B1,…,Bm), R2(A1,…,An,C1,…,Cp) is lossless Note: an MVD A1,…,An B1,…,Bm Implicitly talks about “the other” attributes C1,…Cp

21 Rules for MVDs If A1,…An B1,…,Bm then A1,…,An B1,…,Bm Other rules in the book

22 4 th Normal Form (4NF) R is in 4NF if whenever: A1,…,An B1,…,Bm is a nontrivial MVD, then A1,…,An is a superkey R is in 4NF if whenever: A1,…,An B1,…,Bm is a nontrivial MVD, then A1,…,An is a superkey Same as BCNF with FDs replaced by MVDs

23 Confused by Normal Forms ? 3NF BCNF 4NF In practice: (1) 3NF is enough, (2) don’t overdo it !

24 Querying the Database

25 Querying the Database Find all the employees who earn more than $50,000 and pay taxes in New Jersey. We design high-level query languages: –SQL (used everywhere) –Datalog (used by database theoreticians, their students, friends and family) –Relational algebra: a basic set of operations on relations that provide the basic principles.

26 Relational Algebra at a Glance Operators: relations as input, new relation as output Five basic RA operators: –Basic Set Operators union, difference (no intersection, no complement) –Selection:  –Projection:  –Cartesian Product: X Derived operators: –Intersection, complement –Joins (natural,equi-join, theta join, semi-join) When our relations have attribute names: –Renaming: 

27 Set Operations Binary operations Union: all tuples in R1 or R2 –R1 U R2 –Example: ActiveEmployees U RetiredEmployees Difference: all tuples in R1 and not in R2 –R1 – R2 –Example AllEmployees - RetiredEmployees

28 Selection Unary operation: returns a subset of the tuples which satisfy some condition Notation:  (R) c is a condition: –=,, and, or, not Find all employees with salary more than $40,000: –  (Employee) c Salary > 40000

29 Find all employees with salary more than $40,000.

30 Projection Unary operation: returns certain columns Eliminates duplicate tuples ! Notation:  (R) Example: project social-security number and names: –  (Employee) A1,…,An SSN, Name

31

32 Cartesian Product Binary Operation Result is tuples combining any element of R1 with any element of R2, for R1 X R2 Schema is union of Schema(R1) & Schema(R2) Notation: R1 x R2 Example: Employee x Dependents Very rare in practice; but joins are very often

33

34 Join (Natural) Most important, expensive and exciting. Combines two relations, selecting only related tuples Equivalent to a cross product followed by selection Resulting schema has all attributes of the two relations, but one copy of join condition attributes

35