Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Advertisements

Logical Database Design (3 of 3) John Ortiz. Lecture 7Logical Database Design (2)2 Normalization  If a relation is not in BCNF or 3NF, we refine it by.
Tallahassee, Florida, 2014 COP4710 Database Systems Relational Algebra Fall 2014.
Lecture 07: Relational Algebra
1 Relational Algebra. Motivation Write a Java program Translate it into a program in assembly language Execute the assembly language program As a rough.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
1 Relational Algebra Lecture #9. 2 Querying the Database Goal: specify what we want from our database Find all the employees who earn more than $50,000.
Lecture 6: Design Constraints and Functional Dependencies January 21st, 2004.
Relational Algebra Maybe -- SQL. Confused by Normal Forms ? 3NF BCNF 4NF If a database doesn’t violate 4NF (BCNF) then it doesn’t violate BCNF (3NF) !
1 Lecture 10: Database Design XML Wednesday, October 20, 2004.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
1 Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design April 16 & 18, 2008.
Lecture #3 Functional Dependencies Normalization Relational Algebra Thursday, October 12, 2000.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Lecture 06 The Relational Data Model. 2 Outline Relational Data Model Functional Dependencies FDs in ER Logical Schema Design Reading Chapter 8.
1 Lecture 07: Relational Algebra. 2 Outline Relational Algebra (Section 6.1)
Boyce-Codd NF & Lossless Decomposition Professor Sin-Min Lee.
Relational Schema Design (end) Relational Algebra Finally, querying the database!
One More Normal Form Consider the dependencies: Product Company Company, State Product Is it in BCNF?
Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such.
Functional Dependencies and Relational Schema Design.
Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Lecture 09: Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
1 Introduction to Database Systems CSE 444 Lecture 20: Query Execution: Relational Algebra May 21, 2008.
Transactions, Relational Algebra, XML February 11 th, 2004.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
1 Schema Refinement and Normal Forms Week 6. 2 The Evils of Redundancy  Redundancy is at the root of several problems associated with relational schemas:
CSE 544: Relational Operators, Sorting Wednesday, 5/12/2004.
1 Lecture 7: Normal Forms, Relational Algebra Monday, 10/15/2001.
Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language.
1 Lecture 10: Database Design Wednesday, January 26, 2005.
Functional Dependencies and Relational Schema Design.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
The Relational Model Lecture 16. Today’s Lecture 1.The Relational Model & Relational Algebra 2.Relational Algebra Pt. II [Optional: may skip] 2 Lecture.
1 Lecture 10: Database Design and Relational Algebra Monday, October 20, 2003.
1 Lecture 5: Relational Algebra and XML Monday, April 26th, 2004.
1 CSE544: Lecture 7 XQuery, Relational Algebra Monday, 4/22/02.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 19.
1 Lecture 9: Database Design Wednesday, January 25, 2006.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
©Silberschatz, Korth and Sudarshan2.1Database System Concepts - 6 th Edition Chapter 8: Relational Algebra.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
Lecture 11: Functional Dependencies
Lecture 14: Relational Algebra Projects XML?
Relational Algebra.
COP4710 Database Systems Relational Algebra.
Relational Algebra at a Glance
CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1.
Lecture 8: Relational Algebra
Chapter 2: Intro to Relational Model
Database Management Systems (CS 564)
Problems in Designing Schema
Lecture 18 The Relational Model.
Functional Dependencies and Relational Schema Design
RDBMS RELATIONAL DATABASE MANAGEMENT SYSTEM.
Lecture 8: Database Design
Lecture 33: The Relational Model 2
Relational Algebra Friday, 11/14/2003.
CS639: Data Management for Data Science
Chapter 2: Intro to Relational Model
Lecture 3: Relational Algebra and SQL
Relational Schema Design (end) Relational Algebra SQL (maybe)
Lecture 6: Functional Dependencies
Lecture 11: Functional Dependencies
Lecture 09: Functional Dependencies
Presentation transcript:

Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003

Summary of Previous Discussion FD’s are given as part of the schema. You derived keys from them. You can also specify keys directly (they’re just another FD) First, compute the keys and the FD’s that hold on the relation you’re considering. Then, look for violations of BCNF. There may be a few. Choose one.

Summary (continued) You may need to decompose the decomposed relations: –To decide that, you need to project the FD’s on the decomposed relations. –The end result may differ, depending on which violation you chose to decompose by. They’re all correct. Relations with 2 attributes are in BCNF.

Summary (continued –2) If you have only a single FD representing a key, then the relation is in BCNF. But, you can still have another FD that forces you to decompose. This is all really pretty simple if you think about it long enough.

Boyce-Codd Normal Form A simple condition for removing anomalies from relations: In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, should determine all the attributes of R. A relation R is in BCNF if: Whenever there is a nontrivial dependency A 1,..., A n  B in R, {A 1,..., A n } is a key for R A relation R is in BCNF if: Whenever there is a nontrivial dependency A 1,..., A n  B in R, {A 1,..., A n } is a key for R

Example Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN  name, age, hairColor age  hairColor Decompose in BCNF (in class): Step 1: find all keys ssn, phoneNumber SSN, name, age, hairColor SSN, phoneNumber Step 2: now decompose

Other Example R(A,B,C,D) A B, B C Keys: Violations of BCNF:

Correct Decompositions A decomposition is lossless if we can recover: R(A,B,C) R1(A,B) R2(A,C) R’(A,B,C) should be the same as R(A,B,C) R’ is in general larger than R. Must ensure R’ = R Decompose Recover

Correct Decompositions Given R(A,B,C) s.t. A  B, the decomposition into R1(A,B), R2(A,C) is lossless

3NF: A Problem with BCNF Unit Company Product Unit Company Unit Product FD’s: Unit  Company; Company, Product  Unit So, there is a BCNF violation, and we decompose. Unit  Company No FDs

So What’s the Problem? Unit Company Product Unit CompanyUnit Product Galaga99 UW Galaga99 databases Bingo UW Bingo databases No problem so far. All local FD’s are satisfied. Let’s put all the data back into a single table again: Galaga99 UW databases Bingo UW databases Violates the dependency: company, product -> unit!

Solution: 3rd Normal Form (3NF) A simple condition for removing anomalies from relations: A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R, or B is part of a key. A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R, or B is part of a key.

Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language Algebra Implementation SQL, relational calculus Relational algebra Relational bag algebra

Relational Algebra Five operators: –Union:  –Difference: - –Selection:  –Projection:  –Cartesian Product:  Derived or auxiliary operators: –Intersection, complement –Joins (natural,equi-join, theta join, semi-join) –Renaming: 

1. Union and 2. Difference R1  R2 Example: –ActiveEmployees  RetiredEmployees R1 – R2 Example: –AllEmployees -- RetiredEmployees

What about Intersection ? It is a derived operator R1  R2 = R1 – (R1 – R2) Also expressed as a join (will see later) Example –UnionizedEmployees  RetiredEmployees

3. Selection Returns all tuples which satisfy a condition Notation:  c (R) Examples –  Salary > (Employee) –  name = “Smithh” (Employee) The condition c can be =,, , <>

Find all employees with salary more than $40,000.  Salary > (Employee)

4. Projection Eliminates columns, then removes duplicates Notation:  A1,…,An (R) Example: project social-security number and names: –  SSN, Name (Employee) –Output schema: Answer(SSN, Name)

 SSN, Name (Employee)

5. Cartesian Product Each tuple in R1 with each tuple in R2 Notation: R1  R2 Example: –Employee  Dependents Very rare in practice; mainly used to express joins

Relational Algebra Five operators: –Union:  –Difference: - –Selection:  –Projection:  –Cartesian Product:  Derived or auxiliary operators: –Intersection, complement –Joins (natural,equi-join, theta join, semi-join) –Renaming: 

Renaming Changes the schema, not the instance Notation:  B1,…,Bn (R) Example: –  LastName, SocSocNo (Employee) –Output schema: Answer(LastName, SocSocNo)

Renaming Example Employee NameSSN John Tony LastNameSocSocNo John Tony  LastName, SocSocNo (Employee)

Natural Join Notation: R1 ⋈ R2 Meaning: R1 ⋈ R2 =  A (  C (R1  R2)) Where: –The selection  C checks equality of all common attributes –The projection eliminates the duplicate common attributes

Natural Join Example Employee NameSSN John Tony Dependents SSNDname Emily Joe NameSSNDname John Emily Tony Joe Employee Dependents =  Name, SSN, Dname (  SSN=SSN2 (Employee x  SSN2, Dname (Dependents))

Natural Join R= S= R ⋈ S= AB XY XZ YZ ZV BC ZU VW ZV ABC XZU XZV YZU YZV ZVW

Natural Join Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ? Given R(A, B, C), S(D, E), what is R ⋈ S ? Given R(A, B), S(A, B), what is R ⋈ S ?

Theta Join A join that involves a predicate R1 ⋈  R2 =   (R1  R2) Here  can be any condition

Eq-join A theta join where  is an equality R1 ⋈ A=B R2 =  A=B (R1  R2) Example: –Employee ⋈ SSN=SSN Dependents Most useful join in practice

Semijoin R ⋉ S =  A1,…,An (R ⋈ S) Where A 1, …, A n are the attributes in R Example: –Employee ⋉ Dependents

Semijoins in Distributed Databases Semijoins are used in distributed databases SSNName... SSNDnameAge... Employee Dependents network Employee ⋈ ssn=ssn (  age>71 (Dependents)) T =  SSN  age>71 (Dependents) R = Employee ⋉ T Answer = R ⋈ Dependents

Complex RA Expressions Person Purchase Person Product  name=fred  name=gizmo  pid  ssn seller-ssn=ssnpid=pidbuyer-ssn=ssn  name

Operations on Bags A bag = a set with repeated elements All operations need to be defined carefully on bags {a,b,b,c}  {a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f} {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d}  C (R): preserve the number of occurrences  A (R): no duplicate elimination Cartesian product, join: no duplicate elimination Important ! Relational Engines work on bags, not sets ! Reading assignment: 5.3 – 5.4

Finally: RA has Limitations ! Cannot compute “transitive closure” Find all direct and indirect relatives of Fred Cannot express in RA !!! Need to write C program Name1Name2Relationship FredMaryFather MaryJoeCousin MaryBillSpouse NancyLouSister