CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1.

CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1

Boyce-Codd Normal Form
A relation R is in BCNF if whenever there is a nontrivial FD A1 ... An  B for R, {A1 ... An} is a superkey for R. An FD is trivial if all the attributes on its right-hand side are also on its left-hand side.

BCNF Summary BCNF violation occurs when we do not store information on multiple functions efficiently This does not always mean that there is redundancy in a relation; it means that the relation could have redundant data If the input of a function f is a key or a super key of a table, we capture information on f without redundancy If the input of a function f is NOT a key, there are two situations we could have two redundant data

Two Possible Causes of Redundancy
The input of function g is an output or part of output of another function f The input of function g is a subset of attributes of another function f g f X Y Z 1 2 3 f: X -> Y g: Y-> Z In the first example, we say that function g maps value ‘2’ in Y into ‘3’ in Z three times. In the second example, we say that the function g maps value ‘1’ in Y into ‘1’ in Z three times. f: X,Y-> W g: Y -> Z X Y Z W 1 2 3

When BCNF Decomposition Breaks FDs?
We say that a decomposition is NOT dependency-preserving if we cannot check each FD with decomposed relations. Suppose that R(A, B, C) with FDs: A, B -> C, C->B. By using C -> B, we get R1(B, C) and R2(A,B). Therefore, we cannot check A,B -> C with R1 and R2.

3rd Normal Form R is in 3NF if for every nontrivial FD A1, ..., An  B, either {A1, ..., An} is a superkey, or B is part of a key. Weakens BCNF.

Primary Goal of 3NF Preserve FDs of the initial relation R with docomposed relations R1,…,Rn Try to minimize redundancy while achieving the first goal

Basic Approach for 3NF Decomposition
Create a relation for each FD E.g., If A->B, then create R(A, B) But, we want to minimize the number of such relations Thus, we find a minimum set of FDs, from which we can derive all FDs. E.g., If A->B, B->C, we don’t need to worry about A -> C. Thus, A->C is not part of the minimum set. Every FD in the minimum set has nice properties: A1,…,An -> B where 1) no FDs among A1,…,An, 2) a single attribute on the right and 3) no FDs of the form Ai,…,Aj -> B where {Ai,…,Aj} {A1,…,An}

Synthesis Algorithm for 3NF Schemas
1. Find a minimal basis G of the set of FDs for relation R 2. For each FD XA in G, add a relation with attributes XA 3. If none of the relation schemas from Step 2 is a superkey for R, add a relation whose schema is a key for R Result will be lossless and will preserve dependencies. Result will be in 3NF, but might not be in BCNF.

Minimal Basis A set of FD’s F is a minimal basis of a set
of dependencies E if E = F+ Every dependency in F has a single attribute for its right-hand side Cannot remove any dependency from F or remove attributes from the left side of any FD in F (minimality) Example: E = {AB, AC, BA, BC, CA, CB} F = {AB, BC, CA}

Example 3NF Decomposition
R(A, B, C, D, E), F = {AB -> C, C -> B, C -> D} We assume that F is the minimum cover Obtain R1(A, B, C) , R2(B, C), and R3(C, D) from each FD in set F Since either R1 or R3 does not contain a key {ABE} or {ACE}, add one more relation R4(A, B, E) We finally get R1(ABC), R3(C, D), and R4(A, B, E) omitting R2(B, C). In step 3, you can pick only one of the keys freely.

Is a decomposed relation for 3NF also in BCNF?
If A1,…,An -> B in the minimum set, then we create R(A1,…, An, B). Remember that there is no FDs among A1,…,An. What type of FD in R will violate BCNF? B -> Ai,…,Aj where {Ai,…,Aj} {A1,…,An} If we apply BCNF decomposition with the above FD, we break, A1,…,An -> B Recall the primary goal of 3NF. We need to keep R any way Then, let’s stop here and find out what property FD: B -> Ai,…,Aj has Ai,…,Aj is part of a key!

Relational Schema Design
What is a keys and a super key? What is a functional dependency? How to compute the closure of a set of attributes given FDs? How to determine keys given functional dependencies? How to use Armstrong's Axioms and compute a closure of FDs? What are update and deletion anomalies? Why are they bad? What is BCNF? How to decompose relation into BCNF? Is BCNF decomposition lossless? What is 3NF? How is it different from BCNF? Is 3NF decomposition lossless? What are tradeoffs between BCNF and 3NF? What’s a multi-valued dependency? Is it related to a FD in some way? What is the relationship between BCNF and 4NF?

Relational Algebra What are basic five operators in RA?
What are derived operators? Do you know the symbols of the operators and what they do? How to define each derived operator with the basic ones? What are theta-join and natural join? What is turned by the following expression on relation R(a)?

Which products are available only at a single store?
R(ProductName, Store, ID) ProductName StoreID Bread 2 Cheese-Cheddar 4 5

T3: Inventory Relation T31: Inventory Relation ProductName StoreID Bread 2 Cheese-Cheddar 4 5 ProductName StoreID Bread 2 Cheese-Cheddar 4 5 1. Make a copy of T3 T31 := T3

T3: Inventory Relation T31: Inventory Relation ProductName StoreID Bread 2 Cheese-Cheddar 4 5 ProductName StoreID Bread 2 Cheese-Cheddar 4 5 2. Compute products available at multiple stores A = π[T3.ProductName](T3 ⋈ [T3.ProductName = T31.ProductName AND T3.StoreID != T31.StoreID] T31) T3.ProductName T3.StoreID T31.ProductName T31.StoreID Cheese-Cheddar 4 5

T3: Inventory Relation A: Products available at multile stores ProductName StoreID Bread 2 Cheese-Cheddar 4 5 ProductName Cheese-Cheddar 3. Subtract stores in A from stores in T3 B = π[T3.ProductName] - A B: Products available at a single store ProductName Bread

Find all the store names whose products in their inventories are a subset of the inventory of some other store? T(ProductName, StoreName) ProductName StoreName Bread WalMart Cheese-Cheddar Meijer Schnucks Lettuce Your answer should be “Schnucks”

1. Compute all the possible pair of ProductName and StoreName
A =π[ProductName]T3  π[StoreName]T3 T3: Inventory Relation A: ProductName StoreName Bread WalMart Meijer Schnucks Cheese-Cheddar Lettuce ProductName StoreName Bread WalMart Cheese-Cheddar Meijer Schnucks Lettuce

Compute a list of ProductName-StoreName pair (p, s) where product p is NOT available at store s.
Negative Inventory Relation B = A - T3 T3: Inventory Relation B: ProductName StoreName Bread WalMart Meijer Schnucks Cheese-Cheddar Lettuce ProductName StoreName Bread WalMart Cheese-Cheddar Meijer Schnucks Lettuce

3. Take the natural join of T3 and B and project onto a pair of stores
T3: Inventory Relation ProductName StoreName Bread Meijer Schnucks Cheese-Cheddar WalMart Lettuce B: Negative Inventory Relation ProductName StoreName Bread WalMart Cheese-Cheddar Meijer Schnucks Lettuce C = π[T3.StoreName, B.StoreName] (T3 ⋈ B) ProductName T3.StoreName B.StoreName Bread WalMart Meijer Schnucks Cheese-Cheddar Lettuce

4. Compute all the possible pair of two different stores
T31 = T3, D(store1, store2) = π[StoreName] T3 ⋈[T3.StoreName != T31.StoreName] π[StoreName] T31 T3: Inventory Relation D: Store1 Store2 WalMart Meijer Schnucks Schunucks ProductName StoreName Bread WalMart Cheese-Cheddar Meijer Schnucks Lettuce

All the possible pair of two stores D(store1, store2)
Notice that store1  store2 and store1  store2 are mutually exclusive and collectively exhaustive D(store1, store2) - C(store1, store2) C(store1, store2) where store1  store2 where store1  store2 All the possible pair of two stores D(store1, store2)

5. Subtract D from C D: C: E = D - C Store1 Store2 WalMart Meijer
Schnucks Schunucks C: Store1 Store2 WalMart Meijer Schnucks E = D - C Store1 Store2 Schnucks Meijer

SQL What are the three clauses in a SQL query statement?
How to express a single relation query in SQL with RA? Do you know how conditions involving NULL are evaluated? Do you know how to disambiguate attribute names in the WHERE clause when the FROM clause contain multiple relations? Do you understand the semantics (meaning) of a multi-relation query in SQL in two different ways? What if a query needs two copies of the same relation? In which clauses can we use subqueries? Can you use IN, ALL, ANY, and EXISTS operators on the result relation of a subquery? How to express set operations such as union, intersect, and set difference in SQL? Which operations in SQL support bag semantics? Which operations in SQL support set semantics? How to remove duplicate tuples from the result of a SQL query? Do you known when a correlated subquery is evaluated?

SQL Aggregation/Grouping
What are five aggregation function in SQL? How aggregations in SQL handle NULL values? Is there any difference among the functions? How to partition the result relation in a SQL query into multiple groups? Where can you define conditions on each group? Which attributes can you refer to in the HAVING clause? Which attributes can you include in the SELECT clause? In which clauses can you use aggregate functions? How to write insertion, deletion, and update statements in SQL? How to create a new table in SQL? What is a view? Why some views are not updatable?

Constraints and Triggers
What’s the major difference between constraints and triggers? What are example constraints in SQL? How primary keys and a set of attributes declared as UNIQUE handle NULL values differently? In which situations a foreign-key constraint could be violated? What are three strategies to prevent dangling tuples? When an attribute-based or tuple-based CHECK evaluated? In what situations a DBMS cannot enforce the conditions in CHECKs? When are the condition in an ASSERTION checked? What are events in a TRIGGER statement? What is the difference between statement-level triggers and row-level ones?

CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1.

Similar presentations

Presentation on theme: "CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1.

Similar presentations

Presentation on theme: "CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1."— Presentation transcript:

Similar presentations

About project

Feedback