© D. Wong Ch. 3 (continued) Database design problems Functional Dependency Keys of relations Decompositions based on Functional Dependency (normalization) –Boyce-Codd Normal Form (BCNF) –Third Normal Form –Recovering information from a decomposition Multivalued Dependencies and decomposition –Fourth Normal Form Relationships Among Normal Forms
© D. Wong Database Design Problems Principal kinds of anomalies that constitute bad db design: –Redundancy –Update Anomalies –Deletion Anomalies –Insertion Anomalies Want to find a good relation schema design for the relational model
© D. Wong Redundancy Information repeated unnecessarily in several tuples. Example: –length and filmType for Movie
© D. Wong Update Anomalies Change information in one tuple but leave the same information unchanged in another A consequence of redundancy Cause potential inconsistency Example: –Change length of Star Wars in one tuple but not the others
© D. Wong Deletion Anomalies A set of values becomes empty (deleted), may lose other information as side effect Example: –Deleting the only star listed for the movie Mighty Ducks
© D. Wong Insertion Anomalies Cannot insert a tuple because some of the data not yet available Inverse to deletion anomalies Problem of using null value to fill the missing / unavailable data: –When the data becomes available, will we remember to delete the one with nulls –If the missing data is part of a key, then can’t use null Example: –Cannot start to keep track of the information of a new movie when the cast is not yet determined
© D. Wong Dependency is an assertion that only a subset of all possible relations are ‘legal’. An assertion about the real world, cannot be proved. It’s a form of constraints. Dependencies in a relation means some sort of redundancy in the legal relations. Example: title, year length Functional dependencies (FD) Multivalued dependencies (MVD) Dependencies titleyearlengthfilmTypestudioNamestarName Star wars ColorFoxCarrieFisher 1977???ColorFox Mark Hamil
© D. Wong Functional Dependencies (FD) Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X Y means X functionally determines Y e.g. A 1 A 2 …A n B 1 B 2 …B m X Y is an assertion about R that whenever two tuples agree on all the attributes of X, then they must also agree on attributes of Y. It’s a constraint on the data that may appear within a relation. (i.e. schema level control of data) It’s a restriction on relations that depend only on the equality or inequality of values, i.e. value-oblivious It’s the most important type of value-oblivious constraints. Value-oblivious constraints have the greatest impact for designing database schemas
© D. Wong Significance of Functional Dependencies Key: 1.Entity set R (A1, …, An), and X is a subset of A1, …, An that form a key for R, then may assert: X Y where Y is any subset of {A1, …, An} 2.R(A1, …, An) is a many-one relationship from entity set E1 to entity set E2, and among the Ai’s are attributes that form a key X for E1 and a key Y for E2, then may assert: X Y Tool for explaining the process of normalization
© D. Wong Representing FD using Relational Algebra Example 5.31 MovieStar (name, address, gender, birthdate) FD: name address ( (MovieStar) x (MovieStar)) = MS1.name=MS2.name AND MS1.address MS2.address ( MS1 (MovieStar) x MS2 (MovieStar)) =
© D. Wong Keys in the E/R Model key for an entity set E is a set K of one or more attributes such that, given any two distinct entities e 1 and e 2 in E, e 1 and e 2 cannot have identical values for each of the attributes in the key K. –A key can consist of more than one attribute –There can be more than one possible key (candidate keys), but designate one as the “primary key”. –Attributes that constitute the primary key cannot be null –If the entity set is in an isa-hierarchy, require the root entity set have all the attributes needed for a key. –In E/R diagram, attributes of a key are underlined
© D. Wong Keys of Relations K is a key for relation R if: 1.K all attributes of R 2.For no proper subset of K is (1) true (i.e. a key must be minimal) 3.If K at least satisfies (1), then K is a superkey Example: {title, year, starName) forms a key for the Movie relation Superkeys: A set of attributes that contains a key is called a superkey
© D. Wong Discovering Keys for relations Consider relation R: 1. If R comes from an entity set, then the key for the relation is the key attributes of this entity set 2. If R is from a many-many relationship, then the keys of both connected entity sets are the key attributes for R 3. If R is from a many-one relationship from entity set E1 to entity set E2, then the keys attributes of E1 are the key attributes for R 4. If R is from a one-one relationship, then the key attributes for either of the connected entity sets are key attributes of R. More than 1 candidate key in this case.
© D. Wong FD Rules Splitting / Combining rule Trivial Dependencies Armstrong’s Axioms: 1.Reflexivity 2.Augmentatioin 3.Transitivity Computing closure of attributes Finding all implied FD’s ( Ref. section: 3.5.7)
© D. Wong Splitting / Combining rule FD: A 1 A 2 …A n B 1 B 2 …B m vs. A 1 A 2 …A n B 1... A 1 A 2 …A n B m Splitting rule: replace FD I by the set of FDs II Combining rule: replace the set of FDs II by FD I I II
© D. Wong Trivial Dependencies FD A 1 A 2 …A n B is trivial if B is one of the A’s Every trivial dependency holds in every relation For FD A 1 A 2 …A n B 1 B 2 …B m –Trivial if the B’s are a subset of the A’s e.g. title year title –Nontrivial if at least one of the B’s is not among the A’s –Completely nontrivial if none of the B’s is also one of the A’s Trivial-dependency rule: –A 1 A 2 …A n B 1 B 2 …B m A 1 A 2 …A n C 1 C 2 …C k where the C’s are all those B’s that are not also A’s
© D. Wong Armstrong’s Axioms: A set of inference rules: (Pg. 135, 2 nd ed. Pg.99) 1. Reflexivity If {B 1, B 2, …, B m } {A 1, A 2, …, A n }, then A 1 A 2 …A n B 1 B 2 …B m. // trivial dependencies 2. Augmentatioin If A 1 A 2 …A n B 1 B 2 …B m, then A 1 A 2 …A n C 1 …C k B 1 B 2 …B m C 1 …C k for any set of attributes C 1 …C k 3. Transitivity If A 1 A 2 …A n B 1 B 2 …B m and B 1 B 2 …B m C 1 …C k then A 1 A 2 …A n C 1 …C k