Decomposition and Higher Forms of Normalization Chapter 15.8 (skim) 1/17/2019
"Lossless" Joins The main idea: if you decompose a relation schema, then join the parts of an instance via a natural join, you might get more rows than you started with, i.e., spurious tuples This is bad! Called a "lossy join". Goal: decompositions which produce only "lossless" joins "non-additive" join is more descriptive 1/17/2019
Preserving FDs What if, when a relation is decomposed, the X of an XY ends up only in one of the new relations and the Y ends up only in another? Such a decomposition is not “dependency-preserving.” Goal: Always have FD-preserving decompositions 1/17/2019
Decomposing R Into BCNF Pick an X-> A which violates BCNF, where A is a single attribute. Decompose so that R1={X,A} is in one relation, R2=R-A is in the other. Repeat if necessary on R2. Guaranteed lossless... but may lose dependencies SBD with SB->D, D->B 1/17/2019
Decomposing R into 3NF The algorithm is much more complicated 1. Get a “minimal cover” of FDs 2. Find a lossless-join decomposition of R (which might miss dependencies) 3. Add additional relations to the decomposition to cover any missing FDs of the cover Result will be lossless, will be dependency-preserving 3NF; might not be BCNF 1/17/2019
Fact of life... Finding a decomposition which is both lossless and dependency-preserving is not always possible. 1/17/2019
Contracts schema CSJDPQV Contracts (cid, supplierid, projectid, deptid, partid, qty, value) cid is a key; JP -> C; SD -> P What are candidate keys? BCNF, 3NF decompositions? What if J->S is added? 1/17/2019
Multivalued Dependencies (MVDs) XY means that given X, there is a unique set of possible Y values (which do not depend on other attributes of the relation) PARENTNAMECHILDNAME An FD is also a MVD MVD problems arise if there are two independent 1:N relationships in a relation. 1/17/2019
Textbook Example (3-way pun) course, teacher, book Each teacher teaches a set of courses teacher courses Each course has a recommended set of books (independent of teacher) course book No FDs! 1/17/2019
MVD Theory A whole theory exists, analogous to FD theory Armstrong’s axioms plus 5 additional rules 1/17/2019
Fourth Normal Form A relation R is in 4NF if for every nontrivial XY, X is a superkey of R. Decomposition into 4NF: If there is a non-trivial XY, form one relation with only X and Y, and another with R-Y. In other words, allow only one MVD per relation This will be lossless, but not necessarily FD-preserving. Achieving 4NF is a trade-off 1/17/2019
Fifth Normal Form Sometimes a relation cannot be losslessly decomposed into two relations, but can be into three or more. 5NF captures the idea that a relation scheme must have some particular lossless decomposition ("join dependency"). Finding actual 5NF cases is difficult. 1/17/2019
Normalization Summary 1NF: usually part of the woodwork even so, know how to decompose 2NF: usually skipped but lots of defs. that make great exam Q's! 3NF: a biggie Always aim for this BCNF and 4NF: tradeoffs start here in re: d-preserving and losslessness 5NF: You can say you've heard of it... 1/17/2019
Caveat Normalization is not the be-all and end-all of DB design Example: suppose attributes A and B are always used together, but normalization theory says they should be in different tables. Decomposition might produce unacceptable performance loss (extra disk reads) Plus -- there are constraints other than FDs and MVDs 1/17/2019
Looking Ahead: One Query Is Usually Not Enough A meaningful unit of DB work might involve doing one or more queries, checking constraints, performing calculations, interacting with user, etc. Such a unit is termed a "transaction" "Transaction Processing" will be our next topic 1/17/2019
Current Trends Object DBs and Object-Relational DB’s Data Warehouses May permit complex attributes 1st normal form unnecessary Data Warehouses Huge historical databases, seldom or never updated after creation Joins expensive or impractical Argues against normalization Every-day Relational DBs Aim for BCNF, settle for 3NF 1/17/2019