Download presentation
Presentation is loading. Please wait.
Published byAino Laaksonen Modified over 5 years ago
1
Decomposition and Higher Forms of Normalization
Chapter 15.8 (skim) 1/17/2019
2
"Lossless" Joins The main idea: if you decompose a relation schema, then join the parts of an instance via a natural join, you might get more rows than you started with, i.e., spurious tuples This is bad! Called a "lossy join". Goal: decompositions which produce only "lossless" joins "non-additive" join is more descriptive 1/17/2019
3
Preserving FDs What if, when a relation is decomposed, the X of an XY ends up only in one of the new relations and the Y ends up only in another? Such a decomposition is not “dependency-preserving.” Goal: Always have FD-preserving decompositions 1/17/2019
4
Decomposing R Into BCNF
Pick an X-> A which violates BCNF, where A is a single attribute. Decompose so that R1={X,A} is in one relation, R2=R-A is in the other. Repeat if necessary on R2. Guaranteed lossless... but may lose dependencies SBD with SB->D, D->B 1/17/2019
5
Decomposing R into 3NF The algorithm is much more complicated
1. Get a “minimal cover” of FDs 2. Find a lossless-join decomposition of R (which might miss dependencies) 3. Add additional relations to the decomposition to cover any missing FDs of the cover Result will be lossless, will be dependency-preserving 3NF; might not be BCNF 1/17/2019
6
Fact of life... Finding a decomposition which is both lossless and dependency-preserving is not always possible. 1/17/2019
7
Contracts schema CSJDPQV
Contracts (cid, supplierid, projectid, deptid, partid, qty, value) cid is a key; JP -> C; SD -> P What are candidate keys? BCNF, 3NF decompositions? What if J->S is added? 1/17/2019
8
Multivalued Dependencies (MVDs)
XY means that given X, there is a unique set of possible Y values (which do not depend on other attributes of the relation) PARENTNAMECHILDNAME An FD is also a MVD MVD problems arise if there are two independent 1:N relationships in a relation. 1/17/2019
9
Textbook Example (3-way pun)
course, teacher, book Each teacher teaches a set of courses teacher courses Each course has a recommended set of books (independent of teacher) course book No FDs! 1/17/2019
10
MVD Theory A whole theory exists, analogous to FD theory
Armstrong’s axioms plus 5 additional rules 1/17/2019
11
Fourth Normal Form A relation R is in 4NF if for every nontrivial XY, X is a superkey of R. Decomposition into 4NF: If there is a non-trivial XY, form one relation with only X and Y, and another with R-Y. In other words, allow only one MVD per relation This will be lossless, but not necessarily FD-preserving. Achieving 4NF is a trade-off 1/17/2019
12
Fifth Normal Form Sometimes a relation cannot be losslessly decomposed into two relations, but can be into three or more. 5NF captures the idea that a relation scheme must have some particular lossless decomposition ("join dependency"). Finding actual 5NF cases is difficult. 1/17/2019
13
Normalization Summary
1NF: usually part of the woodwork even so, know how to decompose 2NF: usually skipped but lots of defs. that make great exam Q's! 3NF: a biggie Always aim for this BCNF and 4NF: tradeoffs start here in re: d-preserving and losslessness 5NF: You can say you've heard of it... 1/17/2019
14
Caveat Normalization is not the be-all and end-all of DB design
Example: suppose attributes A and B are always used together, but normalization theory says they should be in different tables. Decomposition might produce unacceptable performance loss (extra disk reads) Plus -- there are constraints other than FDs and MVDs 1/17/2019
15
Looking Ahead: One Query Is Usually Not Enough
A meaningful unit of DB work might involve doing one or more queries, checking constraints, performing calculations, interacting with user, etc. Such a unit is termed a "transaction" "Transaction Processing" will be our next topic 1/17/2019
16
Current Trends Object DBs and Object-Relational DB’s Data Warehouses
May permit complex attributes 1st normal form unnecessary Data Warehouses Huge historical databases, seldom or never updated after creation Joins expensive or impractical Argues against normalization Every-day Relational DBs Aim for BCNF, settle for 3NF 1/17/2019
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.