Presentation is loading. Please wait.

Presentation is loading. Please wait.

Normal Forms. First Normal Form: all table cells must contain atomic values − no sets, arrays, lists, or other collection types − no structured objects.

Similar presentations


Presentation on theme: "Normal Forms. First Normal Form: all table cells must contain atomic values − no sets, arrays, lists, or other collection types − no structured objects."— Presentation transcript:

1 Normal Forms

2 First Normal Form: all table cells must contain atomic values − no sets, arrays, lists, or other collection types − no structured objects all relational databases satisfy first normal form by definition − the definition restricts attribute values to singletons from specified domains indeed, relations that do not satisfy first normal form are specifically called unnormalized relations

3 Definitions: Goal of functional dependency analysis is to decompose the universal relation into components which cannot become inconsistent with respect to any functional dependency constraint

4 Some requirements: + indicates union of attributes lossless join

5 ABC123524ABC123524 AB1252AB1252 BC2324BC2324 ABC123124523524ABC123124523524 spurious tuples Lossy join:

6 More definitions:

7 Whence functional dependencies: Trick: compute minimal cover directly from application FDs without calculating closure

8 Yet more definitions:

9 Yet more definitions (continued):

10 Armstrong's Axioms

11 Theorem: Armstrong's Axioms are both sound and complete. That is, if you apply them repeatedly to a given set of FDs, then (1) every new FD generated belongs to the closure (sound) (2) every FD in the closure will eventually be generated (complete)

12 Definition: Because inferences via Armstrong's Axioms yields precisely the FD set closure, this definition is the same as:

13 Generating the closure is computationally expensive because of its exponential size But, checking a given FD for inclusion in the closure is easy:

14 Algorithm uses only the application-driven FDs, not the closure

15 Other uses of the algorithm: key determination

16 Other uses of the algorithm: equivalence of two sets of FDs

17 Other uses of the algorithm: minimal cover computation:

18 Boyce-Codd Normal Form (BCNF):

19 Dependency preservation:

20 Example: BCNF decomposition process

21 no longer BCNF violators A is a superkey of first relation FDs no longer apply to the second But, still have a violator here

22 BCNF violators

23 BCNF violator

24

25 Aquarium database --- minimal cover sno → sname sno → sfood tno → tname tno → tvolume fno → fname fno → fcolor fno → fweight eno → edate eno → enote tvolume → tcolor eno → fno fno → tno fno → sno Only superkeys: supersets of eno ==> all FDs are BCNF violators except those with eno on left side BCNF: sno sname sfood tno tname tvolume tvolume tcolor fno fname fcolor fweight sno tno eno edate enote fno

26 Note: it suffices to look for BCNF violators in minimal cover of original application-driven FDs In applying Armstrong's axioms to generate FD closure, only augmentation and pseudotransitivity generate new left hand sides; none generate any new right-side attributes

27 BCNF decomposition is lossless, but not always dependency-preserving Example:

28 Example (continued):

29 In general: if decomposition does NOT split any FD from the minimal cover, then decomposition IS dependency-preserving if decomposition does split an FD from the minimal cover, then it may or may not be dependency-preserving (text presents an algorithm for deciding)

30 Second Normal Form:

31 Note: BCNF implies 2NF

32 Example: sno sname sfood tno tname tcolor tvolume only key is (sno, tno), but (sno → sname) and sname is non-prime have several 2NF violators signals mixing of application entities in a single table decompose into: sno sname sfoodtno tname tcolor tvolume Note: (1) tables with single-attribute keys cannot violate 2NF (2) tvolume --> tcolor is not a 2NF violator because tvolume is not part of a key sno → sname sno → sfood tno → tname tno → tvolume tvolume → tcolor eno → fno fno → tno fno → sno fno → fname fno → fcolor fno → fweight eno → edate eno → enote

33 Storage anomalies (irregularities): classify negative consequences of 2NF violators insertion anomaly − relation with a 2NF violator mixes two or more application entities − can't insert an instance of just one of them − e.g., in previous example, want to insert a dolphin species that is as yet unassociated with any tank... sno sname sfood tno tname tcolor tvolume 12 shark everything 25 lagoon blue 5000 12 shark everything 27 deep dive green 50000...... 17 dolphin herring ---- ------ ----- ------ insertion forces nulls into inappropriate attributes, such as part of the key (sno, tno)

34 Storage anomalies (irregularities): deletion anomaly − relation with a 2NF violator mixes two or more application entities − deleting last tank associated with shark species removes all information about the species update anomaly − change sfood attribute of a shark => update several tuples − invites inconsistency sno sname sfood tno tname tcolor tvolume 12 shark everything 25 lagoon blue 5000 12 shark everything 27 deep dive green 50000...... 2NF decomposition removes most of anomalies: sno sname sfood 12 shark everything 17 dolphin herring tno tname tcolor tvolume 25 lagoon blue 5000 27 deep dive green 50000

35 Storage anomalies (irregularities): there remains the negative effect of tvolume --> tcolor − if change all 5000 volume tanks to purple, must update several tuples − still invites inconsistency in general, anomalies arise when the contents of some cells can predict the content of others sno sname sfood 12 shark everything 17 dolphin herring tno tname tcolor tvolume 25 lagoon blue 5000 27 deep dive green 50000 84 puddle blue 5000 sno → sname sno → sfood tno → tname tno → tvolume tvolume → tcolor eno → fno fno → tno fno → sno fno → fname fno → fcolor fno → fweight eno → edate eno → enote predictable entry

36 Third Normal Form:

37 BCNF => 3NF => 2NF 3NF decomposition forces further decomposition in previous example: sno sname sfood 12 shark everything 17 dolphin herring tno tname tcolor tvolume 25 lagoon blue 5000 27 deep dive green 50000 84 puddle blue 5000 sno → sname sno → sfood tno → tname tno → tvolume tvolume → tcolor eno → fno fno → tno fno → sno fno → fname fno → fcolor fno → fweight eno → edate eno → enote tno tname tvolume 25 lagoon 5000 27 deep dive 50000 84 puddle 5000 tcolor tvolume blue 5000 green 50000

38 In practice, 3NF is usually BCNF only exception occurs when a non-superkey determines a prime attribute recall earlier example:

39 There always exists a lossless, dependency-preserving decomposition into 3NF Recall that the BCNF decomposition may not be dependency-preserving For proof of algorithm's validity, see text page 788.

40 Cannot always achieve dependency-preserving BCNF; may have to settle for 3NF and some redundancy excessive decomposition is possible, e.g., decompose species into two tables: (sno, sname) and (sno, sfood) − general idea is that each table represent one distinct application entity plus additional decomposition necessary to accommodate non-relationship constraints, such as tvolume --> tcolor there remain redundancies that persist through BCNF, e.g., fish weight in a given tank must sum to 1000 pounds − can predict weight of last fish in the tank ==> redundancy − constraint is not an FD ==> redundancy is not removed by decomposition FD analysis misses multivalued dependencies and join dependencies Some limitations of functional dependency analysis: e.g.,


Download ppt "Normal Forms. First Normal Form: all table cells must contain atomic values − no sets, arrays, lists, or other collection types − no structured objects."

Similar presentations


Ads by Google