Chapter 3 The Relational Model and Normalization CS420 Western Oregon University
Relations Relational DBMS products store data in the form of relations, a special type of table A relation is a two-dimensional table that has the following characteristics Rows contain data about an entity Columns contain data about attributes of the entity All entries in a column are of the same kind Each column has a unique name Cells of the table hold a single value The order of the columns is unimportant The order of the rows is unimportant No two rows may be identical Although not all tables are relations, the terms table and relation are normally used interchangeably Table/row/column = file/record/field = relation/tuple/attribute
Example: Relation
Example: Tables Not Relations
Types of Keys A key is one or more columns of a relation that identifies a row Composite key is a key that contains two or more attributes A Surrogate Key is an artificial column added to a table to serve as the primary key A foreign key is a column or set of columns that is the primary of another table A relation has one unique primary key and may also have additional unique keys called candidate keys Primary key is used to Represent the table in relationships Organize table storage Generate indexes Sometimes, relations are denoted by showing the name of the relation followed by the columns of the relation in parentheses. The primary key of the relation is underlined.
Functional Dependencies A functional dependency occurs when the value of one (set of) attribute(s) determines the value of a second (set of) attribute(s) The attribute on the left side of the functional dependency is called the determinant SID GPA SKU Department, SKU_Description (CustomerNumber, ItemNumber, Quantity) Price While a primary key is always a determinant, a determinant is not necessarily a primary key
Normalization Normalization eliminates modification anomalies Deletion anomaly: deletion of a row loses information about two or more entities Insertion anomaly: insertion of a fact in one entity cannot be done until a fact about another entity is added Anomalies can be removed by splitting the relation into two or more relations; each with a different, single theme Normalization works through classes of relations called normal forms
Relationship of Normal Forms
Normal Forms Any table of data is in 1NF if it meets the definition of a relation A relation is in 2NF if all its non-key attributes are dependent on all of the key (no partial dependencies) If a relation has a single attribute key, it is automatically in 2NF A relation is in 3NF if it is in 2NF and has no transitive dependencies A relation is in BCNF if every determinant is a candidate key A relation is in fourth normal form if it is in BCNF and has no multi-value dependencies
Example: 2NF
Example: NOT IN 3NF
Example: 3NF
Example: NOT IN BCNF
Example: BCNF
Example: NOT IN 4NF
Example: 4NF
DK/NF First published in 1981 by Fagin DK/NF has no modification anomalies; so no higher normal form is needed A relation is in DK/NF if every constraint on the relation is a logical consequence of the definition of keys and domains
Example 1: DK/NF
Example: DK/NF
Example 2: DK/NF
Example: DK/NF
Normal Forms-Review Any table of data is in 1NF if it meets the definition of a relation A relation is in 2NF if all its non-key attributes are dependent on all of the key (no partial dependencies) If a relation has a single attribute key, it is automatically in 2NF A relation is in 3NF if it is in 2NF and has no transitive dependencies A relation is in BCNF if every determinant is a candidate key A relation is in fourth normal form if it is in BCNF and has no multi-value dependencies
The Synthesis of Relations Given a set of attributes with certain functional dependencies, what relations should we form? Example: A and B are two attributes If A B and B A A and B have a one-to-one attribute relationship If A B, but B not A A and B have a many-to-one attribute relationship If A not B and B not A A and B have a many-to-many attribute relationship
Types of Attribute Relationship
One-to-One Attribute Relationships Attributes that have a one-to-one relationship must occur together in at least one relation Call the relation R and the attributes A and B: Either A or B must be the key of R An attribute can be added to R if it is functionally determined by A or B An attribute that is not functionally determined by A or B cannot be added to R A and B must occur together in R, but should not occur together in other relations Either A or B should be consistently used to represent the pair in relations other than R
Many-to-One Attribute Relationships Attributes that have a many-to-one relationship can exist in a relation together Assume C determines D in relation S C must be the key of S An attribute can be added to S if it is determined by C An attribute that is not determined by C cannot be added to S
Many-to-Many Attribute Relationships Attributes that have a many-to-many relationship can exist in a relation together Assume attributes E and F reside together in relation T The key of T must be (E, F) An attribute can be added to T if it is determined by the combination (E, F) An attribute may not be added to T if it is not determined by the combination (E, F) If adding a new attribute, G, expands the key to (E, F, G), then the theme of the relation has been changed Either G does not belong in T or the name of T must be changed to reflect the new theme
De-normalized Designs When a normalized design is unnatural, awkward, or results in unacceptable performance, a de-normalized design is preferred Example Normalized relation CUSTOMER (CustNumber, CustName, Zip) CODES (Zip, City, State) De-Normalized relations CUSTOMER (CustNumber, CustName, City, State, Zip)