Relational Schema Design II

Relational Schema Design II
Elmasri and Navathe CISC 332

Outline Functional dependencies Informal design guidelines
Normal forms CISC 332

Functional Dependencies (FD)
a functional dependency is a relationship among attributes given the value of one attribute, we can determine the value of another attribute Employee Number -> Bar they work at Drinker -> Date of birth Drinker -> Name of spouse CISC 332

FD (Formally) Notation of X -> Y If we have t1 and t2 in r where:
t1[X] = t2[X] then we must also have t1[Y] = t2[Y] The attribute(s) X determine the attribute(s) Y for the given relation CISC 332

FDs (cont’d) A functional dependency is a property of the semantics or meaning of the attributes. A FD is a constraint on the data. In every relation R(A1, A2, …, An) there is a FD called the PK -> A1, A2, …, An CISC 332

Schemas: Good vs Bad What is a good schema? What is a bad schema?
What should we be considering? Logical and physical aspects Guidelines for design Note that there is no “measure” for relational design, only informal guidelines CISC 332

What is a good schema? At the logical level…
Easy to understand Helpful for formulating correct queries At the physical storage level… Tuples are stored efficiently Tuples are accessed efficiently CISC 332

Design approaches Top-down Bottom-up
Start with groupings of attributes achieved from the conceptual design and mapping Design by analysis is applied Bottom-up Consider relationships between attributes Build up relations Also called design by synthesis CISC 332

Informal Measures for Design
Semantics of the attributes. Reducing the redundant values in tuples and avoid update anomalies. Reducing the null values in tuples. Disallowing the possibility of generating spurious tuples. CISC 332

1. Semantics of the attributes.
Design a relation schema so that it is easy to explain its meaning A relation schema should correspond to one semantic object (entity or relationship) Example – What is clearer? Bar (Name, Address) Employee (StaffID, Name, Salary, Bar) or Employee_works (StaffID, Name, Salary, BarName, Address) CISC 332

2a Reduce redundant data
Design has a significant impact on storage requirements Which scheme needs more storage? Bar and Employee or Employee_works Why? CISC 332

2b – Avoid update anomalies
Relation schemes can suffer from update anomalies Insertion anomaly Deletion anomaly Modification anomaly CISC 332

Insertion anomaly Insert a new employee into employee_works We must keep the values for the bar name and address consistent between tuples Insert a new bar with no employees into employee_works We would have to insert nulls for the employee info. We would have to delete this entry later. CISC 332

Deletion anomaly Delete the last employee for a bar from the employee_works relation. If we delete the last employee for a bar from the database, all the bar information disappears as well. This is like deleting the bar from the database. CISC 332

Modification Anomaly Update the address of a bar in the employee_works relation. We would have to search out each employee that works at that particular bar and update the address information in each of those tuples. CISC 332

3. Reduce null values in tuples
Avoid attributes in relations whose values may often be null Reduces the problem of “fat” relations Saves physical storage space Don’t include a “bar_manager” field for each employee CISC 332

4. Avoid spurious tuples Design relation schemes so that they can be joined with equality conditions on attributes that are either primary or foreign keys. If you don’t, spurious or incorrect data will be generated CISC 332

Spurious tuples (cont’d)
Suppose we replace Employee (staffID, salary, bar) with Bar_data (name) Employee_data (staffID, salary) then Employee != Bar_data * Employee_data CISC 332

Normalization Based on the rule:
one fact – one place Process of ensuring a schema design is free of redundancy Why is redundancy a bad thing? Is top-down, so considered relational design by analysis CISC 332

Normalization (cont’d)
Used to Minimize redundancy Minimize update anomalies We use normal form tests to determine the level of normalization for the scheme CISC 332

Normal Forms 1NF 2NF 3NF Boyce-Codd NF 4NF 5NF CISC 332

First Normal Form (1NF) Now part of the formal definition of a relation (we already do this) Attributes may only have atomic values (i.e. single values) Disallows “relations within relations” or “relations as attributes of tuples” CISC 332

Second Normal Form (2NF)
A relation is in 2NF if all of its nonkey attributes are fully dependent on the key. This is known as full functional dependency. When in 2NF, the removal of any attribute will break the dependency CISC 332

2NF (cont’d) Employee (staffID, bar, sName, sSalary, bar_address)
staffID -> name, salary bar -> bar_address Employee (staffID, name, salary) Bar (name, address) CISC 332

Third Normal Form (3NF) A relation is 3NF if it is in 2NF and has no transitive dependencies A transitive dependency is when X->Y and Y->Z implies X->Z CISC 332

3NF Employee (StaffID, name, salary, bar, bar_address)
CISC 332

Example bar (bar, bAddress, bPhones, beer, price, drinkerName, dGender, dDOB, dPlates, spouseName, spouseDOB) Primary key = bar, drinkerName CISC 332

1NF – Only atomic values Both bPhones and dPlates are multi-valued, so they must be removed. barPhones(bar, phone) drinkerPlates(drinker, plate) CISC 332

2NF – full functional dependency
bar -> bAddress bar, beer -> price drinkerName -> dGender, dDOB, spouseName, spouseDOB Note: spouseName is a non-key attribute CISC 332

3NF – remove transitive dependencies
drinker (name, gender, DOB, spouseName, spouseDOB) dName -> gender, dDOB, spouseName spouseName -> spouseDOB CISC 332

Relational Schema Design II

Similar presentations

Presentation on theme: "Relational Schema Design II"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Relational Schema Design II

Similar presentations

Presentation on theme: "Relational Schema Design II"— Presentation transcript:

Similar presentations

About project

Feedback