Presentation is loading. Please wait.

Presentation is loading. Please wait.

Relational Schema Design II

Similar presentations


Presentation on theme: "Relational Schema Design II"— Presentation transcript:

1 Relational Schema Design II
Elmasri and Navathe CISC 332

2 Outline Functional dependencies Informal design guidelines
Normal forms CISC 332

3 Functional Dependencies (FD)
a functional dependency is a relationship among attributes given the value of one attribute, we can determine the value of another attribute Employee Number -> Bar they work at Drinker -> Date of birth Drinker -> Name of spouse CISC 332

4 FD (Formally) Notation of X -> Y If we have t1 and t2 in r where:
t1[X] = t2[X] then we must also have t1[Y] = t2[Y] The attribute(s) X determine the attribute(s) Y for the given relation CISC 332

5 FDs (cont’d) A functional dependency is a property of the semantics or meaning of the attributes. A FD is a constraint on the data. In every relation R(A1, A2, …, An) there is a FD called the PK -> A1, A2, …, An CISC 332

6 Schemas: Good vs Bad What is a good schema? What is a bad schema?
What should we be considering? Logical and physical aspects Guidelines for design Note that there is no “measure” for relational design, only informal guidelines CISC 332

7 What is a good schema? At the logical level…
Easy to understand Helpful for formulating correct queries At the physical storage level… Tuples are stored efficiently Tuples are accessed efficiently CISC 332

8 Design approaches Top-down Bottom-up
Start with groupings of attributes achieved from the conceptual design and mapping Design by analysis is applied Bottom-up Consider relationships between attributes Build up relations Also called design by synthesis CISC 332

9 Informal Measures for Design
Semantics of the attributes. Reducing the redundant values in tuples and avoid update anomalies. Reducing the null values in tuples. Disallowing the possibility of generating spurious tuples. CISC 332

10 1. Semantics of the attributes.
Design a relation schema so that it is easy to explain its meaning A relation schema should correspond to one semantic object (entity or relationship) Example – What is clearer? Bar (Name, Address) Employee (StaffID, Name, Salary, Bar) or Employee_works (StaffID, Name, Salary, BarName, Address) CISC 332

11 2a Reduce redundant data
Design has a significant impact on storage requirements Which scheme needs more storage? Bar and Employee or Employee_works Why? CISC 332

12 2b – Avoid update anomalies
Relation schemes can suffer from update anomalies Insertion anomaly Deletion anomaly Modification anomaly CISC 332

13 Insertion anomaly Insert a new employee into employee_works We must keep the values for the bar name and address consistent between tuples Insert a new bar with no employees into employee_works We would have to insert nulls for the employee info. We would have to delete this entry later. CISC 332

14 Deletion anomaly Delete the last employee for a bar from the employee_works relation. If we delete the last employee for a bar from the database, all the bar information disappears as well. This is like deleting the bar from the database. CISC 332

15 Modification Anomaly Update the address of a bar in the employee_works relation. We would have to search out each employee that works at that particular bar and update the address information in each of those tuples. CISC 332

16 3. Reduce null values in tuples
Avoid attributes in relations whose values may often be null Reduces the problem of “fat” relations Saves physical storage space Don’t include a “bar_manager” field for each employee CISC 332

17 4. Avoid spurious tuples Design relation schemes so that they can be joined with equality conditions on attributes that are either primary or foreign keys. If you don’t, spurious or incorrect data will be generated CISC 332

18 Spurious tuples (cont’d)
Suppose we replace Employee (staffID, salary, bar) with Bar_data (name) Employee_data (staffID, salary) then Employee != Bar_data * Employee_data CISC 332

19 Normalization Based on the rule:
one fact – one place Process of ensuring a schema design is free of redundancy Why is redundancy a bad thing? Is top-down, so considered relational design by analysis CISC 332

20 Normalization (cont’d)
Used to Minimize redundancy Minimize update anomalies We use normal form tests to determine the level of normalization for the scheme CISC 332

21 Normal Forms 1NF 2NF 3NF Boyce-Codd NF 4NF 5NF CISC 332

22 First Normal Form (1NF) Now part of the formal definition of a relation (we already do this) Attributes may only have atomic values (i.e. single values) Disallows “relations within relations” or “relations as attributes of tuples” CISC 332

23 Second Normal Form (2NF)
A relation is in 2NF if all of its nonkey attributes are fully dependent on the key. This is known as full functional dependency. When in 2NF, the removal of any attribute will break the dependency CISC 332

24 2NF (cont’d) Employee (staffID, bar, sName, sSalary, bar_address)
staffID -> name, salary bar -> bar_address Employee (staffID, name, salary) Bar (name, address) CISC 332

25 Third Normal Form (3NF) A relation is 3NF if it is in 2NF and has no transitive dependencies A transitive dependency is when X->Y and Y->Z implies X->Z CISC 332

26 3NF Employee (StaffID, name, salary, bar, bar_address)
CISC 332

27 Example bar (bar, bAddress, bPhones, beer, price, drinkerName, dGender, dDOB, dPlates, spouseName, spouseDOB) Primary key = bar, drinkerName CISC 332

28 1NF – Only atomic values Both bPhones and dPlates are multi-valued, so they must be removed. barPhones(bar, phone) drinkerPlates(drinker, plate) CISC 332

29 2NF – full functional dependency
bar -> bAddress bar, beer -> price drinkerName -> dGender, dDOB, spouseName, spouseDOB Note: spouseName is a non-key attribute CISC 332

30 3NF – remove transitive dependencies
drinker (name, gender, DOB, spouseName, spouseDOB) dName -> gender, dDOB, spouseName spouseName -> spouseDOB CISC 332


Download ppt "Relational Schema Design II"

Similar presentations


Ads by Google