Part IV: Logical Database Design 2: Database Systems Part IV: Logical Database Design
Logical Database Design The process of transforming the conceptual data model (i.e. ERDs) into a logical database model (i.e. relational) A logical database model is a design that conforms to the data model for a class of DBMS
Review: Data Models Hierarchical Network Relational Object-oriented
Overview of Logical Design Represent entities Each entity type in an ERD is represented as a relation Represent relationships Each relationship in the ERD must be represented in the relational model
Overview of Logical Design Normalize relations Relations must be refined to avoid unnecessary redundancies and anomalies Merge relations Redundant relations must be merged
Relational Database Model Data is stored in relations (entities) A relation consists of tuples/rows (instances) and attributes Goal: To store data without unnecessary redundancy and to be able to process information easily
Components of Relational Database Model Data structure Data are organized in the form of tables Data manipulation Powerful data manipulation operations are used Data integrity Business rules are included to maintain data integrity
Keys Key Minimal set of attributes that uniquely identifies each row in a relation Composite key A key consisting of more than one attribute
Keys Candidate key Primary key Any set of attributes that could be chosen as a key of a relation Should be unique and non-redundant Primary key The candidate key designated for principal use in uniquely identifying rows in a relation
Keys Foreign key A set of attributes in one relation that constitutes a key in some other (possibly the same) relation Used to indicate logical links between relations
Foreign Key EMP DEPT Foreign key Primary key EMPNO ENAME DEPTNO ------ ------- ------- 7839 KING 10 7698 BLAKE 30 7782 CLARK 10 7566 JONES 20 7654 MARTIN 30 7499 ALLEN 30 7844 TURNER 30 7900 JAMES 30 7521 WARD 30 7902 FORD 20 7369 SMITH 20 ... DEPTNO DNAME LOC ------- ---------- -------- 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO ... Foreign key Primary key
Relations A named, two-dimensional table of data Consists of a set of named columns and an arbitrary number of unnamed rows Can be expressed as: RELATION (attribute1, attribute2,…) Example STUDENT (ID_Num, Name, Address, Birthday)
Properties of Relations Entries in columns are atomic (single-valued) Entries in columns are from the same domain Each row is unique (no duplicate rows) The sequence of columns is insignificant The sequence of rows is insignificant
Anomalies Errors or inconsistencies that may result when manipulating data in a table that contains redundancies Types of anomalies: Insertion anomaly Deletion anomaly Modification anomaly
Anomalies: An Example EMPLOYEE COURSE EMPID NAME DEPT SALARY COURSE DATE COMPLETED 100 Dana Scully Marketing 42,000 Planning 5/6/99 Management 5/27/95 140 Fox Mulder Info Systems 39,000 C++ 12/28/93 110 Walter Skinner Administration 41,500 Budgeting 6/6/86 190 Alex Krycek Finance 38,000 Tax Acct. 10/1/93
Well-Structured Relations Contains a minimum amount of redundancy and allows users to manipulate data without errors Normalization is used to achieve well-structured relations
Normalization Process of converting a relation to a standard form Used to derive well-structured relations that are free of anomalies when manipulated Often accomplished in stages or normal forms
Normal Form State of a relation that can be determined by applying dependency rules to that relation Normal Forms: First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) Fourth Normal Form (4NF) Fifth Normal Form (5NF)
Functional Dependency The value of an attribute in a relation determines unique value of another (one or more) attributes in the relation Example Std_ID -> Name, Bday, Major Left-side attribute (Stud_ID) is called a determinant which determines unique values of other attributes in the relation
Partial Functional Dependency One or more non-key attributes are functionally dependent on only part of the primary key Example EMPLOYEE COURSE (Emp_ID, Name, Dept, Salary, Course, Date_Completed) Functional dependencies: Emp_ID, Course -> Date_Completed Emp_ID -> Name, Dept, Salary Emp_ID is only part of the primary key
Transitive Dependency A non-key attribute is functionally dependent on one or more other non-key attributes Example SALES (Cust_No, Name, Salesperson, Region) Functional dependencies: Cust_No -> Name, Salesperson, Region Salesperson -> Region Salesperson is not a primary key!
Steps in Normalization First normal form (1NF) Repeating groups have been removed Grade Report STUDENT ID STUDENT NAME CAMPUS ADDRESS MAJOR COURSE ID COURSE TITLE INSTRUCTOR NAME INSTRUCTOR LOCATION GRADE 143 Mulder 101 Cervini MIS CS 122 CS 161 DB Sys. O/S Codd Tannenbaum F 227 F 104 B+ A 434 Scully 304 Eliazo Psy Psy 101 Th 141 En 12 Basic Psy Marriage Basic Eng. Freud Pope John Paul Shakespeare Bel 204 B 102 B 202
Steps in Normalization First normal form (1NF) Repeating groups have been removed Grade Report STUDENT ID STUDENT NAME CAMPUS ADDRESS MAJOR COURSE ID COURSE TITLE INSTRUCTOR NAME INSTRUCTOR LOCATION GRADE 143 Mulder 101 Cervini MIS CS 122 DB Sys. Codd F 227 B+ CS 161 O/S Tannenbaum F 104 A 434 Scully 304 Eliazo Psy Psy 101 Basic Psy Freud Bel 204 Th 141 Marriage Pope John Paul B 102
Steps in Normalization Second normal form (2NF) Partial functional dependencies have been removed Assume Student cannot have multiple majors Student cannot repeat a subject Only one teacher is available per course STUDENT (STUDENT ID, STUDENT NAME, CAMPUS ADDRESS, MAJOR) COURSE INSTRUCTOR (COURSE ID, COURSE TITLE, INSTRUCTOR NAME, INSTRUCTOR LOCATION REGISTRATION (STUDENT ID, COURSE ID, GRADE)
Steps in Normalization Third normal form (3NF) Transitive dependencies have been removed Assume Instructor teaches only in one classroom Previous assumptions hold STUDENT (STUDENT ID, STUDENT NAME, CAMPUS ADDRESS, MAJOR) COURSE INSTRUCTOR (COURSE ID, COURSE TITLE, INSTRUCTOR NAME) INSTRUCTOR (INSTRUCTOR NAME, INSTRUCTOR LOCATION) REGISTRATION (STUDENT ID, COURSE ID, GRADE)
Steps in Normalization Boyce-Codd normal form (BCNF) Remaining anomalies from functional dependencies are removed In BCNF if and only if every determinant is a candidate key Example: STUDENT ADVISOR (Student ID, Major, Advisor) For each major a student has only one advisor Each advisor advises only one major Each advisor advises several students in one major Each major has several advisors Each student may major in several subjects Student ID Major Advisor 123 Physics Einstein Music Mozart 456 Biology Darwin 789 Bohr 143
Steps in Normalization Fourth normal form (4NF) Any multivalued dependencies have been removed
Steps in Normalization Fifth normal form (5NF) Any remaining anomalies (join dependencies) have been removed Join dependency - data in relations broken down cannot be recombined to form the original
Steps in Normalization Domain-Key Normal Form (DK/NF) Proposed by Fagin in 1981 Showed that any relation in DK/NF is automatically in 5NF, 4NF, etc. Does not provide methodology for converting to DK/NF
Transforming ERDs to Relations Represent entities Entity = Relation Primary key of entity = Primary key of relation Convert: Multivalued attributes Composite attributes Weak entities
Transforming ERDs to Relations Represent entities Converting multivalued attributes Employee_ID Name Employee_ID Name Address Address EMPLOYEE EMPLOYEE has Skill_Name convert many-to-many Skill_Name SKILL Skill_ID
Transforming ERDs to Relations Represent entities Converting composite attributes Student_ID Address Student_ID Address STUDENT STUDENT Name MI Last First MI Last First
Transforming ERDs to Relations Represent entities Converting weak entities Employee_ID Name Employee_ID Name Address Address EMPLOYEE EMPLOYEE has has Employee_ID Birthdate DEPENDENT Birthdate DEPENDENT Dep_Name Dep_Name
Transforming ERDs to Relations Represent relationships Depends on: Degree of the relationship Cardinalities of the relationship
Transforming Relationships Binary one-to-many relationship Primary key attributes of the entity on the one-side of the relationship = foreign key in the relation on the many side DeptNo DName DeptNo DName DEPT DEPT Loc Loc has has DeptNo EMP EMP EName EName EmpNo EmpNo
Transforming Relationships Binary one-to-many relationship EMP DEPT EMPNO ENAME DEPTNO 7839 KING 10 7698 BLAKE 30 7782 CLARK 10 7566 JONES 20 7654 MARTIN 30 7499 ALLEN 30 7844 TURNER 30 7900 JAMES 30 7521 WARD 30 7902 FORD 20 7369 SMITH 20 ... DEPTNO DNAME LOC 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO ... Foreign key Primary key
Transforming Relationships Binary one-to-one relationship Similar situation as one-to-many relationship Create foreign key on any side of the relationship Employee_ID Name Employee_ID Name Address Address EMPLOYEE EMPLOYEE assigned assigned Employee_ID Description Description COMPUTER Terminal_ID COMPUTER Terminal_ID
Transforming Relationships Binary one-to-one relationship Primary key of relation A = foreign key of relation B Primary key of relation B = foreign key of relation A Both situations apply Student_ID Name Address STUDENT has JPG_Image PICTURE Student_ID
Transforming Relationships Binary many-to-many relationship Create a separate relation Primary key is a composite key consisting of the primary key of each of the two entities Occasionally requires a primary key that includes more than just the primary keys of the two relations
Transforming Relationships Binary many-to-many relationship example Employee_ID Name Address Employee_ID Name EMPLOYEE Address EMPLOYEE is given Project_ID Employee_ID assigned to Role PROJECT_ ASSIGNMENT Role Date Assigned PROJECT Project_ID refers to Project_Name PROJECT Project_ID Project_Name
Transforming Relationships Binary many-to-many relationship example EMPLOYEE PROJECT Employee_ID Name Address 40780 Summers, Buffy Sunnydale 21295 Grissom, Gil Las Vegas 50666 Kent, Clark Smallville Project_ID Project_Name 100 Looking for Clues 101 Monsters, Inc. PROJECT_ASSIGNMENT Employee_ID Project_ID Date Assigned Role 21295 100 27/05/2003 Lead Analyst 50666 Jr. Programmer 40780 101 28/12/2003 Project Manager 05/01/2004 Sr. Programmer
Transforming Relationships Unary relationships Unary one-to-many A recursive foreign key is added to reference the primary key values of the same relation Employee_ID Name Employee_ID Name Manager_ID EMPLOYEE EMPLOYEE manages manages
Transforming Relationships Unary relationships Unary one-to-many EMPLOYEE EMPLOYEE_ID NAME MANAGER_ID 7839 KING 7698 BLAKE 7839 7782 CLARK 7839 7566 JONES 7839 7654 MARTIN 7698 7499 ALLEN 7698
Transforming Relationships Unary relationships Unary many-to-many Create a separate relation to represent the many-to-many relationship Primary key = composite key of the two attributes from the same primary key domain Item_No. Name Item_No. Name Unit_Cost ITEM Unit_Cost ITEM consists of Item_No. refers to part of Comp_No. Quantity COMPONENT Quantity
Transforming Relationships Unary relationships Unary many-to-many ITEM Item_No. Name Unit_Cost 500 Hard Drive 3,000 006 Pentium 4 PC 27,000 101 Keyboard 400 999 Screw 0.50 COMPONENT Item_No. Comp_No. Quantity 006 500 2 101 1 999 180 30 20
Transforming Relationships Subtypes Create a separate relation for the supertype and for each subtype Supertype relation consists of attributes common to all of the subtypes Relation for each subtype contains primary key and attributes unique to that subtype Primary keys of type and subtypes are from the same domain
Transforming Relationships Subtypes example Emp_ID Name Emp_ID Name Emp_Type Address Emp_Type Address EMPLOYEE EMPLOYEE Emp_Type = may be may be may be d “H” “C” “S” HOURLY SALARIED CONSULTANT HOURLY SALARIED CONSULTANT Emp_ID Emp_ID Emp_ID Hourly_Rate Monthly_Sal Billing_Rate Hourly_Rate Billing_Rate Monthly_Sal
Transforming Relationships Subtypes example EMPLOYEE Emp_ID Name Address Emp_Type 40780 Summers, Buffy Sunnydale S 21295 Grissom, Gil Las Vegas 50666 Kent, Clark Smallville H 56466 Bristow, Sidney Washington C 97872 Bauer, Jack 15249 Mulder, Fox Emp_ID Monthly_Sal 40780 12,000 21295 30,000 15249 20,000 SALARIED Emp_ID Hourly_Rate 50666 500 97872 750 Emp_ID Billing_Rate 56466 3,500 CONSULTANT HOURLY
Merge Relations: View Integration Merge relations that refer to the same entity to remove redundancy View integration problems Synonyms Homonyms Transitive Dependencies Subtypes
Synonyms Two or more attributes may have different names but the same meaning Choose either of the two attribute names and eliminate the other synonym or use a new attribute name to replace both synonyms
Homonyms A single attribute may have more than one meaning Create new attribute names
Transitive Dependencies May result when two 3NF relations are merged to form a single relation Example STUDENT1 (Student ID, Major) STUDENT2 (Student ID, Advisor) STUDENT (Student ID, Major, Advisor) Note: Assume only one advisor per major Remove transitive dependencies by creating 3NF relations
Subtypes If there are two or more different types of a relation but they contain some characteristics common to all Create supertype-subtype relationships Example PATIENT1 (Patient No., Name, Address) PATIENT2 (Patient No., Room No.) PATIENT (Patient No., Name, Address) INPATIENT (Patient No., Room No.) OUTPATIENT (Patient No., Date Treated)