Logical Design & the Relational Model MGIS 641 Koç University
Learning objectives Define the terms: logical database model, relation, relational data model, well structured relation, anomaly, normalisation, functional dependency, determinant, composite key, partial functional dependency, transitive dependency, foreign key. Describe four steps in logical database design List properties of relations (re)Define two properties that are essential for a candidate key Give definitions for 1st, 2nd, 3rd normal forms Transform a relation in 1st normal form into 2nd , 3rd normal form Transform an E-R diagram to logically equivalent relations
Introduction Logical database design is a process of transforming the conceptual data model into a logical database model. There are four major logical database models in use : relational most common hierarchical, network, relational, object-oriented
DB development process Planning Enterprise data model Analysis Conceptual data model Logical DB design Logical data model Physical DB design Technology model Implementation DB & repositories
Database Design Logical Database model : A design that conforms to the data model for a class of DBMSs. Hierarchical Model :A data model in which records are arranged in a top-down structure that resembles a tree
Hierarchical DB Model The hierarchical data model is set up like a "forest" or collection of tree structures. advantage: speed and efficiency for certain kinds of applications. The hierarchical data model is a good choice when the data to be modeled is also like a tree. problem: how data is accessed is predefined; and each relationship must be explicitly defined when the database is created The hierarchical data model is a special case of the network data model. The best-known hierarchical database management system is IBM's IMS. It was originally a purely hierarchical system but has gained some non-hierarchical features as a result of practical needs.
Network DB Model Network Database Model :A data model in which each record type may be associated with an arbitrary number of different records
Network DB Model An example of a network database management system is IDMS. creates relationship among data through a linked-list structure in which subordinate records (members) can be linked to more than one data element (owner). A single data element may be a part of many relationships. pointer - explicit link, storage addresses that contain the location of a related record complexity: for every set of linked data elements, a pair of pointers must be maintained. It is difficult to conceptualize complex data structures using this model. speed: direct links make systems implemented using the network model very fast
Object-Oriented DB Model : A DB model in which data attributes and methods that operate on those attributes are encapsulated in structures called objects. Relational DB Model : A data model that represents data in the form of tables
Relational Data Model Relational DB Model : A data model that represents data in in the form of tables and relations. Relation : A named, two-dimensional table of data. each relation consists of a set of named columns and an arbitrary number of rows. The relational Data Model consists of three components : Data Structure : data are organised in the form of tables. Data manipulation : SQL operations+ Data integrity : facilities are included to specify business rules that maintain the integrity of data when they are manipulated.
Overview of Logical DB Design Conceptual Data Model Represent Entities : Each Entity Type -> Relation; Identifier -> Primary Key, Attributes -> Non-key attributes Represent Relationships: Depends on the nature of the relationship. Foreign key, New relation etc. Normalise Relations : Unnecessary duplications & anomalies are removed. Merge Relations : Redundant entities are removed. Represent Entities Represent Relationships Normalise Relations Merge Relations Logical Data Model
Relations Relation EMPLOYEE Short Hand for the structure of EMPLOYEE relation : EMPLOYEE (EMP NO, NAME, PHONE, SALARY)
Properties of Relations Not all tables are relations. Entries in Columns are Atomic : An entry in a cell is atomic (single-valued). No repeating groups or multivalued attributes allowed. Entries in Columns are from the same Domain Each row is unique The sequence of columns is Insignificant The Sequence of rows is Insignificant
Properties of relations Multi -> Single valued cells
Well Structured Relations Well Structured Relation : A relation that contains a minimum amount of redundancy and allows users to insert, modify and delete rows in a table without errors and inconsistencies. Anomalies : errors or inconsistencies that may result when a user tries to update a table that contains redundant data. Insertion anomaly (Student table try to add a student) Deletion anomaly (Delete E30 student) Modification anomaly (Modify E10 student DEPT.)
Concepts of Normalisation Normalisation : The process of converting complex data structures into simple, stable data structures. Normal form : A state of a relation that can be determined by simple rules regarding dependencies to that relation. Table with repeating groups Remove repeating groups First Normal Form Remove partial dependencies Second Normal Form Remove transitive dependencies Third Normal Form Remove remaining anomalies Boyce-Codd Normal Form Remove multivalued dependencies Fourth Normal Form Remove remaining anomalies Fifth Normal Form
Functional dependence and Keys Functional Dependency : A particular relationship between two attributes. For any relation R, attribute B is functionally dependent on attribute A if, for every valid instance of A, that value of A uniquely determines the value of B. The functional dependence of B on A is represented as A --> B e.g. SSN -> NAME, ADDRESS ISBN -> TITLE, AUTHOR Determinant : The attribute on the left hand side of the arrow in a functional dependency. e.g. SSN, ISBN A -> B ?
Rules of Functional Dependency X -> X Reflexive rule If X -> Y Then XZ -> Y augmentation rule If X -> Y and X -> Z then X -> YZ union rule If X -> Y then X -> Z where Z is a subset of Y decomposition rule If X -> Y and Y -> Z then X -> Z transitivity rule If X -> Y and YZ -> W then XZ -> W pseudotransitivity rule. Properties of candidate keys Unique Identification : For every row, the value of the key must uniquely identify that row. Key attribute -> Non- key attribute Non-redundancy : No attribute in the key can be deleted without destroying the property of unique identification. Composite Key : A primary key that contains more than one attribute.
Basic Normal Forms First Normal Form : A relation that contains no repeating groups. K.U. GRADE REPORT Name : Ahmet ID :S10 Major : ENG COURSE TITLE INST INST ROOM GRADE MGIS401 Databases OAY B110 A OPSM402 Operations BT B109 B 1NF Grade Report Relation
Basic Normal Forms Second Normal Form : A relation is in 2nd Normal Form if it is in 1st Normal Form and every nonkey attribute is full functionally-dependent on the primary key. Thus no non-key attribute is functionally dependent on part (but not all) of the primary key. Analyse the functional dependencies : STD ID -> NAME, MAJOR COURSE -> COURSE TITLE, INST. NAME, INST. ROOM STUDENT ID, COURSE -> GRADE INST. NAME -> INST. ROOM GRADE REPORT (STD ID, NAME, MAJOR, COURSE, COURSE TITLE, INST. NAME, INST. ROOM, GRADE) 1NF Grade Report Relation
Basic Normal Forms Identify the partial dependencies : STD ID -> NAME, MAJOR COURSE -> COURSE TITLE, INST. NAME, INST. ROOM STUDENT ID, COURSE ID -> GRADE Remove partial dependencies : Create three new relations STUDENT (STD ID, NAME, MAJOR) COURSE (COURSE, COURSE TITLE, INST. NAME, INST. ROOM) REGISTRATION(STD ID, COURSE, GRADE) 1NF Grade Report Relation 2NF 2NF 2NF
Basic Normal Forms Anomalies in 2nd normal form exist. Modification anomaly e.g. change Room of Inst. OAY Deletion Anomaly : Delete OPSM402 Row loose info about BT. Anomalies due to Entity type INSTRUCTOR. (not designed in yet) Analyse the functional dependencies : COURSE -> COURSE TITLE, INST. NAME, INST. ROOM INST. NAME -> INST. ROOM Transitive Dependency Third Normal Form : A relation is in 3nd Normal Form if it is in 2nd Normal Form and no transitive dependencies exist. Transitive dependency : A functional dependency between two (or more) non-key attributes in a relation. 3rd Normal Form 2nd Normal Form 3rd Normal Form
Transforming E-R Diagrams to Relations Represent Entities : Primary Key of the entity type --> Primary Key of the Relation Check if a) The value of the key uniquely identifies every row in the relation b) The key should be non-redundant; that is no attribute in the key can be deleted without destroying the unique identification. Each non-key attribute --> non-key attribute of the relation. e.g. CUSTOMER (CUSTOMER NO, NAME, ADDRESS, DISCOUNT) CUSTOMER NO NAME ADDRESS DISCOUNT CUSTOMER E-R Diagram Corresponding Relation
Transforming E-R Diagrams to Relations Represent Relationships :1:N relationship CUSTOMER NO NAME ADDRESS DISCOUNT CUSTOMER Places ORDER NO DATE PROM. DATE ORDER foreign key E-R Diagram Corresponding Relations
Transforming E-R Diagrams to Relations Represent Relationships :M:N relationship DESCP. PRODUCT NO PRODUCT QUANTITY Places Requests ORDER NO DATE PROM. DATE ORDER E-R Diagram Corresponding Relations
Transforming E-R Diagrams to Relations Represent Relationships :Unary relationships 1:N ADDRESS NAME EMP. NO EMPLOYEE manages EMPLOYEE (EMP. NO, NAME, ADDRESS, MANAGER ID) NAME M:N ITEM NO ITEM Contains QUANTITY ITEM(ITEM NO, NAME) ITEM BILL (ITEM NO, COMPONENT NO, QUANTITY)
Transforming E-R Diagrams to Relations Represent Relationships :ISA Relationship ADDRESS RENT PROPERTY ID PROPERTY ISA ISA BEACH PROPERTY MOUNTAIN PROPERTY PROPERTY ID BLOCKS TO BEACH PROPERTY ID SKIING PROPERTY(PROPERTY NO, RENT, ADDRESS) BEACH PROPERTY(PROPERTY NO, BLOCKS TO BEACH) MOUNTAIN PROPERTY(PROPERTY NO, SKIING)
Merging relations - View Integration E-R Diagrams are transformed then checked for 3rd Normal Form. Normalised relations may have been created from several E-R diagrams, some relations may be redundant. Merge relations to remove redundancies. Examples : EMPLOYEE1(EMPLOYEE NO, NAME, ADDRESS) EMPLOYEE2(EMPLOYEE NO, NAME, JOBCODE) merged to EMPLOYEE (EMPLOYEE NO, NAME, ADDRESS, JOBCODE) Some View Integration Problems are due to synonyms, homonyms, transitive dependencies, class/subclass relations
View Integration Problems Synonyms e.g. STUDENT1 (STUDENT ID, NAME) STUDENT2 (REGISTRATION NO, NAME, ADDRESS) Merge STUDENT (ID NO, NAME, ADDRESS) Homonyms e.g. STUDENT1 (STUDENT ID, NAME, ADDRESS) STUDENT2 (STUDENT ID, NAME, ADDRESS) STUDENT (STUDENT ID, NAME, HOME ADDRESS, CAMPUS ADDRESS)
View Integration Problems Transitive dependencies e.g. STUDENT1 (STUDENT ID, MAJOR) STUDENT2 (STUDENT ID, ADVISOR) IF MAJOR -> ADVISOR TRANSITIVE DEPENDENCY if directly merged. STUDENT (STUDENT ID , MAJOR) MAJOR ADVISOR(MAJOR, ADVISOR ) ISA e.g. PATIENT1 (PATIENT ID, NAME, ADDRESS) PATIENT2 (PATIENT ID ,ROOM NO) there are two type of patients in and outpatients therefore no direct merge PATIENT (PATIENT ID, NAME, ADDRESS) INPATIENT (PATIENT ID, ROOM NO) OUTPATIENT (PATIENT ID, DATE TREATED)