The Relational Model System Development Life Cycle Normalisation

Slides:



Advertisements
Similar presentations
Chapter 3 The Relational Model Transparencies © Pearson Education Limited 1995, 2005.
Advertisements

1 Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Chapter 3. 2 Chapter 3 - Objectives Terminology of relational model. Terminology of relational model. How tables are used to represent data. How tables.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Normalization I.
1 Minggu 2, Pertemuan 3 The Relational Model Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Chapter 5 Normalization Transparencies © Pearson Education Limited 1995, 2005.
1 Pertemuan 04 MODEL RELASIONAL Matakuliah: >/ > Tahun: > Versi: >
Relational Database Management System A type of database in which records are stored in relational form is called relational database management system.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
CSC271 Database Systems Lecture # 6. Summary: Previous Lecture  Relational model terminology  Mathematical relations  Database relations  Properties.
Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from Update anomalies : Insertions Deletions Modification.
Relational Model & Relational Algebra. 2 Relational Model u Terminology of relational model. u How tables are used to represent data. u Connection between.
Lecture 2 The Relational Model. Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations.
Chapter 4 The Relational Model Pearson Education © 2014.
© Pearson Education Limited, Chapter 2 The Relational Model Transparencies.
Relational Model Session 6 Course Name: Database System Year : 2012.
FUNCTIONAL DEPENDENCIES
Chapter 4 The Relational Model.
Chapter 3 The Relational Model Transparencies Last Updated: Pebruari 2011 By M. Arief
Lecture 12 Inst: Haya Sammaneh
Chapter 3 The Relational Model. 2 Chapter 3 - Objectives u Terminology of relational model. u How tables are used to represent data. u Connection between.
Normalization. 2 Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification of various types of update anomalies.
Chapter 13 Normalization Transparencies. 2 Last Class u Access Lab.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
1 Pertemuan 23 Normalisasi Matakuliah: >/ > Tahun: > Versi: >
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Normalization Transparencies
CSC271 Database Systems Lecture # 28.
1 The Relational Database Model. 2 Learning Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical.
Team Dosen UMN Database Design Connolly Book Chapter
Functional Dependencies and Normalization for Relational Databases.
9/7/2012ISC329 Isabelle Bichindaritz1 The Relational Database Model.
Chapter 13 Normalization Transparencies. 2 Chapter 13 - Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification.
The Relational Model Pertemuan 03 Matakuliah: M0564 /Pengantar Sistem Basis Data Tahun : 2008.
Chapter 13 Normalization © Pearson Education Limited 1995, 2005.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 13 Normalization Transparencies Last Updated: 08 th May 2011 By M. Arief
Chapter 10 Normalization Pearson Education © 2009.
© Pearson Education Limited, Normalization Bayu Adhi Tama, M.T.I. Faculty of Computer Science University of Sriwijaya.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
The Relational Model. 2 Relational Model Terminology u A relation is a table with columns and rows. –Only applies to logical structure of the database,
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
The Relational Model © Pearson Education Limited 1995, 2005 Bayu Adhi Tama, M.T.I.
CSCI 6315 Applied Database Systems Review for Midterm Exam I Xiang Lian The University of Texas Rio Grande Valley Edinburg, TX 78539
Chapter 3 The Relational Model. Objectives u Terminology of relational model. u How tables are used to represent data. u Connection between mathematical.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
ITD1312 Database Principles Chapter 4C: Normalization.
Chapter 4 The Relational Model Pearson Education © 2009.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Chapter 8 Relational Database Design Topic 1: Normalization Chuan Li 1 © Pearson Education Limited 1995, 2005.
Normalization.
Advanced Normalization
Normalization DBMS.
Advanced Normalization
Chapter 4 The Relational Model Pearson Education © 2009.
Chapter 4 The Relational Model Pearson Education © 2009.
Database Normalization
Chapter 4 The Relational Model Pearson Education © 2009.
Chapter 14 & Chapter 15 Normalization Pearson Education © 2009.
Chapter 14 Normalization – Part I Pearson Education © 2009.
Normalization Dale-Marie Wilson, Ph.D..
The Relational Model Transparencies
Chapter 4 The Relational Model Pearson Education © 2009.
Chapter 4 The Relational Model Pearson Education © 2009.
Chapter 14 Normalization.
國立臺北科技大學 課程:資料庫系統 2015 fall Chapter 14 Normalization.
Chapter 4 The Relational Model Pearson Education © 2009.
Presentation transcript:

The Relational Model System Development Life Cycle Normalisation Exam Revision The Relational Model System Development Life Cycle Normalisation

Properties of Relations Each tuple is distinct; there are no duplicate tuples. Order of attributes has no significance. Order of tuples has no significance, theoretically: However, in practice, the order of tuples may affect query response time, thus efficiency

Relational Keys Superkey Candidate Key Properties An attribute, or set of attributes, that uniquely identifies a tuple within a relation. However, other attributes can be present. Candidate Key Superkey (K) such that no proper subset is a superkey within the relation. Properties In each tuple of R, values of K uniquely identify that tuple (uniqueness). No proper subset of K has the uniqueness property (irreducibility).

Relational Keys Primary Key Alternate Keys Foreign Key Candidate key selected to identify tuples uniquely within the relation. A relation has always a primary key, in the worst case this could be the whole set of attributes! Alternate Keys Candidate keys that are not selected to be primary key. Foreign Key Attribute, or set of attributes, within one relation that matches candidate key of some (possibly same) relation.

Instances of Branch and Staff Relations

Integrity Constraints Part of the data model that ensures accuracy of data Null Represents value for an attribute that is currently unknown or not applicable for tuple. Deals with incomplete or exceptional data. Represents the absence of a value and is not the same as zero or spaces, which are values.

Integrity Constraints Base Relation Named relation corresponding to an entity in conceptual schema, whose tuples are physically stored in database. Entity Integrity In a base relation, no attribute of a primary key can be null. No subset of the primary key can be used to identify tuples uniquely It applies only to primary keys, not to candidate keys

Integrity Constraints Referential Integrity If foreign key exists in a relation, either foreign key value must match a candidate key value of some tuple in its home relation or foreign key value must be wholly null.

Integrity Constraints General Constraints Additional rules specified by users or database administrators that define or constrain some aspect of the enterprise.

Entity/Relationship Models An Entity/Relationship model consists of diagrams to represent designs. Entity like object = ”thing.” Entity types like class = set of ”similar” entities/objects. Attribute = property of entities in an entity type. Relations connect different entity type.

Weak entity types Entity sets that do not have key attributes of their own are called weak entity types. The remaining attributes come from another (or more) entity type which is related to the weak entity set by an identifying relationship. IN E/R diagrams you should: represent weak entity sets as double rectangles represent identifying relationships as double diamonds

Specialization

Generalization

Constraints on Specialization and Generalization Disjointness Constraint: Specifies that the subclasses of the specialization must be disjoint: an entity can be a member of at most one of the subclasses of the specialization Specified by d in EER diagram If not disjoint, specialization is overlapping: that is the same entity may be a member of more than one subclass of the specialization Specified by o in EER diagram

Constraints on Specialization and Generalization Completeness Constraint: Total specifies that every entity in the superclass must be a member of some subclass in the specialization/generalization Shown in EER diagrams by a double line (e.g. Employee) Partial allows an entity not to belong to any of the subclasses Shown in EER diagrams by a single line

Constraints on Specialization and Generalization Hence, we have four types of specialization/generalization: Disjoint, total Disjoint, partial Overlapping, total Overlapping, partial Note: Generalization usually is total because the superclass is derived from the subclasses.

Database System Development Lifecycle Database planning: Planning how the stages of the lifecycle can be realised most efficiently and effectively. System definition Requirements collection and analysis Database design DBMS selection (optional)

Database System Development Lifecycle Application design Prototyping (optional) Implementation Data conversion and loading Testing Operational maintenance

Stages of the Database System Development Lifecycle

Database Design Three phases of database design: Conceptual database design Logical database design Physical database design.

Conceptual Database Design Process of constructing a model of the data used in an enterprise, independent of all physical considerations. Data model is built using the information in users’ requirements specification. Conceptual data model is source of information for logical design phase.

Logical Database Design Process of constructing a model of the data used in an enterprise based on a specific data model (e.g. relational), but independent of a particular DBMS and other physical considerations. Conceptual data model is refined and mapped on to a logical data model based on the target data model for the db (e.g. relational data model)

Physical Database Design Process of producing a description of the database implementation on secondary storage. Describes base relations, file organizations, and indexes used to achieve efficient access to data. Also describes any associated integrity constraints and security measures. Tailored to a specific DBMS system.

Purpose of Normalization Normalization is a technique for producing a set of suitable relations that support the data requirements of an enterprise and eliminate redundancy.

Data Redundancy and Update Anomalies Relations that contain redundant information may potentially suffer from update anomalies. Types of update anomalies include Insertion Deletion Modification

Functional Dependencies Important concept associated with normalization. Functional dependency describes relationship between attributes. FDs are properties of the meaning (semantics) of the attributes in a relation. FDs are constraints that describe inter-relationships between the data attributes. For example, if A and B are attributes of relation R, B is functionally dependent on A (denoted A  B), if each value of A in R is associated with exactly one value of B in R.

Characteristics of Functional Dependencies Determinants should have the minimal number of attributes necessary to maintain the functional dependency with the attribute(s) on the right hand-side. This requirement is called full functional dependency.

Characteristics of Functional Dependencies Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A, if B is functionally dependent on A, but not on any proper subset of A.

Example Full Functional Dependency Exists in the Staff relation. staffNo, sName → branchNo True - each value of (staffNo, sName) is associated with a single value of branchNo. However, branchNo is also functionally dependent on a subset of (staffNo, sName), namely staffNo. Example above is a partial dependency.

Transitive Dependencies Important to recognize a transitive dependency because its existence in a relation can potentially cause update anomalies. Transitive dependency describes a condition where A, B, and C are attributes of a relation such that if A → B and B → C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).

Example Transitive Dependency Consider functional dependencies in the StaffBranch relation (see Slide 12). staffNo → sName, position, salary, branchNo, bAddress branchNo → bAddress Transitive dependency, branchNo → bAddress exists on staffNo via branchNo.

Example - Identifying a set of FDs for the StaffBranch relation With sufficient information available, identify the functional dependencies for the StaffBranch relation as: staffNo → sName, position, salary, branchNo, bAddress branchNo → bAddress bAddress → branchNo branchNo, position → salary bAddress, position → salary

The Process of Normalization As normalization proceeds, the relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies.

The Process of Normalization

The Process of Normalization

First Normal Form (1NF) A relation in which the intersection of each row and column contains one and only one value.

Second Normal Form (2NF) Based on the concept of full functional dependency. Full functional dependency indicates that if A and B are attributes of a relation, B is fully dependent on A if B is functionally dependent on A but not on any proper subset of A.

Second Normal Form (2NF) A relation that is in 1NF and every non-primary-key attribute is fully functionally dependent on the primary key.

1NF to 2NF Identify the primary key for the 1NF relation. Identify the functional dependencies in the relation. If partial dependencies exist on the primary key remove them by placing then in a new relation along with a copy of their determinant.

Third Normal Form (3NF) Based on the concept of transitive dependency. Transitive Dependency is a condition where A, B and C are attributes of a relation such that if A  B and B  C, then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or C).

Third Normal Form (3NF) A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on the primary key.

2NF to 3NF Identify the primary key in the 2NF relation. Identify functional dependencies in the relation. If transitive dependencies exist on the primary key remove them by placing them in a new relation along with a copy of their determinant.

Inference Rules for Functional Dependencies Let A, B, and C be subsets of the attributes of the relation R. Armstrong’s axioms are as follows:  (1) Reflexivity If B is a subset of A, then A → B (2) Augmentation If A → B, then A,C → B,C (3) Transitivity If A → B and B → C, then A → C

Inference Rules for Functional Dependencies Further rules can be derived from the first three rules that simplify the practical task of computing X+. Let D be another subset of the attributes of relation R, then: (4) Self-determination A → A (5) Decomposition If A → B,C, then A → B and A → C

Inference Rules for Functional Dependencies (6) Union If A → B and A → C, then A → B,C (7) Composition If A → B and C → D then A,C → B,D

Boyce–Codd Normal Form (BCNF) Based on functional dependencies that take into account all candidate keys in a relation, however BCNF also has additional constraints compared with the general definition of 3NF. Boyce–Codd normal form (BCNF) A relation is in BCNF if and only if every determinant is a candidate key.

Boyce–Codd Normal Form (BCNF) Difference between 3NF and BCNF is that for a functional dependency A  B, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key. Whereas, BCNF insists that for this dependency to remain in a relation, A must be a candidate key. Every relation in BCNF is also in 3NF. However, a relation in 3NF is not necessarily in BCNF.

Fourth Normal Form (4NF) Although BCNF removes anomalies due to functional dependencies, another type of dependency called a multi-valued dependency (MVD) can also cause data redundancy. Possible existence of multi-valued dependencies in a relation is due to 1NF and can result in data redundancy.

Fourth Normal Form (4NF) A multi-valued dependency can be further defined as being trivial or nontrivial. A MVD A −>> B in relation R is defined as being trivial if (a) B is a subset of A or (b) A  B = R. A MVD is defined as being nontrivial if neither (a) nor (b) are satisfied. A trivial MVD does not specify a constraint on a relation, while a nontrivial MVD does specify a constraint.

Fourth Normal Form (4NF) Defined as a relation that is in Boyce-Codd Normal Form and contains no nontrivial multi-valued dependencies.

4NF - Example