part 1 with pages 1-32 and part 2 with pages 33-69 Database Principles Constructed by Hanh Pham based on slides from: “Database Processing, Fundamentals, Design, and Implementation”, D. Kroenke, D. Auer, Prentice Hall “Database Principles: Fundamentals of Design, Implementation, and Management”, C. Coronel, S. Morris, P.Rob Relational Databases This slide has 2 parts: part 1 with pages 1-32 and part 2 with pages 33-69
Outlines (of part 1) The Problem Relational Model Entity Relation Functional dependency Determinant Keys: Candidate key Composite key Primary key Surrogate key Foreign key
Chapter Objectives To understand basic relational terminology To understand the characteristics of relations To understand alternative terminology used in describing the relational model To be able to identify functional dependencies, determinants, and dependent attributes To identify primary, candidate, and composite keys About relational database operators, the data dictionary, and the system catalog How data redundancy is handled in the relational database model Why indexing is important
The PROBLEM We have received one or more tables of existing data. The data is to be stored in a new database. QUESTION: Should the data be stored as received, or should it be transformed for storage?
How Many Tables? Should we store these two tables as they are, or should we combine them into one table in our new database?
Now the standard model for commercial DBMS products. The Relational Model Introduced in 1970 Created by E.F. Codd He was an IBM engineer The model used mathematics known as “relational algebra” Now the standard model for commercial DBMS products.
A Logical View of Data Relational model View data logically rather than physically Logical view is based on relation Relation thought of as a table Table: two-dimensional structure composed of rows and columns Structural and data independence Resembles a file conceptually Relational database model is easier to understand than hierarchical and network models
Important Relational Model Terms Entity Relation Functional dependency Determinant Keys: Candidate key Composite key Primary key Surrogate key Foreign key Referential integrity constraint Normal form Multivalued dependency
An entity is some identifiable thing that users want to track: Customers Computers Sales
Relation Relational DBMS products store data about entities in relations, which are a special type of table. A relation is a two-dimensional table that has the following characteristics: Rows contain data about an entity. Columns contain data about attributes of the entity. All entries in a column are of the same kind. Each column has a unique name. Cells of the table hold a single value. The order of the columns is unimportant. The order of the rows is unimportant. No two rows may be identical.
Relation
Tables That Are Not Relations: Multiple Entries per Cell
Tables That Are Not Relations: Table with Required Row Order
A Relation with Values of Varying Length
Alternative Terminology Although not all tables are relations, the terms table and relation are normally used interchangeably. The following sets of terms are equivalent:
Functional Dependency A functional dependency occurs when the value of one (set of) attribute(s) determines the value of a second (set of) attribute(s): StudentID StudentName StudentID (DormName, DormRoom, Fee) The attribute on the left side of the functional dependency is called the determinant. Functional dependencies may be based on equations: ExtendedPrice = Quantity X UnitPrice (Quantity, UnitPrice) ExtendedPrice Function dependencies are not equations!
Functional Dependencies Are Not Equations ObjectColor Weight ObjectColor Shape ObjectColor (Weight, Shape)
Composite Determinants Composite determinant = a determinant of a functional dependency that consists of more than one attribute (StudentName, ClassName) (Grade)
Functional Dependency Rules If A (B, C), then A B and A C. If (A,B) C, then neither A nor B determines C by itself.
Functional Dependencies in the SKU_DATA Table
Functional Dependencies in the SKU_DATA Table SKU (SKU_Description, Department, Buyer) SKU_Description (SKU, Department, Buyer) Buyer Department
Functional Dependencies in the ORDER_ITEM Table
Functional Dependencies in the ORDER_ITEM Table (OrderNumber, SKU) (Quantity, Price, ExtendedPrice) (Quantity, Price) (ExtendedPrice)
What Makes Determinant Values Unique? A determinant is unique in a relation if and only if, it determines every other column in the relation. You cannot find the determinants of all functional dependencies simply by looking for unique values in one column: Data set limitations Must be logically a determinant
Keys A key is a combination of one or more columns that is used to identify rows in a relation. A composite key is a key that consists of two or more columns.
Candidate and Primary Keys A candidate key is a key that determines all of the other columns in a relation. A primary key is a candidate key selected as the primary means of identifying rows in a relation. There is only one primary key per relation. The primary key may be a composite key. The ideal primary key is short, numeric, and never changes.
Surrogate Keys A surrogate key is an artificial column added to a relation to serve as a primary key. DBMS supplied Short, numeric, and never changes—an ideal primary key Has artificial values that are meaningless to users Normally hidden in forms and reports
RENTAL_PROPERTY without surrogate key: Surrogate Keys NOTE: The primary key of the relation is underlined below: RENTAL_PROPERTY without surrogate key: RENTAL_PROPERTY (Street, City, State/Province, Zip/PostalCode, Country, Rental_Rate) RENTAL_PROPERTY with surrogate key: RENTAL_PROPERTY (PropertyID, Street, City, State/Province, Zip/PostalCode, Country, Rental_Rate)
Foreign Keys A foreign key is the primary key of one relation that is placed in another relation to form a link between the relations. A foreign key can be a single column or a composite key. The term refers to the fact that key values are foreign to the relation in which they appear as foreign key values.
Foreign Keys NOTE: The primary keys of the relations are underlined and any foreign keys are in italics in the relations below: DEPARTMENT (DepartmentName, BudgetCode, ManagerName) EMPLOYEE (EmployeeNumber, EmployeeLastName, EmployeeFirstName, DepartmentName)
Summary Tables are basic building blocks of a relational database Keys are central to the use of relational tables Keys define functional dependencies Each table row must have a primary key that uniquely identifies all attributes Tables are linked by common attributes
Relational Databases (II) Database Principles Constructed by Hanh Pham based on slides from: “Database Processing, Fundamentals, Design, and Implementation”, D. Kroenke, D. Auer, Prentice Hall “Database Principles: Fundamentals of Design, Implementation, and Management”, C. Coronel, S. Morris, P.Rob Relational Databases (II)
Outlines (of part 2) Relational Databases (Cont.) Referential Integrity Constraint Relational Set Operators Dictionary, Catalog Indexes
The Referential Integrity Constraint A referential integrity constraint is a statement that limits the values of the foreign key to those already existing as primary key values in the corresponding relation.
Foreign Key with a Referential Integrity Constraint NOTE: The primary key of the relation is underlined and any foreign keys are in italics in the relations below: SKU_DATA (SKU, SKU_Description, Department, Buyer) ORDER_ITEM (OrderNumber, SKU, Quantity, Price, ExtendedPrice) Where ORDER_ITEM.SKU must exist in SKU_DATA.SKU
Many RDBMs enforce integrity rules automatically Safer to ensure that application design conforms to entity and referential integrity rules Designers use flags to avoid nulls Flags indicate absence of some value
Relational Set Operators Relational algebra Defines theoretical way of manipulating table contents using relational operators Use of relational algebra operators on existing relations produces new relations: SELECT UNION PROJECT DIFFERENCE JOIN PRODUCT INTERSECT DIVIDE
Relational Set Operators Natural join Links tables by selecting rows with common values in common attributes (join columns) Equijoin Links tables on the basis of an equality condition that compares specified columns Theta join Any other comparison operator is used
Relational Set Operators Inner join Only returns matched records from the tables that are being joined Outer join Matched pairs are retained, and any unmatched values in other table are left null
Relational Set Operators Left outer join Yields all of the rows in the CUSTOMER table Including those that do not have a matching value in the AGENT table Right outer join Yields all of the rows in the AGENT table Including those that do not have matching values in the CUSTOMER table
The Data Dictionary and System Catalog Provides detailed accounting of all tables found within the user/designer-created database Contains (at least) all the attribute names and characteristics for each table in the system Contains metadata: data about data System Catalog Contains metadata Detailed system data dictionary that describes all objects within the database
The Data Dictionary and System Catalog Homonym Indicates the use of the same name to label different attributes Synonym Opposite of a homonym Indicates the use of different names to describe the same attribute
Relationships within the Relational Database 1:M relationship Relational modeling ideal Should be the norm in any relational database design 1:1 relationship Should be rare in any relational database design M:N relationships Cannot be implemented as such in the relational model M:N relationships can be changed into 1:M relationships
The 1:M Relationship Relational database norm Found in any database environment
The 1:1 Relationship One entity related to only one other entity, and vice versa Sometimes means that entity components were not defined properly Could indicate that two entities actually belong in the same table Certain conditions absolutely require their use
Implemented by breaking it up to produce a set of 1:M relationships The M:N Relationship Implemented by breaking it up to produce a set of 1:M relationships Avoid problems inherent to M:N relationship by creating a composite entity Includes as foreign keys the primary keys of tables to be linked
Data Redundancy Revisited Data redundancy leads to data anomalies Can destroy the effectiveness of the database Foreign keys Control data redundancies by using common attributes shared by tables Crucial to exercising data redundancy control Sometimes, data redundancy is necessary
Indexes Orderly arrangement to logically access rows in a table Index key Index’s reference point Points to data location identified by the key Unique index Index in which the index key can have only one pointer value (row) associated with it Each index is associated with only one table
Codd’s Relational Database Rules In 1985, Codd published a list of 12 rules to define a relational database system Products marketed as “relational” that did not meet minimum relational standards Even dominant database vendors do not fully support all 12 rules
Tables are basic building blocks of a relational database Summary Tables are basic building blocks of a relational database Keys are central to the use of relational tables Keys define functional dependencies Each table row must have a primary key that uniquely identifies all attributes Tables are linked by common attributes The relational model supports relational algebra functions SELECT, PROJECT, JOIN, INTERSECT UNION, DIFFERENCE, PRODUCT, DIVIDE Good design begins by identifying entities, attributes, and relationships 1:1, 1:M, M:N