Database Systems The Basic (Flat) Relational Model
Relational Model Concepts (1) Table is called relation Row (record) is called tuple Column header is called attribute Students SSN Name Age GPA 123456789 John 20 3.2 23456789 Mary 18 2.9 345678912 Bill 19 2.7 Name of the relation Attributes head tuples (rows) columns
Relational Model Concepts (2) Given a tuple t and an attribute A in a relation R, t[A] represents the value of t under A in R Example: If t is the second tuple in Students t[Name] = Mary t[Age] = 18 t[Name, Age] = (Mary, 18)
Domain of an Attribute Definition: The set of values that an attribute can take on is the domain of the attribute dom(A) --- the domain of attribute A A domain is usually represented by a type Examples: SSN char(9) --- character string of length 9 Name varchar(30) --- character string of variable length up to 30 characters Age number --- a number
Relational Model Concepts (3) Two aspects of a relation: Schema --- the set of attributes of R. State (or contents) --- the CURRENT set of tuples of R (denoted by r(R)) Schema of a relation rarely changes. Some possible changes are: Rename an attribute Delete an attribute Add an attribute Delete the schema
Relational Model Concepts (4) The state of a relation may change frequently. Some possible changes are: Modify some attribute values Delete an existing tuple Insert a new tuple A given schema may have different states at different times.
An Example Database Students Departments SSN Name Major GPA Name Location Chairperson 1234 Jeff CS 3.2 CS N18 EB Aggarwal 2345 Mary Math 3.0 EE Q4 EB Sackman 3456 Bob CS 2.7 Math LN2200 Hanson 4567 Wang EE 2.9 Biology 210 S3 Smith Courses Sections Name Course# CreditHours Dept Course# Section# Semester Instructor Database CS432 4 CS CS432 01 Fall98 Meng Database CS532 4 CS CS532 01 Fall98 Meng Dis. Math Math314 4 Math Math314 02 Fall 97 Hanson Lin. Alg. Math304 4 Math Math304 01 Spring97 Brown
Relation Schema A relation schema is used to describe a relation Denoted by R(A1, A2, A3, …, An), where R: Relation schema name A1, …, An: attributes of R The degree of a table/relation is the number of attributes in a relation schema
Examples STUDENT(Name, SSN, HomePhone, Address, OfficePhone, Age, GPA) Degree(STUDENT) = 7 dom(Name) = all names which consists of at most 30 characters. dom (SSN) is the set of valid 9-digit social security numbers. dom(HomePhone): local phone numbers.
Relation Instance (1) A relation (or relation instance) r of the relation schema R(A1, A2, …, An), denoted by r(R), is a set of n-tuples r = {t1, t2, …, tm) each n-tuple ti is an ordered list of n values, ti = <v1, v2, …, vn) where each value vi, 1 i n, is an element of dom(Ai) or a special null value The Cartesian product of two sets is all combination of elements with the first from set one and the second from set two.
Relation Instance (2) A relation r(R) is a mathematical relation of degree n on the domains dom(A1), dom(A2), …, and dom(An), which is a subset of the Cartesian product of the domains that define R: r(R) dom(A1) dom(A2) … dom(An) We denote the number of values or cardinality of a domain D by D The total number of tuples in the Cartesian product is dom(A1) * dom(A2) * … * dom(An)
Characteristics of Relations A relation has a name that is distinct from all other relation names Each cell of the relation contains exactly one atomic (single) value Each attribute has a distinct name (unique) All values of an attribute (single column) are from the same domain (same type and meaning)
Characteristics of Relations Ordering of attributes in a relation schema R (and of values within each tuple): We will consider the attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> to be ordered . (However, a more general alternative definition of relation does not require this ordering). Each tuple is distinct; there are no duplicate tuples The order of tuples has no significance, theoretically (efficiency)
Relational Model Constraints Domain Constraint No multi-valued attributes are allowed in a table That is, for any tuple t and attribute A in a table, t[A] must be a single atomic value entries in the table are single-valued (atomic). Therefore composite and multi-valued attributes are not directly represented in the relational model Attribute data should match its data type
Examples of Multi-valued Attributes Employees SSN Name Age Dependents 123456789 Bob 34 Allen, Ann 234567891 Mary 42 Kathy 345678912 Bill 47 Mike, Susan, David Other examples: Attribute Authors of relation Books Attribute Reference_Books of relation Courses Attribute of Hobbies of relation Employees
Key Constraints All tuples in a relation must be distinct No two rows in the same table can be identical at any given time. That is, each tuple in a table is unique This rule comes from the mathematical definition that a relation is a set of tuples and the fact that a set never contains two identical elements This rule has serious implications on the performance of relational database systems. When a new tuple is inserted to a relation, the system has to make sure that the new tuple is different from all existing tuples in the relation
Superkey (1) Definition: A superkey (SK) of a relation is a set of one or more attributes whose values uniquely identify every tuple of the relation Superkey: a subset of attributes whose values are distinct for each tuple in R Superkeys may contain redundant (extra) attributes Examples: Attribute SSN is a SK of relation Students {SSN, Name} is also a SK The set of attributes {Name, Birthdate, Home_Address} is a SK of Students
Superkey (2) Is the set of all attributes of a relation a superkey of the relation? In the following relation, is attribute A a superkey? How about {B, C}? A B C D a1 b1 c1 d1 a1 b2 c2 d1 a2 b2 c1 d1 a2 b1 c2 d1
Superkey (3) Some claims: Every relation has at least one superkey Any superset of a superkey is a superkey From a given state of a relation, we may determine whether a set of attributes of the relation does not form a superkey, but we can not determine if a set of attributes forms a superkey
Key (1) Key: a subset of attributes in R whose values are unique for each tuple in r(R), but with no redundant (extra) attributes. Definition: A set of attributes is a key of a relation if (1) it is a superkey of the relation, and (2) no proper subset of it is a superkey of the relation A Key of any relation is a minimal superkey Student ID is candidate key for Student, since it is a superkey, and no subset of it is a superkey.
Key (2) If any attribute is removed from a key, then the remaining attributes no longer form a key (minimality property) Example: Students(SSN, Name, Home_Address, Birthdate, GPA), SSN is a key. {SSN, Name} is a superkey but not a key {Name, Home_Address, Birthdate} is also a key
Key (3) Every relation has at least one key A relation may have more than one key Keys of a relation are also known as candidate keys of the relation. Candidate keys: a subset of attributes which can be used as a key Customer-id is candidate key of customer account-number is candidate key of account
Candidate Key Example: Find all possible candidate keys for the following relation based on its current tuples. A B C D a1 b1 c1 d1 a1 b2 c2 d1 a2 b2 c1 d1 a2 b1 c2 d1 Answer: {A, B}, {A, C}, {B, C} Although several candidate keys may exist, one of the candidate keys is selected to be the primary key
Primary Key (1) Definition: A primary key of a relation is the candidate key chosen by the database designer for a particular application. The primary key of each relation is chosen and declared at the time when the relation is defined Once chosen, it cannot be changed The primary key is usually chosen to be the candidate key that has the smallest number of attributes to improve both storage efficiency and query processing efficiency
Primary Key (2) With the primary key defined, only the values under the attributes in the primary key need to be checked for identifying duplicate when new tuples are inserted (index is often used) The primary key of a relation is often used in references from other relations
Example STUDENT(ST-NO, SSN, Name, Age, GPA, … ) Superkey: {ST-No, Name} {SSN, Age, Address} Candidate keys: SSN ST-NO Primary Key:
Null Value For a given tuple t and a given attribute A of a relation R, the following cases may occur when t is to be inserted into R. t[A] is unknown t[A] is yet to be assigned t[A] is inapplicable When one of the above cases occurs, assign a null value to t[A]: t[A] = null
Constraints on Null Entity Integrity Constraint No attribute in the primary key can take on null values Note: A null value is different from either a 0 or a space No primary key value can be null i.e. null is not allowed as a value for a primary key
Foreign Key (1) Definition: A set of attributes of relation R1 is a foreign key FK in R1 if it satisfies the following two conditions: There is a relation R2 with the primary key PK such that FK and PK have the same number of attributes with compatible domains For any tuple ti in R1, either there exists a tuple tj in R2 such that ti[FK] = tj[PK] or ti[FK] is null
Foreign Key (2) R1(PK1, A1, A2, …, An, FK) R2(PK2, B1, B2, …, Bm) FK is called a foreign key iff: Attributes in FK have the same domain as PK2 A value of FK in a tuple ti of R1 either occurs as a value of PK2 for some tuple tj in R2 or null: ti[FK] = tj[PK2] we say that ti refers to (references) tj R1 and R2 in the definition could be the same relation Employees(SSN, Age, Salary, Position, Manager_SSN)
Foreign Key (3) Employee SSN Name Age Dept-Name 123456789 John 45 Sales 234567891 Mary 42 Service 345678912 Bob 39 null Department Name Location Manager Sales Binghamton Bill Inventory Vestal Charles Service Vestal Maria Dept_Name of Employee is a foreign key referencing Name of Department
Foreign Key (4) Referential Integrity Constraint No relation is allowed to contain unmatched foreign key values States that a tuple in one relation which refers to another relation must refer to an existing tuple in that relation Using a foreign key of a relation to reference the (primary) key of another relation is THE WAY used by the relational data model to establish relationships among different relations
Foreign Key (5) CSNO Ruba 123 cs111 Ali 222 cs210 CSNO Name Hrs STNO CSNO Name Ruba 123 cs111 Ali 222 cs210 CSNO Name Hrs Cs111 Intro to com 3 Cs210 C programming 3
Semantic Integrity Constraints GPA: grater than or equal to 0 and less than or equal to 4 Age: greater than 0 Grade: greater than or equal 35 and less than or equal 100
Update Operations on Relations Updates insert, or delete, or Modify Retrievals Queries
Populated database state for COMPANY
The Insert Operation Provides a list of attribute values for a new tuple (t) which is to be inserted into a relation r(R) When inserting a new tuple in r(R), we should make sure that the values preserve all constraint types we studied before May violate all types of constraint Insert<‘Cecilia’, ‘F’, ‘Kolonsky’, null, ‘1960-04-05’,’3rd Street, Katy, TX’, F, 28000, null, 4> into EMPLOYEE > Rejected, Primary Key is null (Entity Integrity Constraint)
The Insert Operation Insert<‘Cecilia’, ‘F’, ‘Kolonsky’, ‘99988777’, ‘1960-04-05’,’3rd Street, Katy, TX’, F, 28000, null, 4> into EMPLOYEE > Rejected, Primary Key is duplicate (Key Constraint) Insert<‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989, ‘1960-04-05’,’3rd Street, Katy, TX’, F, 28000, null, 7> into EMPLOYEE > Rejected, DEPARTMENT # 7 does not exist. (Referential Integrity Constraint) Insert<‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989, ‘1960-04-05’,’3rd Street, Katy, TX’, F, 28000, null, 4> into EMPLOYEE > Accepted
The Delete Operation Deletes a tuple or tuples from r(R) It can only violate referential integrity constraint. “Delete from WORKS_ON where ESSN = ‘999887777’ and PNO = 10” Accepted “Delete from EMPLOYEE where SSN = ‘999887777’” Rejected. Tuples in WORK_ON refer to this tuple, if the tuple is deleted, referential integrity violations will result
The Delete Operation “Delete from EMPLOYEE where SSN = ‘333445555’” Rejected. Tuples in EMPLOYEE, DEPARTMENT, WORK_ON, and DEPENDENT refer to this tuple, if the tuple is deleted, referential integrity violations will result
The Modify Operation It is used to change the values of one or more attributes in a tuple or more of some relation r(R) May violate all constraints “Update EMPLOYEE, set Salary = 29000 where SSN = ‘999887777’” Accepted “Update EMPLOYEE, set DNO = 1 where SSN = ‘999887777’”
The Modify Operation “Update EMPLOYEE, set DNO = 7 where SSN = ‘999887777’” Rejected, it violates referential integrity constraint “Update EMPLOYEE, set SSN = ‘980000000’ where SSN = ‘999887777’” “Update EMPLOYEE, set SSN = ‘987654321’ where SSN = ‘999887777’” Rejected, it violates unique row (key) and referential integrity constraints