376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 3 – Relational Data Model and Relational Algebra Prof. Billibon Yoshimi 9/18/2018
So far Covered chapters 1, 2, and Pages 409-424 Now covering chapter 6. Prof. Billibon Yoshimi 9/18/2018
MySQL Free relational database system Available for download from http://www.mysql.com Versions available for win32, unix, and Mac OS X. You may wish to download a copy for your dorm machine to experiment with administering the database. You'll use the Solaris version in the lab for development. Should start using it. Prof. Billibon Yoshimi 9/18/2018
The Final Project Prof. Billibon Yoshimi 9/18/2018
The Relational Data Model Need to know Domains, Tuple, Attributes and Relations. A relation is represented by a table. Each row is a collection of related data values. A row is also called a tuple. Each column holds the values for a single attribute of all the entities. Each column has the same type from the domain D (of all valid values for that type). Prof. Billibon Yoshimi 9/18/2018
Domain A domain is a set of atomic values (indivisible) Each domain has a name, data type, and format. e.g USA_phone_number, character string, (ddd)-ddd-dddd Social_security_numbers Grade_point_averages (where d are decimal digits and first 3 digits are valid area code) The domain of attribute, Ai, is written as dom(Ai) e.g. dom(HOME_PHONE) = USA_phone_numbers Prof. Billibon Yoshimi 9/18/2018
Relational Schema Used to describe a relation, R is the name of the relation. Degree of relation is n, the number of attributes in the schema. R(A1, A2, … An) Each attribute Ai is a role played by domain D in schema R. e.g. STUDENT(Name, SSN, HomePhone, Address, Age, GPA) has degree 6. Prof. Billibon Yoshimi 9/18/2018
Relational Instances Denoted by r or r( R ) r has n-tuples r = {t1, t2,… tm} Each n-tuple t is a list of values <v1, v2, … vn> where each vi is from dom(Ai). vi can be null. Remember: intension – R and extension – r( R) r( R) (dom(A1) X dom(A2) X … dom(An)) Prof. Billibon Yoshimi 9/18/2018
Example of STUDENT relation STUDENT(Name, SSN, HomePhone, Address, Age, GPA) STUDENT schema STUDENT Name SSN HomePhone Address Age GPA Bill Y 939732212 9147773333 19 J ST 35 3.5 Felica Z 333729999 Null 23 A ST 21 3.2 Mark M 123332122 8454372212 8 B ST 19 3.8 STUDENT relation Prof. Billibon Yoshimi 9/18/2018
Characteristics of Relations Tuples have no implicit order (can be ordered by other means, but implicitly are not ordered). Values within a tuple are ordered. Attributes are listed in a specific order. Alternative definition of relations use <attribute,value> pairs to remove this constraint. Prof. Billibon Yoshimi 9/18/2018
Values in tuples All values are atomic. Multivalued and composite values (from earlier) must be resolved Multivalued attributes are represented by separate relations. Composite attributes are reduced to simple component attributes. Prof. Billibon Yoshimi 9/18/2018
Relational Model Notation Q, R, S – relations Q, r, s – relation instances. T, u, v – tuples STUDENT – represents the current set of tuples in the current relation instance. STUDENT (Name, SSN …) – represent relation schema STUDENT.Age – attribute names qualified with relation name. Prof. Billibon Yoshimi 9/18/2018
Key attributes A relation is a set of tuples. No duplication in a set. Subsets of attributes in R can be used to maintain this constraint. SK – is a subset of attributes for relation schema R. t1[SK] != t2[SK] where t1 and t2 are distinct tuples. Prof. Billibon Yoshimi 9/18/2018
Super key Every set has at least one superkey (the set of all attributes) A key of relation schema R is a minimal superkey of R. Minimal superkey – remove any attribute and its no longer a superkey. Key are derived from data semantic. Key should be time invariant. Every potential key is a candidate key. One candiate key designated primary key (used to identify tuples in a relation.) Prof. Billibon Yoshimi 9/18/2018
Relational Database Schema A set of relation schemas. S = {R1, R2, … R3} Relational data base instance of a relational database schema is a set of relation instances (such that earch ri is taken from Ri in S) DB = {r1, r2, … r3} Prof. Billibon Yoshimi 9/18/2018
Two key integrity constraints Key constraint Entity integrity constraint Referential integrity Prof. Billibon Yoshimi 9/18/2018
Key constraints Relate to candidate keys of each relation. Must be unique for each tuple in each relation instance Prof. Billibon Yoshimi 9/18/2018
Entity integrity constraint No primary key may be null Can’t identify a tuple with a null key. Prof. Billibon Yoshimi 9/18/2018
Referential integrity constraint Maintains consistency between relations. Tuples can only refer to existing tuples. Foreign key (FK) in R1 A set of attributes (FK) in relation schema R1 having the same domain as primary key PK on relation R2. The attributes FK “refers to” R1. FK’s value for t1 in R1 is either a PK for t2 in R2 or is null. In first case, t1 “refers to” t2. Prof. Billibon Yoshimi 9/18/2018
Foreign keys Can refer to their own relation. Diagrammatically display referential integrity using a directed graph. Draw an arrow from the foreign key to the primary key. Prof. Billibon Yoshimi 9/18/2018
Foreign key example EMPLOYEE FNAME MINIT LNAME SSN BDATE ADDR SUPERSSN DEPT_LOCATIONS DEPT LOCATION DEPARTMENT DNAME DNUMBER MGRSSN MGRSTARTDATE PROJECT PNAME PNUM PLOC DNUM WORKS_ON ESSN PNO HOURS DEPENDENT ESSN DEPENDENT_NAME SEX BDATE RELATIONSHIP Prof. Billibon Yoshimi 9/18/2018
Foreign key example EMPLOYEE FNAME MINIT LNAME SSN BDATE ADDR SUPERSSN DEPT DEPT_LOCATIONS DEPT LOCATION DEPARTMENT DNAME DNUMBER MGRSSN MGRSTARTDATE PROJECT PNAME PNUM PLOC DNUM WORKS_ON ESSN PNO HOURS DEPENDENT ESSN DEPENDENT_NAME SEX BDATE RELATIONSHIP Prof. Billibon Yoshimi 9/18/2018
Update operations on relations Look at three basic operations, insert, delete and modify. Verify that constraints are satisfied for these operations. Prof. Billibon Yoshimi 9/18/2018
Try insert Given this relations schema, do EMPLOYEE FNAME MINIT LNAME SSN BDATE ADDR SUPERSSN DEPT Given this relations schema, do Insert <‘Bill’, ‘H’, ‘Yoshimi’, ‘333443333’, ’05-05-1970’, ‘Ave. B, Morgan, TX’, null, 4> into EMPLOYEE What is the required to succeed? Prof. Billibon Yoshimi 9/18/2018
On insert failure Many options Abort the transaction. Ask user for more information (like null primary key). If reference made to non-existant relation, ask user if they’d like to add the relation. May cascade. Prof. Billibon Yoshimi 9/18/2018
Try delete Delete WORKS_ON(‘999887777’ 10) Delete EMPLOYEE(SSN=`999887777’) Delete EMPLOYEE(SSN=‘33344555’) Prof. Billibon Yoshimi 9/18/2018
On delete failure Reject the delete. Try cascading the delete Change references to null or set to valid tuple. Prof. Billibon Yoshimi 9/18/2018
Relational algebra Manipulating operations Operations on sets of tuples including union, intersection, difference and Cartesian product Operations like SELECT, PROJECT and JOIN Prof. Billibon Yoshimi 9/18/2018
SELECT operation (s) s<condition>(RELATION) -select a subset of tuples from RELATION where <condition> is satisfied. e.g. sSSN=‘333445555’(EMPLOYEE), sDEPT=5(EMPLOYEE) Degree of result is same as R. # of relations returned <= # of relations in R. All relations in R are evaluated. SELECT is commutative. Prof. Billibon Yoshimi 9/18/2018
Conditions <condition> can be a composite predicate like ((DNO=4 AND SALARY>12000) or (DNO=5 AND FNAME=‘Bill’)) using boolean operators AND, OR and NOT. Operators include =, >, <, , , and if ordered. If unordered can only do comparison (= or ). Strings can use SUBSTRING-OF Prof. Billibon Yoshimi 9/18/2018
PROJECT operation (s) p<list of attributes>(RELATION) -selects <list of attributes> columns from RELATION e.g. pSEX, SALARY(EMPLOYEE) PROJECT removes duplicate tuples from result set. Degree of result is = length of <list of attributes>. If one attribute is a key of RELATION, result will have same number of relations as RELATION. Prof. Billibon Yoshimi 9/18/2018
Prof. Billibon Yoshimi 9/18/2018