7-1 Chapter 7 The Relational Data Model, Relational Constraints, and the Relational Algebra.

Slides:



Advertisements
Similar presentations
Lecture 1 Relational Algebra and Relational Calculus.
Advertisements

The Relational Algebra
CM036: Advanced Database Lecture 3 Relational Algebra and SQL.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 3 The Basic (Flat) Relational Model.
Database Systems Chapter 6 ITM Relational Algebra The basic set of operations for the relational model is the relational algebra. –enable the specification.
The Relational Data Model (Based on Chapter 5)
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 5- 1.
- relation schema, relations - database schema, database state
Chapter 5 The Relational Data Model and Relational Database Constraints Copyright © 2004 Pearson Education, Inc.
Database Systems Chapter 5 ITM 354. Chapter Outline Relational Model Concepts Relational Model Constraints and Relational Database Schemas Update Operations.
Chapter 5 The Relational Data Model and Relational Database Constraints.
Relational Data Model Sept. 2014Yangjun Chen ACS Outline: Relational Data Model Relational Data Model -relation schema, relations -database schema,
Database Systems Relational Model Concepts Toqir Ahmad Rana Database Management Systems 1 Lecture 17.
Chapter 5 Relational Model Concepts Dr. Bernard Chen Ph.D. University of Central Arkansas.
The Relational Algebra and Calculus. Relational Algebra Overview Relational algebra is the basic set of operations for the relational model These operations.
CS 380 Introduction to Database Systems (Chapter 5: The Relational Data Model and Relational Database Constraints)
The Relational Data Model 1.Relational Model Concepts 2.Characteristics of Relations 3.Relational Integrity Constraints 3.1Key Constraints 3.2Entity Integrity.
1 The Relational Data Model, Relational Constraints, and The Relational Algebra.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 6 The Relational Algebra and Calculus.
Content Resource- Elamsari and Navathe, Fundamentals of Database Management systems.
Relational Algebra Example Database Application (COMPANY) Relational Algebra –Unary Relational Operations –Relational Algebra Operations From Set Theory.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 3 The Relational Data Model and Relational Database Constraints.
Topic 5 The Relational Data Model and Relational Database Constraints Faculty of Information Science and Technology Mahanakorn University of Technology.
Instructor: Churee Techawut Basic Concepts of Relational Database Chapter 5 CS (204)321 Database System I.
DatabaseIM ISU1 Fundamentals of Database Systems Chapter 5 The Relational Data Model.
Relational Algebra - Chapter (7th ed )
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 6 The Relational Algebra and Calculus.
CS 380 Introduction to Database Systems Chapter 7: The Relational Algebra and Relational Calculus.
METU Department of Computer Eng Ceng 302 Introduction to DBMS The Relational Data Model and Relational Database Constraints by Pinar Senkul resources:
Relational Data Model 建置資料模式 (Based on Chapter 7 in Fundamentals of Database Systems by Elmasri and Navathe, Ed. 4)
Chapter 6 The Relational Data Model and the Relational Algebra.
METU Department of Computer Eng Ceng 302 Introduction to DBMS The Relational Algebra by Pinar Senkul resources: mostly froom Elmasri, Navathe and other.
Slide Chapter 5 The Relational Data Model and Relational Database Constraints.
Chapter 6 The Relational Algebra Copyright © 2004 Ramez Elmasri and Shamkant Navathe.
Slide 6- 1 CARTESIAN (or cross) Product Operation Defines a relation Q that is the concatenation of every tuple of relation R with every tuple of relation.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 6- 1.
DatabaseIM ISU1 Fundamentals of Database Systems Chapter 6 The Relational Algebra.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 6 The Relational Algebra.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 6- 1.
Al-Maarefa College for Science and Technology INFO 232: Database systems Chapter 3 “part 2” The Relational Algebra and Calculus Instructor Ms. Arwa Binsaleh.
1 CS 430 Database Theory Winter 2005 Lecture 4: Relational Model.
CSE314 Database Systems Lecture 3 The Relational Data Model and Relational Database Constraints Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
The Relational Algebra. Slide 6- 2 Outline Relational Algebra Unary Relational Operations Relational Algebra Operations From Set Theory Binary Relational.
Relational Model E.F. Codd at IBM 1970 Chapter 3 (ed. 7 – Chap. 5)
Slide 6- 1 Additional Relational Operations Aggregate Functions and Grouping A type of request that cannot be expressed in the basic relational algebra.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
Lecture 1: Relational Data Models
1 CSBP430 – Database Systems Chapter 7 - The Relational Data Model Elarbi Badidi College of Information Technology United Arab Emirates University
Chapter 6 The Relational Algebra Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008.
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
1 Chapter 6 The Relational Data Model and Relational Algebra Relational Model Concepts Characteristics of Relations Relational Integrity Constraints –Key.
Relational Algebra National University of Computer and Emerging Sciences Lecture # 6 June 30,2012.
Database Systems 主講人 : 陳建源 日期 :99/11/30 研究室 : 法 Chapter 6 The Relational Algebra.
4/28/2017 Chapter 5 The Relational Data Model and Relational Database Constraints.
Chapter 71 The Relational Data Model, Relational Constraints & The Relational Algebra.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 3 The Relational Data Model and Relational Database Constraints تنبيه.
The Relational Algebra and Calculus
The Relational Data Model & Relational Algebra
Database Systems Chapter 6
Chapter (6) The Relational Algebra and Relational Calculus Objectives
Relational Algebra Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 4.
376a. Database Design Dept. of Computer Science Vassar College
Chapter 4 The Relational Algebra and Calculus
The Relational Data Model
CS4222 Principles of Database System
Outline: Relational Data Model
The Relational Algebra and Calculus
The Relational Data Model and Relational Database Constraints
Lecture 3 Relational Algebra and SQL
Presentation transcript:

7-1 Chapter 7 The Relational Data Model, Relational Constraints, and the Relational Algebra

7-2 Outlines Relational Model Concepts Characteristics of Relations Relational Integrity Constraints –Key Constraints –Entity Integrity Constraints –Referential Integrity Constraints Update Operations on Relations Relational Algebra Operations

7-3 Outlines ( Continued ) Relational Algebra Operations –SELECT and PROJECT –Set Operations –JOIN Operations –Additional Relational Operations

Relational Model Concepts Database: a collection of relations Relation (informally): A table of values. Each column the table has a column header called an attribute. Each row is called a tuple. attribute tuple (entity or relationship) Domain Formal Relational Concept: -- Domain: A set of atomic (indivisible) values. Domain Name Data type(Format)

Attribute -- Attribute: A name to suggest the meaning that a domain plays in a a particular relation. Each attribute A i has a domain dom( ). Relation schema -- Relation schema: A relation name R and a set of attributes A i that define the relation. (role) Continued E.G. Names (set of names of Persons) EMPNAME, MGRNAME intension Denoted by: R(A 1,A 2,…, A n ) where Example: STUDENT (Name,SSN,BirthDate,Addr) R: relation name A 1,A 2, …, A n : attributes

Degree of a relation -- Degree of a relation: Its number of attributes n. n=4 Tuple -- Tuple t (of R(A 1,A 2,…,A n )): A (ordered) set of values t = where each value v i is an element of dom(A i ). Also called an n-tuple. Relation instance -- Relation instance r(R): A set of tuples. r(R) = { t 1, t 2, …, t m }, or alternatively r(R) dom(A 1 ) × dom(A 2 ) × … × dom(A n ) extension (state) Continued

Figure 7.1 The attributes and tuples of a relation STUDENT (student entity) From the same domain play different role * value unknown * attribute does not apply to this tuple * this tuple has no value for this attribute degree=7

Characteristics of Relations Ordering of tuples in a relation r(R): The tuples are not considered to be ordered, even though they appear to be in the tabular form. No order c.f. sequential file Ordering of attributes in a relation schema R (and of values within each tuple): We will consider the attributes in R(A 1,A 2,…,A n ) and the values in t= to be ordered. (However, a more general alternative definition of relation does not require this ordering) An ordered set of values t= vs. a set of (, ) pairs

6-4a7-9 Values in a tuple Values in a tuple: All values are considered atomic (indivisible). A special null value is used to represent values that are unknown or inapplicable to certain tuples. Interpretation: Interpretation: relations (entity, relationship) : Notation: -- We refer to component values of a tuple t by t[A i ]=v i (the value of attribute A i for tuple t). Similarly, t[A u, A v, …, A w ] refers to the subtuple of t containing the values of attribute A u, A v, …, A w respectively. First normal form assumption t= t [Name]= t [SSN, GPA, Age]= * assertion * predicate how to map composite & multivalued

Relational Integrity Constraints Constraints are conditions that must hold on all valid relation instances. There are four main types of constrains: Domain constraints, Key constraints, Entity integrity constraints, and Referential integrity constraints. functional dependencies,... Domain Constraints Domain Constraints: the value of each attribute v(A) dom(A)

6-5a Key Constraints Superkey of R Superkey of R: A set attributes SK of R such that no two tuples in any valid relation instance r(R) will have the same value for SK. That is, for any distinct tuples t 1 and t 2 in r(R), t 1 [SK]  t 2 [SK]. Key of R Key of R: A “minimal” superkey; that is, a superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey. Example Example: The CAR relation schema: CAR(State, Reg#, SerialNo, Make, Model, Year) has two keys Key1={State,Reg#}, Key2={SerialNo}, which are also superkeys. {SerialNo, Make} is a superkey but not a key the set of all attributes forms a superkey, too a relation schema may have more than one key If a relation has several candidate keys, one is chosen arbitrarily to be the primary key. The primary key attributes are underlined.

Entity Integrity Relational Database Schema: A set S of relation schemas that belong to the same database and a set of integrity constraints IC. S is the name of the database. R={R 1,R 2,…,R n } Entity Integrity: Entity Integrity: The primary key attributes PK of each relation schema R in S cannot have null values in any tuple of r(R). This is because primary key values are used to identify the individual tuples. t[PK] null for any tuple t in r(R) relational database instance DB={r 1,r 2,…r n } s.t. r i satisfy IC (see 7-13,7-14)

Figure 7.5 The COMPANY relational database schema; primary key are underlined. Allow attribute that represent the same real world concept to have name that may or may not identical in different relation. Allow attributes that represent different concept to have the same name in different relations. (to 7-16)

Figure 7.6 A relational database instance (state) of COMPANY schema

Continued

6-6a7-16 Note: Other attributes of R may be similarly constrained to Disallow null values, even though they are not members of the primary key. Key Constraints and Entity Integrity Constraints are specified on individual relations.

Referential Integrity A tuple in one relation that refers to another relation must refer to an existing tuple in it. A constraint involving two relations (the previous constraints involve a single relation). Arise from relationships among entities EMPLOYEE(DNo) DEPARTMENT(DNUMBER) Used to specify a relationship among tuples in two relations: the referencing relation and the referenced relation. Tuples in the referencing relation R 1 have attributes FK (called foreign key attributes) that reference the primary key attributes PK of the referenced relation R 2. A tuple t 1 in R 1 is said to reference a tuple t 2 in R 2 if t 1 [FK]=t 2 [PK]. FK can be null. (EMPLOYEE)(DEPARTMENT) have the same domain

6-9a7-18 A referential integrity constraint can be displayed in a relational database schema as a directed arc from R 1 FK to R 2. (see 7-19) FK PK R1 R1 R2R2 ‧ ‧ ‧‧ ‧ ‧‧ ‧ ‧‧ ‧ ‧ referencing relation referenced relation Semantic integrity Constraint ˙ the salary of an employee the salary of his boss ˙ the maximum work hours number specified in worker law > >

a foreign key refers to its own relation 7-18

Operations: retrievals and updates 4 Update Operations on Relations -- INSERT a tuple. -- DELETE a tuple. -- MODIFY a tuple. -- Integrity constraints should not be violated by the update operations. -- Several update operations may have to be grouped together. -- Updates may propagate to cause other updates automatically. This may be necessary to maintain integrity constraints. -- In case of integrity violation, several actions can be taken: -cancel the operation that causes the violation -perform the operation but inform the user of the violation -trigger additional updates so the violation is corrected -execute a user-specified error-correction routine

6-11a7-21 Insert operation Domain Constraint: if an attribute value is given that does not appear in the domain Key Constraint: if a key value in the new tuple t already exists in another tuple in the relation. Entity Integrity: if the primary key of new tuple is null Referential Integrity: if the value of any foreign key in t refers to a tuple that does not exist in the referenced relation.

Insert into EMPLOYEE. acceptable 2.Insert into EMPLOYEE. Violate key constraint. 3.Insert into EMPLOYEE. Violate entity integrity constraint 4.Insert into EMPLOYEE Violate referential integrity constraint (See 7-14) SSNSUPER SSNDNo DNUMMGRSSN EMPLOYEE DEPARTMENT ‧ ‧ ‧‧ ‧ ‧‧ ‧ ‧‧ ‧ ‧ ‧ ‧ ‧ ‧‧ ‧ ‧

Two options are available. Reject the insertion Correct the reason for rejecting the insertion. (3) Provide an acceptable SSN. (4) ‧ Change the value of DNo, or ‧ insert a DEPARTMENT tuple with DNUMER T=7 (Cascade back to EMPLOYEE relation) SSNSUPER SSNDNo DNUMMGRSSN EMPLOYEE DEPARTMENT ‧ ‧ ‧‧ ‧ ‧‧ ‧ ‧‧ ‧ ‧ ‧ ‧ ‧ ‧‧ ‧ ‧

DELETE OPERATION Only referential integrity constraint may be violated. × domain constraint × key constraint × entity integrity constraint 1. Delete the WORKS_ON tuple with ESSN=‘ ’ and DNo=10 acceptable 2. Delete the EMPLOYEE tuple with SSN=‘ ’ unacceptable. Two tuples in WORKS_ON refer to this tuple. 3. Delete the EMPLOYEE tuple with SSN=‘ ’ unacceptable The tuple involved is referenced by tuples from EMPLOYEE, DEPARTMENT, WORKS_ON and DEPENDENT relations.

6-14a

Three options are available.  Reject the deletion.  Attempt to cascade (or propagate) the deletion E.G. Delete the two offending tuples in(2).  Modify the referencing attribute values that cause the violation. Change to another valid tuple, or set to null. When a referential integrity constraint is specified, the DBMS should allow the users to specify which of the three options applies in case of a violation of the constraint.  Combine these three alternatives. E.G. in (3) operation WORKS_ON, DEPENDENT => automatically delete EMPLOYEE => Set to null or change to another tuple

Modify operation 1. Modify the SALARY of the EMPLOYEE tuple with SSN=‘ ’ to Acceptable 2. Modify the DNO of the EMPLOYEE tuple with SSN=‘ ’ to 1. Acceptable 3. Modify the DNO of the EMPLOYEE tuple with SSN=‘ ’ to 7. Unacceptable (Violate referential integrity) 4. Modify the SSN of the EMPLOYEE tuple with SSN=‘ ’ to ‘ ’. Unacceptable (Violate primary key and referential integrity constraints)

6-16a

Modify attributes other than primary key and foreign key: check correct data type & domain Modify primary key: check domain constraint, key constraint, entity integrity constraint, referential integrity constraint. Modify foreign key: check referential integrity constraint

Defining Relations  Deciding which attributes belong together in each relation  Choosing appropriate names for the relations and their attributes.  Specifying the domains and data types of various attributes.  Identifying the candidate keys and choosing a primary key for each relation.  Specifying all foreign keys.

Name of relational database schema DECLARE SCHEMA COMPANY; domain name & data type DECLARE DOMAIN PERSON_SSNS TYPE FIXED_CHAR (9) ; DECLARE DOMAIN PERSON_NAMES TYPE VARIABLE_CHAR (15) ; DECLARE DOMAIN PERSON_INITIALS TYPE ALPHABETIC_CHAR (1) ; DECLARE DOMAIN DATES TYPE DATE ; DECLARE DOMAIN ADDRESSES TYPE VARIABLE_CHAR (35) ; DECLARE DOMAIN PERSON_SEX TYPE ENUMERATED {M, F} ; DECLARE DOMAIN DEPT_SALARIES TYPE MONEY ; DECLARE DOMAIN DEPT_NUMBERS TYPE INTEGER_RANGE [1,10] ; DECLARE DOMAIN DEPT_NAMES TYPE VARIABLE_CHAR (20) ; relations DECLARE RELATION EMPLOYEE FOR SCHEMA COMPANY ATTRIBUTES FNAME DOMAIN PERSON_NAMES, MINIT DOMAIN PERSON_INITIALS, LNAME DOMAIN PERSON_NAMES, SSN DOMAIN PERSON_SSNS,

6-19a7-32 BDATE DOMAIN DATES, ADDRESS DOMAIN ADDRESSES, SEX DOMAIN PERSON_SEX, SALARY DOMAIN PERSON_SALARIES, SUPERSSN DOMAIN PERSON_SSNS, DNO DOMAIN DEPT_NUMBERS CONSTRAINTS PRIMARY_KEY (SSN), FOREIGN_KEY (SUPERSSN) REFERENCES EMPLOYEE, FOREIGN_KEY (DNO) REFERENCES DEPARTMENT; DECLARE RELATION DEPARTMENT FOR SCHEMA COMPANY ATTRIBUTES DNAME DOMAIN DEPT_NAMES, DNUMBER DOMAIN DEPT_NUMBERS, MGRSSN DOMAIN PERSON_SSNS, MGRSTARTDATE DOMAIN DATES CONSTRAINTS PRIMARY_KEY (DNUMBER), KEY (DNAME), FOREIGN_KEY (MGRSSN) REFERENCES EMPLOYEE;

The Relational Algebra - Operations to manipulate relations. - Used to specify retrieval requests (queries). - Query result is in the form of a relation. Relational Operations: 5.1 SELECT and PROJECTΠ operations. 5.2 Set operations: These include UNION, INTERSECTION, DIFFERENCE, CARTESIAN PRODUCT. 5.3 JOIN operations. 5.4 Other relational operations: DIVISION, OUTER JOIN, AGGREGATE FUNCTIONS. ˙set operations ˙ specific for relational databases

SELECT and PROJECT Π SELECT operation (denoted by ): -Selects the tuples (rows) from a relation R that satisfy a certain selection condition c - Form of the operation: c (R) - The condition c is an arbitrary Boolean expression on the attributes of R : =,, ≧, ≠, AND, OR, NOT. -Resulting relation has the same attributes as R -Resulting relation includes each tuple in r(R) whose attributes values satisfy the condition c. selection

Continued Example: To select the subset of EMPLOYEE tuples who work in department 4 DNO = 4 (EMPLOYEE) To select the subset of EMPLOYEE tuples whose salary is greater than SALARY>30000 (EMPLOYEE) To select tuples for all employees who either work in department 4 and make over $25000 per year, or work in department 5. (DNO=4 AND SALARY>25000) OR DNO=5 (EMPLOYEE)

6-21a7-36 SELECT operation is commutative. ( (R)) = ( (R)) Combine a cascade of SELECT operations into a single SELECT operation with conjunction. ( (…( (R)) …)) = AND AND . . . AND (R)

PROJECT operation (denoted by Π ): - Keeps only certain attributes (columns) from a relation R specified in an attribute list L - Form of operation Π L (R) - Resulting relation has only those attributes of R specified in L Its degree is equal to # of attributes in L List each employee’s first and last names and salary. Example: Π FNAME,LNAME,SALARY (EMPLOYEE) projection

The PROJECT operation eliminates duplicate tuples in the resulting relation so that it remains a mathematical set ( no duplicate elements). (duplicate elimination) Example: Π SEX, SALARY (EMPLOYEE) If several male employees have salary 30000, only a single tuple is kept in the resulting relation. Duplicate tuples are eliminated by the Π operation. Continued # of tuples in the resulting relation ≦ # of tuples in the original relation

6-22a7-39 Π (Π (R) ) ≠ Π (Π (R) ) Π (Π (R) ) = Π (R) when

Sequences of operations: --Several operations can be combined to form a relational algebra expression (query) Example: Retrieve the names and salaries of employees who work in department 4: Π FNAME,LNAME,SALARY ( DNO=4 (EMPLOYEE)) --Alternatively, we specify explicit intermediate relations for each step: DEPT4_EMPS  DNO=4 (EMPLOYEE) R  Π FNAME,LNAME,SALARY (DEPT4_EMPS )

6-23a7-41 Continued Rename the attributes. --Attributes can optionally be renamed in the resulting left-hand- side relation (this may be required for some operations that will be presented later): e.g. UNION, JOIN DEPT4_EMPS  DNO=4 (EMPLOYEE) R(FIRSTNAME,LASTNAME,SALARY)  Π FNAME,LNAME,SALARY (DEPT4_EMPS ) No renaming: the resulting relation has the same attribute names.

Set Operations Relation: a set of tuples - Binary operations from mathematical set theory: UNION: R 1 ∪ R 2, INTERSECTION: R 1 ∩ R 2, SET DIFFERENCE: R 1 - R 2, CARTESIAN PRODUCT: R 1 × R 2. - For ∪, ∩, -, the operand relations R 1 (A 1,A 2,…,A n ) and R 2 (B 1,B 2,…,B n ) must have the same number of attributes, and the domains of corresponding attributes must be compatible; that is, dom(A i )=dom(B i ) for i=1,2,…, n. This condition is called union compatibility. - The resulting relation for ∪, ∩, or - has the same attribute names as the first operand R 1 (by convention).

Figure 7.11 Two union compatible relations STUDENT ∪ INSTRUCTOR STUDENT ∩ INSTRUCTOR STUDENT - INSTRUCTOR ≠ INTRUCTOR - STUDENT R-S≠S-R R ∪ S=S ∪ R R∩S=S∩R R ∪ (S ∪ T)=(R ∪ S) ∪ T (R∩S)∩T=R∩(S∩T) commutative associative 7-42

CARTESTIAN PRODUCT (CROSS ) m+n attributes n R1 × n R2 tuples R (A 1,A 2,…,A m, B 1,B 2,…,B n )  R 1 (A 1,A 2,…,A m ) × R 2 (B 1,B 2,…,B n ) m attributes n attributes n R1 tuples n R2 tuples -A tuple exists in R for each combination of tuples t 1 from R 1 and t 2 from R 2 such that: t[A 1,A 2,…,A m ]= t 1 and t [B 1,B 2,…,B n ]= t 2 -If R 1 has n 1 tuples and R 2 has n 2 tuples, then R will have n 1 * n 2 tuples. PRODUCT JOIN

6-26a7-45 -CARTESTIAN PRODUCT is a meaningless operation on its own. It can combine related tuples from two relations if followed by the appropriate SELECT operation. => JOIN Example: Combine each DEPARTMENT tuple with the EMPLOYEE tuple of the manager. DEP_EMP  DEPARTMENT × EMPLOYEE DEPT_MANAGER  MGRSSN=SSN (DEP_EMP)

JOIN Operations THETA JOIN: Similar to a CARTESIAN PRODUCT followed by a SELECT. The condition c is called a R (A 1,A 2,…,A m, B 1,B 2,…,B n )  R 1 (A 1,A 2,…,A m ) c R 2 (B 1,B 2,…,B n ) join condition. m+n attributes ≦ n R1 × n R2 R (A 1,A 2,…,A m, B 1,B 2,…,B n )  R 1 (A 1,A 2,…,A m ) c R 2 (B 1,B 2,…,B n ) m attributes n attributes n R1 n R2 c : AND … AND cond: A i θ B j θ  {=,, ≧, ≠}

6-27a7-47 EQUIJOIN: The join condition c includes one or more equality comparisons involving attributes from R 1 and R 2. That is, c is of the form: ( A i = B j ) AND…AND ( A h = B k ); 1 ≦ i, h ≦ m, 1 ≦ j, k ≦ n In the above EQUIJOIN operation: A i,…,A h are called the join attributes of R 1 B j,…,B k are called the join attributes of R 2

6-27b7-48 Example of using EQUIJOIN: Retrieve each DEPARTMENT’s name and its manager’s name: T  DEPARTMENT MGRSSN=SSN EMPLOYEE RESULT  Π DNAME,FNAME,LNAME (T) EMP_DEPENDENTS  EMPNAMES × DEPENDENT ACTUAL_DEPENDENTS  SSN=ESSN (EMP_DEPENDENTS) ACTUAL_DEPENDENTS  EMPNAMES SSN=ESSN DEPENDENT

NATURAL JOIN (*): R  R 1 c R 2, R 2 In an EQUIJOIN R  R 1 c R 2, the join attribute of R 2 appears R redundantly in the result relation R. In a NATURAL JOIN, the R 2 R redundant join attributes of R 2 are eliminated from R. The equality condition is implied and need not be specified. R  R 1 * (join attributes of R1),(join attributes of R2) R 2 R  R 1 * (join attributes of R1),(join attributes of R2) R 2 Example: Retrieve each EMPLOYEE’s name and the name of the DEPARTMENT he/she works for: T  EMPLOYEE* (DNO),(DNUMBER) DEPARTMENT RESULT  Π FNAME,LNAME,DNAME (T) 0 ≦ n R ≦ n R1 × n R2 n R1 n R2

6-28a7-50 If the join attributes have the same names in both relations, they R  R 1 * R 2 need not be specified and we can write R  R 1 * R 2. Example: Retrieve each EMPLOYEE’s name and name of his/her SUPERVISOR: SUPERVISOR(SUPERSSN,SFN,SLN)  Π SSN,FNAME,LNAME (EMPLOYEE) T  EMPLOYEE * SUPERVISOR RESULT  Π FNAME,LNAME,SFN,SLN (T)

Note: In the original definition of NATURAL JOIN, the join attributes were required to have the same names in both relations. There can be a more than one set of join attributes with a different meaning between the same two relations. For example: JOIN ATTRIBUTES EMPLOYEE.SSN=DEPARTMENT.MGRSSN EMPLOYEE.DNO=DEPARTMENT.DNUMBER RELATIONSHIP EMPLOYEE manages the DEPARTMENT EMPLOYEE works for the DEPARTMENT Example: Retrieve each EMPLOYEE’s name and the name of the DEPARTMENT he/she works for: T  EMPLOYEE DNO=DNUMBER DEPARTMENT RESULT  Π FNAME,LNAME,DNAME (T)

A relation can have a set of join attributes to join it with itself: JOIN ATTRIBUTES RELATIONSHIP EMPLOYEE(1).SUPERSSN=EMPLOYEE(2).SSN EMPLOYEE(2) supervises EMPLOYEE(1) -One can think of this as joining two distinct copies of the relation, although only one relation actually exists. -In this case, renaming can be useful. Example: Retrieve each EMPLOYEE’s name and the name of his/her SUPERVISOR:,LNAME (EMPLOYEE) SUPERVISOR(SSN,SFN,SLN)  Π SSN,FNAME,LNAME (EMPLOYEE) T  EMPLOYEE SUPERSSN=SSSN SUPERVISOR RESULT  Π FNAME,LNAME,SFN,SLN (T)

DIVISION Operation R(Z) ÷ S(X) where X  Z Y = Z – X Y = Z – X T 1  Π Y (R) T 1  Π Y (R) T 2  Π Y ((S × T 1 ) – R) T 2  Π Y ((S × T 1 ) – R) T  T 1 – T 2 T  T 1 – T 2 T(Y)tt R t R [Y]=t Rt R [X]= t S t S S A relation T(Y) that includes a tuple t if a tuple t R whose t R [Y]=t appears in R with t R [X]= t S for every tuple t S in S.

6-31a7-54 DIVISION Operation Retrieve the names of employees who work on all the projects that ‘John Smith’ works on. SMITH  σ FNAME=‘John’ AND LNAME=‘Smith’ (EMPLOYEE) SMITH_PNOS  Π PNO (WORKS_ON) ESSN=SSN SMITH) (1)Retrieve the list of project numbers that ‘John Smith’ works on. SMITH  σ FNAME=‘John’ AND LNAME=‘Smith’ (EMPLOYEE) SMITH_PNOS  Π PNO (WORKS_ON) ESSN=SSN SMITH) SSN_PNO  Π PNO,ESSN (WORKS_ON) (2)Create a relation that includes tuples from WORKS_ON. SSN_PNO  Π PNO,ESSN (WORKS_ON) SSNS(SSN)  SSN_PNOS ÷ SMITH_PNOS RESULT  Π FNAME,LNAME (SSNS * EMPLOYEE) (3)Apply the DIVISION operation SSNS(SSN)  SSN_PNOS ÷ SMITH_PNOS RESULT  Π FNAME,LNAME (SSNS * EMPLOYEE)

6-31b7-55 SSNS  SSN_PNOS÷SMITH_PNOS T  R ÷ S Figure 7.15 DIVISION

Complete Set of Relational Algebra Operations: R∩S≡(R ∪ S) - ((R - S) ∪ (S - R)) (natural) join ≡ Π L σ C ( R × S ) --All the operations discussed so far can be described as a sequence of only the operations SELECT,PROJECT,UNION,SET DIFFERENCE, and CARTESIAN PRODUCT. R∩S≡(R ∪ S) - ((R - S) ∪ (S - R)) (natural) join ≡ Π L σ C ( R × S ) --Hence, the set {σ, Π, ∪, -, × } is called a complete set of relational algebra operations. Any query language equivalent to these operations is called relationally complete. --For database applications, additional operations are needed that were not part of the original relational algebra. These include: 1.Aggregate functions and grouping. 2.OUTER JOIN and OUTER UNION. more than complete

A list of (, ) pairs 5.4 Additional Relational Operations AGGREGRATE FUNCTIONS: -- Functions such as SUM, COUNT, AVERAGE, MIN, MAX are often applied to sets of values or sets of tuples in database applications. ℱ (R) ℱ (R) --The grouping attributes are optional. Example 1: Retrieve the average salary of all employees (no grouping needed): R(AVAGSAL)  ℱ average salary (EMPLOYEE) grouping attr. t attr. In : function list Script F a list of attributes of the relation specified in R degree: 1 single tuple only

6-33a7-58 Continued Example 2: For each department, retrieve the department number,the number of employees, and the average salary (in the department): R(DNO, NUMEMPS,AVGSAL)  DNO ℱ COUNT SSN,AVERAGE SALARY (EMPLOYEE) DNO ℱ COUNT SSN,AVERAGE SALARY (EMPLOYEE) DNO is called the grouping attribute in the above example. degree:3

b 7.16b

Recursive Closure Operation applied to recursive relationship between tuples of the same types. E.g. employee and supervisor Retrieve all supervisees of an employee e at all levels. Need a looping mechanism Cannot be specified in relational algebra. 1 st + 2 nd level RESULT2(SSN)  Π SSN1 (SUPERVISION SSN2=SSN RESULT1) RESULT3  ( RESULT 1 ∪ RESULT 2) 1 st level BORG_SSN  Π SSN (σ FNAME=‘James’ANDLNAME=‘BORG’ (EMPLOYEE)) SUPERVISION(SSN1,SSN2)  Π SSN,SUPERSSN (EMPLOYEE) RESULT1(SSN)  Π SSN1 (SUPERVISION SSN2=SSN BORG_SSN)

Employee ▽ James Bars ▽ John Smith ▽ Franklin Wong ▽ Alicia Zelaya ▽ Jennifer Wallace ▽ Ramesh Narayan Joce English Ahmad Jabbar Department Research Administration Headquarters OUTER JOIN --In a regular EQUIJOIN or NATURAL operation, tuples in R 1 or R 2 that do not have matching tuples in the other relation do not appear in the result. Employee * Department --Some queries require all tuples in R 1 (or R 2 or both) to appear in the result. --When no matching tuples are found, nulls are placed for the missing attributes. List all employee names and the name of the department they manage

6-36a7-62 R 1 R 2 TEMP  ( EMPLOYEE SSN=MGRSSN DEPARTMENT) RESULT  Π FNAME,MINIT,LNAME,DNAME (TEMP) --LEFT OUTER JOIN : R 1 R 2 lets every tuple in R 1 appear in the result. TEMP  ( EMPLOYEE SSN=MGRSSN DEPARTMENT) RESULT  Π FNAME,MINIT,LNAME,DNAME (TEMP) R 1 R 2 TEMP  ( EMPLOYEE SSN=MGRSSN DEPARTMENT) RESULT  Π FNAME,MINIT,LNAME,DNAME (TEMP) --RIGHT OUTER JOIN: R 1 R 2 lets every tuple in R 2 appear in the result. TEMP  ( EMPLOYEE SSN=MGRSSN DEPARTMENT) RESULT  Π FNAME,MINIT,LNAME,DNAME (TEMP) R 1 R 2 TEMP  ( EMPLOYEE SSN=MGRSSN DEPARTMENT) RESULT  Π FNAME,MINIT,LNAME,DNAME (TEMP) --FULL OUTER JOIN: R 1 R 2 lets every tuple in R 1 or R 2 appear in the result. TEMP  ( EMPLOYEE SSN=MGRSSN DEPARTMENT) RESULT  Π FNAME,MINIT,LNAME,DNAME (TEMP) (See 6-38) Department + sell tuples 9 tuples

OUTER UNION Take the union of tuples from two relations that are partially compatible. STUDENT(Name, SSN, Department, Advisor) FACULTY(Name, SSN, Department, Rank) = R (Name, SSN, Department, Advisor, Rank) OUTER UNION 724

Examples of Queries in Relational Algebra RESEARCH_DEPT  σ DNAME=‘RESEARCH’ (DEPARTMENT) RESEARCH_DEPT_EMPS  (RESEARCH_DEPT DNUMBER=DNO EMPLOYEE) RESULT  π FNAME,LNAME,ADDRESS (RESEARCH_DEPT_EMPS) QUERY 1 Retrieve the name and address of all employees who work for the ‘Research’ department. RESEARCH_DEPT  σ DNAME=‘RESEARCH’ (DEPARTMENT) RESEARCH_DEPT_EMPS  (RESEARCH_DEPT DNUMBER=DNO EMPLOYEE) RESULT  π FNAME,LNAME,ADDRESS (RESEARCH_DEPT_EMPS) STAFFORD_PROJS  σ PLOCATION=‘Stafford’ (PROJECT) CONTR_DEPT  (STAFFORD_PROJS DNUM=DNUMBER DEPARTMENT) PROJ_DEPT_MGR  (CONT_DEPT MGRSSN=SSN EMPLOYEE) RESULT  π PNUMBER,DNUM,LNAME,ADDRESS,BDATE (PROJ_DEPT_MGR), QUERY 2 For every project located in ‘Stafford’, list the project number, the controlling department number, and the department manager’s last name, address, and birthdate. STAFFORD_PROJS  σ PLOCATION=‘Stafford’ (PROJECT) CONTR_DEPT  (STAFFORD_PROJS DNUM=DNUMBER DEPARTMENT) PROJ_DEPT_MGR  (CONT_DEPT MGRSSN=SSN EMPLOYEE) RESULT  π PNUMBER,DNUM,LNAME,ADDRESS,BDATE (PROJ_DEPT_MGR),

DEPT5_PROJS(PNO)  π PNUMBER (σ DNUM=5 (PROJECT)) EMP_PROJ(SSN,PNO)  π ESSN,PNO (WORKS_ON) RESULT_EMP_SSNS  EMP_PROJ ÷ DEPT5_PROJS RESULT  π LNAME,FNAME (RESULT_EMP_SSNS * EMPLOYEE), QUERY 3 Find the names of employees who work on all the projects controlled by department number 5 DEPT5_PROJS(PNO)  π PNUMBER (σ DNUM=5 (PROJECT)) EMP_PROJ(SSN,PNO)  π ESSN,PNO (WORKS_ON) RESULT_EMP_SSNS  EMP_PROJ ÷ DEPT5_PROJS RESULT  π LNAME,FNAME (RESULT_EMP_SSNS * EMPLOYEE), SMITHS(ESSN)  π SSN (σ LNAME=‘Smith’ (EMPLOYEE)) SMITH_WORKER_PROJS  π PNO (WORKS_ON * SMITHS) MGRS  π LNAME,DNUMBER (EMPLOYEE SSN=MGRSSN DEPARTMENT ) SMITH_MGRS , σ LNAME=‘Smith’ (MGRS) SMITH_MANAGED_DEPTS(DNUM)  π DNUMBER (SMITH_MGRS) SMITH_MGR_PROJS(PNO)  π PNUMBER (SMITH_MANAGED_DEPTS*PROJECT) RESULT  (SMITH_WORKER_PROJS ∪ SMITH_MGR_PROJS) QUERY 4 Make a list of project numbers for projects that involve an employee whose last name is ‘Smith’, either as a worker or as a manager of the department that controls the project. SMITHS(ESSN)  π SSN (σ LNAME=‘Smith’ (EMPLOYEE)) SMITH_WORKER_PROJS  π PNO (WORKS_ON * SMITHS) MGRS  π LNAME,DNUMBER (EMPLOYEE SSN=MGRSSN DEPARTMENT ) SMITH_MGRS , σ LNAME=‘Smith’ (MGRS) SMITH_MANAGED_DEPTS(DNUM)  π DNUMBER (SMITH_MGRS) SMITH_MGR_PROJS(PNO)  π PNUMBER (SMITH_MANAGED_DEPTS*PROJECT) RESULT  (SMITH_WORKER_PROJS ∪ SMITH_MGR_PROJS)

T 1 (SSN, NO_OF_DEPS)  ESSN ℱ COUNT DEPENDENT_NAME (DEPENDENT) T 2  σ NO_OF_DEPS≥2 (T 1 ) RESULT  π LNAME,FNAME (T 2 * EMPLOYEE) QUERY 5 List the names of all employees with two or more dependents. T 1 (SSN, NO_OF_DEPS)  ESSN ℱ COUNT DEPENDENT_NAME (DEPENDENT) T 2  σ NO_OF_DEPS≥2 (T 1 ) RESULT  π LNAME,FNAME (T 2 * EMPLOYEE) ALL_EMPS  π SSN (EMPLOYEE) EMPS_WITH_DEPS(SSN)  π ESSN (DEPENDENT) EMPS_WITHOUT_DEPS  (ALL_EMPS - EMPS_WITH_DEPS) RESULT  π LNAME,FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE) QUERY 6 Retrieve the names of employees who have no dependents. ALL_EMPS  π SSN (EMPLOYEE) EMPS_WITH_DEPS(SSN)  π ESSN (DEPENDENT) EMPS_WITHOUT_DEPS  (ALL_EMPS - EMPS_WITH_DEPS) RESULT  π LNAME,FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE) MGRS(SSN)  π MGRSSN (DEPARTMENT) EMPS_WITH_DEPS(SSN)  π ESSN (DEPENDENT) MGRS_WITH_DEPS  (MGRS ∩ EMPS_WITH_DEPS) RESULT  π LNAME,FNAME (MGRS_WITH_DEPS * EMPLOYEE) QUERY 7 List the names of managers who have at least one dependent. MGRS(SSN)  π MGRSSN (DEPARTMENT) EMPS_WITH_DEPS(SSN)  π ESSN (DEPENDENT) MGRS_WITH_DEPS  (MGRS ∩ EMPS_WITH_DEPS) RESULT  π LNAME,FNAME (MGRS_WITH_DEPS * EMPLOYEE)