Download presentation
Presentation is loading. Please wait.
Published bySophie Montgomery Modified over 6 years ago
1
Chapters 5-8 Relational Data Models, Relational Constraints, and Relational Algebra
Flat file: A two dimensional array of attributes or data items ProductX Bellaire ProductY Sugarland ProductZ Houston Computerization Stafford Reorganization Houston Newbenefits Stafford Database Management Systems (DBMS): A generalized software system that is used to create, manage, and protect data bases Chapters 5-8
2
Chapters 5-8
3
Attribute: A name characteristic or property of an entity = column header Entity: A “thing” in the real world with an independent existence physical existence: person, student, car Chapters 5-8
4
GPA: 0<= GPA <= 4.0 Fname -- atomic Minit -- atomic
Domain - The valid set of atomic value for an attribute in a relation e.g. SSN set of 9 digits GPA: 0<= GPA <= 4.0 Atomic - each value in the domain is indivisible Name (Fname, Minit, Lname) – not atomic Fname -- atomic Minit -- atomic Lname atomic Chapters 5-8
5
Relational Model Concepts
A Relation is a mathematical concept based on the ideas of sets The model was first proposed by Dr. E.F. Codd of IBM Research in 1970 in the following paper: "A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970 The above paper caused a major revolution in the field of database management and earned Dr. Codd the coveted ACM Turing Award Chapters 5-8
6
Informal Definitions Informally, a relation looks like a table of values. A relation typically contains a set of rows. The data elements in each row represent certain facts that correspond to a real-world entity or relationship In the formal model, rows are called tuples Each column has a column header that gives an indication of the meaning of the data items in that column In the formal model, the column header is called an attribute name (or just attribute) Chapters 5-8
7
Formal Definitions - Schema
The Schema (or description) of a Relation: Denoted by R(A1, A2, .....An) R is the name of the relation The attributes of the relation are A1, A2, ..., An Example: CUSTOMER (Cust-id, Cust-name, Address, Phone#) CUSTOMER is the relation name Defined over the four attributes: Cust-id, Cust-name, Address, Phone# Each attribute has a domain or a set of valid values. For example, the domain of Cust-id is 6 digit numbers. Chapters 5-8
8
Formal Definitions - Tuple
A tuple is an ordered set of values (enclosed in angled brackets ‘< … >’) Each value is derived from an appropriate domain. A row in the CUSTOMER relation is a 4- tuple and would consist of four values, for example: <632895, "John Smith", "101 Main St. Atlanta, GA ", "(404) "> This is called a 4-tuple as it has 4 values A tuple (row) in the CUSTOMER relation. A relation is a set of such tuples (rows) Chapters 5-8
9
Formal Definitions - Domain
A domain has a logical definition: Example: “USA_phone_numbers” are the set of 10 digit phone numbers valid in the U.S. A domain also has a data-type or a format defined for it. The USA_phone_numbers may have a format: (ddd)ddd-dddd where each d is a decimal digit. Dates have various formats such as year, month, date formatted as yyyy-mm-dd, or as dd mm,yyyy etc. The attribute name designates the role played by a domain in a relation: Used to interpret the meaning of the data elements corresponding to that attribute Example: The domain Date may be used to define two attributes named “Invoice-date” and “Payment-date” with different meanings Chapters 5-8
10
Formal Definitions - State
The relation state is a subset of the Cartesian product of the domains of its attributes each domain contains the set of all possible values the attribute can take. Example: attribute Cust-name is defined over the domain of character strings of maximum length 25 dom(Cust-name) is varchar(25) The role these strings play in the CUSTOMER relation is that of the name of a customer. Chapters 5-8
11
Formal Definitions - Summary
Formally, Given R(A1, A2, , An) r(R) dom (A1) X dom (A2) X ....X dom(An) R(A1, A2, …, An) is the schema of the relation R is the name of the relation A1, A2, …, An are the attributes of the relation r(R): a specific state (or "value" or “population”) of relation R – this is a set of tuples (rows) r(R) = {t1, t2, …, tn} where each ti is an n-tuple ti = <v1, v2, …, vn> where each vj element-of dom(Aj) Chapters 5-8
12
Formal Definitions - Example
Let R(A1, A2) be a relation schema: Let dom(A1) = {0,1} Let dom(A2) = {a,b,c} Then: dom(A1) X dom(A2) is all possible combinations: {<0,a> , <0,b> , <0,c>, <1,a>, <1,b>, <1,c> } The relation state r(R) dom(A1) X dom(A2) For example: r(R) could be {<0,a> , <0,b> , <1,c> } this is one possible state (or “population” or “extension”) r of the relation R, defined over A1 and A2. It has three 2-tuples: <0,a> , <0,b> , <1,c> Chapters 5-8
13
Definition Summary Informal Terms Formal Terms Table Relation
Column Header Attribute All possible Column Values Domain Row Tuple Table Definition Schema of a Relation Populated Table State of the Relation Chapters 5-8
14
Super key: an attribute or a set of attributes that identifies an entity uniquely (may not be minimal set) SSN SSN, NAME SSN, NAME, MAJOR Chapters 5-8
15
Candidate key: a super key such that no proper subset of its attributes is itself a super key. So candidate keys must have a minimal identifier. STUID SSN Primary key: the candidate key that is chosen OR the candidate key that is used to identify tuples in a relation -- unique, must exist Alternate key: A candidate key in a relation that is not selected e.g. if primary key is SSN then STUID is a alternate key Chapters 5-8
16
Concatenated (composite) key: A primary key that is comprised of two or more attributes or data items G RADE_REPORT(STUID, COURSE#, GRADE) Chapters 5-8
17
Foreign key: A non-key attribute in one relation that appears as the primary key (or part of the key) in another relation EMPLOYEE(SSN, FNAME, MINIT, DNO) DEPARTMENT(DNUMBER, DNAME, MANAGER) Chapters 5-8
18
Secondary key: a field that can have duplicate values, and that can be used as search path by the users Chapters 5-8
19
Chapters 5-8
20
Referential Integrity Constraints for COMPANY database
Chapters 5-8
21
Chapters 5-8
22
Relational Algebra Overview
Relational algebra is the basic set of operations for the relational model These operations enable a user to specify basic retrieval requests (or queries) The result of an operation is a new relation, which may have been formed from one or more input relations This property makes the algebra “closed” (all objects in relational algebra are relations) Chapters 5-8
23
Relational Algebra Overview (continued)
The algebra operations thus produce new relations These can be further manipulated using operations of the same algebra A sequence of relational algebra operations forms a relational algebra expression The result of a relational algebra expression is also a relation that represents the result of a database query (or retrieval request) Chapters 5-8
24
Relational Algebra Overview
Relational Algebra consists of several groups of operations Unary Relational Operations SELECT (symbol: (sigma)) PROJECT (symbol: (pi)) Relational Algebra Operations From Set Theory UNION ( ), INTERSECTION ( ), DIFFERENCE (or MINUS, – ) CARTESIAN PRODUCT ( x ) Binary Relational Operations JOIN (several variations of JOIN exist) DIVISION Additional Relational Operations OUTER JOINS, OUTER UNION AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM, COUNT, AVG, MIN, MAX) Chapters 5-8
25
SALARY > 30,000 (EMPLOYEE)
Unary Relational Operations: SELECT The SELECT operation (denoted by (sigma)) is used to select a subset of the tuples from a relation based on a selection condition. The selection condition acts as a filter Keeps only those tuples that satisfy the qualifying condition Tuples satisfying the condition are selected whereas the other tuples are discarded (filtered out) Examples: Select the EMPLOYEE tuples whose department number is 4: DNO = 4 (EMPLOYEE) Select the employee tuples whose salary is greater than $30,000: SALARY > 30,000 (EMPLOYEE) Chapters 5-8
26
Unary Relational Operations: SELECT
In general, the select operation is denoted by <selection condition>(R) where the symbol (sigma) is used to denote the select operator the selection condition is a Boolean (conditional) expression specified on the attributes of relation R tuples that make the condition true are selected appear in the result of the operation tuples that make the condition false are filtered out discarded from the result of the operation Chapters 5-8
27
Unary Relational Operations: SELECT (contd.)
SELECT Operation Properties The SELECT operation <selection condition>(R) produces a relation S that has the same schema (same attributes) as R SELECT is commutative: <condition1>( < condition2> (R)) = <condition2> ( < condition1> (R)) Because of commutativity property, a cascade (sequence) of SELECT operations may be applied in any order: <cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1> ( R))) A cascade of SELECT operations may be replaced by a single selection with a conjunction of all the conditions: <cond1>(< cond2> (<cond3>(R)) = <cond1> AND < cond2> AND < cond3>(R))) The number of tuples in the result of a SELECT is less than (or equal to) the number of tuples in the input relation R Chapters 5-8
28
Select Works on single table and takes rows that meet a specified condition, copy them into a new table (Table name) Condition(s) SQL (Structured Query language) SELECT * FROM (table name) WHERE condition 1 AND condition 2 AND condition 3… Chapters 5-8
29
Condition(s) Chapters 5-8 Table
30
Find employees who work for department number 5. employee DNO = 5
SQL: SELECT * FROM employee WHERE dno = 5; Chapters 5-8
31
Chapters 5-8
32
Query tree DNO=5 Chapters 5-8 Employee
33
s(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)
s(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE) s<cond1>(s<cond2>(R)) = s<cond2>(s<cond1>(R)) s<cond1>(s<cond2>(. . .(s<condn> (R)) . . .)) = s<cond1> AND <cond2> AND AND <condn>(R) Chapters 5-8
34
Project Operates on a single table,
produces a vertical subset of the table, extract the values of specified columns eliminate duplicate rows place the value in a new table (table name) column1, column2, column3, … Chapters 5-8
35
SELECT column1, column2, column3, … FROM (table name)
SQL: SELECT column1, column2, column3, … FROM (table name) Chapters 5-8
36
column(s) Chapters 5-8 Table
37
E.g. Show the names of all employees employee fname, minit, lname
SELECT fname, minit, lname FROM employee; Chapters 5-8
38
Chapters 5-8
39
fname,minit,lname Chapters 5-8 Employee
40
Show the names of all employees who work for department number 5
Select & project Show the names of all employees who work for department number 5 ( employee) fname, minit, lname dno = 5 SELECT fname, minit, lname FROM employee WHERE dno = 5; Chapters 5-8
41
Chapters 5-8
42
fname,minit,lname DNO = 5 Chapters 5-8 Employee
43
Examples of applying SELECT and PROJECT operations
Chapters 5-8
44
If R1 has X rows and M columns R2 has Y rows and N columns
PRODUCT (or Cartesian product) R1 x R2 R1 X R2 is a table where width is the width of R1 plus the width of R2 and whose columns are the columns of R1 followed by the columns of R2 If R1 has X rows and M columns R2 has Y rows and N columns R1 X R2 = X * Y rows and M + N columns Chapters 5-8
45
Cartesian Product Chapters 5-8
46
Query Tree for Cartesian Product
X Chapters 5-8 Table1 Table2
47
Example of Query Tree Chapters 5-8
48
Theta Join The result of performing a SELECT operation using a comparison operator theta (=,<, <=, >, <=, <>) on the product Chapters 5-8
49
Theta Join (>) Chapters 5-8
50
Theta Join (ID>STUID)
Chapters 5-8
51
Query Tree for Theta Join
X ID > STUID Chapters 5-8 Student Credit_Hours
52
Equijoin Product with “theta” is equality
Chapters 5-8
53
Chapters 5-8 Equijoin
54
Equijoin Chapters 5-8
55
Query Tree for Equijoin
X ID = STUID Chapters 5-8 Student Credit_Hours
56
Natural Join |X| Is an equijoin which the repeated column is eliminated Usually join performs over column with the same names Chapters 5-8
57
Equi-join Remove Chapters 5-8
58
Remove this column Chapters 5-8
59
Chapters 5-8
60
Query Tree for Natural Join
|X| Chapters 5-8 Student Credit_Hours
61
Semi-join: If R1 and R2 are tables
Semijoin of R1 and R2 is natural join of R1 and R2 and then projecting the result into the attributes of A Semijoin is not cumulative Chapters 5-8
62
insert into student1 values(‘101’,’Jim’,’Smith’);
Create tables create table student1 (id char(3) primary key, fname char(10), lname char(10)); insert into student1 values(‘101’,’Jim’,’Smith’); insert into student1 values(‘102’,’Tim’,’Brown’); insert into student1 values(‘103’,’Babara’,’Houston’); create table credit_hours (stuid char(3) primary key, hours number(3)); insert into credit_hours values(101,60); insert into credit_hours values(102,85); Chapters 5-8
63
Left Semi-Join Chapters 5-8
64
Right Semi-Join Chapters 5-8
65
Outer Join: Is an extension of a THETA JOIN, an EQUIJOIN, or a NATURAL JOIN An outer join consists of all rows that appear in the usual theta join, plus an additional row for each of the tuples from the original tables that do not participate in the theta join. In those rows that are unmatched original tuples, extend it by assigning null values to the other attributes. Chapters 5-8
66
Left outer join unmatched rows from the first (left) table appear in the resulting table
Right outer join unmatched rows from the second (right) table appear in the resulting table Chapters 5-8
67
Left Outer Join Right Outer Join Chapters 5-8
68
from student, credit_hours where id = stuid(+);
Outer Join -- Oracle Left-outer join select * from student, credit_hours where id = stuid(+); SELECT E.FNAME, E.LNAME, dependent_name FROM EMPLOYEE E, DEPENDENT D WHERE E.SSN = D.ESSN(+); Chapters 5-8
69
Right-outer join select * from student, credit_hours where id(+) = stuid; Chapters 5-8
70
Sample SQL create view: Chapters 5-8
create view v_emp_dno as select fname, lname, dno from employee; select * from v_emp_dno; create view v_department as select dnumber, dname from department; select * from v_department; Cartesian product: select * from v_emp_dno, v_department; Natural join: select * from v_emp_dno, v_department where dno = dnumber; Left Outer join select fname, lname, ssn, essn, dependent_name from employee, dependent where ssn = essn (+); Right Outer join select essn, dependent_name, fname, lname, ssn from employee, dependent where essn (+) = ssn; Chapters 5-8
71
Set operations: Union, Difference, Intersection, Division
Union (U) tables must be compatible - they must have same basic structure, both relations must have the same domains. The union of two relations is the set of tuples in either or both relations Chapters 5-8
72
Example to illustrate the result of UNION, INTERSECT, and DIFFERENCE
Chapters 5-8
73
SQL--Union Select ssn from employee where dno = 5 Union select distinct(essn) from dependent;
5 rows selected SSN 4 rows selected ESSN 3 rows selected Chapters 5-8
74
Difference (-) The difference between two relations is the set of tuples that belong to the first relation but not in the second relation. Chapters 5-8
75
SQL--Minus Select ssn from employee minus select unique essn from dependent;
5 rows selected = SSN 8 rows selected ESSN 3 rows selected Chapters 5-8
76
Intersection () The intersection of two relations is the set of tuples that belong to both relations simultaneously. Chapters 5-8
77
Intersection Chapters 5-8
78
Division () A binary operation that can be defined on two relations where the entire structure of one (the divisor) is a portion of the structure of the other (the dividen) Chapters 5-8
79
Division Chapters 5-8
80
Example of DIVISION Chapters 5-8
81
Aggregate Functions and Grouping
Script F: (group attributes) <function, attribute> (R) Functions = sum, average, maximum, minimum, count Chapters 5-8
82
All Employees (No Group By)
SELECT sum(salary), Max (salary), min(salary), avg(salary) FROM employee; SUM(SALARY) MAX(SALARY) MIN(SALARY) AVG(SALARY) Chapters 5-8
83
Example: Retrieve the department number, number of employees, and average salary in the department – Group By DNO RESULT(DNO, NUMBER_OF_EMPLOYEES, AVG_SAL) count SSN, Average SALARY EMPLOYEE SELECT dno, count(ssn), avg(salary) FROM employee GROUP BY dno order by dno; DNO COUNT(SSN) AVG(SALARY) Chapters 5-8
84
SELECT dno, sum(salary), Max (salary), min(salary), avg(salary)
Group By SELECT dno, sum(salary), Max (salary), min(salary), avg(salary) FROM employee GROUP BY dno; DNO SUM(SALARY) MAX(SALARY) MIN(SALARY) AVG(SALARY) Chapters 5-8
85
DNO count SSN, Average SALARY (EMPLOYEE)
Chapters 5-8
86
If grouping attributes are not specified
count SSN, Average SALARY (EMPLOYEE) Chapters 5-8
87
SELECT sum(salary), Max (salary), min(salary), avg(salary)
FROM employee, department WHERE dno = dnumber AND dname = 'Research'; SUM(SALARY) MAX(SALARY) MIN(SALARY) AVG(SALARY) Chapters 5-8
88
(select fname, lname, dno from employee where dno = 5) --------------
View Create View V_Dno5 as (select fname, lname, dno from employee where dno = 5) view V_DNO5 created. Select * from V_DNO5; FNAME LNAME DNO John Smith Franklin Wong Ramesh Narayan Joyce English Chapters 5-8
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.