COP4710 Database Systems Relational Algebra.

Slides:



Advertisements
Similar presentations
IS698: Database Management Min Song IS NJIT. The Relational Data Model.
Advertisements

1 Relational Algebra* and Tuple Calculus * The slides in this lecture are adapted from slides used in Standford's CS145 course.
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Tallahassee, Florida, 2014 COP4710 Database Systems Relational Algebra Fall 2014.
D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
Lecture 07: Relational Algebra
1 Relational Algebra. Motivation Write a Java program Translate it into a program in assembly language Execute the assembly language program As a rough.
Relational Algebra Dashiell Fryer. What is Relational Algebra? Relational algebra is a procedural query language. Relational algebra is a procedural query.
Relational Algebra Ch. 7.4 – 7.6 John Ortiz. Lecture 4Relational Algebra2 Relational Query Languages  Query languages: allow manipulation and retrieval.
1 Relational Algebra Lecture #9. 2 Querying the Database Goal: specify what we want from our database Find all the employees who earn more than $50,000.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
1 Relational Algebra Basic operators Relational algebra expression tree.
One More Normal Form Consider the dependencies: Product Company Company, State Product Is it in BCNF?
Relational Algebra Basic Operations Algebra of Bags.
Relational Algebra CIS 4301 Lecture Notes Lecture /28/2006.
Databases 1 First lecture. Informations Lecture: Monday 12:15-13:45 (3.716) Practice: Thursday 10:15-11:45 (2-519) Website of the course:
Relational Algebra - Chapter (7th ed )
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
Relational Algebra (Chapter 7)
The Relational Algebra and Calculus
1 Relational Algebra Basic Operations Algebra of Bags.
Databases : Relational Algebra 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof. Jeffrey D. Ullman distributes.
From Professor Ullman, Relational Algebra.
1 Lecture 2 Relational Algebra Based on
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Advanced Relational Algebra & SQL (Part1 )
Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof.
1 Algebra of Queries Classical Relational Algebra It is a collection of operations on relations. Each operation takes one or two relations as its operand(s)
Relational Algebra BASIC OPERATIONS 1 DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA.
1 CSE544: Lecture 7 XQuery, Relational Algebra Monday, 4/22/02.
1 Introduction to Database Systems, CS420 Relational Algebra.
1 Database Design: DBS CB, 2 nd Edition Relational Algebra: Basic Operations & Algebra of Bags Ch. 5.
Relational Algebra COMP3211 Advanced Databases Nicholas Gibbins
1. Chapter 2: The relational Database Modeling Section 2.4: An algebraic Query Language Chapter 5: Algebraic and logical Query Languages Section 5.1:
©Silberschatz, Korth and Sudarshan2.1Database System Concepts - 6 th Edition Chapter 8: Relational Algebra.
Ritu CHaturvedi Some figures are adapted from T. COnnolly
Database Systems Chapter 6
Summary of Relational Algebra
Basic Operations Algebra of Bags
Relational Algebra.
CPSC-310 Database Systems
COMP3017 Advanced Databases
Fundamental of Database Systems
Relational Algebra at a Glance
Relational Algebra Chapter 4 1.
Lecture 8: Relational Algebra
The Relational Algebra and Relational Calculus
CS 440 Database Management Systems
Chapter 2: Intro to Relational Model
Relational Algebra Chapter 4, Part A
COP5725 Advanced Database Systems
Database Management Systems (CS 564)
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Operators Expression Trees Bag Model of Data
LECTURE 3: Relational Algebra
Relational Algebra Chapter 4 1.
CPSC-310 Database Systems
2018, Fall Pusan National University Ki-Joune Li
Relational Algebra Chapter 4, Sections 4.1 – 4.2
Basic Operations Algebra of Bags
Lecture 33: The Relational Model 2
Chapter 2: Intro to Relational Model
Relational Algebra Friday, 11/14/2003.
CS639: Data Management for Data Science
Example of a Relation attributes (or columns) tuples (or rows)
Chapter 2: Intro to Relational Model
Lecture 3: Relational Algebra and SQL
Lecture 11: Functional Dependencies
Select-From-Where Statements Multirelation Queries Subqueries
2019, Fall Pusan National University Ki-Joune Li
Presentation transcript:

COP4710 Database Systems Relational Algebra

Why Do We Learn This? Querying the database: specify what we want from our database Find all the people who earn more than $1,000,000 and pay taxes in Tallahassee Could write in C++/Java, but a bad idea Instead use high-level query languages: Theoretical: Relational Algebra, Datalog Practical: SQL Relational algebra: a basic set of operations on relations that provide the basic principles

What is an “Algebra”? Mathematical system consisting of: Examples Operands --- variables or values from which new values can be constructed Operators --- symbols denoting procedures that construct new values from given values Examples Arithmetic(Elementary) algebra, linear algebra, Boolean algebra …… What are operands? What are operators?

What is Relational Algebra? An algebra Whose operands are relations or variables that represent relations Whose operators are designed to do common things that we need to do with relations in a database relations as input, new relation as output Can be used as a query language for relations!

Relational Operators at a Glance Five basic RA operations: Basic Set Operations union, difference (no intersection, no complement) Selection: s Projection: p Cartesian Product: X When our relations have attribute names: Renaming: r Derived operations: Intersection, complement Joins (natural join, equi-join, theta join, semi-join, ……)

Set Operations Union: all tuples in R1 or R2, denoted as R1 U R2 R1, R2 must have the same schema R1 U R2 has the same schema as R1, R2 Example: Active-Employees U Retired-Employees If any, is duplicate elimination required? Difference: all tuples in R1 but not in R2, denoted as R1 – R2 R1 - R2 has the same schema as R1, R2 Example All-Employees - Retired-Employees

Selection Returns all tuples (rows) which satisfy a condition, denoted as sc(R) c is a condition: =, <, >, AND, OR, NOT Output schema: same as input schema Find all employees with salary more than $40,000: sSalary > 40000 (Employee) SSN Name Dept-ID Salary 111060000 Alex 1 30K 754320032 Bob 32K 983210129 Chris 2 45K SSN Name Dept-ID Salary 983210129 Chris 2 45K

Projection Unary operation: returns certain columns, denoted as P A1,…,An (R) Eliminates duplicate tuples ! Input schema R(B1, …, Bm) Condition: {A1, …, An} {B1, …, Bm} Output schema S(A1, …, An) Example: project social-security number and names: P SSN, Name (Employee) SSN Name Dept-ID Salary 111060000 Alex 1 30K 754320032 Bob 32K 983210129 Chris 2 45K SSN Name 111060000 Alex 754320032 Bob 983210129 Chris

Selection vs. Projection Think of relation as a table How are they similar? How are they different? Horizontal vs. vertical? Duplicate elimination for both? What about in real systems? Why do you need both?

Cartesian Product Each tuple in R1 with each tuple in R2, denoted as R1 x R2 Input schemas R1(A1,…,An), R2(B1,…,Bm) Output schema is S(A1, …, An, B1, …, Bm) Two relations are combined! Very rare in practice; but joins are very common Example: Employee x Dependent

Example SSN Name 111060000 Alex 754320032 Brandy Employee-SSN Dependent SSN Name 111060000 Alex 754320032 Brandy Employee-SSN Dependent-Name 111060000 Chris 754320032 David Employee x Dependent SSN Name Employee-SSN Dependent-Name 111060000 Alex Chris 754320032 David Brandy

Renaming Soc-sec-num, firstname(Employee) Does not change the relational instance, denoted as Notation: r S(B1,…,Bn) (R) Changes the relational schema only Input schema: R(A1, …, An) Output schema: S(B1, …, Bn) Example: Soc-sec-num, firstname(Employee) SSN Name 111060000 Alex 754320032 Bob 983210129 Chris Soc-sec-num firstname 111060000 Alex 754320032 Bob 983210129 Chris

Set Operations: Intersection Intersection: all tuples both in R1 and in R2, denoted as R1 R2 R1, R2 must have the same schema R1 R2 has the same schema as R1, R2 Example UnionizedEmployees RetiredEmployees Intersection is derived: R1 R2 = R1 – (R1 – R2) why ?

Theta Join A join that involves a predicate q, denoted as R1 q R2 Input schemas: R1(A1,…,An), R2(B1,…,Bm) Output schema: S(A1,…,An,B1,…,Bm) Derived operator: R1 q R2 = s q (R1 x R2) Take the (Cartisian) product R1 x R2 Then apply SELECTC to the result As for SELECT, C can be any Boolean-valued condition

Theta Join: Example Name Address Bar Beer Price AJ’s Bud 2.5 Miller Sells Name Address AJ's 1800 Tennessee Michael's Pub 513 Gaines Bar Beer Price AJ’s Bud 2.5 Miller 2.75 Michael’s Pub Corona 3.0 BarInfo := Sells Sells.Bar=Bar.Name Bar Bar Beer Price Name Address AJ’s Bud 2.5 AJ's 1800 Tennessee Miller 2.75 Michael’s Pub Michael's Pub 513 Gaines Corona 3.0

Natural Join Notation: R1 R2 Input Schema: R1(A1, …, An), R2(B1, …, Bm) Output Schema: S(C1,…,Cp) Where {C1, …, Cp} = {A1, …, An} U{B1, …, Bm} Meaning: combine all pairs of tuples in R1 and R2 that agree on the join attributes: {A1,…,An} {B1,…, Bm} (called the join attributes)

Natural Join: Examples Employee Dependent SSN Name 111060000 Alex 754320032 Brandy SSN Dependent-Name 111060000 Chris 754320032 David Employee Dependent = P SSN, Name, Dependent-Name(sEmployee.SSN=Dependent.SSN(Employee x Dependent) SSN Name Dependent-Name 111060000 Alex Chris 754320032 Brandy David

Natural Join: Examples B X Y Z V B C Z U V W R S A B C X Z U V Y W

Equi-join Special case of theta join: condition c contains only conjunctions of equalities Result schema is the same as that of Cartesian product May have fewer tuples than Cartesian product Most frequently used in practice: R1 A=B R2 Natural join is a particular case of equi-join A lot of research on how to do it efficiently

A Joke About Join A join query walks up to two tables in a restaurant and asks : “Mind if I join you?”

Building Complex Expressions Algebras allow us to express sequences of operations in a natural way Example In arithmetic algebra: (x + 4)*(y - 3) Relational algebra allows the same Three notations, just as in arithmetic: Sequences of assignment statements Expressions with several operators Expression trees

Sequences of Assignments Create temporary relation names Renaming can be implied by giving relations a list of attributes Example: R3 := R1 C R2 can be written: R4 := R1 x R2 (R4: temporary relation) R3 := sC (R4)

Expressions with Several Operators Example: the theta-join R3 := R1 JOINC R2 can be written: R3 := sC (R1 x R2) Precedence of relational operators: Unary operators --- select, project, rename --- have highest precedence, bind first Then come products and joins Then intersection Finally, union and set difference bind last But you can always insert parentheses to force the order you desire

Expression Trees Leaves are operands either variables standing for relations or particular constant relations Interior nodes are operators, applied to their child or children

Expression Tree: Examples Given Bars(name, addr), Sells(bar, beer, price), find the names of all the bars that are either on Tennessee St. or sell Bud for less than $3 UNION RENAMER(name) PROJECTname PROJECTbar SELECTaddr = “Tennessee St.” SELECT price<3 AND beer=“Bud” Bars Sells

Summary of Relational Algebra Why bother ? Can write any RA expression directly in C++/Java, seems easy Two reasons: Each operator admits sophisticated implementations (think of and s C) Expressions in relational algebra can be rewritten: optimized s(age >= 30 AND age <= 35)(Employees) Method 1: scan the file, test each employee Method 2: use an index on age Employees Relatives Iterate over Employees, then over Relatives? Or iterate over Relatives, then over Employees? Sort Employees, Relatives, do “merge-join” “hash-join” etc.