 CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian.

Slides:



Advertisements
Similar presentations
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
Advertisements

D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
Relational Algebra Ch. 7.4 – 7.6 John Ortiz. Lecture 4Relational Algebra2 Relational Query Languages  Query languages: allow manipulation and retrieval.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A Modified by Donghui Zhang.
INFS614, Fall 08 1 Relational Algebra Lecture 4. INFS614, Fall 08 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of.
1 Relational Algebra & Calculus. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
 CS 405G: Introduction to Database Systems Lecture 7: Relational Algebra II Instructor: Chen Qian Spring 2014.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
FALL 2004CENG 351 File Structures and Data Managemnet1 Relational Algebra.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
1 Relational Algebra. 2 Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model supports.
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
CS405G: Introduction to Database Systems Final Review.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Rutgers University Relational Algebra 198:541 Rutgers University.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
CSCD343- Introduction to databases- A. Vaisman1 Relational Algebra.
Relational Algebra, R. Ramakrishnan and J. Gehrke (with additions by Ch. Eick) 1 Relational Algebra.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
Lecture 05 Structured Query Language. 2 Father of Relational Model Edgar F. Codd ( ) PhD from U. of Michigan, Ann Arbor Received Turing Award.
Instructor: Jinze Liu Fall Basic Components (2) Relational Database Web-Interface Done before mid-term Must-Have Components (2) Security: access.
1 Relational Algebra. 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of data from a database. v Relational model supports.
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
Relational Algebra  Souhad M. Daraghma. Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational.
1 Relational Algebra & Calculus Chapter 4, Part A (Relational Algebra)
1 Relational Algebra and Calculas Chapter 4, Part A.
1.1 CAS CS 460/660 Introduction to Database Systems Relational Algebra.
Database Management Systems 1 Raghu Ramakrishnan Relational Algebra Chpt 4 Xin Zhang.
Relational Algebra.
ICS 321 Fall 2011 The Relational Model of Data (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 8/29/20111Lipyeow.
1 Relational Algebra Chapter 4, Sections 4.1 – 4.2.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
Database Management Systems 1 Raghu Ramakrishnan Relational Algebra Chpt 4 Xin Zhang.
CSCD34-Data Management Systems - A. Vaisman1 Relational Algebra.
IST 210 The Relational Language Todd S. Bacastow January 2004.
Database Management Systems, R. Ramakrishnan1 Relational Algebra Module 3, Lecture 1.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CS 405G: Introduction to Database Systems Relational Algebra.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu.
Database Management Systems 1 Raghu Ramakrishnan Relational Algebra Chpt 4 Jianping Fan.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2007.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
1 CS122A: Introduction to Data Management Lecture #7 Relational Algebra I Instructor: Chen Li.
CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
Relational Algebra & Calculus
Relational Algebra Chapter 4 1.
Relational Algebra Chapter 4, Part A
CS 405G: Introduction to Database Systems
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
CS 405G: Introduction to Database Systems
Relational Algebra.
CS 405G: Introduction to Database Systems
Relational Algebra 1.
Relational Algebra Chapter 4 1.
Relational Algebra Chapter 4, Sections 4.1 – 4.2
CS 405G: Introduction to Database Systems
CENG 351 File Structures and Data Managemnet
Relational Algebra & Calculus
Relational Algebra Chapter 4 - part I.
CS 405G: Introduction to Database Systems
Recap – Relational languages
CS 405G: Introduction to Database Systems
Presentation transcript:

 CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian

12/22/2015Chen Qian, University of Kentucky2 Review Informal TermsFormal Terms TableRelation ColumnAttribute/Domain RowTuple Values in a columnDomain Table DefinitionSchema of a Relation Populated TableExtension

12/22/2015Chen Qian, University of Kentucky3 Update Operations on Relations Update operations INSERT a tuple. DELETE a tuple. MODIFY a tuple. Constraints should not be violated in updates

12/22/2015Chen Qian, University of Kentucky4 Example We have the following relational schemas Student(sid: string, name: string, gpa: float) Course(cid: string, department: string) Enrolled(sid: string, cid: string, grade: character) We have the following sequence of database update operations. (assume all tables are empty before we apply any operations) INSERT into Student sidnamegpa 1234John Smith3.5

12/22/2015Chen Qian, University of Kentucky5 Example (Cont.) INSERT into Courses INSERT into Enrolled UPDATE the grade in the Enrolled tuple with sid = 1234 and cid = 647 to ‘A’. DELETE the Enrolled tuple with sid 1234 and cid 647 sidnamegpa 1234John Smith3.5 ciddepartment 647EECS sidcidgrade B sidcidgrade A sidcidgrade

12/22/2015Chen Qian, University of Kentucky6 Exercise INSERT into Courses INSERT into Enrolled INSERT into Student sidnamegpa 1234John Smith3.5 ciddepartment 647EECS 108MATH sidcidgrade B ciddepartment 647EECS sidcidgrade sidnamegpa 1234John Smith Mary Carter3.8

12/22/2015Chen Qian, University of Kentucky7 Exercise (cont.) A little bit tricky INSERT into Student Fail due to domain constraint INSERT into Enrolled Fail due to entity integrity INSERT into Enrolled Failed due to referential integrity sidnamegpa 1234John Smith Mary Carter3.8 ciddepartment 647EECS 108MATH sidcidgrade B

12/22/2015Chen Qian, University of Kentucky8 Exercise (cont.) A more tricky one UPDATE the cid in the tuple from Course where cid = 108 to 109 sidnamegpa 1234John Smith Mary Carter3.8 ciddepartment 647EECS 108MATH sidcidgrade B ciddepartment 647EECS 109MATH sidcidgrade B

12/22/2015Chen Qian, University of Kentucky9 Update Operations on Relations In case of integrity violation, several actions can be taken: Cancel the operation that causes the violation (REJECT option) Perform the operation but inform the user of the violation Trigger additional updates so the violation is corrected (CASCADE option, SET NULL option) Execute a user-specified error-correction routine

Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model supports simple, powerful QLs: Strong formal foundation based on logic. Allows for much optimization. Query Languages != programming languages! QLs not intended to be used for complex calculations and inference (e.g. logical reasoning) QLs support easy, efficient access to large data sets. 12/22/201510

Formal Relational Query Languages Two mathematical Query Languages form the basis for “real” languages (e.g. SQL), and for implementation: Relational Algebra: More operational, very useful for representing execution plans. Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non- procedural, declarative.) * Understanding Algebra & Calculus is key to understanding SQL, query processing! 12/22/201511

12/22/2015Chen Qian, University of Kentucky12 Relational algebra Core set of operators: Selection, projection, cross product, union, difference, and renaming Additional, derived operators: Join, natural join, intersection, etc. Compose operators to make complex queries OPER A language for querying relational databases based on operators:

12/22/2015Chen Qian, University of Kentucky13 Selection Input: a table R Notation:  p R p is called a selection condition/predicate Purpose: filter rows according to some criteria Output: same columns as R, but only rows of R that satisfy p

12/22/2015Chen Qian, University of Kentucky14 Selection example Students with GPA higher than 3.0   GPA > 3.0 Student sidnameagegpa 1234John Smith Mary Carter Bob Lee Susan Wong Kevin Kim212.9 sidnameagegpa 1234John Smith Mary Carter Bob Lee Susan Wong Kevin Kim212.9  GPA > 3.0

12/22/2015Chen Qian, University of Kentucky15 More on selection Selection predicate in general can include any column of R, constants, comparisons (=, >, etc.), and Boolean connectives (  : and,  : or, and ¬ : negation (not) ) Example: straight A students under 18 or over 21  GPA = 4.0  (age 21) Student But you must be able to evaluate the predicate over a single row of the input table Example: student with the highest GPA  GPA >= all GPA in Student table Student

12/22/2015Chen Qian, University of Kentucky16 Projection Input: a table R Notation: π L R L is a list of columns in R Purpose: select columns to output Output: same rows, but only the columns in L Order of the rows is preserved Number of rows may be less (depends on where we have duplicates or not)

12/22/2015Chen Qian, University of Kentucky17 Projection example ID’s and names of all students π SID, name Student sidnameagegpa 1234John Smith Mary Carter Bob Lee Susan Wong Kevin Kim212.9 π SID, name sidname 1234John Smith 1123Mary Carter 1011Bob Lee 1204Susan Wong 1306Kevin Kim

12/22/2015Chen Qian, University of Kentucky18 More on projection Duplicate output rows are removed (by definition) Example: student ages π age Student sidnameagegpa 1234John Smith Mary Carter Bob Lee Susan Wong Kevin Kim212.9 π age age

12/22/2015Chen Qian, University of Kentucky19 Cross product Input: two tables R and S Notation: R × S Purpose: pairs rows from two tables Output: for each row r in R and each row s in S, output a row rs (concatenation of r and s)

12/22/2015Chen Qian, University of Kentucky20 Cross product example Student × Enroll sidnameagegpa 1234John Smith Mary Carter Bob Lee222.6 sidcidgrade A A sidnameagegpasidcidgrade 1234John Smith A 1123Mary Carter A 1011Bob Lee A 1234John Smith A 1123Mary Carter A 1011Bob Lee A ×

12/22/2015Chen Qian, University of Kentucky21 A note on column ordering The ordering of columns in a table is considered unimportant (as is the ordering of rows) That means cross product is commutative, i.e., R × S = S × R for any R and S = sidnameagegpa 1234John Smith Mary Carter Bob Lee222.6 sidnamegpaage 1234John Smith Mary Carter Bob Lee2.622

Derived operator: join Input: two tables R and S Notation: R p S p is called a join condition/predicate Purpose: relate rows from two tables according to some criteria Output: for each row r in R and each row s in S, output a row rs if r and s satisfy p 12/22/ Shorthand for σ p ( R X S )

Join example Info about students, plus CID’s of their courses Student (Student.SID = Enroll.SID) Enroll 12/22/ Use table_name. column_name syntax to disambiguate identically named columns from different input tables sidnameagegpa 1234John Smith Mary Carter Bob Lee222.6 sidcidgrade A A sidnameagegpasidcidgrade 1234John Smith A 1123Mary Carter A 1011Bob Lee A 1234John Smith A 1123Mary Carter A 1011Bob Lee A Student.SID = Enroll.SID

Derived operator: natural join Input: two tables R and S Notation: R S Purpose: relate rows from two tables, and Enforce equality on all common attributes Eliminate one copy of common attributes 12/22/ Shorthand for π L ( R p S ), where p equates all attributes common to R and S L is the union of all attributes from R and S, with duplicate attributes removed

Natural join example Student Enroll = π L ( Student p Enroll ) = π SID, name, age, GPA, CID ( Student Student.SID = Enroll.SID Enroll ) 12/22/ sidnameagegpa 1234John Smith Mary Carter Bob Lee222.6 sidcidgrade A A sidnameagegpasidcidgrade 1234John Smith A 1123Mary Carter A 1011Bob Lee A 1234John Smith A 1123Mary Carter A 1011Bob Lee A

Union Input: two tables R and S Notation: R S R and S must have identical schema Output: Has the same schema as R and S Contains all rows in R and all rows in S, with duplicate rows eliminated 12/22/201526

Difference Input: two tables R and S Notation: R - S R and S must have identical schema Output: Has the same schema as R and S Contains all rows in R that are not found in S 12/22/201527

Derived operator: intersection Input: two tables R and S Notation: R \ S R and S must have identical schema Output: Has the same schema as R and S Contains all rows that are in both R and S 12/22/2015Jinze University of Kentucky28 Shorthand for R - ( R - S ) Also equivalent to S - ( S - R ) And to R S

Renaming Input: a table R Notation: ρ S R, ρ (A 1, A 2, …) R or ρ S(A 1, A 2, …) R Purpose: rename a table and/or its columns Output: a renamed table with the same rows as R Used to Avoid confusion caused by identical column names Create identical columns names for natural joins 12/22/201529

Renaming Example  Enroll1 (SID1, CID1,Grade1) Enroll 12/22/ sidcidgrade A A sid1cid1grade A A  Enroll1 (SID1, CID1,Grade1)

Review: Summary of core operators Selection: Projection: Cross product: Union: Difference: Renaming: Does not really add “processing” power 12/22/ σp Rσp R πL RπL R R X SR X S R SR S R - S ρ S(A 1, A 2, …) R

Review Summary of derived operators Join: Natural join: Intersection: 12/22/ R p S R S Many more Outer join, Division, Semijoin, anti-semijoin, …

Red parts 12/22/ pid of red parts Catalog having red parts

sid of suppliers who support Red parts 12/22/ names of suppliers who support Red parts