Presentation is loading. Please wait.

Presentation is loading. Please wait.

 CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian.

Similar presentations


Presentation on theme: " CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian."— Presentation transcript:

1  CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian

2 12/22/2015Chen Qian, University of Kentucky2 Review Informal TermsFormal Terms TableRelation ColumnAttribute/Domain RowTuple Values in a columnDomain Table DefinitionSchema of a Relation Populated TableExtension

3 12/22/2015Chen Qian, University of Kentucky3 Update Operations on Relations Update operations INSERT a tuple. DELETE a tuple. MODIFY a tuple. Constraints should not be violated in updates

4 12/22/2015Chen Qian, University of Kentucky4 Example We have the following relational schemas Student(sid: string, name: string, gpa: float) Course(cid: string, department: string) Enrolled(sid: string, cid: string, grade: character) We have the following sequence of database update operations. (assume all tables are empty before we apply any operations) INSERT into Student sidnamegpa 1234John Smith3.5

5 12/22/2015Chen Qian, University of Kentucky5 Example (Cont.) INSERT into Courses INSERT into Enrolled UPDATE the grade in the Enrolled tuple with sid = 1234 and cid = 647 to ‘A’. DELETE the Enrolled tuple with sid 1234 and cid 647 sidnamegpa 1234John Smith3.5 ciddepartment 647EECS sidcidgrade 1234647B sidcidgrade 1234647A sidcidgrade

6 12/22/2015Chen Qian, University of Kentucky6 Exercise INSERT into Courses INSERT into Enrolled INSERT into Student sidnamegpa 1234John Smith3.5 ciddepartment 647EECS 108MATH sidcidgrade 1234108B ciddepartment 647EECS sidcidgrade sidnamegpa 1234John Smith3.5 1123Mary Carter3.8

7 12/22/2015Chen Qian, University of Kentucky7 Exercise (cont.) A little bit tricky INSERT into Student Fail due to domain constraint INSERT into Enrolled Fail due to entity integrity INSERT into Enrolled Failed due to referential integrity sidnamegpa 1234John Smith3.5 1123Mary Carter3.8 ciddepartment 647EECS 108MATH sidcidgrade 1234108B

8 12/22/2015Chen Qian, University of Kentucky8 Exercise (cont.) A more tricky one UPDATE the cid in the tuple from Course where cid = 108 to 109 sidnamegpa 1234John Smith3.5 1123Mary Carter3.8 ciddepartment 647EECS 108MATH sidcidgrade 1234108B ciddepartment 647EECS 109MATH sidcidgrade 1234109B

9 12/22/2015Chen Qian, University of Kentucky9 Update Operations on Relations In case of integrity violation, several actions can be taken: Cancel the operation that causes the violation (REJECT option) Perform the operation but inform the user of the violation Trigger additional updates so the violation is corrected (CASCADE option, SET NULL option) Execute a user-specified error-correction routine

10 Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model supports simple, powerful QLs: Strong formal foundation based on logic. Allows for much optimization. Query Languages != programming languages! QLs not intended to be used for complex calculations and inference (e.g. logical reasoning) QLs support easy, efficient access to large data sets. 12/22/201510

11 Formal Relational Query Languages Two mathematical Query Languages form the basis for “real” languages (e.g. SQL), and for implementation: Relational Algebra: More operational, very useful for representing execution plans. Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non- procedural, declarative.) * Understanding Algebra & Calculus is key to understanding SQL, query processing! 12/22/201511

12 12/22/2015Chen Qian, University of Kentucky12 Relational algebra Core set of operators: Selection, projection, cross product, union, difference, and renaming Additional, derived operators: Join, natural join, intersection, etc. Compose operators to make complex queries OPER A language for querying relational databases based on operators:

13 12/22/2015Chen Qian, University of Kentucky13 Selection Input: a table R Notation:  p R p is called a selection condition/predicate Purpose: filter rows according to some criteria Output: same columns as R, but only rows of R that satisfy p

14 12/22/2015Chen Qian, University of Kentucky14 Selection example Students with GPA higher than 3.0   GPA > 3.0 Student sidnameagegpa 1234John Smith213.5 1123Mary Carter223.8 1011Bob Lee222.6 1204Susan Wong223.4 1306Kevin Kim212.9 sidnameagegpa 1234John Smith213.5 1123Mary Carter223.8 1011Bob Lee222.6 1204Susan Wong223.4 1306Kevin Kim212.9  GPA > 3.0

15 12/22/2015Chen Qian, University of Kentucky15 More on selection Selection predicate in general can include any column of R, constants, comparisons (=, >, etc.), and Boolean connectives (  : and,  : or, and ¬ : negation (not) ) Example: straight A students under 18 or over 21  GPA = 4.0  (age 21) Student But you must be able to evaluate the predicate over a single row of the input table Example: student with the highest GPA  GPA >= all GPA in Student table Student

16 12/22/2015Chen Qian, University of Kentucky16 Projection Input: a table R Notation: π L R L is a list of columns in R Purpose: select columns to output Output: same rows, but only the columns in L Order of the rows is preserved Number of rows may be less (depends on where we have duplicates or not)

17 12/22/2015Chen Qian, University of Kentucky17 Projection example ID’s and names of all students π SID, name Student sidnameagegpa 1234John Smith213.5 1123Mary Carter223.8 1011Bob Lee222.6 1204Susan Wong223.4 1306Kevin Kim212.9 π SID, name sidname 1234John Smith 1123Mary Carter 1011Bob Lee 1204Susan Wong 1306Kevin Kim

18 12/22/2015Chen Qian, University of Kentucky18 More on projection Duplicate output rows are removed (by definition) Example: student ages π age Student sidnameagegpa 1234John Smith213.5 1123Mary Carter223.8 1011Bob Lee222.6 1204Susan Wong223.4 1306Kevin Kim212.9 π age age 21 22 21

19 12/22/2015Chen Qian, University of Kentucky19 Cross product Input: two tables R and S Notation: R × S Purpose: pairs rows from two tables Output: for each row r in R and each row s in S, output a row rs (concatenation of r and s)

20 12/22/2015Chen Qian, University of Kentucky20 Cross product example Student × Enroll sidnameagegpa 1234John Smith213.5 1123Mary Carter223.8 1011Bob Lee222.6 sidcidgrade 1234647A 1123108A sidnameagegpasidcidgrade 1234John Smith213.51234647A 1123Mary Carter223.81234647A 1011Bob Lee222.61234647A 1234John Smith213.51123108A 1123Mary Carter223.81123108A 1011Bob Lee222.61123108A ×

21 12/22/2015Chen Qian, University of Kentucky21 A note on column ordering The ordering of columns in a table is considered unimportant (as is the ordering of rows) That means cross product is commutative, i.e., R × S = S × R for any R and S = sidnameagegpa 1234John Smith213.5 1123Mary Carter223.8 1011Bob Lee222.6 sidnamegpaage 1234John Smith3.521 1123Mary Carter3.822 1011Bob Lee2.622

22 Derived operator: join Input: two tables R and S Notation: R p S p is called a join condition/predicate Purpose: relate rows from two tables according to some criteria Output: for each row r in R and each row s in S, output a row rs if r and s satisfy p 12/22/201522 Shorthand for σ p ( R X S )

23 Join example Info about students, plus CID’s of their courses Student (Student.SID = Enroll.SID) Enroll 12/22/201523 Use table_name. column_name syntax to disambiguate identically named columns from different input tables sidnameagegpa 1234John Smith213.5 1123Mary Carter223.8 1011Bob Lee222.6 sidcidgrade 1234647A 1123108A sidnameagegpasidcidgrade 1234John Smith213.51234647A 1123Mary Carter223.81234647A 1011Bob Lee222.61234647A 1234John Smith213.51123108A 1123Mary Carter223.81123108A 1011Bob Lee222.61123108A Student.SID = Enroll.SID

24 Derived operator: natural join Input: two tables R and S Notation: R S Purpose: relate rows from two tables, and Enforce equality on all common attributes Eliminate one copy of common attributes 12/22/201524 Shorthand for π L ( R p S ), where p equates all attributes common to R and S L is the union of all attributes from R and S, with duplicate attributes removed

25 Natural join example Student Enroll = π L ( Student p Enroll ) = π SID, name, age, GPA, CID ( Student Student.SID = Enroll.SID Enroll ) 12/22/201525 sidnameagegpa 1234John Smith213.5 1123Mary Carter223.8 1011Bob Lee222.6 sidcidgrade 1234647A 1123108A sidnameagegpasidcidgrade 1234John Smith213.51234647A 1123Mary Carter223.81234647A 1011Bob Lee222.61234647A 1234John Smith213.51123108A 1123Mary Carter223.81123108A 1011Bob Lee222.61123108A

26 Union Input: two tables R and S Notation: R S R and S must have identical schema Output: Has the same schema as R and S Contains all rows in R and all rows in S, with duplicate rows eliminated 12/22/201526

27 Difference Input: two tables R and S Notation: R - S R and S must have identical schema Output: Has the same schema as R and S Contains all rows in R that are not found in S 12/22/201527

28 Derived operator: intersection Input: two tables R and S Notation: R \ S R and S must have identical schema Output: Has the same schema as R and S Contains all rows that are in both R and S 12/22/2015Jinze Liu @ University of Kentucky28 Shorthand for R - ( R - S ) Also equivalent to S - ( S - R ) And to R S

29 Renaming Input: a table R Notation: ρ S R, ρ (A 1, A 2, …) R or ρ S(A 1, A 2, …) R Purpose: rename a table and/or its columns Output: a renamed table with the same rows as R Used to Avoid confusion caused by identical column names Create identical columns names for natural joins 12/22/201529

30 Renaming Example  Enroll1 (SID1, CID1,Grade1) Enroll 12/22/201530 sidcidgrade 1234647A 1123108A sid1cid1grade1 1234647A 1123108A  Enroll1 (SID1, CID1,Grade1)

31 Review: Summary of core operators Selection: Projection: Cross product: Union: Difference: Renaming: Does not really add “processing” power 12/22/201531 σp Rσp R πL RπL R R X SR X S R SR S R - S ρ S(A 1, A 2, …) R

32 Review Summary of derived operators Join: Natural join: Intersection: 12/22/201532 R p S R S Many more Outer join, Division, Semijoin, anti-semijoin, …

33 Red parts 12/22/201533 pid of red parts Catalog having red parts

34 sid of suppliers who support Red parts 12/22/201534 names of suppliers who support Red parts


Download ppt " CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian."

Similar presentations


Ads by Google