Fundamentals/ICY: Databases 2012/13 WEEK 11 (relational operators & relational algebra) John Barnden Professor of Artificial Intelligence School of Computer.

Slides:



Advertisements
Similar presentations
Relational Database Operators
Advertisements

Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
1 CHAPTER 4 RELATIONAL ALGEBRA AND CALCULUS. 2 Introduction - We discuss here two mathematical formalisms which can be used as the basis for stating and.
The Relational Database Model
Relational Algebra 1 Chapter 5.1 V3.0 Napier University Dr Gordon Russell.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
Database Systems Chapter 6 ITM Relational Algebra The basic set of operations for the relational model is the relational algebra. –enable the specification.
Chapter 2 The Relational Database Model
Relational Algebra Relational Calculus. Relational Algebra Operators Relational algebra defines the theoretical way of manipulating table contents using.
The Relational Database Model
The Relational Database Model. 2 Objectives How relational database model takes a logical view of data Understand how the relational model’s basic components.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 3 The Relational Database Model.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
Rutgers University Relational Algebra 198:541 Rutgers University.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Chapter 11.1 and 11.2 Data Manipulation: Relational Algebra and SQL Brian Cobarrubia Introduction to Database Management Systems October 4, 2007.
3 The Relational Model MIS 304 Winter Class Objectives That the relational database model takes a logical view of data That the relational model’s.
The Relational Database Model
Databases Illuminated
Fundamentals/ICY: Databases 2010/11 WEEK 11 John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK.
Chapter 3 Section 3.4 Relational Database Operators
The Relational Database Model
1 The Relational Database Model. 2 Learning Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical.
Relational Algebra References: Databases Illuminated by Catherine Ricardo, published by Jones and Bartlett in 2004 Fundamentals of Relational Databases.
9/7/2012ISC329 Isabelle Bichindaritz1 The Relational Database Model.
Intro to Maths for CS: 2012/13 Sets (week 1 part) John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Relational Algebra  Souhad M. Daraghma. Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational.
Module Coordinator Tan Szu Tak School of Information and Communication Technology, Politeknik Brunei Semester
CS 4432query processing1 CS4432: Database Systems II Lecture #11 Professor Elke A. Rundensteiner.
1.1 CAS CS 460/660 Introduction to Database Systems Relational Algebra.
From Relational Algebra to SQL CS 157B Enrique Tang.
Fundamentals/ICY: Databases 2013/14 WEEK 9 –Monday John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,
CS424 Relational Data Manipulation Relational Data Manipulation Relational tables are sets. Relational tables are sets. The rows of the tables can be considered.
1 Relational Algebra Chapter 4, Sections 4.1 – 4.2.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
Relational Algebra MBAD 613 R. Nakatsu. Relational Data Manipulation Language Query-by-Example; Query-by-Form Transform-Oriented Languages Relational.
Advanced Relational Algebra & SQL (Part1 )
Fundamentals/ICY: Databases 2013/14 WEEK 9 –Friday John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,
Views, Algebra Temporary Tables. Definition of a view A view is a virtual table which does not physically hold data but instead acts like a window into.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
3 1 Database Systems The Relational Database Model.
3 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel Relational Algebra Operators (continued) Difference –Yields all.
Fundamentals/ICY: Databases 2013/14 Week 11 – Monday – relations, ended. John Barnden Professor of Artificial Intelligence School of Computer Science University.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 3 The Relational Database Model.
April 20022CS3X1 Database Design Relational algebra John Wordsworth Department of Computer Science The University of Reading Room.
Week 2 Lecture The Relational Database Model Samuel ConnSamuel Conn, Faculty Suggestions for using the Lecture Slides.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
LECTURE THREE RELATIONAL ALGEBRA 11. Objectives  Meaning of the term relational completeness.  How to form queries in relational algebra. 22Relational.
Ritu CHaturvedi Some figures are adapted from T. COnnolly
COMP3017 Advanced Databases
The Relational Database Model
Lecture 2 The Relational Model
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Chapter 3 The Relational Database Model
LECTURE 3: Relational Algebra
Data Manipulation using Relational Algebra
The Relational Database Model
DCT 2053 DATABASE CONCEPT Chapter 2.2 CONTINUE
Database Systems: Design, Implementation, and Management
Chapter 4 Relational Algebra
Relational Database Operators
Presentation transcript:

Fundamentals/ICY: Databases 2012/13 WEEK 11 (relational operators & relational algebra) John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK

Relational Operators from Chapters 3 (and 7 and 8) of Rob & Coronel (7th Ed) plus one handout and extra Maths observations

Relational Database Operators uRelational algebra l Defines theoretical way of manipulating tables using “relational operators” that mainly manipulate the relations in the tables. SELECT PROJECT JOIN (various sorts) INTERSECT l Use of relational algebra operators on existing tables produces new tables UNION DIFFERENCE PRODUCT ((DIVIDE))

Relational Operators ( continued ) uSelect [better name would be Select-Rows] l Yields a table S whose rows are some (or all) of the rows in the single given table T, preserving duplicates. l S’s rows could be all of T’s, but more usually are those that satisfy some specified criterion. l Yields a “horizontal subset” of T (“collection of horizontal slices” would be a better term). l Does not itself reduce the set of columns.

Select

Select [better name would be Select-Rows] uSQL: l SELECT * FROM … WHERE … l Note: it’s the WHERE part that is actually doing the selection according to a criterion. uRelational algebra notation in handout: l Result table is  C (T) where T is the given table and C is the selection criterion. l More compact than SQL notation. Avoids notation private to particular versions of particular programming languages.

Relational Operators ( continued ) uProject [better name would be Select-Columns] l Yields a table S whose columns are a specified subset of the columns of the single given table T, and whose rows contain the corresponding values from all of T’s rows. l The operation may create duplication even when none present in original table. l Yields a “vertical subset” of T [better: “set of vertical slices”]

Project

Project [better name would be Select-Columns] uSQL: l SELECT …column specs … FROM … uRelational algebra notation in handout: l Result table is  X (T) where T is the given table and X is the list of selected attributes (columns). l But this always removes row duplications from the result, and so does not exactly correspond to the full DB notion of projection.

Relational Operators ( continued ) uUnion and its All version uIntersect and its All version uDifference and its All version uThe given tables must have compatible value domains.

Union

Intersect

Difference

Union, Intersection and Difference uSQL: l UNION, INTERSECT, EXCEPT (or MINUS) l UNION ALL, INTERSECT ALL, EXCEPT (or MINUS) ALL uRelational algebra notation in handout: l Result tables are T1  T2, T1  T2 and T1  T2 where T1 and T2 are the given tables. uMaths of relations: l Result relations are R1  R2, R1  R2 and R1  R2 in the non-ALL cases. where R1and R2 are the relations in the given tables. l Problem: relations don’t account for duplicates of rows, so don’t handle the ALL versions.

Some “Relational Operations”: Set Operations Applied to Relations uUnion of relations R and S: R  S = the set of tuples that are in R or S (or both). NB: no repetitions created! uIntersection of relations R and S: R  S = the set of tuples that are in both R and S. uDifference of relations R and S: R  S = the set of tuples that are in R but not S.

Relational Operations: contrast to SQL uThose operations do NOT themselves require R and S to have similar tuples in order to be well-defined. l E.g., R could be binary and on integer sets, S could be ternary and on character-string sets. uBut the corresponding DB table operations (which are usually called “relational operators”) do require the tables to have the same shape (same number of columns, same domains for corresponding columns).

Relational Operators ( continued ) uJoin (various types) l Allows us to join related rows from two or more tables l It’s an important feature of the relational database idea l Joining has been implicitly important in some of the module handouts, because of the use of WHERE to test for attribute equality between tables.

Relational Operators ( continued ) uProduct or Cross Join l Yields a table containing all concatenations of whole rows from first given table with whole rows from second given table.

Product If second table also had a PRICE attribute, then the product would have a Table1.PRICE attr. and a Table2.PRICE attr.

Product or Cross Join ( continued ) uSQL: l SELECT * FROM …two [or more] tables … NB: it’s the mere listing of the tables that does the Product, but it’s possible also to write: SELECT * FROM T1 CROSS JOIN T2 CROSS JOIN... uRelational algebra notation: l Result table is T1  T2 where T1 and T2 are the given tables. uMaths of relations: l Result relation is R1  R2 where R1and R2 are the relations in the given tables. l Problem: relations don’t account for duplicates of rows.

So, I want …  u….. to define the non-standard notion of “flattened Cartesian product” of two relations R and S. Notated by the symbol  (underlined multiplication symbol). R  S = the set of tuples that are the concatenations of members of R and members of S. E.g., if is in R and is in S then is in R  S.

Contd. uIf A is the People relation and B is the Organizations relation, and A has members of form  E156, ‘Sam’, ‘Finks’, I678> and B has members of form  I459, ‘Dell’, ‘UK’> THEN A  B has members of form   E156, ‘Sam’, ‘Finks’, I678>,  I459, ‘Dell’, ‘UK’> > BUT A  B has members of form  E156, ‘Sam’, ‘Finks’, I678, I459, ‘Dell’, ‘UK’>

Product or Cross Join ( continued ) uCAUTION: l Note the use of the flattened Cartesian product R1  R2. The ordinary Cartesian product would be incorrect, even though in DB lingo the word Product might suggest Cartesian product and people do sometimes say Cartesian product… l … and even though in relational algebra the ordinary multiplication symbol is used, which again makes it look like a Cartesian product.

Two Tables That Will Be Used to Illustrate Other Joins

Natural Join ( continued ) uSQL: l SELECT …all the attributes but including only one version of each shared one … FROM T1, T2 WHERE … explicit condition of equalities for ALL the shared attributes... l SELECT * FROM T1 NATURAL JOIN T2; Instead of using *, can choose columns, and can add a WHERE uRelational algebra notation:   l Result table is T1  T2 where T1 and T2 are the given tables.  is the “bow tie” symbol.

uCorrespondence to your SQL experience: l SELECT sid, office FROM staff, lecturing WHERE staff.sid = lecturing.sid; Does a natural join (because sid is the only shared attribute) followed by a projection onto sid, office. l SELECT sid, office FROM staff, lecturing WHERE staff.sid = lecturing.sid AND year > 2001; In effect, does a natural join followed by a further (row) selection followed by a projection. l SELECT sid, office FROM staff NATURAL JOIN lecturing WHERE year > 2001; Does same thing.

Natural Join uThe common attributes or columns are called the join attributes or columns): just the AGENT_CODE attribute in above example uCan be thought of as the result of a three-stage process: l the PRODUCT of the tables is created l a SELECT is performed on the resulting table to yield only the rows for which the join-attribute values (e.g. AGENT_CODE values) are equal l a PROJECT is now performed to yield a single copy of each join attribute, thereby eliminating duplicate columns

Natural Join, Step 1: PRODUCT Note the two AGENT_CODE columns

Natural Join, Step 2: SELECT

Natural Join, Step 3: PROJECT

Natural Join ( continued ) uA row in one of the given tables that does not match any row in the other given table on the join attributes does not lead to a row in the result table. uNote that if the two tables have no attributes in common, then every row of each table trivially matches every row of the other table! So in this case the result is the PRODUCT (CROSS JOIN) of the two tables!!

Natural Join ( continued ) uIf the same AGENT_CODE were to occur several times in the AGENT table, then l a customer would be listed for each match

Following stuff on Joins is optional (but important in practice)

Other Forms of Join uEquijoin l Links tables on the basis of an equality condition that compares SPECIFIED attributes of each table, rather than automatically taking the common attributes. l Result does not eliminate duplicate columns that are not involved in the join condition. uTheta join l Like equijoin but using a non-equality join condition. uOuter joins (left, right, and full) l Equijoin or theta join plus unmatched rows from left table, right table or both, padding them out with NULLs to fit the result table.

Equijoin and Theta Join ( continued ) uSQL: l SELECT * FROM T1, T2 WHERE … explicit join condition, stating (non)equality of the CHOSEN attributes... l SELECT * FROM T1 JOIN T2 ON … such a condition … l SELECT * FROM T1 JOIN T2 USING (… some common attribs …) [for equijoin only] uRelational algebra notation: l T1  C T2 where C is the join condition.

Outer Join Of CUSTOMER and AGENT, using equal AGENT_CODE Left outer l Uses all the rows in the CUSTOMER table, by doing equijoin on AGENT_CODE but also including non-matching CUSTOMER rows. Right outer l Uses all the rows in the AGENT table, doing equijoin on AGENT_CODE but also including non-matching AGENT rows. Full outer l Using all the rows in the AGENT and CUSTOMER tables, doing equijoin on AGENT_CODE but also including non-matching rows from each table. l Union of Left Outer Join result and Right Outer Join result.

Left Outer Join Same as an equijoin with the addition of the “extra”, last, row shown above

Right Outer Join: Full Outer Join: Would have the “extra” row of this table as well as the extra row of the Left Outer Join table

Outer Joins ( continued ) uSQL: l SELECT * FROM T1, T2 WHERE … explicit join condition … UNION … a SELECT expression that gets the extra LEFT rows UNION … a SELECT expression that gets the extra RIGHT rows l SELECT * FROM T1 LEFT/RIGHT/FULL JOIN T2 / USING (… some shared attribs …) / ON … explicit join cond … uRelational algebra notation: l Variants of bow tie symbol. See R,C&C sec (though their symbols need a subscript stating the join condition unless natural).

Note on SQL Join Queries uCan of course do your own extra projection (= attrib selection) in the SELECT, and can add a WHERE. uE.g.: l SELECT …attribs … FROM T1 LEFT JOIN T2 USING (… some shared attribs …) WHERE … ;

Following stuff on DIVIDE is optional

Towards the DIVIDE operation uIt’s analogous to the “integer division” of an integer T by an integer S, included in many programming languages. T div S = the largest integer Q such that  S  Q  T 2 So 7 div 3 = 2.

DIVIDE operation on DB tables The only value of LOC that is associated in T with both values ‘A’ and ‘B’ of CODE is 5. Simplest case: 2-col table by 1-col table TSQ

Divide uDIVIDE T by S: the attributes X 1 … X M of table S must be some but not all of those of T’s. Q Gives a table Q having the remaining attributes Y 1 … Y N of T. Q Q holds the values of Y 1 … Y N that T associates with every row (X 1 … X M ) in S. Q Q Q uSo the rows of the Product of S with Q form a subset of the rows (suitably re-ordered) of T, and Q is maximal in this respect (i.e., adding further rows to Q would stop the Product’s rows all being in T) Q So Q is the largest table such that  Q  S  Q  T (with rows suitably re-ordered)  using  to mean: has some or all rows of.

Divide ( continued ) uSQL: l Not standardly included. Effect can be simulated. uRelational algebra notation:  l T2  T1 uMaths of relations: l Result relation R could be described as the maximal set R of tuples such that R1  R  R2 where R1 and R2 are the relations in the given tables.