1 Relational Algebra and SQL
2 Relational Query Languages Languages for describing queries on a relational database Relational AlgebraRelational Algebra –Intermediate language used within DBMS –Procedural Structured Query LanguageStructured Query Language (SQL) –Predominant application-level query language –Declarative
3 What is an Algebra? A language based on operators and a domain of values Operators map values taken from the domain into other domain values Hence, an expression involving operators and arguments produces a value in the domain relational algebraWhen the domain is a set of all relations (and the operators are as described later), we get the relational algebra query queryresult We refer to the expression as a query and the value produced as the query result
4 Relational Algebra Domain: set of relations selectprojectunionset differenceCartesianproductBasic operators: select, project, union, set difference, Cartesian product set intersectiondivisionjoinDerived operators: set intersection, division, join Procedural: Relational expression specifies query by describing an algorithm (the sequence in which operators are applied) for determining the result of an expression
5 Relational Algebra in a DBMS parser SQL query Relational algebra expression Optimized Relational algebra expression Query optimizer Code generator Query execution plan Executable code DBMS
Relational Algebra-Operators Select Operator ( ) Project Operator ( ) Union operator ( ∪ ) Set Difference (−) Cartesian Product (*) 6
7 Select Operator ( ) Produce table containing subset of rows of argument table satisfying condition condition relation Example: Person Person Person Hobby=‘stamps’ ( Person ) 1123 John 123 Main stamps 1123 John 123 Main coins 5556 Mary 7 Lake Dr hiking 9876 Bart 5 Pine St stamps 1123 John 123 Main stamps 9876 Bart 5 Pine St stamps Id Name Address Hobby
Operator ( ) Selection returns a subset of the rows of a single table. Syntax: select where /* the must involve only columns from the indicated table */ alternatively σ (table_name) Find all suppliers from Boston. Select Supplier where Location = ‘Bos’ σ Location = ‘Bos’ (Supplier)
Select Operator ( ) Find the Cardholders from Modena. Observations: –There is only one input table. –Both Cardholder and the answer table have the same schema (list of columns) –Every row in the answer has the value ‘Modena’ in the b_addr column. select Cardholder where b_addr = ‘Modena’ alternatively σ b_addr = ‘Modena’ (Cardholder)
Select Operator ( ) σ subject = "database" (Books) Output − Selects tuples from books where subject is 'database'. σ subject = "database" and price = "450" (Books) Output − Selects tuples from books where subject is 'database' and 'price' is 450 σ subject = "database" and price = "450" or year > "2010" (Books) Output − Selects tuples from books where subject is 'database' and 'price' is 450 or those books published after
11 Selection Condition Operators:, =, Simple selection condition: – operator AND OR NOT
12 Selection Condition - Examples Person Id>3000 Or Hobby=‘hiking’ (Person) Person Id>3000 AND Id <3999 (Person) Person NOT(Hobby=‘hiking’) (Person) Person Hobby ‘hiking’ (Person)
13 Project Operator ( ) Produces table containing subset of columns of argument table attribute list (relation) Example: PersonPerson Person Name,Hobby (Person) 1123 John 123 Main stamps 1123 John 123 Main coins 5556 Mary 7 Lake Dr hiking 9876 Bart 5 Pine St stamps John stamps John coins Mary hiking Bart stamps Id Name Address Hobby Name Hobby
Project Operator ( ) Projection returns a subset of the columns of a single table. Syntax: project over /* the columns in must come from the indicated table */ alternatively π (table_name) Find all supplier names Project Supplier over Sname π Sname (Supplier)
15 Project Operator 1123 John 123 Main stamps 1123 John 123 Main coins 5556 Mary 7 Lake Dr hiking 9876 Bart 5 Pine St stamps John 123 Main Mary 7 Lake Dr Bart 5 Pine St Result is a table (no duplicates) Id Name Address Hobby Name Address Example: PersonPerson Person Name,Address (Person)
Project Operator ( ) ∏ subject, author (Books) Selects and projects columns named as subject and author from the relation Books. 16
17 Expressions 1123 John 123 Main stamps 1123 John 123 Main coins 5556 Mary 7 Lake Dr hiking 9876 Bart 5 Pine St stamps 1123 John 9876 Bart Id Name Address Hobby Id Name Person Result
18 Set Operators Relation is a set of tuples => set operations should apply Result of combining two relations with a set operator is a relation => all its elements must be tuples having same structure union compatible relationsHence, scope of set operations limited to union compatible relations
Union operator ( ∪ ) It performs binary union between two given relations r ∪ s = { t | t ∈ r or t ∈ s} For a union operation to be valid, the following conditions must hold − r and s must have the same number of attributes. Attribute domains must be compatible. Duplicate tuples are automatically eliminated. ∏ author (Books) ∪ ∏ author (Articles) Output − Projects the names of the authors who have either written a book or an article or both. 19
20 Union -Example Tables: Person Person (SSN, Name, Address, Hobby) Professor Professor (Id, Name, Office, Phone) are not union compatible. However
Union operator ( ∪ ) Part1Suppliers = project (select Supplies where Pno = ‘p1’) over Sno Part2Suppliers = project (select Supplies where Pno = ‘p2’) over Sno Part1Suppliers UNION Part2Suppliers alternatively Part1Suppliers = π Sno (σ Pno = ‘p1’ (Supplies) ) Part2Suppliers = π Sno (σ Pno = ‘p2’ (Supplies) ) Answer = Part1Suppliers Part2Suppliers ∩
UNION Exercise: Find the borrower numbers of all borrowers who have either borrowed or reserved a book (any book). Reservers = project Reserves over borrowerid Borrowers = project Borrows over borrowerid Answer = Borrowers union Reservers alternatively Reservers = π borrowerid (Reserves) Borrowers = π borrowerid (Borrows) Answer = Borrowers Reservers not duplicated ∩
INTERSECTION Treat two tables as sets and perform a set intersection Syntax: Observations: –This operation is impossible unless both tables involved have the same schemas. Why? –Because rows from both tables must fit into a single answer table; hence they must “look alike”. Table1 INTERSECTION Table2 alternatively Table1 Table2 ∩
INTERSECTION Example Part1Suppliers = project (select Supplies where Pno = ‘p1’) over Sno Part2Suppliers = project (select Supplies where Pno = ‘p2’) over Sno Part1Suppliers INTERSECT Part2Suppliers alternatively Part1Suppliers = π Sno (σ Pno = ‘p1’ (Supplies) ) Part2Suppliers = π Sno ( σPno = ‘p2’ (Supplies) ) Answer = Part1Suppliers Part2Suppliers ∩
INTERSECTION Exercise Find the borrower numbers of all borrowers who have borrowed and reserved a book. Reservers = project Reserves over borrowerid Borrowers = project Borrows over borrowerid Answer = Borrowers intersect Reservers alternatively Reservers = π borrowerid (Reserves) Borrowers = π borrowerid (Borrows) Answer = Borrowers Reservers ∩
Set Difference (−) The result of set difference operation is tuples, which are present in one relation but are not in the second relation Notation − r − s Finds all the tuples that are present in r but not in s. ∏ author (Books) − ∏ author (Articles) Output − Provides the name of authors who have written books but not articles. 26
SET DIFFERENCE Treat two tables as sets and perform a set intersection Syntax: Observations: –This operation is impossible unless both tables involved have the same schemas. Why? –Because it only makes sense to calculate the set difference if the two sets have elements in common. Table1 MINUS Table2 alternatively Table1 \ Table2
SET DIFFERENCE Example Part1Suppliers = project (select Supplies where Pno = ‘p1’) over Sno Part2Suppliers = project (select Supplies where Pno = ‘p2’) over Sno Part1Suppliers MINUS Part2Suppliers alternatively Part1Suppliers = π Sno (σ Pno = ‘p1’ (Supplies) ) Part2Suppliers = π Sno (σ Pno = ‘p2’ (Supplies) ) Answer = Part1Suppliers \ Part2Suppliers
SET DIFFERENCE Exercise Find the borrower numbers of all borrowers who have borrowed something and reserved nothing. Reservers = project Reserves over borrowerid Borrowers = project Borrows over borrowerid Answer = Borrowers minus Reservers alternatively Reservers = π borrowerid (Reserves) Borrowers = π borrowerid (Borrows) Answer = Borrowers \ Reservers
30 Cartesian Product If R and S are two relations, R S is the set of all concatenated tuples, where x is a tuple in R and y is a tuple in S –(R and S need not be union compatible) R S is expensive to compute Cartesian products are usually unrelated to a real-world thing -normally contain some noise tuples. a b c d a b c d x1 x2 y1 y2 x1 x2 y1 y2 x3 x4 y3 y4 x1 x2 y3 y4 x3 x4 y1 y2 R S x3 x4 y3 y4 2* 2 rows R S
Cartesian Product 5 rows 4 rows 20 rows info: 7 rows in total noise: 13 rows in total
Cartesian Product Names = Project Cardholder over b_name Addresses = Project Cardholder over b_addr Names x Addresses Names x Addresses Info = project cardholder over b_name, b_addr noise How many rows? 36
Cartesian Product Notation − r Χ s Where r and s are relations and their output will be defined as − r Χ s = { q t | q ∈ r and t ∈ s} σ author = 'tutorialspoint' (Books Χ Articles) Output − Yields a relation, which shows all the books and articles written by tutorials point. 33
34 Renaming Result of expression evaluation is a relation Attributes of relation must have distinct names. This is not guaranteed with Cartesian product –e.g., suppose in previous example a and c have the same name Renaming operator tidies this up. To assign the names A 1, A 2,… A n to the attributes of the n column relation produced by expression expr use expr [A 1, A 2, … A n ]
35 Renaming- Example This is a relation with 4 attributes: StudId, CrsCode1, ProfId, CrsCode2
36
JOIN: The most useful and most common operation. Tables are “related” by having columns in common; primary key on one table appears as a “foreign” key in another. Join uses this relatedness to combine the two tables into one. Join is usually needed when a database query involves knowing something found in one table but wanting to know something found in a different table. Join is useful because both Select and Project work on only one table at a time.
JOIN Example: Suppose we want to know the names of all parts ordered between Nov 4 and Nov 6. } What we know is here? The names we want are here related tables
Assignment Natural Join / self join Equijoin (Inner join) Outer join ( left, Right, Full) %29 39