INFO 340 Lecture 4 Relational Algebra and Calculus SQL Syntax
Relational Algebra and Calculus Relational Algebra is a procedural whereas Relational Calculus is declarative. SQL is based on Relational Calculus. You tell server WHAT you want, not how you want to get it.
Tuple Relational Calculus { T | P(T) } –T is a Tuple variable –P(T) is a formula defining T –Result is the set of all tuples T where P(T) is true Give example..
Domain Relational Calculus First Order Predicate Logic –Looks at predicates on one side and individuals on the other.
Oh CRUD Create - Using INSERT Retrieve - Using SELECT Update - Using UPDATE Delete – Using DELETE How we manipulate the data. Called the Data Manipulation Language (DML).
Remember SELECT from Lab? select select whatever-attributes-you-want from from the-table-that-you-want where where x-attribute = what-you-want; select select last_name from from myTable where where first_name = ‘Suzie’;
SQL and Relational Calculus SQL is grounded in Relational Calculus. You tell the DBMS WHAT you want and it figures out how best to retrieve it. Well, that’s not entirely true.. SELECT does break Relational Theory but that’s for another day…
Two tables for today’s examples IDName 1HR 2Sales 3Engineering 4Marketing DEPARTMENT TABLE LastNameDepartmentID Smith1 Johnson1 Miller2 Lee3 EMPLOYEE TABLE
Cross Join Cross joins are the Cartesian product of two tables. There are two ways to express cross joins. SELECTFROM SELECT * FROM Employee E, Department D SELECT FROM CROSS JOIN SELECT * FROM Employee E CROSS JOIN Department D IDName 1HR 2Sales 3Engineering 4Marketing DEPARTMENT TABLE LastNameDeptID Smith1 Johnson1 Miller2 Lee3 EMPLOYEE TABLE E.LastNameE.DeptIDD.IDD.Name Smith11HR Smith12Sales Smith13Engineering Smith14Marketing Johnson11HR Johnson12Sales Johnson13Engineering Johnson14HR Miller21HR Miller22Sales Miller23Engineering Miller24Marketing Lee31HR Lee32Sales Lee33Engineering Lee34Marketing
Inner Join Inner joins are the most common type of Join performed in SQL. There are actually two ways to express an Inner. An inner join is done by taking the Cartesian product of the two tables, then only returning the rows that match the conditional. SELECT FROM SELECT * FROM Employee E, Department D WHERE WHERE D.ID=E.DeptID SELECT FROM SELECT * FROM Employee E JOIN ON JOIN Department D ON D.ID=E.DeptID IDName 1HR 2Sales 3Engineering 4Marketing DEPARTMENT TABLE LastNameDeptID Smith1 Johnson1 Miller2 Lee3 EMPLOYEE TABLE E.LastNameE.DeptIDD.IDD.Name Smith11HR Johnson11HR Miller22Sales Lee33Engineering
Outer Join Outer joins are used to return all rows from one table regardless of a match in the other table. If no match is found in the other table, a NULL is returned. Three types: LEFT, RIGHT, FULL Show all the departments and the employees in them, if any: SELECT FROM SELECT * FROM Department D LEFT JOIN ON LEFT JOIN Employee E ON D.ID=E.DeptID IDName 1HR 2Sales 3Engineering 4Marketing DEPARTMENT TABLE LastNameDeptID Smith1 Johnson1 Miller2 Lee3 EMPLOYEE TABLE D.IDD.NameE.LastNameE.DeptID 1HRSmith1 1HRJohnson1 2SalesMiller2 3EngineeringLee3 4MarketingNULL
Self Join A self-join is a query in which a table is joined (compared) to itself. Self-joins are used to compare values in a column with other values in the same column in the same table. One practical use for self-joins: obtaining running counts and running totals in an SQL query. To write the query, select from the same table listed twice with different aliases, set up the comparison, and eliminate cases where a particular value would be equal to itself. Example Which customers are located in the same state (column name is Region)? SELECT DISTINCT c1.ContactName, c1.Address, c1.City, c1.Region FROM Customers AS c1, Customers AS c2 WHERE c1.Region = c2.Region AND c1.ContactName <> c2.ContactName ORDER BY c1.Region, c1.ContactName; Another example: Exercise Which customers are located in the same city? (32 rows)
Aggregate Functions While returning rows is nice, often times you want to return data based upon a computed value from a set. –Count –Sum –Min –Max –Avg
An example of Aggregates NameGrade Steve2.5 John3.5 Wendy3.8 Niki4.0 Kevin1.4 SELECT count(*), max(grade), min(grade), avg(grade), sum(grade) FROM FROM student_grades Count(*)Max(grade)Min(grade)Avg(grade)Sum(grade)
SELECT Statement - Grouping Now that you have aggregate functions, they become useful in grouping results. Back to the example Join tables, maybe you want a count of the number of employees in each department. The GROUP BY clause is added to the end of the SELECT statement.
SELECT Statement - Grouping All column names in SELECT list must appear in GROUP BY clause unless name is used only in an aggregate function. If WHERE is used with GROUP BY, WHERE is applied first, then groups are formed from remaining rows satisfying predicate. ISO considers two nulls to be equal for purposes of GROUP BY.
Group By Example How many employees are in each department? SELECT FROM SELECT D.Name, COUNT(E.DeptID) FROM Department D LEFT JOIN ON LEFT JOIN Employee E ON D.ID=E.DeptID GROUP BY GROUP BY D.Name IDName 1HR 2Sales 3Engineering 4Marketing DEPARTMENT TABLE LastNameDeptID Smith1 Johnson1 Miller2 Lee3 EMPLOYEE TABLE D.NameCOUNT(E.DeptID) HR2 Sales1 Engineering1 Marketing0
HAVING clause But what if we want to return results based upon a GROUP BY? Enter the HAVING clause. Let’s only see the departments with people in them: SELECT SELECT D.Name, COUNT(E.DeptID) FROM FROM Department D LEFT JOIN ON LEFT JOIN Employee E ON D.ID=E.DeptID GROUP BY GROUP BY D.Name HAVING HAVING COUNT(E.DeptID) > 0 IDName 1HR 2Sales 3Engineering 4Marketing DEPARTMENT TABLE LastNameDeptID Smith1 Johnson1 Miller2 Lee3 EMPLOYEE TABLE D.NameCOUNT(E.DeptID) HR2 Sales1 Engineering1
ORDER BY clause Finally, what if we want some order imposed on our results? Order by can contain any field or value specified in the selection criteria. SELECT SELECT D.Name, COUNT(E.DeptID) FROM FROM Department D LEFT JOIN ON LEFT JOIN Employee E ON D.ID=E.DeptID GROUP BY GROUP BY D.Name HAVING HAVING COUNT(E.DeptID) > 0 ORDER BY ORDER BY D.Name IDName 1HR 2Sales 3Engineering 4Marketing DEPARTMENT TABLE LastNameDeptID Smith1 Johnson1 Miller2 Lee3 EMPLOYEE TABLE D.NameCOUNT(E.DeptID) Engineering1 HR2 Sales1
Set Theory Review Intersection of 2 Sets R = {1,2,3,4} S = {4,5,6,7} R S = { 4 } R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = { Ø } R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘blue’, ‘fire trap’, ‘AM radio’ } R S = { ‘blue’ } Set A Set B The Intersection
Set Theory Review Union of 2 Sets R = {1,2,3,4} S = {4,5,6,7} R S = {1,2,3,4,5,6,7} R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = {Joe, Suzie, Jane, Bob, Sam } R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘AM radio’, ‘blue’, ‘fire trap’ } R S = {‘big stereo’, ‘blue’, ‘safe’, ‘AM radio’, ‘fire trap’ } Set A Set B
Set Theory Review Difference of 2 Sets R = {1,2,3,4} S = {4,5,6,7} R \ S = {1,2,3} R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = {Joe, Suzie } R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘AM radio’, ‘blue’, ‘fire trap’ } R S = {‘big stereo’, ‘safe’ } Set A Set B The Difference
Use between select clauses –Keyword for union is union –Keyword for intersection is intersect –Keyword for difference is except Column names must match in each query. selectfrom union selectfromExample: (select Name from Staff) union (select Name from Faculty) Union, Intersect, and Difference (Except)
INSERT INSERT INTO INSERT INTO TableName [ (columnList) ] VALUES VALUES (dataValueList) columnList is optional; if omitted, SQL assumes a list of all columns in their original CREATE TABLE order. Any columns omitted must have been declared as NULL when table was created, unless DEFAULT was specified when creating column. © Pearson Education Limited 1995, 2005
INSERT dataValueList must match columnList as follows: –number of items in each list must be same; –must be direct correspondence in position of items in two lists; –data type of each item in dataValueList must be compatible with data type of corresponding column. © Pearson Education Limited 1995, 2005
INSERT … VALUES Insert a new row into Employee table supplying data for all columns. –Let’s finally put someone in the marketing department! INSERT INTOVALUESFull table, so can omit the column names: INSERT INTO Employee VALUES (‘Brown’, 4); INSERT INTO VALUESOr we can explicitly list the column names: INSERT INTO Employee (LastName, DeptID) VALUES (‘Brown’, 4); INSERT INTOVALUESPerhaps we the DeptID field allows NULLs or has a default: INSERT INTO Employee (LastName) VALUES (‘Brown’);
UPDATE UPDATE UPDATE TableName SET SET columnName1 = dataValue1 [, columnName2 = dataValue2...] WHERE [WHERE searchCondition] TableName can be name of a base table or an updatable view. SET clause specifies names of one or more columns that are to be updated. WHERE clause is optional, if omitted all rows are updated. © Pearson Education Limited 1995, 2005
UPDATE example UPDATE SETWHEREMs. Johnson gets married and wants to change her name to Anderson. UPDATE EMPLOYEE SET LastName=‘Anderson’ WHERE LastName=‘Johnson’ UPDATE SET WHEREBetter way to find Ms. Johnson UPDATE EMPLOYEE SET LastName=‘Anderson’ WHERE LastName=‘Johnson’ AND DeptID=1 UPDATE SETWHEREThe Marketing department is being merged with Sales and as such all the employees in that department are being moved into Sales. UPDATE EMPLOYEE SET DeptID=2 WHERE DeptID=4 IDName 1HR 2Sales 3Engineering 4Marketing DEPARTMENT TABLE LastNameDeptID Smith1 Johnson1 Miller2 Lee3 EMPLOYEE TABLE
DELETE –DELETE FROM TableName –[WHERE searchCondition] TableName can be name of a base table or an updatable view. searchCondition is optional; if omitted, all rows are deleted from table. This does not delete table. If search_condition is specified, only those rows that satisfy condition are deleted. © Pearson Education Limited 1995, 2005
DELETE example DELETE FROM WHEREMr. Smith decides to take another job and quits: DELETE FROM EMPLOYEE WHERE LastName=‘Smith’ AND DeptId=1 DELETE FROM WHERERemember the Marketing department? Well, rather than merge with Sales we are going to eliminate it and all the employees in that department. DELETE FROM EMPLOYEE WHERE DeptID=4 IDName 1HR 2Sales 3Engineering 4Marketing DEPARTMENT TABLE LastNameDeptID Smith1 Johnson1 Miller2 Lee3 EMPLOYEE TABLE
Variants & Like There is a rich set of functions that can be used in SQL. Of course, most of them are highly language-variant dependent. SELECT FROM WHERE LIKELIKE. Allows searching a text field for a value. SELECT * FROM students WHERE name LIKE ‘R%’ –% is a wildcard, whereas _ matches just one character
CASE statements SELECT FROMSELECT CASE Sex WHEN ‘M’ THEN ‘Male’ WHEN ‘F’ THEN ‘Female’ END CASE FROM Students
Mini-Project Due Feb 4, 2009 –Build on your iSchool MySQL account Choose between the following: –UW OnTech Archive –UW Privacy Policy Set
Mini-Project UW OnTech Archive UW Privacy Policy –
UW OnTech archive contents contents by issue
UW OnTech issue contentsarticle example
Sample questions: –How many contents by issue pages list topics that are not the same as the topics on the corresponding issue contents pages ? Or are missing entirely ? –How many pages list an ‘exposed address’ on its readable page? –How many pages have an address that is visible in the page source? UW OnTech
More sample questions: –What is the average number of clickable links per article in the archive ? –What is the min & max number of clickable links in the archive Which articles were they –More to come
UW Privacy & Security Policies Sample questions: –Which policies have the greatest distance between Effective Date & Review Date ? Which ones are they? –How many policies have the same Effective Date & Review Date? Which ones are they? –Which policies have more than 5 attachments?
More sample questions: –Which policies have greater than 5 references? Which policies are they ? What is most often cited reference in the reference section ? Do any of the policies in these two sets reference each other ? UW Privacy & Security Policies