Lecture 5 Relational Algebra and Calculus SQL Syntax INFO 340 Lecture 5 Relational Algebra and Calculus SQL Syntax Cartesian product examples SQL examples Change to website
Relational Algebra and Calculus Relational Algebra is a procedural whereas Relational Calculus is declarative. SQL is based on Relational Calculus. You tell server WHAT you want, not how you want to get it.
Two tables for today’s examples LastName DepartmentID Smith 1 Johnson Miller 2 Lee 3 EMPLOYEE TABLE ID Name 1 HR 2 Sales 3 Engineering 4 Marketing DEPARTMENT TABLE
Cross Join Cross joins are the Cartesian product of two tables. There are two ways to express cross joins. SELECT * FROM Employee E, Department D SELECT * FROM Employee E CROSS JOIN Department D REFERENCE E.LastName E.DeptID D.ID D.Name Smith 1 HR 2 Sales 3 Engineering 4 Marketing Johnson Miller Lee LastName DeptID Smith 1 Johnson Miller 2 Lee 3 EMPLOYEE TABLE http://en.wikipedia.org/wiki/Join_(SQL) ID Name 1 HR 2 Sales 3 Engineering 4 Marketing DEPARTMENT TABLE
Inner Join Inner joins are the most common type of Join performed in SQL. There are actually two ways to express an Inner. An inner join is done by taking the Cartesian product of the two tables, then only returning the rows that match the conditional. SELECT * FROM Employee E, Department D WHERE D.ID=E.DeptID SELECT * FROM Employee E JOIN Department D ON D.ID=E.DeptID REFERENCE LastName DeptID Smith 1 Johnson Miller 2 Lee 3 EMPLOYEE TABLE E.LastName E.DeptID D.ID D.Name Smith 1 HR Johnson Miller 2 Sales Lee 3 Engineering Sub-Types of Inner Joins: Natural Join, Equal-Jon ID Name 1 HR 2 Sales 3 Engineering 4 Marketing DEPARTMENT TABLE
Outer Join Outer joins are used to return all rows from one table regardless of a match in the other table. If no match is found in the other table, a NULL is returned. Three types: LEFT, RIGHT, FULL Show all the departments and the employees in them, if any: SELECT * FROM Department D LEFT JOIN Employee E ON D.ID=E.DeptID REFERENCE LastName DeptID Smith 1 Johnson Miller 2 Lee 3 EMPLOYEE TABLE D.ID D.Name E.LastName E.DeptID 1 HR Smith Johnson 2 Sales Miller 3 Engineering Lee 4 Marketing NULL FULL OUTER JOIN not even supported in MySQL, Oracle 8i and early, etc. ID Name 1 HR 2 Sales 3 Engineering 4 Marketing DEPARTMENT TABLE
Self Join A self-join is a query in which a table is joined (compared) to itself. Self-joins are used to compare values in a column with other values in the same column in the same table. One practical use for self-joins: obtaining running counts and running totals in an SQL query. To write the query, select from the same table listed twice with different aliases, set up the comparison, and eliminate cases where a particular value would be equal to itself. Example Which customers are located in the same state (column name is Region)? SELECT DISTINCT c1.ContactName, c1.Address, c1.City, c1.Region FROM Customers AS c1, Customers AS c2 WHERE c1.Region = c2.Region AND c1.ContactName <> c2.ContactName ORDER BY c1.Region, c1.ContactName; Another example: Exercise Which customers are located in the same city? (32 rows) http://www.udel.edu/evelyn/SQL-Class3/SQL3_self.html
Aggregate Functions While returning rows is nice, often times you want to return data based upon a computed value from a set. Count Sum Min Max Avg
An example of Aggregates Name Grade Steve 2.5 John 3.5 Wendy 3.8 Niki 4.0 Kevin 1.4 SELECT count(*), max(grade), min(grade), avg(grade), sum(grade) FROM student_grades Count(*) Max(grade) Min(grade) Avg(grade) Sum(grade) 5 4 1.4 3.04 15.2
SELECT Statement - Grouping Now that you have aggregate functions, they become useful in grouping results. Back to the example Join tables, maybe you want a count of the number of employees in each department. The GROUP BY clause is added to the end of the SELECT statement.
SELECT Statement - Grouping All column names in SELECT list must appear in GROUP BY clause unless name is used only in an aggregate function. If WHERE is used with GROUP BY, WHERE is applied first, then groups are formed from remaining rows satisfying predicate. ISO considers two nulls to be equal for purposes of GROUP BY. It is important to note the group by is applied last.
Group By Example How many employees are in each department? REFERENCE SELECT D.Name, COUNT(E.DeptID) FROM Department D LEFT JOIN Employee E ON D.ID=E.DeptID GROUP BY D.Name REFERENCE LastName DeptID Smith 1 Johnson Miller 2 Lee 3 EMPLOYEE TABLE D.Name COUNT(E.DeptID) HR 2 Sales 1 Engineering Marketing ID Name 1 HR 2 Sales 3 Engineering 4 Marketing DEPARTMENT TABLE
HAVING clause But what if we want to return results based upon a GROUP BY? Enter the HAVING clause. Let’s only see the departments with people in them: SELECT D.Name, COUNT(E.DeptID) FROM Department D LEFT JOIN Employee E ON D.ID=E.DeptID GROUP BY D.Name HAVING COUNT(E.DeptID) > 0 REFERENCE LastName DeptID Smith 1 Johnson Miller 2 Lee 3 EMPLOYEE TABLE The having clause is applied last. ID Name 1 HR 2 Sales 3 Engineering 4 Marketing DEPARTMENT TABLE D.Name COUNT(E.DeptID) HR 2 Sales 1 Engineering
ORDER BY clause Finally, what if we want some order imposed on our results? Order by can contain any field or value specified in the selection criteria. SELECT D.Name, COUNT(E.DeptID) FROM Department D LEFT JOIN Employee E ON D.ID=E.DeptID GROUP BY D.Name HAVING COUNT(E.DeptID) > 0 ORDER BY D.Name REFERENCE LastName DeptID Smith 1 Johnson Miller 2 Lee 3 EMPLOYEE TABLE Discuss DESC, ASC.. Multiple ORDER BYs, etc. ID Name 1 HR 2 Sales 3 Engineering 4 Marketing DEPARTMENT TABLE D.Name COUNT(E.DeptID) Engineering 1 HR 2 Sales
Set A Set B Set Theory Review The Intersection Intersection of 2 Sets R = {1,2,3,4} S = {4,5,6,7} R S = { 4 } R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = { Ø } R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘blue’, ‘fire trap’ , ‘AM radio’ } R S = { ‘blue’ } Set A Set B The Intersection
Set A Set B Set Theory Review Union of 2 Sets R = {1,2,3,4} S = {4,5,6,7} R S = {1,2,3,4,5,6,7} R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = {Joe, Suzie, Jane, Bob, Sam } R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘AM radio’, ‘blue’, ‘fire trap’ } R S = {‘big stereo’, ‘blue’, ‘safe’, ‘AM radio’, ‘fire trap’ } Set A Set B
Set A Set B Set Theory Review The Difference Difference of 2 Sets R = {1,2,3,4} S = {4,5,6,7} R \ S = {1,2,3} R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = {Joe, Suzie } R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘AM radio’, ‘blue’, ‘fire trap’ } R S = {‘big stereo’, ‘safe’ } Set A Set B The Difference
Union, Intersect, and Difference (Except) Use between select clauses Keyword for union is union Keyword for intersection is intersect Keyword for difference is except Column names must match in each query. Example: (select Name from Staff) union (select Name from Faculty)
INSERT INSERT INTO TableName [ (columnList) ] VALUES (dataValueList) columnList is optional; if omitted, SQL assumes a list of all columns in their original CREATE TABLE order. Any columns omitted must have been declared as NULL when table was created, unless DEFAULT was specified when creating column. © Pearson Education Limited 1995, 2005
INSERT dataValueList must match columnList as follows: number of items in each list must be same; must be direct correspondence in position of items in two lists; data type of each item in dataValueList must be compatible with data type of corresponding column. © Pearson Education Limited 1995, 2005
INSERT … VALUES Insert a new row into Employee table supplying data for all columns. Let’s finally put someone in the marketing department! Full table, so can omit the column names: INSERT INTO Employee VALUES (‘Brown’, 4); Or we can explicitly list the column names: INSERT INTO Employee (LastName, DeptID) VALUES (‘Brown’, 4); Perhaps we the DeptID field allows NULLs or has a default: INSERT INTO Employee (LastName) VALUES (‘Brown’);
UPDATE UPDATE TableName SET columnName1 = dataValue1 [WHERE searchCondition] TableName can be name of a base table or an updatable view. SET clause specifies names of one or more columns that are to be updated. WHERE clause is optional, if omitted all rows are updated. © Pearson Education Limited 1995, 2005
UPDATE example Ms. Johnson gets married and wants to change her name to Anderson. UPDATE EMPLOYEE SET LastName=‘Anderson’ WHERE LastName=‘Johnson’ Better way to find Ms. Johnson UPDATE EMPLOYEE SET LastName=‘Anderson’ WHERE LastName=‘Johnson’ AND DeptID=1 The Marketing department is being merged with Sales and as such all the employees in that department are being moved into Sales. UPDATE EMPLOYEE SET DeptID=2 WHERE DeptID=4 REFERENCE LastName DeptID Smith 1 Johnson Miller 2 Lee 3 EMPLOYEE TABLE Discuss how the first query is poor because it will probably update everyone.. Note that while ANSI SQL doesn’t allow JOINs in the update, many Database software packages do. ID Name 1 HR 2 Sales 3 Engineering 4 Marketing DEPARTMENT TABLE
DELETE DELETE FROM TableName [WHERE searchCondition] TableName can be name of a base table or an updatable view. searchCondition is optional; if omitted, all rows are deleted from table. This does not delete table. If search_condition is specified, only those rows that satisfy condition are deleted. © Pearson Education Limited 1995, 2005
DELETE example Mr. Smith decides to take another job and quits: DELETE FROM EMPLOYEE WHERE LastName=‘Smith’ AND DeptId=1 Remember the Marketing department? Well, rather than merge with Sales we are going to eliminate it and all the employees in that department. DELETE FROM EMPLOYEE WHERE DeptID=4 REFERENCE LastName DeptID Smith 1 Johnson Miller 2 Lee 3 EMPLOYEE TABLE ID Name 1 HR 2 Sales 3 Engineering 4 Marketing DEPARTMENT TABLE
Variants & Like There is a rich set of functions that can be used in SQL. Of course, most of them are highly language-variant dependent. LIKE. Allows searching a text field for a value. SELECT * FROM students WHERE name LIKE ‘R%’ % is a wildcard, whereas _ matches just one character Discuss examples like MySQL can do SHA1, string manipulation, SELECT 5*10+10 works, etc.
CASE statements SELECT CASE Sex WHEN ‘M’ THEN ‘Male’ WHEN ‘F’ THEN ‘Female’ END CASE FROM Students
Mini-Project Due Feb 4, 2009 Choose between the following: Build on your iSchool MySQL account Choose between the following: UW OnTech Archive UW Privacy Policy Set
Mini-Project UW OnTech Archive -- http://www.washington.edu/computing/ontech/archive.php UW Privacy Policy – http://depts.washington.edu/comply/privacy.shtml http://security.uwmedicine.org/policies/sec_policies.asp
UW OnTech contents by issue archive contents
UW OnTech issue contents article example
UW OnTech Sample questions: How many contents by issue pages list topics that are not the same as the topics on the corresponding issue contents pages ? Or are missing entirely ? How many pages list an ‘exposed e-mail address’ on its readable page? How many pages have an e-mail address that is visible in the page source?
UW OnTech More sample questions: What is the average number of clickable links per article in the archive ? What is the min & max number of clickable links in the archive Which articles were they More to come
UW Privacy & Security Policies Sample questions: Which policies have the greatest distance between Effective Date & Review Date ? Which ones are they? How many policies have the same Effective Date & Review Date? Which policies have more than 5 attachments?
UW Privacy & Security Policies More sample questions: Which policies have greater than 5 references? Which policies are they ? What is most often cited reference in the reference section ? Do any of the policies in these two sets reference each other ?
Customers