Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 Chapter 6 THE DATABASE LANGUAGE SQL 1.

Similar presentations


Presentation on theme: "Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 Chapter 6 THE DATABASE LANGUAGE SQL 1."— Presentation transcript:

1 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 Chapter 6 THE DATABASE LANGUAGE SQL 1

2 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6. The Database Language SQL 6.1. 4 6.1.7. 2

3 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6. The Database Language SQL 6.1 Simple Queries in SQL 6.2 Queries Involving More Than One Relation 6.3 Subqueries 6.4 Full-Relation Operation 6.5 Database Modification 6.6 Transactions in SQL 6.7 Summary of Chapter 6 6.8 References for chapter 6 3

4 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 Introduction SQL (pronounce “sequel”) stands for: Structured Query Language SQL (pronounce “sequel”) stands for: Structured Query Language SQL consists of: SQL consists of: DML (Data Manipulation Language) DML (Data Manipulation Language) DDL (Data Definition Language) DDL (Data Definition Language) Dialects of SQL: Dialects of SQL: ANSI SQL ANSI SQL SQL-92 or SQL2 SQL-92 or SQL2 SQL-99 or SQL3 SQL-99 or SQL3 SQL:2003 SQL:2003 4

5 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 Introduction (cont’d) There are versions of SQL produced by the principal DBMS vendors. There are versions of SQL produced by the principal DBMS vendors. They support ANSI standards They support ANSI standards Conform to a large extent to the more recent SQL2 Conform to a large extent to the more recent SQL2 Each has its variations and extensions beyond SQL2 including some of the features in the SQL-99 and SQL:2003 Each has its variations and extensions beyond SQL2 including some of the features in the SQL-99 and SQL:2003 5

6 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 SIMPLE QUERIES IN SQL Section 6.1 6

7 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1 Simple Queries in SQL 6.1.1 Projecting in SQL 6.1.2 Selecting in SQL 6.1.3 Comparison of Strings 6.1.4 Pattern Matching in SQL 6.1.5 Dates and Times 6.1.6 Null Values and Comparisons Involving NULL 6.1.7 The Truth-Value UNKNOWN 6.1.8 Ordering the Output 6.1.9 Exercises for Section 6.1 7

8 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1 Simple Queries in SQL A simple query uses the following keywords: A simple query uses the following keywords: SELECT, FROM, WHERE (BNF) SELECT, FROM, WHERE (BNF) We call these queries: select-from-where form We call these queries: select-from-where form 8

9 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1 Simple Queries in SQL (cont’d) Example 6.1 (selection) Given the relation Movies with the following schema: Movies(title, year, length, genre, studioName, producerC#) Query all movies produced by Disney Studios in 1990 Example 6.1 (selection) Given the relation Movies with the following schema: Movies(title, year, length, genre, studioName, producerC#) Query all movies produced by Disney Studios in 1990 SELECT * FROM Movies WHERE studioName = ‘Disney’ AND year = 1990; SELECT * FROM Movies WHERE studioName = ‘Disney’ AND year = 1990; 9

10 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.1 Projection in SQL We can project the relation produced by a SQL onto some of its attributes. We can project the relation produced by a SQL onto some of its attributes. Example 6.2 Modify the example 6.1 and bring just title and length of the movies with the same condition. SELECT title, length FROM Movies WHERE studioName = 'DISNEY' AND year = 1990; Example 6.2 Modify the example 6.1 and bring just title and length of the movies with the same condition. SELECT title, length FROM Movies WHERE studioName = 'DISNEY' AND year = 1990; 10

11 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.1 Projection in SQL (cont’d) If you wish to have different names for the attributes, then just type what you wish as follows: If you wish to have different names for the attributes, then just type what you wish as follows: Example 6.3 SELECT title AS name, length AS duration FROM Movies WHERE studioName = 'DISNEY' AND year = 1990; Example 6.3 SELECT title AS name, length AS duration FROM Movies WHERE studioName = 'DISNEY' AND year = 1990; Note that you can eliminate “AS” Note that you can eliminate “AS” 11

12 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.1 Projection in SQL (cont’d) Another option is to use an expression in place of an attribute. Another option is to use an expression in place of an attribute. Example 6.4 compute the length in hours SELECT title AS name, length/60 AS Length_In_Hours FROM Movies WHERE studioName = 'DISNEY' AND year = 1990; Example 6.4 compute the length in hours SELECT title AS name, length/60 AS Length_In_Hours FROM Movies WHERE studioName = 'DISNEY' AND year = 1990; 12

13 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.1 Projection in SQL (cont’d) Another option is to use a constant in place of an attribute. Another option is to use a constant in place of an attribute. Example 6.5 SELECT title, length/60 AS Length ‘hrs.’ AS Hours FROM Movies WHERE studioName = 'DISNEY' AND year = 1990; Example 6.5 SELECT title, length/60 AS Length ‘hrs.’ AS Hours FROM Movies WHERE studioName = 'DISNEY' AND year = 1990; Note that SQL is case insensitive. Note that SQL is case insensitive. 13

14 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.2 Selection in SQL We can select desired tuples through “WHERE” clause. We can select desired tuples through “WHERE” clause. Comparison operators: Comparison operators: = (like == in Java) = (like == in Java) <> (like != in Java) <> (like != in Java) < > <= <= >= >= Concatenation operator for strings: || Concatenation operator for strings: || 14

15 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.2 Selection in SQL (cont’d) String constants are surrounded by single quotes. Like ‘Disney’ in the previous examples. String constants are surrounded by single quotes. Like ‘Disney’ in the previous examples. Numeric constants can be Integer or Real. Numeric constants can be Integer or Real. The result of a comparison is TRUE or FALSE. The result of a comparison is TRUE or FALSE. Logical operators are: AND, OR, NOT Logical operators are: AND, OR, NOT The precedence of logical operators is: NOT, AND, OR The precedence of logical operators is: NOT, AND, OR Use parenthesis to break this precedence. Use parenthesis to break this precedence. 15

16 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.2 Selection in SQL (cont’d) Example 6.6 Query for the movies that made by ‘MGM’ studios and either were made after 1970 or were less than 90 minutes long. SELECT title FROM Movies WHERE studioName = ‘MGM’ AND (year > 1970 OR length 1970 OR length < 90); 16

17 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.2 Selection in SQL (cont’d) Example 6.6 Query for the movies that made by ‘MGM’ studios and either were made after 1970 or were less than 90 minutes long. SELECT title FROM Movies WHERE studioName = ‘MGM’ AND (year > 1970 OR length 1970 OR length < 90); 17

18 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.3 Comparison of Strings Strings can be stored as either fixed-length using CHAR data type or variable-length using VARCHAR. Strings can be stored as either fixed-length using CHAR data type or variable-length using VARCHAR. When comparing strings, only real characters are considered and padding characters are ignored regardless of the data type declaration. When comparing strings, only real characters are considered and padding characters are ignored regardless of the data type declaration. When comparing strings using, =, lexicographic order of characters are considered. (like dictionary) When comparing strings using, =, lexicographic order of characters are considered. (like dictionary) 18

19 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.3 Comparison of Strings (cont’d) Example: ‘fodder’ < ‘foo’ Because, ‘fo’ = ‘fo’ but ‘d’ < ‘o’ Example: ‘fodder’ < ‘foo’ Because, ‘fo’ = ‘fo’ but ‘d’ < ‘o’ Example: ‘bar’ < ‘bargain’ Because, ‘bar’ = ‘bar’ but ‘’ < ‘gain’ Example: ‘bar’ < ‘bargain’ Because, ‘bar’ = ‘bar’ but ‘’ < ‘gain’ Note that ‘A’ < ‘a’ Note that ‘A’ < ‘a’ 19

20 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.4 Pattern Matching in SQL Two alternative forms of comparison are: s LIKE p s NOT LIKE p Where s is a string and p is a pattern. Two alternative forms of comparison are: s LIKE p s NOT LIKE p Where s is a string and p is a pattern. p can contain wildcards or ordinary chars. p can contain wildcards or ordinary chars. % matches zero or more chars % matches zero or more chars _ (underscore) matches one char _ (underscore) matches one char Note that strings are case sensitive. Note that strings are case sensitive. 20

21 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.4 Pattern Matching in SQL (cont’d) Example 6.7 SELECT title FROM Movies WHERE title LIKE 'STAR_ _ _ _ _'; Example 6.7 SELECT title FROM Movies WHERE title LIKE 'STAR_ _ _ _ _'; SELECT title FROM Movies WHERE title LIKE 'STAR ____'; Retrieves the titles that starts with ‘STAR’, then one blank and the 4 last chars can be anything. Retrieves the titles that starts with ‘STAR’, then one blank and the 4 last chars can be anything. So, possible matches can be: ‘Star Wars’, ‘Star Trek’ So, possible matches can be: ‘Star Wars’, ‘Star Trek’ 21

22 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.4 Pattern Matching in SQL (cont’d) Example 6.8 SELECT title FROM Movies WHERE title LIKE ‘%’’s%’; Example 6.8 SELECT title FROM Movies WHERE title LIKE ‘%’’s%’; Note that if your string contains single quote, put another single quote to distinguish between surrounding single quotes and the single quote itself. Note that if your string contains single quote, put another single quote to distinguish between surrounding single quotes and the single quote itself. Retrieve all movies that contain the ‘s in their name like: Logan’s Run, Alice’s Restaurant Retrieve all movies that contain the ‘s in their name like: Logan’s Run, Alice’s Restaurant 22

23 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.5 Dates and Times A date constant is represented by the keyword DATE followed by a quoted string. A date constant is represented by the keyword DATE followed by a quoted string. For example: DATE ‘1961-08-24’ For example: DATE ‘1961-08-24’ Note the strict format of the ‘YYYY-mm-dd’ Note the strict format of the ‘YYYY-mm-dd’ 23

24 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.5 Dates and Times (cont’d) A time constant is represented by the keyword TIME followed by a quoted string. A time constant is represented by the keyword TIME followed by a quoted string. For example: TIME ’15:05:03’ For example: TIME ’15:05:03’ Another example: TIME ’15:05:03.15’ Another example: TIME ’15:05:03.15’ Note the strict format of ‘HH:mm:ss’ and ‘HH:mm:ss.nnn’ Note the strict format of ‘HH:mm:ss’ and ‘HH:mm:ss.nnn’ Note that HH is a military format (24-hour) Note that HH is a military format (24-hour) Fractions of seconds can be as many as significant digit you like Fractions of seconds can be as many as significant digit you like 24

25 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.5 Dates and Times (cont’d) An alternative for the time is to represent it with respect to GMT (Greenwich Mean Time) An alternative for the time is to represent it with respect to GMT (Greenwich Mean Time) TIME ‘HH:mm:ss – HH:mm’ to represent the times behind GMT. TIME ‘HH:mm:ss – HH:mm’ to represent the times behind GMT. For example: TIME ‘12:00:00 – 8:00) represent the noon in Pacific standard time (PST) For example: TIME ‘12:00:00 – 8:00) represent the noon in Pacific standard time (PST) TIME ‘HH:mm:ss + HH:mm’ to represent the times ahead GMT. TIME ‘HH:mm:ss + HH:mm’ to represent the times ahead GMT. 25

26 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.5 Dates and Times (cont’d) To combine date and time, we use a value of type TIMESTAMP. To combine date and time, we use a value of type TIMESTAMP. Use format: TIMESTAMP ‘YYYY-mm-dd HH:mm:ss’ to represent a timestamp. Note the space between the date and the time. Use format: TIMESTAMP ‘YYYY-mm-dd HH:mm:ss’ to represent a timestamp. Note the space between the date and the time. For example: TIMESTAMP ‘1961-08-24 12:00:00’ For example: TIMESTAMP ‘1961-08-24 12:00:00’ TIMESTAMPs values can be compared by the comparison operators. TIMESTAMPs values can be compared by the comparison operators. 26

27 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.6 Null Values and Comparisons Involving NULL Different interpretations for NULL values: 1. Value unknown I know there is some value here but I don’t know what it is? 2. Value inapplicable There is no value that make sense here. 3. Value withheld We are not entitled to know this value. 27

28 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.6 Null Values and Comparisons Involving NULL (cont’d) When operating upon a NULL value, remember that: When operating upon a NULL value, remember that: 1. The result would be NULL if any value, including NULL, is one of the operands of an arithmetic operation. Example: price + 1 = NULL if the price is NULL 2. The result would be UNKNOWN if we compare a value, including NULL, with NULL. Example: ‘Ali’ < NULL is UNKNOWN Note that NULL is not a constant. Note that NULL is not a constant. 28

29 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.6 Null Values and Comparisons Involving NULL (cont’d) Example 6.9 Let x have the value NULL Then x + 3 is also NULL But NULL + 3 is not a legal SQL expression Example 6.9 Let x have the value NULL Then x + 3 is also NULL But NULL + 3 is not a legal SQL expression Also: x=3 is unknown in the above example because we cannot say whether the x which currently is NULL is equal 3 or not? Also: x=3 is unknown in the above example because we cannot say whether the x which currently is NULL is equal 3 or not? Note that NULL = 3 is not correct SQL. Note that NULL = 3 is not correct SQL. We use: x IS NULL or x IS NOT NULL to check if the x is NULL or not. We use: x IS NULL or x IS NOT NULL to check if the x is NULL or not. 29

30 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.7 The Truth-Value UNKNOWN UNKNOWN is the third truth value beside TRUE and FALSE. UNKNOWN is the third truth value beside TRUE and FALSE. To memorize what the results of an operation would be, consider TRUE = 1, FALSE = 0, and UNKNOWN = ½ To memorize what the results of an operation would be, consider TRUE = 1, FALSE = 0, and UNKNOWN = ½ The AND of two operand is the minimum of those values The AND of two operand is the minimum of those values The OR of two operand is the maximum of those values The OR of two operand is the maximum of those values The NOT of a value v is 1 - v The NOT of a value v is 1 - v 30

31 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.7 The Truth-Value UNKNOWN (cont’d) The truth table The truth table XYX AND YX OR YNOT X TRUE FALSE TRUEUNKNOWN TRUEFALSE TRUEFALSE TRUEFALSE UNKNOWNTRUEUNKNOWNTRUEUNKNOWN FALSE UNKNOWN FALSETRUEFALSETRUE FALSEUNKNOWNFALSEUNKNOWNTRUE FALSE TRUE 31

32 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.7 The Truth-Value UNKNOWN (cont’d) The relevant tuples won’t be retrieved if a condition in WHERE clause is evaluated to UNKNOWN. The relevant tuples won’t be retrieved if a condition in WHERE clause is evaluated to UNKNOWN. Example 6.10 SELECT * FROM Movies_null WHERE length 120; Example 6.10 SELECT * FROM Movies_null WHERE length 120; The query won’t retrieve the tuples that contain NULL values in the length column even though we expect it should bring all movies. The query won’t retrieve the tuples that contain NULL values in the length column even though we expect it should bring all movies. 32

33 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.7 The Truth-Value UNKNOWN (cont’d) The relevant tuples won’t be retrieved if a condition in WHERE clause is evaluated to UNKNOWN. The relevant tuples won’t be retrieved if a condition in WHERE clause is evaluated to UNKNOWN. Example 6.10 SELECT Studioname Example 6.10 SELECT Studioname FROM Movies_null WHERE length is null; FROM Movies_null WHERE length is null; 33

34 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.8 Ordering the Output We may ask to represent the output in sorted order. We may ask to represent the output in sorted order. We modify the select-from-where format and add the following clause: ORDER BY We modify the select-from-where format and add the following clause: ORDER BY The order is by default ascending but you can ask to order in descending order by appending the keyword DESC to an attributes The order is by default ascending but you can ask to order in descending order by appending the keyword DESC to an attributes Similarly, you can add keyword ASC for ascending order. Similarly, you can add keyword ASC for ascending order. 34

35 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.8 Ordering the Output (cont’d) Example 6.11 SELECT producerC#,title FROM Movies WHERE studioName = 'DISNEY' AND year = 1990 ORDER BY length DESC, title; Example 6.11 SELECT producerC#,title FROM Movies WHERE studioName = 'DISNEY' AND year = 1990 ORDER BY length DESC, title; Note that we can list any attributes in the ORDER BY clause even if the attributes are not listed in the SELECT clause. Note that we can list any attributes in the ORDER BY clause even if the attributes are not listed in the SELECT clause. The list can even have an expression. For example ORDER BY year - 10 The list can even have an expression. For example ORDER BY year - 10 35

36 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.1.9 Exercises for Section 6.1 36

37 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 QUERIES INVOLVING MORE THAN ONE RELATION Section 6.2 37

38 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2 Queries Involving More Than One Relation 6.2.1 Products and Joins in SQL 6.2.2 Disambiguating Attributes 6.2.3 Tuple Variables 6.2.4 Interpreting Multi-Relation Queries 6.2.5 Union, Intersection, and Difference of Queries 6.2.6 Exercises for Section 6.2 38

39 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL Cartesian product of two relations R(a, b) containing n tuples and T(c, d, e) containing m tuples, produces another relation U(a, b, c, d, e) with n x m tuples. Cartesian product of two relations R(a, b) containing n tuples and T(c, d, e) containing m tuples, produces another relation U(a, b, c, d, e) with n x m tuples. To have a clear picture about what happens when we make a Cartesian product of two relations, consider a two level nested for-loop: in the outer loop, each tuples of R is retrieved and in the inner loop, it combines with each tuples of T and produce the resulting relation tuples. To have a clear picture about what happens when we make a Cartesian product of two relations, consider a two level nested for-loop: in the outer loop, each tuples of R is retrieved and in the inner loop, it combines with each tuples of T and produce the resulting relation tuples. 39

40 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL (Product) Example 6.12a Movies(title, year, length, genre, studioName, producerC#) MovieExec(name, address, cert#, netWorth) SELECT * FROM Movies, MovieExec 40

41 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL (cont’d) Cartesian product of two relations are not usually used. We should eliminate the unnecessary tuples by applying some conditions in the WHERE clause. Cartesian product of two relations are not usually used. We should eliminate the unnecessary tuples by applying some conditions in the WHERE clause. Based on what conditions we apply in the WHERE clause, different types of joins are produced. Based on what conditions we apply in the WHERE clause, different types of joins are produced. 41

42 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL (Equi/Theta Join) Example 6.12a Retrieve the name of the producer of ‘Star Wars’. We need the following relations: Movies(title, year, length, Incolor, studioName, producerC#) MovieExec(name, address, cert#, netWorth) SELECT * FROM Movies, MovieExec WHERE title = 'STAR WARS' AND producerC# = cert#; 42

43 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL (DB1) EquiJoin EquiJoin involves the join of two or more tables using the equality operator. The equijoin by definition produces a result containing two identical columns. 43

44 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL (DB1) Example : Get all supplier and part information such that the supplier and part in question are located in the same city. SELECTSUPPLIER.*, PART.* FROM SUPPLIER, PART WHERESUPPLIER.CITY=PART.CITY; 44

45 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL (Natural Join) Example 6.12b SELECT SELECT title, year, length, InColor, studioName, producerC#,name,address,,netWorth producerC#,name,address,cert#(del),netWorth FROM Movies, MovieExec WHERE title = 'STAR WARS' AND producerC# = cert#; 45

46 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL (cont’d) Example 6.12 Retrieve the name of the producer of ‘Star Wars’. We need the following relations: Movies(title, year, length, InColor, studioName, producerC#) MovieExec(name, address, cert#, netWorth) SELECT name FROM Movies, MovieExec WHERE title = 'STAR WARS' AND producerC# = cert#; 46

47 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL (DB1) Natural Join In the equijoin, if one of the two identical columns is eliminated, then what is left is called the natural join. Natural join is probably the single most useful form of join and is usually referred to as simply "Join" 47

48 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.1 Products and Joins in SQL (DB1) SELECTS.SNUM, S.SNAME, S.STATUS, S.CITY, P.PNUM, P.PNAME, P.COLOR, P.WEIGHT FROMSUPPLIERS S, PARTS P Where S.CITY=P.CITY; 48

49 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.2 Disambiguating Attributes If the relations involving in a join contain attributes with the same name, then we must qualify each attribute by the relation name and a dot as follows: Example 6.13 Retrieve pairs of stars and executives with the same address. SELECT MovieStar.name, MovieExec.name FROM MovieStar, MovieExec WHERE MovieStar.address = MovieExec.address; MovieExec.address; 49

50 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.3 Tuple Variables Sometimes we need to join a relation with itself. So, the previous method does not work here. Sometimes we need to join a relation with itself. So, the previous method does not work here. We may list a relation R as many times as we need to in the FROM clause. We may list a relation R as many times as we need to in the FROM clause. We need a way to refer to each occurrence of R. We need a way to refer to each occurrence of R. SQL allows us to define an alias for each occurrence of R. SQL allows us to define an alias for each occurrence of R. This is also called Tuple Variable. This is also called Tuple Variable. 50

51 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.3 Tuple Variables (cont’d) Tuple variable syntax: From R AS R1, R AS R2,... Tuple variable syntax: From R AS R1, R AS R2,... Note that AS is optional and usually we omit it. Note that AS is optional and usually we omit it. If a relation has an alias, you are allowed to disambiguate the attributes with the alias as well. If a relation has an alias, you are allowed to disambiguate the attributes with the alias as well. When you don’t mention tuple variable for a relation, in fact it has a tuple variable with the same name of the relation. When you don’t mention tuple variable for a relation, in fact it has a tuple variable with the same name of the relation. 51

52 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.3 Tuple Variables (cont’d) Example 6.14 Retrieve pairs of stars that share the same address. SELECT Star1.name, Star2.name FROM MovieStar Star1, MovieStar Star2 WHERE Star1.address = Star2.address AND Star1.name < Star2.name; What’s the role of the second condition? What’s the role of the second condition? What would happen if we use <>? What would happen if we use <>? 52

53 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.4 Interpreting Multi-Relation Queries If there are several tuple variables, we may imagine nested loops, one for each tuple variable, in which the variables each range over the tuples of their respective relations. If there are several tuple variables, we may imagine nested loops, one for each tuple variable, in which the variables each range over the tuples of their respective relations. For each assignment of tuples to the tuple variables, we decide whether the WHERE clause is true. For each assignment of tuples to the tuple variables, we decide whether the WHERE clause is true. If so, we produce a tuple consisting of the values of the expressions following SELECT. If so, we produce a tuple consisting of the values of the expressions following SELECT. 53

54 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.4 Interpreting Multi-Relation Queries (cont’d) Another approach is to relate the query to relational algebra. Another approach is to relate the query to relational algebra. At first, create the Cartesian product of the relations. At first, create the Cartesian product of the relations. Then, apply selection operator by considering WHERE clause conditions. Then, apply selection operator by considering WHERE clause conditions. Then, make a projection of the attributes listed in the SELECT clause. Then, make a projection of the attributes listed in the SELECT clause. 54

55 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.4 Interpreting Multi-Relation Queries (cont’d) Example 6.15 Convert the query of example 6.14 to RA. Π L1 (σ C1 And C2 (R X T)) Where: R = MovieStar Star1 T = MovieStar Star2 L1 = Star1.name, Start2.name C1 = Star1.address = Star2.address C2 = Star1.name < Star2.name 55

56 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.5 Union, Intersection, and Difference of Queries There are 3 operations in set theory called: Union, Intersection, and Difference There are 3 operations in set theory called: Union, Intersection, and Difference SQL uses the keywords UNION, INTERSECT, and EXCEPT for the same operations on relations. SQL uses the keywords UNION, INTERSECT, and EXCEPT for the same operations on relations. Next example shows how to use these operators on relations. Next example shows how to use these operators on relations. 56

57 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.5 Union, Intersection, and Difference of Queries (cont’d) Example 6.16 Query the names and addresses of all female movie stars who are also movie executives with a net worth over $10,000,000 (SELECT name, address FROM MovieStar WHERE gender = 'M') INTERSECT (SELECT name, address FROM MovieExec WHERE netWorth > 10000000); 57

58 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.5 Union, Intersection, and Difference of Queries (cont’d) Example 6.17 Query the names and addresses of movie stars who are not movie executives. (SELECT name, address FROM MovieStar) MINUS (SELECT name, address FROM MovieExec); 58

59 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.5 Union, Intersection, and Difference of Queries (cont’d) Example 6.18 Query all the titles and years of movies that appeared in either the Movies or StarsIn relations. (SELECT title, year FROM Movies) UNION (SELECT movieTitle AS title, movieYear AS year FROM StarsIn ); 59

60 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.2.6 Exercises for Section 6.2 60

61 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 SUBQUERIES Section 6.3 61

62 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3 Subqueries 6.3.1 Subqueries that Produce Scalar Values 6.3.2 Conditions Involving Relations 6.3.3 Conditions Involving Tuples 6.3.4 Correlated Subqueries 6.3.5 Subqueries in From Clauses 6.3.6 SQL Join Expressions 6.3.7 Natural Joins 6.3.8 Outer Joins 6.3.9 Exercises for Section 6.3 62

63 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3 Subqueries A query that is part of another is called a subquery. A query that is part of another is called a subquery. The subqueries can have subqueries as well. The subqueries can have subqueries as well. We already saw subqueries in the previous examples. We created a UNION query by connecting two subqueries. We already saw subqueries in the previous examples. We created a UNION query by connecting two subqueries. 63

64 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.1 Subqueries that Produce Scalar Values A select-from-where expression can produce a relation with any number of attributes and any number of tuples. A select-from-where expression can produce a relation with any number of attributes and any number of tuples. If it produces one tuple with one attribute, we call it a scalar. If it produces one tuple with one attribute, we call it a scalar. We can use a scalar as a constant. To do that, we surround the query in a parenthesis as the following example shows. We can use a scalar as a constant. To do that, we surround the query in a parenthesis as the following example shows. 64

65 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.1 Subqueries that Produce Scalar Values (cont’d) Example 6.19 (another version of Example 6.12) Query the producer of Star Wars. SELECT name FROM MovieExec WHERE cert# = (SELECT producerC# FROM Movies FROM Movies WHERE title = 'STAR WARS' and producerC# > 950); WHERE title = 'STAR WARS' and producerC# > 950); What would happen if the subquery retrieve zero or more than one tuple? 65

66 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.1 Subqueries that Produce Scalar Values (cont’d) Example 6.19 (another version of Example 6.12) Query the producer of Star Wars. SELECT name FROM MovieExec WHERE cert# in (SELECT producerC# FROM Movies FROM Movies WHERE title = 'STAR WARS' ); WHERE title = 'STAR WARS' ); The subquery retrieve zero or more than one tuple. 66

67 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.2 Conditions Involving Relations (skip to examples) These operators can be applied to relations and produce a Boolean result. These operators can be applied to relations and produce a Boolean result. The relation must be expressed as a subquery. The relation must be expressed as a subquery. In this sub-section, we consider the operators in their simple form where a scalar value s is involved. In this sub-section, we consider the operators in their simple form where a scalar value s is involved. Therefore, the subquery R is required to produce a one-column relation. Therefore, the subquery R is required to produce a one-column relation. 67

68 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.2 Conditions Involving Relations (skip to examples) EXISTS R is true iff R is not empty. EXISTS R is true iff R is not empty. s IN R is true iff s is equal to one of the values in R. s IN R is true iff s is equal to one of the values in R. s > ALL R is true iff s is greater than every value in unary relation R. Other comparison operators ( =, =, <>) can be used. s > ALL R is true iff s is greater than every value in unary relation R. Other comparison operators ( =, =, <>) can be used. s > ANY R is true iff s is greater than at least one value in unary relation R. Other comparison operators ( =, =, <>) can be used. s > ANY R is true iff s is greater than at least one value in unary relation R. Other comparison operators ( =, =, <>) can be used. 68

69 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.2 Conditions Involving Relations (skip to examples) To negate EXISTS, ALL, and ANY operators, put NOT in front of the entire expression. NOT EXISTS R, NOT s > ALL R, NOT s > ANY R To negate EXISTS, ALL, and ANY operators, put NOT in front of the entire expression. NOT EXISTS R, NOT s > ALL R, NOT s > ANY R s NOT IN R is the negation of IN operator. s NOT IN R is the negation of IN operator. Some situations of these operators are equal to other operators. Some situations of these operators are equal to other operators. For example: s <> ALL R is equal to s NOT IN R s = ANY R is equal to s IN R For example: s <> ALL R is equal to s NOT IN R s = ANY R is equal to s IN R 69

70 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.3 Conditions Involving Tuples (skip to examples) A tuple in SQL is represented by a parenthesized list of scalar values. A tuple in SQL is represented by a parenthesized list of scalar values. Examples: (123, ‘I am a string’, 0, NULL) (name, address, salary) Examples: (123, ‘I am a string’, 0, NULL) (name, address, salary) The first example shows all constants and the second shows attributes. The first example shows all constants and the second shows attributes. Mixing constants and attributes are allowed. Mixing constants and attributes are allowed. 70

71 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.1 Subqueries that Produce Scalar Values (cont’d) Example 6.19 (another version of Example 6.12) Query the producer of Star Wars. SELECT name FROM MovieExec WHERE Exists (SELECT producerC# FROM Movies FROM Movies WHERE title = 'STAR WARS' and cert# = producerC# ); WHERE title = 'STAR WARS' and cert# = producerC# ); The subquery retrieve zero or more than one tuple. 71

72 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.3 Conditions Involving Tuples (Examples) If the type and the number of attributes in a tuple are the same as of a relation, we can compare them. If the type and the number of attributes in a tuple are the same as of a relation, we can compare them. Example: ('Tom', 'Smith') IN Example: ('Tom', 'Smith') IN (SELECT firstName, LastName FROM foo); (SELECT firstName, LastName FROM foo); Note that the order of the attributes must be the same in the tuple and the SELECT list. Note that the order of the attributes must be the same in the tuple and the SELECT list. 72

73 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.3 Conditions Involving Tuples (One/two Column Relation) Example 6.20: Query all the producers of movies in which Harrison Ford stars. SELECT name FROM MovieExec WHERE cer# IN (SELECT producerC# FROM Movies (SELECT producerC# FROM Movies WHERE (title, year) IN WHERE (title, year) IN (SELECT movieTitle, movieYear (SELECT movieTitle, movieYear FROM StarsIN FROM StarsIN WHERE starName = 'Harrison Ford') WHERE starName = 'Harrison Ford') ); ); 73

74 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.3 Conditions Involving Tuples (using Join Queries) Note that sometimes, you can get the same result without the expensive subqueries. Note that sometimes, you can get the same result without the expensive subqueries. For example, the previous query can be written as follows: For example, the previous query can be written as follows: SELECT name FROM MovieExec, Movies, StarsIN WHERE cer# = producerC# AND title = movieTitle AND year = movieYear And starName = 'Harrison Ford'; 74

75 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.4 Correlated Subqueries The simplest subquery is evaluated once and the result is used in a higher-level query. The simplest subquery is evaluated once and the result is used in a higher-level query. Some times a subquery is required to be evaluated several times, once for each assignment of a value that comes from a tuple variable outside the subquery. Some times a subquery is required to be evaluated several times, once for each assignment of a value that comes from a tuple variable outside the subquery. A subquery of this type is called correlated subquery. A subquery of this type is called correlated subquery. 75

76 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.4 Correlated Subqueries (cont'd) Example 6.21 Query the titles that have been used for two or more movies. SELECT title FROM Movies old WHERE year < ANY (SELECT year FROM Movies (SELECT year FROM Movies WHERE title = old.title); WHERE title = old.title); 76

77 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.5 Subqueries in From Clauses In a FROM list, we my use a parenthesized subquery. In a FROM list, we my use a parenthesized subquery. The subquery must have a tuple variable or alias. The subquery must have a tuple variable or alias. 77

78 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.5 Subqueries in From Clauses (cont'd) Example 6.22 Query the producers of Harrison Ford's movies. Select name FROM MovieExec, (SELECT producerC# FROM Movies, StarsIN (SELECT producerC# FROM Movies, StarsIN WHERE title = movieTitle WHERE title = movieTitle AND year = movieYear AND year = movieYear AND starName = 'Harrison Ford' AND starName = 'Harrison Ford' ) Prod ) Prod WHERE cert# = Prod.producerC#; 78

79 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.6 SQL Join Expressions Join operators construct new temp relations from existing relations. Join operators construct new temp relations from existing relations. These relations can be used in any part of the query that you can put a subquery. These relations can be used in any part of the query that you can put a subquery. Cross join is the simplest form of a join. Cross join is the simplest form of a join. Actually, this is synonym for Cartesian product. Actually, this is synonym for Cartesian product. For example: From Movies CROSS JOIN StarsIn is equal to: From Movies, StarsIn For example: From Movies CROSS JOIN StarsIn is equal to: From Movies, StarsIn 79

80 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.6 SQL Join Expressions (cont'd) If the relations we used are: Movies(title, year, length, genre, studioName, producerC#) StarsIn(movieTitle, movieYear, starName) Then the result of the CROSS JOIN would be a relation with the following attributes: R(title, year, length, genre, studioName, producerC#, movieTitle, movieYear, starName) If the relations we used are: Movies(title, year, length, genre, studioName, producerC#) StarsIn(movieTitle, movieYear, starName) Then the result of the CROSS JOIN would be a relation with the following attributes: R(title, year, length, genre, studioName, producerC#, movieTitle, movieYear, starName) Note that if there is a common name in the two relations, then the attributes names would be qualified with the relation name. Note that if there is a common name in the two relations, then the attributes names would be qualified with the relation name. 80

81 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.6 SQL Join Expressions (cont'd) Cross join by itself is rarely a useful operation. Cross join by itself is rarely a useful operation. Usually, a theta-join is used as follows: FROM R JOIN S ON condition Usually, a theta-join is used as follows: FROM R JOIN S ON condition For example: Movies JOIN StarsIn ON title = movieTitle AND year = movieYear For example: Movies JOIN StarsIn ON title = movieTitle AND year = movieYear The result would be the same number of attributes but the tuples would be those that agree on both the title and year. The result would be the same number of attributes but the tuples would be those that agree on both the title and year. 81

82 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.6 SQL Join Expressions (cont'd) Note that in the previous example, the title and year are repeated twice. Once as title and year and once as movieTitle and movieYear. Note that in the previous example, the title and year are repeated twice. Once as title and year and once as movieTitle and movieYear. Considering the point that the resulting tuples have the same value for title and movieTitle, and year and movieYear, then we encounter the redundancy of information. Considering the point that the resulting tuples have the same value for title and movieTitle, and year and movieYear, then we encounter the redundancy of information. One way to remove the unnecessary attributes is projection. You can mention the attributes names in the SELECT list. One way to remove the unnecessary attributes is projection. You can mention the attributes names in the SELECT list. 82

83 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.7 Natural Joins Natural join and theta-join differs in: Natural join and theta-join differs in: 1. The join condition All pairs of attributes from the two relations having a common name are equated, and also there are no other conditions. 2. The attributes list One of each pair of equated attributes is projected out. Example MovieStar NATURAL JOIN MovieExec Example MovieStar NATURAL JOIN MovieExec 83

84 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.7 Natural Joins (cont'd) Example 6.24 Query those stars who are executive as well. The relations are: MovieStar(name, address, gender, birthdate) MovieExec(name, address, cert#, netWorth) SELECT MovieStar.name FROM MovieStar NATURAL JOIN MovieExec 84

85 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.8 Outer Joins Outer join is a way to augment the result of a join by dangling tuples, padded with null values. Outer join is a way to augment the result of a join by dangling tuples, padded with null values. 85

86 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.8 Outer Joins (cont'd) Example 6.25 Consider the following relations: MovieStar(name, address, gender, birthdate) MovieExec(name, address, cert#, netWorth) Then MovieStar NATURAL FULL OUTER JOIN MovieExec Will produce a relation whose tuples are of 3 kinds: 1. Those who are both movie stars and executive 2. Those who are movie star but not executive 3. Those who are executive but not movie star 86

87 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.8 Outer Joins (cont'd) We can replace keyword FULL with LEFT or RIGHT to get two new join. We can replace keyword FULL with LEFT or RIGHT to get two new join. NATURAL LEFT OUTER JOIN would yield the first two tuples but not the third. NATURAL LEFT OUTER JOIN would yield the first two tuples but not the third. NATURAL RIGHT OUTER JOIN would yield the first and third tuples but not the second. NATURAL RIGHT OUTER JOIN would yield the first and third tuples but not the second. 87

88 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.8 Outer Joins (cont'd) We can have theta-outer-join as follows: We can have theta-outer-join as follows: R FULL OUTER JOIN S ON condition R FULL OUTER JOIN S ON condition R LEFT OUTER JOIN S ON condition R LEFT OUTER JOIN S ON condition R RIGHT OUTER JOIN S ON condition R RIGHT OUTER JOIN S ON condition 88

89 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.3.9 Exercises for Section 6.3 89

90 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 FULL-RELATION OPERATIONS Section 6.4 90

91 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4 Full-Relation Operations 6.4.1 Eliminating Duplicates 6.4.2 Duplicates in Unions, Intersections, and Differences 6.4.3 Grouping and Aggregation in SQL 6.4.4 Aggregation Operators 6.4.5 Grouping 6.4.6 Grouping, Aggregation, and Nulls 6.4.7 Having Clauses 6.4.8 Exercises for Section 6.4 91

92 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.1 Eliminating Duplicates SQL does not eliminate duplicate tuples by itself. So, it does not treat the relations as a set. It treats the relations as a bag. SQL does not eliminate duplicate tuples by itself. So, it does not treat the relations as a set. It treats the relations as a bag. To eliminate duplicate tuples, use DISTINCT keyword after SELECT as the next example shows. To eliminate duplicate tuples, use DISTINCT keyword after SELECT as the next example shows. Note that duplicate tuples elimination is a very expensive operation for database, so, use DISTINCT keyword wisely. Note that duplicate tuples elimination is a very expensive operation for database, so, use DISTINCT keyword wisely. 92

93 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.1 Eliminating Duplicates Example 6.27 Query all the producers of movies in which Harrison Ford stars. SELECT DISTINCT name FROM MovieExec, Movies, StarsIN WHERE cer# = producerC# AND title = movieTitle AND year = movieYear And starName = 'Harrison Ford'; 93

94 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.2 Duplicates in Unions, Intersections, and Differences Duplicate tuples are eliminated in UNION, INTERSECT, and EXCEPT. Duplicate tuples are eliminated in UNION, INTERSECT, and EXCEPT. In other words, bags are converted to sets. In other words, bags are converted to sets. If you don't want this conversion, use keyword ALL after the operators. If you don't want this conversion, use keyword ALL after the operators. Example 6.28 Example 6.28 ( SELECT title, year FROM Movies) UNION ALL (SELECT movieTitle AS title, movieYear AS year FROM StarsIn); 94

95 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.3 Grouping and Aggregation in SQL We can partition the tuples of a relation into "groups" based on the values of one or more attributes. The relation can be an output of a SELECT statement. We can partition the tuples of a relation into "groups" based on the values of one or more attributes. The relation can be an output of a SELECT statement. Then, we can aggregate the other attributes using aggregation operators. Then, we can aggregate the other attributes using aggregation operators. For example, we can sum up the salary of the employees of each department by grouping the company into departments. For example, we can sum up the salary of the employees of each department by grouping the company into departments. 95

96 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.4 Aggregation Operators SQL uses the five aggregation operators: SUM, AVG, MIN, MAX, and COUNT SQL uses the five aggregation operators: SUM, AVG, MIN, MAX, and COUNT These operators can be applied to scalar expressions, typically, a column name. These operators can be applied to scalar expressions, typically, a column name. One exception is COUNT(*) which counts all the tuples of a query output. One exception is COUNT(*) which counts all the tuples of a query output. We can eliminate the duplicate values before applying aggregation operators by using DISTINCT keyword. For example: COUNT(DISTINCT x) We can eliminate the duplicate values before applying aggregation operators by using DISTINCT keyword. For example: COUNT(DISTINCT x) 96

97 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.4 Aggregation Operators (cont'd) Example 6.29 Find the average net worth of all movie executives. SELECT AVG(netWorth) FROM MovieExec; 97

98 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.4 Aggregation Operators (cont'd) Example 6.30 Count the number of tuples in the StarsIn relation. SELECT COUNT(*) FROM StarsIn; SELECT COUNT(starName) FROM StarsIn; These two statements do the same but you will see the difference in later slides. 98

99 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.5 Grouping We can group the tuples by using GROUP BY clause following the WHERE clause. We can group the tuples by using GROUP BY clause following the WHERE clause. The keywords GROUP BY are followed by a list of grouping attributes. The keywords GROUP BY are followed by a list of grouping attributes. 99

100 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.5 Grouping (cont'd) Example 6.31 Find sum of the movies length each studio is produced. SELECT studioName, SUM(length) AS Total_Length SUM(length) AS Total_Length FROM Movies GROUP BY studioName; 100

101 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.5 Grouping (cont'd) In a SELECT clause that has aggregation, only those attributes that are mentioned in the GROUP BY clause may appear unaggregated. In a SELECT clause that has aggregation, only those attributes that are mentioned in the GROUP BY clause may appear unaggregated. For example, in previous example, if you want to add genre in the SELECT list, then, you must mention it in the GROUP BY list as well. For example, in previous example, if you want to add genre in the SELECT list, then, you must mention it in the GROUP BY list as well. SELECT studioName, genre, SUM(length) AS Total_Length SUM(length) AS Total_Length FROM Movies GROUP BY studioName, genre; 101

102 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.5 Grouping (cont'd) It is possible to use GROUP BY in a more complex queries about several relations. It is possible to use GROUP BY in a more complex queries about several relations. In these cases the following steps are applied: In these cases the following steps are applied: 1. Produce the output relation based on the select-from-where parts. 2. Group the tuples according to the list of attributes mentioned in the GROUP BY list. 3. Apply the aggregation operators 102

103 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.5 Grouping (cont'd) Example 6.32 Create a list of each producer name and the total length of film produced. SELECT name, SUM(length) FROM MovieExec, Movies WHERE producerC# = cert# GROUP BY name; 103

104 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.6 Grouping, Aggregation, and Nulls What would happen to aggregation operators if the attributes have null values? What would happen to aggregation operators if the attributes have null values? There are a few rules to remember There are a few rules to remember 104

105 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.6 Grouping, Aggregation, and Nulls (cont'd) 1. NULL values are ignored when the aggregation operator is applied on an attribute. 2. COUNT(*) counts all tuples of a relation, therefore, it counts the tuples even if the tuple contains NULL value. 3. NULL is treated as an ordinary value when forming groups. 4. When we perform an aggregation, except COUNT, over an empty bag, the result is NULL. The COUNT of an empty bag is 0 105

106 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.6 Grouping, Aggregation, and Nulls (cont'd) Example 6.33 Consider a relation R(A, B) with one tuple, both of whose components are NULL. What's the result of the following SELECT? SELECT A, COUNT(B) FROM R GROUP BY A; The result is (NULL, 0) but why? 106

107 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.6 Grouping, Aggregation, and Nulls (cont'd) What's the result of the following SELECT? SELECT A, COUNT(*) FROM R GROUP BY A; The result is (NULL, 1) because COUNT(*) counts the number of tuples and this relation has one tuple. 107

108 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.6 Grouping, Aggregation, and Nulls (cont'd) What's the result of the following SELECT? SELECT A, SUM(B) FROM R GROUP BY A; The result is (NULL, NULL) because SUM(B) addes one NULL value which is NULL. 108

109 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.7 HAVING Clauses So far, we have learned how to restrict tuples from contributing in the output of a query. So far, we have learned how to restrict tuples from contributing in the output of a query. How about if we don't want to list all groups? How about if we don't want to list all groups? HAVING clause is used to restrict groups. HAVING clause is used to restrict groups. HAVING clause followed by one or more conditions about the group. HAVING clause followed by one or more conditions about the group. 109

110 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.7 HAVING Clauses (cont'd) Example 6.34 Query the total film length for only those producers who made at least one film prior to 1930. SELECT name, SUM(length) FROM MovieExec, Movies WHERE producerC# = cert# GROUP BY name HAVING MIN(year) < 1930; 110

111 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.7 HAVING Clauses (cont'd) The rules we should remember about HAVING: The rules we should remember about HAVING: 1. An aggregation in a HAVING clause applies only to the tuples of the group being tested. 2. Any attribute of relations in the FROM clause may be aggregated in the HAVING clause, but only those attributes that are in the GROUP BY list may appear unaggregated in the HAVING clause (the same rule as for the SELECT clause). 111

112 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.7 HAVING Clauses (cont'd) The order of clauses in SQL queries would be: The order of clauses in SQL queries would be: SELECT SELECT FROM FROM WHERE WHERE GROUP BY GROUP BY HAVING HAVING Only SELECT and FROM are mandatory. Only SELECT and FROM are mandatory. 112

113 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.4.8 Exercises for Section 6.4 113

114 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 DATABASE MODIFICATIONS Section 6.5 114

115 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5 Database Modifications 6.5.1 Insertion 6.5.2 Deletion 6.5.3 Updates 6.5.4 Exercises for Section 6.5 115

116 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.1 Insertion The syntax of INSERT statement: INSERT INTO R(A 1,..., A N ) VALUES (v 1,..., v n ); The syntax of INSERT statement: INSERT INTO R(A 1,..., A N ) VALUES (v 1,..., v n ); If the list of attributes doesn't include all attributes, then it put default values for the missing attributes. If the list of attributes doesn't include all attributes, then it put default values for the missing attributes. 116

117 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.1 Insertion (cont'd) Example 6.35 INSERT INTO StarsIn(MovieTitle, movieYear, starName) VALUES ('The Maltese Falcon', 1942, 'Sydney Greenstreet'); If we are sure about the order of the attributes, then we can write the statement as follows: INSERT INTO StarsIn VALUES ('The Maltese Falcon', 1942, 'Sydney Greenstreet'); 117

118 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.1 Insertion (cont'd) The simple insert can insert only one tuple, however, if you want to insert multiple tuples, then you can use the following syntax: The simple insert can insert only one tuple, however, if you want to insert multiple tuples, then you can use the following syntax: INSERT INTO R(A 1,..., A N ) SELECT v 1,..., v n FROM R 1, R 2,..., R N WHERE ; 118

119 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.1 Insertion (cont'd) Example 6.36 Suppose that we want to insert all studio names that are mentioned in the Movies relation but they are not in the Studio yet. INSERT INTO Studio(name) SELECT DISTINCT studioName FROM Movies WHERE studionName NOT IN (SELECT name FROM Studio); FROM Studio); 119

120 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.2 Deletion The syntax of DELETE statement: DELETE FROM R WHERE ; The syntax of DELETE statement: DELETE FROM R WHERE ; Every tuples satisfying the condition will be deleted from the relation R. Every tuples satisfying the condition will be deleted from the relation R. 120

121 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.2 Deletion (cont'd) Example 6.37 DELETE FROM StarsIn WHERE movieTitle = 'The Maltese Falcon' AND movieYear = 1942 AND movieYear = 1942 AND starName = 'Sydney Greenstreet'; starName = 'Sydney Greenstreet'; 121

122 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.2 Deletion (cont'd) Example 6.38 Delete all movie executives whose net worth is less than ten million dollars. DELETE FROM MovieExec WHERE netWorth < 10000000; 122

123 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.3 Updates The syntax of UPDATE statement: UPDATE R SET WHERE ; The syntax of UPDATE statement: UPDATE R SET WHERE ; Every tuples satisfying the condition will be updated from the relation R. Every tuples satisfying the condition will be updated from the relation R. If there are more than one value-assignment, we should separate them with comma. If there are more than one value-assignment, we should separate them with comma. 123

124 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.3 Updates Example 6.39 Attach the title 'Pres.' in front of the name of every movie executive who is the president of a studio. UPDATE MovieExec SET name = 'Pres.' || name WHERE cert# IN (SELECT presC# FROM Studio); 124

125 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.5.4 Exercises for Section 6.5 125

126 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 TRANSACTIONS IN SQL Section 6.6 126

127 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6 Transactions in SQL 6.6.1 Serializability 6.6.2 Atomicity 6.6.3 Transactions 6.6.4 Read-Only Transactions 6.6.5 Dirty Reads 6.6.6 Other Isolation Levels 6.6.7 Exercises for Section 6.6 127

128 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6 Transactions in SQL Up to this point, we assumed that: Up to this point, we assumed that: the SQL operations are done by one user. the SQL operations are done by one user. The operations are done one at a time. The operations are done one at a time. There is no hardware/software failure in middle of a database modification. Therefore, the operations are done atomically. There is no hardware/software failure in middle of a database modification. Therefore, the operations are done atomically. In Real life, situations are totally different. In Real life, situations are totally different. There are millions of users using the same database and it is possible to have some concurrent operations on one tuple. There are millions of users using the same database and it is possible to have some concurrent operations on one tuple. 128

129 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.1 Serializability In applications like web services, banking, or airline reservations, hundreds to thousands operations per second are done on one database. In applications like web services, banking, or airline reservations, hundreds to thousands operations per second are done on one database. It's quite possible to have two or more operations affecting the same, let's say, one bank account. It's quite possible to have two or more operations affecting the same, let's say, one bank account. If these operations overlap in time, then they may act in a strange way. If these operations overlap in time, then they may act in a strange way. Let's take an example. Let's take an example. 129

130 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.1 Serializability (cont'd) Example 6.40 Consider an airline reservation web application. Users can book their desired seat by themselves. The application is using the following schema: Flights(fltNo, fltDae, seatNo, seatStatus) When a user requests the available seats for the flight no 123 on date 2011-12-15, the following query is issued: 130

131 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.1 Serializability (cont'd) SELECT seatNo FROM Flights WHERE fltNo = 123 AND fltDate = DATE '2011-12-25' AND fltDate = DATE '2011-12-25' AND seatStatus = 'available'; seatStatus = 'available'; When the customer clicks on the seat# 22A, the seat status is changed by the following SQL: UPDATE Flights SET seatStatus = 'occupied' WHERE fltNo = 123 AND fltDate = DATE '2011-12-25' AND fltDate = DATE '2011-12-25' AND seatNo = '22A'; seatNo = '22A'; 131

132 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.1 Serializability (cont'd) What would happen if two users at the same time click on the reserve button for the same seat#? What would happen if two users at the same time click on the reserve button for the same seat#? Both see the same seats available and both reserve the same seat. Both see the same seats available and both reserve the same seat. To prevent these happen, SQL has some solutions. To prevent these happen, SQL has some solutions. We group a set of operations that need to be performed together. This is called 'transaction'. We group a set of operations that need to be performed together. This is called 'transaction'. 132

133 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.1 Serializability (cont'd) For example, the query and the update in example 6.40 can be grouped in a transaction. For example, the query and the update in example 6.40 can be grouped in a transaction. SQL allows the programmer to stat that a certain transaction must be serializable with respect to other transactions. SQL allows the programmer to stat that a certain transaction must be serializable with respect to other transactions. That is, these transactions must behave as if they were run serially, one at a time with no overlap. That is, these transactions must behave as if they were run serially, one at a time with no overlap. 133

134 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.2 Atomicity What would happen if a transaction consisting of two operations is in progress and after the first operation is done, the database and/or network crashes? What would happen if a transaction consisting of two operations is in progress and after the first operation is done, the database and/or network crashes? Let's take an example. Let's take an example. 134

135 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.2 Atomicity (cont'd) Example 6.41 Consider a bank's account records system with the following relation: Accounts(acctNo, balance) Let's suppose that $100 is going to transfer from acctNo 123 to acctNo 456. To do this, the following two steps should be done: 1. Add $100 to account# 456 2. Subtract $100 from account# 123. 135

136 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.2 Atomicity (cont'd) The needed SQL statements are as follows: UPDATE Accounts SET balance = balance + 100 WHERE acctNo = 456; UPDATE Accounts SET balance = balance - 100 WHERE acctNo = 123; What would happen if right after the first operation, the database crashes? 136

137 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.2 Atomicity (cont'd) The problem addressed by example 6.41 is that certain combinations of operations need to be done atomically. The problem addressed by example 6.41 is that certain combinations of operations need to be done atomically. That is, either they are both done or neither is done. That is, either they are both done or neither is done. 137

138 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.3 Transactions The solution to the problems of serialization and atomicity is to group database operations into transactions. The solution to the problems of serialization and atomicity is to group database operations into transactions. A transaction is a set of one or more operations on the database that must be executed atomically and in a serializable manner. A transaction is a set of one or more operations on the database that must be executed atomically and in a serializable manner. To create a transation, we use the following SQL command: To create a transation, we use the following SQL command: START TRANSACTION 138

139 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.3 Transactions (cont'd) There are two ways to end a transaction: There are two ways to end a transaction: 1. The SQL receives COMMIT command. 2. The SQL receives ROLLBACK command. COMMIT command causes all changes become permanent in the database. COMMIT command causes all changes become permanent in the database. ROLLBACK command causes all changes undone. ROLLBACK command causes all changes undone. 139

140 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.4 Read-Only Transactions We saw that when a transaction read a data and then want to write something, is prone to serialization problems. We saw that when a transaction read a data and then want to write something, is prone to serialization problems. When a transaction only reads data and does not write data, we have more freedom to let the transaction execute in parallel with other transactions. When a transaction only reads data and does not write data, we have more freedom to let the transaction execute in parallel with other transactions. We call these transactions read-only. We call these transactions read-only. 140

141 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.4 Read-Only Transactions (cont'd) Example 6.43 Suppose we want to read data from the Flights relation of example 6.40 to determine whether a certain seat was available? What's the worst thing that can happen? When we query the availability of a certain seat, that seat was being booked or was being released by the execution of some other program. Then we get the wrong answer. 141

142 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.4 Read-Only Transactions (cont'd) If we tell the SQL that our current transaction is read-only, then SQL allows our transaction be executed with other read-only transactions in parallel. If we tell the SQL that our current transaction is read-only, then SQL allows our transaction be executed with other read-only transactions in parallel. The syntax of SQL command for read-only setting: The syntax of SQL command for read-only setting: SET TRANSACTION READ ONLY; We put this statement before our read-only transaction. We put this statement before our read-only transaction. 142

143 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.4 Read-Only Transactions (cont'd) The syntax of SQL command for read-write setting: The syntax of SQL command for read-write setting: SET TRANSACTION READ WRITE; We put this statement before our read-write transaction. We put this statement before our read-write transaction. This option is the default. This option is the default. 143

144 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.5 Dirty Reads The data that is written but not committed yet is called dirty data. The data that is written but not committed yet is called dirty data. A dirty read is a read of dirty data written by another transaction. A dirty read is a read of dirty data written by another transaction. The risk in reading dirty data is that the transaction that wrote it never commit it. The risk in reading dirty data is that the transaction that wrote it never commit it. 144

145 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.5 Dirty Reads (cont'd) Example 6.44 Consider the account transfer of example 6.41. Here are the steps: 1. Add money to account 2. 2. Test if account 1 has enough money? a. If there is not enough money, remove the money from account 2 and end. b. If there is, subtract the money from account 1 and end. Imagine, there are 3 accounts A1, A2, and A3 with $100, $200, and $300. 145

146 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.5 Dirty Reads (cont'd) Example 6.44 (cont'd) Let's suppose: Transaction T1 transfers $150 from A1 to A2 Transaction T2 transfers $250 from A2 to A3 What would happen if the dirty read is allowed? 146

147 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.5 Dirty Reads (cont'd) The syntax of SQL command for dirty-read setting: The syntax of SQL command for dirty-read setting: SET TRANSACTION READ WRITE ISOLATION LEVEL READ UNCOMMITTED; We put this statement before our read-write transaction. We put this statement before our read-write transaction. This option is the default. This option is the default. 147

148 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.6 Other Isolation Levels There are four isolation level. There are four isolation level. We have seen the first two before. We have seen the first two before. Serializable (default) Serializable (default) Read-uncommitted Read-uncommitted Read-committed Syntax: SET TRANSACTION ISOLATION LEVEL READ COMMITTED; Read-committed Syntax: SET TRANSACTION ISOLATION LEVEL READ COMMITTED; Repeatable-read Syntax SET TRANSACTION ISOLATION LEVEL READ COMMITTED; Repeatable-read Syntax SET TRANSACTION ISOLATION LEVEL READ COMMITTED; 148

149 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.6 Other Isolation Levels (cont'd) For each the default is 'READ WRITE' (except the isolation READ UNCOMMITTED that the default is 'READ ONLY') and if you want 'READ ONLY', you should mention it explicitly. For each the default is 'READ WRITE' (except the isolation READ UNCOMMITTED that the default is 'READ ONLY') and if you want 'READ ONLY', you should mention it explicitly. The default isolation level is 'SERIALIZABLE'. The default isolation level is 'SERIALIZABLE'. Note that if a transaction T is acting in 'SERIALIZABLE' level and the other one is acting in 'READ UNCOMMITTED' level, then this transaction can see the dirty data of T. It means that each one acts based on their level. Note that if a transaction T is acting in 'SERIALIZABLE' level and the other one is acting in 'READ UNCOMMITTED' level, then this transaction can see the dirty data of T. It means that each one acts based on their level. 149

150 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.6 Other Isolation Levels (cont'd) Under READ COMMITTED isolation, it forbids reading the dirty data. Under READ COMMITTED isolation, it forbids reading the dirty data. But it does not guarantee that if we issue several queries, we get the same tuples. But it does not guarantee that if we issue several queries, we get the same tuples. That's because there may be some new committed tuples by other transactions. That's because there may be some new committed tuples by other transactions. 150

151 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.6 Other Isolation Levels (cont'd) Example 6.46 Let's consider the seat choosing problem under 'READ COMMITTED' isolation. Your query won't see seat as available if another transaction reserved it but not committed yet. You may see different set of seats in subsequent queries depends on if the other transactions commit their reservations or rollback them. 151

152 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.6 Other Isolation Levels (cont'd) Under REPEATABLE READ isolation, if a tuple is retrieved for the first time, then we are sure that the same tuple will be retrieve if the query is repeated. Under REPEATABLE READ isolation, if a tuple is retrieved for the first time, then we are sure that the same tuple will be retrieve if the query is repeated. But the query may show more tuples because of the phantom tuples. But the query may show more tuples because of the phantom tuples. A phantom tuple is a tuple that is inserted by other transactions. A phantom tuple is a tuple that is inserted by other transactions. 152

153 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.6 Other Isolation Levels (cont'd) Example 6.47 Let's continue the seat choosing problem under 'REPEATABLE READ' isolation. If a seat is available on the first query, then it will remain available at the subsequent queries. Now suppose that some new tuples are inserted into the flight relation (phantom tuples) for that particular flight for any reason. Then the subsequent queries retrieve the new tuples as well. 153

154 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.6 Other Isolation Levels (cont'd) Properties of SQL isolation levels Properties of SQL isolation levels 154 Isolation Level Dirty Read Non- repeatable Read Phantom Read Uncommitted Read Committed - Repeatable Read -- Serializable---

155 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.6.7 Exercises for Section 6.6 155

156 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.7 Summary of Chapter 6 156

157 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 6.8 References for Chapter 6 157


Download ppt "Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 Chapter 6 THE DATABASE LANGUAGE SQL 1."

Similar presentations


Ads by Google