Presentation is loading. Please wait.

Presentation is loading. Please wait.

EECS 647: Introduction to Database Systems

Similar presentations


Presentation on theme: "EECS 647: Introduction to Database Systems"— Presentation transcript:

1 EECS 647: Introduction to Database Systems
Instructor: Luke Huan Spring 2007

2 Luke Huan Univ. of Kansas
Administrative Final project is posted at the class website. Go through it and we will discuss the final project at next class meeting 9/19/2018 Luke Huan Univ. of Kansas

3 Luke Huan Univ. of Kansas
Review SFW statement: Sort the outputs: USE bag operation: Enforce set operation: Output average: Partition rows in a table: Select certain groups: ORDER BY Default choice DISTINCT AVG (aggregation function) GROUP BY HAVING 9/19/2018 Luke Huan Univ. of Kansas

4 Luke Huan Univ. of Kansas
Review Nested queries Treat as a table Treat as a scalar IN EXIST 9/19/2018 Luke Huan Univ. of Kansas

5 Luke Huan Univ. of Kansas
Today’s Topic NULL value Outer join Nested queries Table creation and constraints 9/19/2018 Luke Huan Univ. of Kansas

6 Incomplete information
Example: Student (SID, name, age, GPA) Value unknown We do not know Nelson’s age Value not applicable Nelson has not taken any classes yet; what is his GPA? 9/19/2018 Luke Huan Univ. of Kansas

7 Luke Huan Univ. of Kansas
Solution 1 A dedicated special value for each domain (type) GPA cannot be –1, so use –1 as a special value to indicate a missing or invalid GPA Leads to incorrect answers if not careful SELECT AVG(GPA) FROM Student; Complicates applications SELECT AVG(GPA) FROM Student WHERE GPA <> -1; Remember the Y2K bug? “00” was used as a missing or invalid year value 9/19/2018 Luke Huan Univ. of Kansas

8 Luke Huan Univ. of Kansas
Solution 2 A valid-bit for every column Student (SID, name, name_is_valid, age, age_is_valid, GPA, GPA_is_valid) Complicates schema and queries SELECT AVG(GPA) FROM Student WHERE GPA_is_valid; 9/19/2018 Luke Huan Univ. of Kansas

9 Luke Huan Univ. of Kansas
Solution 3? Decompose the table; missing row = missing value StudentName (SID, name) StudentAge (SID, age) StudentGPA (SID, GPA) StudentID (SID) Conceptually the cleanest solution Still complicates schema and queries How to get all information about a student in a table? Would natural join work? 9/19/2018 Luke Huan Univ. of Kansas

10 Luke Huan Univ. of Kansas
SQL’s solution A special value NULL For every domain Special rules for dealing with NULL’s Example: Student (SID, name, age, GPA) h 789, “Nelson”, NULL, NULL i 9/19/2018 Luke Huan Univ. of Kansas

11 Luke Huan Univ. of Kansas
Three-valued logic TRUE = 1, FALSE = 0, UNKNOWN = 0.5 x AND y = min(x, y) x OR y = max(x, y) NOT x = 1 – x WHERE and HAVING clauses only select rows for output if the condition evaluates to TRUE UNKNOWN is not enough AND True False NULL UNK OR True False NULL TRUE UNK AND OR 9/19/2018 Luke Huan Univ. of Kansas

12 Luke Huan Univ. of Kansas
Computing with NULL’s (Arithmetic operation) when we operate on a NULL and another value (including another NULL) using +, –, etc., the result is NULL Aggregate functions ignore NULL, except COUNT(*) (since it counts rows) When we compare a NULL with another value (including another NULL) using =, >, etc., the result is UNKNOWN 9/19/2018 Luke Huan Univ. of Kansas

13 Unfortunate consequences
SELECT AVG(GPA) FROM Student 3.4 SELECT SUM(GPA)/COUNT(*) FROM Student; 2.72 SELECT * FROM Student; SELECT * FROM Student WHERE GPA = GPA Not equivalent sid name age gpa 1234 John Smith 21 3.5 1123 Mary Carter 19 3.8 1011 Bob Lee 22 NULL 1204 Susan Wong 3.4 1306 Kevin Kim 18 2.9 WHY DO YOU CARE WHEN EQUIVALENCES ARE BROKEN? Q.O. is harder. 9/19/2018 Luke Huan Univ. of Kansas

14 Luke Huan Univ. of Kansas
Another problem Example: Who has NULL GPA values? SELECT * FROM Student WHERE GPA = NULL; Does not work; never returns anything (SELECT * FROM Student) EXCEPT ALL (SELECT * FROM Student WHERE GPA = GPA) Works, but ugly Introduced built-in predicates IS NULL and IS NOT NULL SELECT * FROM Student WHERE GPA IS NULL; 9/19/2018 Luke Huan Univ. of Kansas

15 Luke Huan Univ. of Kansas
Outerjoin motivation Example: a master class list SELECT c.CID, s.SID FROM Enroll e, Student s, Course c WHERE e.SID = s.SID and c.CID = e.CID; What if a student take no classes For these students, CID column should be NULL What if a course with no student enrolled yet? For these courses, SID should be NULL Very useful in information integration as well 9/19/2018 Luke Huan Univ. of Kansas

16 Luke Huan Univ. of Kansas
Outerjoin SELECT * FROM R FULL OUTER JOIN S ON p; A full outer join between R and S (denoted R ! S) includes all rows in the result of R !pS, plus “Dangling” R rows (those that do not join with any S rows) padded with NULL’s for S’s columns “Dangling” S rows (those that do not join with any R rows) padded with NULL’s for R’s columns 9/19/2018 Luke Huan Univ. of Kansas

17 Luke Huan Univ. of Kansas
Outerjoin (II) SELECT * FROM R LEFT OUTER JOIN S ON p;; SELECT * FROM R RIGHT OUTER JOIN S ON p; A left outer join (R ! S) includes rows in R ! S plus dangling R rows padded with NULL’s A right outer join (R ! S) includes rows in R ! S plus dangling S rows padded with NULL’s 9/19/2018 Luke Huan Univ. of Kansas

18 Luke Huan Univ. of Kansas
Outerjoin examples SELECT * FROM Employee LEFT OUTER JOIN Department ON Eid = Mid Employee Eid Name 1123 John Smith 1234 Mary Carter 1311 Bob Lee Eid Name Did Mid Dname 1123 John Smith 4 Research 1234 Mary Carter 5 Finance 1311 Bob Lee NULL SELECT * FROM Employee FULL OUTER JOIN Department ON Eid = Mid SELECT * FROM Employee RIGHT OUTER JOIN Department ON Eid = Mid Department Eid Name Did Mid Dname 1123 John Smith 4 Research 1234 Mary Carter 5 Finance 1311 Bob Lee NULL 6 1312 HR Eid Name Did Mid Dname 1123 John Smith 4 Research 1234 Mary Carter 5 Finance NULL 6 1312 HR Did Mid Dname 4 1123 Research 5 1234 Finance 6 1312 HR 9/19/2018 Luke Huan Univ. of Kansas

19 Luke Huan Univ. of Kansas
Summary Query SELECT-FROM-WHERE statements Ordering Set and bag operations Aggregation and grouping Table expressions, subqueries NULL Outerjoins 9/19/2018 Luke Huan Univ. of Kansas

20 Luke Huan Univ. of Kansas
Exercise Goal: Get the Name of one’s supervisor EMPLOYEE (Eid, Mid, Name) SQL Statement: SELECT EID, Name FROM EMPLOYEE WHERE MId =Eid; SELECT E1.EID, E2.Name FROM EMPLOYEE E1, EMPLOYEE E2 WHERE E1.MId = E2.EId; Eid Mid Name 1123 1234 John Smith 1311 Mary Carter 1611 Bob Lee 1455 Lisa Wang Jack Snow Eid Name 1123 Mary Carter 1234 Bob Lee 1311 Jack Snow 1455 1611 9/19/2018 Luke Huan Univ. of Kansas

21 Luke Huan Univ. of Kansas
Exercise Goal: group employees according to their department, for each department, list EIds of its employees and list the head count EMPLOYEE (Eid, Name), DEPARTMENT(Eid, Did) SQL Statement: SELECT Did, Eid, COUNT(Eid) FROM EMPLOYEE e, DEPARTMENT d WHERE e.Eid = d.Eid GROUP BY Did There is something wrong! Wishful thinking (list a group of Eid after grouping) won’t work Solution: produce (1) a summary table listing Did and head count, and (2) sort Department according to Did. 9/19/2018 Luke Huan Univ. of Kansas

22 More on Nested Queries: Scoping
To find out which table a column belongs to Start with the immediately surrounding query If not found, look in the one surrounding that; repeat if necessary Use table_name.column_name notation and AS (renaming) to avoid confusion 9/19/2018 Luke Huan Univ. of Kansas

23 Luke Huan Univ. of Kansas
An Example SELECT * FROM Student s WHERE EXISTS (SELECT * FROM Enroll e WHERE SID = s.SID AND EXISTS (SELECT * FROM Enroll WHERE SID = s.SID AND CID <> e.CID)); Students who are taking at least two courses 9/19/2018 Luke Huan Univ. of Kansas

24 Quantified subqueries
A quantified subquery can be used as a value in a WHERE condition Universal quantification (for all): … WHERE x op ALL (subquery) … True iff for all t in the result of subquery, x op t Existential quantification (exists): … WHERE x op ANY (subquery) … True iff there exists some t in the result of subquery such that x op t Beware In common parlance, “any” and “all” seem to be synonyms In SQL, ANY really means “some” 9/19/2018 Luke Huan Univ. of Kansas

25 Examples of quantified subqueries
Which employees have the highest salary? Employee (Sid, Name, Salary) SELECT * FROM Employee WHERE Salary >= ALL (SELECT Salary FROM Employee); How about the lowest salary? SELECT * FROM Employee WHERE Salary <= ALL (SELECT salary FROM Employee); 9/19/2018 Luke Huan Univ. of Kansas

26 More ways of getting the highest Salary
Who has the highest Salary? SELECT * FROM Employee e WHERE NOT EXISTS (SELECT * FROM Employee WHERE Salary > e.Salary); SELECT * FROM Employee WHERE Eid NOT IN (SELECT e1.SID FROM Employee e1, Employee e2 WHERE e1.Salary < e2.Salary); 9/19/2018 Luke Huan Univ. of Kansas

27 Luke Huan Univ. of Kansas
Nested Queries Nested queries do not add expression power to SQL For convenient Write intuitive SQL queries Can always use SQL queries without nesting to complete the same task (though sometime it is hard) 9/19/2018 Luke Huan Univ. of Kansas

28 Luke Huan Univ. of Kansas
More Exercise Sailors (sid: INTEGER, sname: string, rating: INTEGER, age: REAL) Boats (bid: INTEGER, bname: string, color: string) Reserves (sid: INTEGER, bid: INTEGER) Sailors Boats sid sname rating age 1 Fred 7 22 2 Jim 39 3 Nancy 8 27 bid bname color 101 Nina red 102 Pinta blue 103 Santa Maria sid bid 1 102 2 Reserves 9/19/2018 Luke Huan Univ. of Kansas

29 Luke Huan Univ. of Kansas
Exercise I Find sid’s of sailors who’ve reserved a red AND a green boat SELECT R1.sid FROM Boats B1, Reserves R1, Boats B2, Reserves R2 WHERE B1.color=‘red’ AND B2.color=‘green’ AND R1.bid=B1.bid AND R2.bid=B2.bid AND R1.sid=R2.sid SELECT sid FROM Boats, Reserves Where B.color = ‘red’ INTERSECT (SELECT sid Where B.color = ‘green’) 9/19/2018 Luke Huan Univ. of Kansas

30 Luke Huan Univ. of Kansas
Exercise II Find sid’s of sailors who have not reserved a boat SELECT sid FROM Sailors EXCEPT (SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid=R.sid) Non-monotonic operation! 9/19/2018 Luke Huan Univ. of Kansas

31 Exercise III (a tough one)
Find sailors who’ve reserved all boats SELECT Sid FROM Sailor EXCEPT FROM (SELECT bid, sid FROM Boat, Sailor Reserves) Non-monotonic operation! #All sailors #Those who do not reserve all boats #All possible combinations between sid and bid #Existing reservations 9/19/2018 Luke Huan Univ. of Kansas

32 Summary of SQL features covered so far
SELECT-FROM-WHERE statements Set and bag operations Ordering Aggregation and grouping Table expressions, subqueries NULL’s and outerjoins Next: data modification statements, constraints 9/19/2018 Luke Huan Univ. of Kansas

33 Relational Query Languages
Two sublanguages: DDL – Data Definition Language Define and modify schema DML – Data Manipulation Language Queries can be written intuitively. Insert, delete, update rows in a table DBMS is responsible for efficient evaluation. The key: precise semantics for relational queries. Optimizer can re-order operations, without affecting query answer. Choices driven by “cost model”

34 Constraints in Table Declaration
Restrictions on allowable data in a database In addition to the simple structure and type restrictions imposed by the table definitions Declared as part of the schema Enforced by the DBMS Why use constraints? Protect data integrity (catch errors) Tell the DBMS about the data (so it can optimize better) 9/19/2018 Luke Huan Univ. of Kansas

35 Types of SQL constraints
Key constraint NOT NULL Referential integrity (foreign key) 9/19/2018 Luke Huan Univ. of Kansas

36 Luke Huan Univ. of Kansas
Key declaration At most one PRIMARY KEY per table Typically implies a primary index Rows are stored inside the index, typically sorted by the primary key value ) best speedup for queries Any number of UNIQUE keys per table Typically implies a secondary index To speedup queries 9/19/2018 Luke Huan Univ. of Kansas

37 Luke Huan Univ. of Kansas
NOT NULL Indicates that the value of an attribute can not be NULL 9/19/2018 Luke Huan Univ. of Kansas

38 Luke Huan Univ. of Kansas
Examples CREATE TABLE Student (SID INTEGER NOT NULL PRIMARY KEY, name VARCHAR(30) NOT NULL, VARCHAR(30) UNIQUE, age INTEGER, GPA FLOAT); CREATE TABLE Course (CID CHAR(10) NOT NULL PRIMARY KEY, title VARCHAR(100) NOT NULL); CREATE TABLE Enroll (SID INTEGER NOT NULL, CID CHAR(10) NOT NULL, PRIMARY KEY(SID, CID)); Used for query processing This form is required for multi-attribute keys 9/19/2018 Luke Huan Univ. of Kansas

39 Referential integrity example
Enroll.SID references Student.SID If an SID appears in Enroll, it must appear in Student Enroll.CID references Course.CID If a CID appears in Enroll, it must appear in Course That is, no “dangling pointers” How do you translate a relationship set in E/R? Referential integrity! 9/19/2018 Luke Huan Univ. of Kansas

40 Referential integrity in SQL
Referenced column(s) must be PRIMARY KEY Referencing column(s) form a FOREIGN KEY Example CREATE TABLE Enroll (SID INTEGER NOT NULL, CID CHAR(10) NOT NULL, PRIMARY KEY(SID, CID), FOREIGN KEY CID REFERENCES Course(CID) FOREIGN KEY SID REFERENCES Student(SID)); 9/19/2018 Luke Huan Univ. of Kansas

41 Enforcing referential integrity
Example: Enroll.SID references Student.SID Insert or update an Enroll row so it refers to a non-existent SID Reject Delete or update a Student row whose SID is referenced by some Enroll row Cascade: ripple changes to all referring rows Set NULL: set all references to NULL Set DEFAULT: use the default value of the attribute All four options can be specified in SQL 9/19/2018 Luke Huan Univ. of Kansas

42 Luke Huan Univ. of Kansas
SQL Example CREATE TABLE Enroll (SID INTEGER NOT NULL, CID CHAR(10) NOT NULL, PRIMARY KEY(SID, CID), FOREIGN KEY CID REFERENCES Course(CID) CASCADE ON DELETE FOREIGN KEY SID REFERENCES Student(SID) CASCADE ON DELETE); Delete all student enrollments for a canceled class Delete all student enrollments for a dropped student 9/19/2018 Luke Huan Univ. of Kansas

43 Database Modification Operations
INSERT DELETE UPDATE 9/19/2018 Luke Huan Univ. of Kansas

44 Luke Huan Univ. of Kansas
INSERT Insert one row INSERT INTO Enroll VALUES (456, ’EECS647’); Student 456 takes EECS647 Insert the result of a query INSERT INTO Enroll (SELECT SID, ’EECS647’ FROM Student WHERE SID NOT IN (SELECT SID FROM Enroll WHERE CID = ’EECS647’)); Force everybody to take EECS647! 9/19/2018 Luke Huan Univ. of Kansas

45 Luke Huan Univ. of Kansas
DELETE Delete everything DELETE FROM Enroll; Delete according to a WHERE condition Example: Student 456 drops CPS116 DELETE FROM Enroll WHERE SID = 456 AND CID = ’EECS647’; Example: Drop students from all EECS classes with GPA lower than 1.0 DELETE FROM Enroll WHERE SID IN (SELECT SID FROM Student WHERE GPA < 1.0) AND CID LIKE ’EECS%’; 9/19/2018 Luke Huan Univ. of Kansas

46 Luke Huan Univ. of Kansas
UPDATE Example: Student 142 changes name to “Barney” UPDATE Student SET name = ’Barney’ WHERE SID = 142; 9/19/2018 Luke Huan Univ. of Kansas

47 Summary of SQL features covered so far
Query SELECT-FROM-WHERE statements Set and bag operations Table expressions, subqueries Aggregation and grouping Ordering Outerjoins Constraints Modification INSERT/DELETE/UPDATE Next: triggers, views, indexes 9/19/2018 Luke Huan Univ. of Kansas


Download ppt "EECS 647: Introduction to Database Systems"

Similar presentations


Ads by Google