EECS 647: Introduction to Database Systems

Slides:



Advertisements
Similar presentations
SQL: The Query Language Part 2
Advertisements

SQL: The Query Language Part 1 R &G - Chapter 5 Life is just a bowl of queries. -Anon (not Forrest Gump)
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Programming, Triggers Chapter 5 Modified by Donghui Zhang.
Introduction to Database Systems 1 SQL: The Query Language Relation Model : Topic 4.
1 SQL: Structured Query Language (‘Sequel’) Chapter 5.
1 Lecture 11: Basic SQL, Integrity constraints
SQL: Queries, Constraints, Triggers
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Constraints, Triggers Chapter 5.
SQL: The Query Language Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein and etc for some slides.
CS 405G: Introduction to Database Systems
 CS 405G: Introduction to Database Systems Lecture 7: Relational Algebra II Instructor: Chen Qian Spring 2014.
1 SQL: Structured Query Language (‘Sequel’) Chapter 5.
SQL 2 – The Sequel R&G, Chapter 5 Lecture 10. Administrivia Homework 2 assignment now available –Due a week from Sunday Midterm exam will be evening of.
FALL 2004CENG 351 File Structures and Data Management1 SQL: Structured Query Language Chapter 5.
SPRING 2004CENG 3521 The Relational Model Chapter 3.
1 Query Languages: How to build or interrogate a relational database Structured Query Language (SQL)
Rutgers University SQL: Queries, Constraints, Triggers 198:541 Rutgers University.
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Constraints, Triggers Chapter 5.
1 SQL: Structured Query Language Chapter 5. 2 SQL and Relational Calculus relationalcalculusAlthough relational algebra is useful in the analysis of query.
CSC343 – Introduction to Databases - A. Vaisman1 SQL: Queries, Programming, Triggers.
The Relational Model These slides are based on the slides of your text book.
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
CSC 411/511: DBMS Design Dr. Nan WangCSC411_L6_SQL(1) 1 SQL: Queries, Constraints, Triggers Chapter 5 – Part 1.
Instructor: Jinze Liu Fall Basic Components (2) Relational Database Web-Interface Done before mid-term Must-Have Components (2) Security: access.
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Constraints, Triggers Chapter 5.
SQL: Queries, Programming, Triggers. Example Instances We will use these instances of the Sailors and Reserves relations in our examples. If the key for.
ICS 321 Fall 2009 SQL: Queries, Constraints, Triggers Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 9/8/20091Lipyeow.
 CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CMPT 258 Database Systems SQL Queries (Chapter 5).
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CS34311 The Relational Model. cs34312 Why Relational Model? Currently the most widely used Vendors: Oracle, Microsoft, IBM Older models still used IBM’s.
1 SQL: Structured Query Language (‘Sequel’) Chapter 5.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2007.
CS 405G: Introduction to Database Systems SQL III.
SQL: The Query Language Part 1 R &G - Chapter 5 The important thing is not to stop questioning. Albert Einstein.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu.
1 SQL: The Query Language. 2 Example Instances R1 S1 S2 v We will use these instances of the Sailors and Reserves relations in our examples. v If the.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2007.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 CS122A: Introduction to Data Management Lecture #4 (E-R  Relational Translation) Instructor: Chen Li.
CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
SQL – Part 2.
COP Introduction to Database Structures
COP Introduction to Database Structures
01/31/11 SQL Examples Edited by John Shieh CS3754 Classnote #10.
SQL The Query Language R & G - Chapter 5
Basic SQL Lecture 6 Fall
CS 405G: Introduction to Database Systems
Translation of ER-diagram into Relational Schema
Lecture#7: Fun with SQL (Part 2)
CS 405G: Introduction to Database Systems
Database Applications (15-415) SQL-Part II Lecture 9, February 04, 2018 Mohammad Hammoud.
The Relational Model Content based on Chapter 3
CS 405G: Introduction to Database Systems
The Relational Model Relational Data Model
CS 405G: Introduction to Database Systems
SQL: Queries, Constraints, Triggers
SQL: The Query Language Part 1
SQL: Structured Query Language
SQL: Queries, Programming, Triggers
CS 405G: Introduction to Database Systems
The Relational Model Content based on Chapter 3
CS 405G: Introduction to Database Systems
Recap – Relational languages
CS 405G: Introduction to Database Systems
Presentation transcript:

EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2007

Luke Huan Univ. of Kansas Administrative Final project is posted at the class website. Go through it and we will discuss the final project at next class meeting 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Review SFW statement: Sort the outputs: USE bag operation: Enforce set operation: Output average: Partition rows in a table: Select certain groups: ORDER BY Default choice DISTINCT AVG (aggregation function) GROUP BY HAVING 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Review Nested queries Treat as a table Treat as a scalar IN EXIST 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Today’s Topic NULL value Outer join Nested queries Table creation and constraints 9/19/2018 Luke Huan Univ. of Kansas

Incomplete information Example: Student (SID, name, age, GPA) Value unknown We do not know Nelson’s age Value not applicable Nelson has not taken any classes yet; what is his GPA? 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Solution 1 A dedicated special value for each domain (type) GPA cannot be –1, so use –1 as a special value to indicate a missing or invalid GPA Leads to incorrect answers if not careful SELECT AVG(GPA) FROM Student; Complicates applications SELECT AVG(GPA) FROM Student WHERE GPA <> -1; Remember the Y2K bug? “00” was used as a missing or invalid year value 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Solution 2 A valid-bit for every column Student (SID, name, name_is_valid, age, age_is_valid, GPA, GPA_is_valid) Complicates schema and queries SELECT AVG(GPA) FROM Student WHERE GPA_is_valid; 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Solution 3? Decompose the table; missing row = missing value StudentName (SID, name) StudentAge (SID, age) StudentGPA (SID, GPA) StudentID (SID) Conceptually the cleanest solution Still complicates schema and queries How to get all information about a student in a table? Would natural join work? 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas SQL’s solution A special value NULL For every domain Special rules for dealing with NULL’s Example: Student (SID, name, age, GPA) h 789, “Nelson”, NULL, NULL i 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Three-valued logic TRUE = 1, FALSE = 0, UNKNOWN = 0.5 x AND y = min(x, y) x OR y = max(x, y) NOT x = 1 – x WHERE and HAVING clauses only select rows for output if the condition evaluates to TRUE UNKNOWN is not enough AND True False NULL UNK OR True False NULL TRUE UNK AND OR 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Computing with NULL’s (Arithmetic operation) when we operate on a NULL and another value (including another NULL) using +, –, etc., the result is NULL Aggregate functions ignore NULL, except COUNT(*) (since it counts rows) When we compare a NULL with another value (including another NULL) using =, >, etc., the result is UNKNOWN 9/19/2018 Luke Huan Univ. of Kansas

Unfortunate consequences SELECT AVG(GPA) FROM Student 3.4 SELECT SUM(GPA)/COUNT(*) FROM Student; 2.72 SELECT * FROM Student; SELECT * FROM Student WHERE GPA = GPA Not equivalent sid name age gpa 1234 John Smith 21 3.5 1123 Mary Carter 19 3.8 1011 Bob Lee 22 NULL 1204 Susan Wong 3.4 1306 Kevin Kim 18 2.9 WHY DO YOU CARE WHEN EQUIVALENCES ARE BROKEN? Q.O. is harder. 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Another problem Example: Who has NULL GPA values? SELECT * FROM Student WHERE GPA = NULL; Does not work; never returns anything (SELECT * FROM Student) EXCEPT ALL (SELECT * FROM Student WHERE GPA = GPA) Works, but ugly Introduced built-in predicates IS NULL and IS NOT NULL SELECT * FROM Student WHERE GPA IS NULL; 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Outerjoin motivation Example: a master class list SELECT c.CID, s.SID FROM Enroll e, Student s, Course c WHERE e.SID = s.SID and c.CID = e.CID; What if a student take no classes For these students, CID column should be NULL What if a course with no student enrolled yet? For these courses, SID should be NULL Very useful in information integration as well 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Outerjoin SELECT * FROM R FULL OUTER JOIN S ON p; A full outer join between R and S (denoted R ! S) includes all rows in the result of R !pS, plus “Dangling” R rows (those that do not join with any S rows) padded with NULL’s for S’s columns “Dangling” S rows (those that do not join with any R rows) padded with NULL’s for R’s columns 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Outerjoin (II) SELECT * FROM R LEFT OUTER JOIN S ON p;; SELECT * FROM R RIGHT OUTER JOIN S ON p; A left outer join (R ! S) includes rows in R ! S plus dangling R rows padded with NULL’s A right outer join (R ! S) includes rows in R ! S plus dangling S rows padded with NULL’s 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Outerjoin examples SELECT * FROM Employee LEFT OUTER JOIN Department ON Eid = Mid Employee Eid Name 1123 John Smith 1234 Mary Carter 1311 Bob Lee Eid Name Did Mid Dname 1123 John Smith 4 Research 1234 Mary Carter 5 Finance 1311 Bob Lee NULL SELECT * FROM Employee FULL OUTER JOIN Department ON Eid = Mid SELECT * FROM Employee RIGHT OUTER JOIN Department ON Eid = Mid Department Eid Name Did Mid Dname 1123 John Smith 4 Research 1234 Mary Carter 5 Finance 1311 Bob Lee NULL 6 1312 HR Eid Name Did Mid Dname 1123 John Smith 4 Research 1234 Mary Carter 5 Finance NULL 6 1312 HR Did Mid Dname 4 1123 Research 5 1234 Finance 6 1312 HR 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Summary Query SELECT-FROM-WHERE statements Ordering Set and bag operations Aggregation and grouping Table expressions, subqueries NULL Outerjoins 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Exercise Goal: Get the Name of one’s supervisor EMPLOYEE (Eid, Mid, Name) SQL Statement: SELECT EID, Name FROM EMPLOYEE WHERE MId =Eid; SELECT E1.EID, E2.Name FROM EMPLOYEE E1, EMPLOYEE E2 WHERE E1.MId = E2.EId; Eid Mid Name 1123 1234 John Smith 1311 Mary Carter 1611 Bob Lee 1455 Lisa Wang Jack Snow Eid Name 1123 Mary Carter 1234 Bob Lee 1311 Jack Snow 1455 1611 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Exercise Goal: group employees according to their department, for each department, list EIds of its employees and list the head count EMPLOYEE (Eid, Name), DEPARTMENT(Eid, Did) SQL Statement: SELECT Did, Eid, COUNT(Eid) FROM EMPLOYEE e, DEPARTMENT d WHERE e.Eid = d.Eid GROUP BY Did There is something wrong! Wishful thinking (list a group of Eid after grouping) won’t work Solution: produce (1) a summary table listing Did and head count, and (2) sort Department according to Did. 9/19/2018 Luke Huan Univ. of Kansas

More on Nested Queries: Scoping To find out which table a column belongs to Start with the immediately surrounding query If not found, look in the one surrounding that; repeat if necessary Use table_name.column_name notation and AS (renaming) to avoid confusion 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas An Example SELECT * FROM Student s WHERE EXISTS (SELECT * FROM Enroll e WHERE SID = s.SID AND EXISTS (SELECT * FROM Enroll WHERE SID = s.SID AND CID <> e.CID)); Students who are taking at least two courses 9/19/2018 Luke Huan Univ. of Kansas

Quantified subqueries A quantified subquery can be used as a value in a WHERE condition Universal quantification (for all): … WHERE x op ALL (subquery) … True iff for all t in the result of subquery, x op t Existential quantification (exists): … WHERE x op ANY (subquery) … True iff there exists some t in the result of subquery such that x op t Beware In common parlance, “any” and “all” seem to be synonyms In SQL, ANY really means “some” 9/19/2018 Luke Huan Univ. of Kansas

Examples of quantified subqueries Which employees have the highest salary? Employee (Sid, Name, Salary) SELECT * FROM Employee WHERE Salary >= ALL (SELECT Salary FROM Employee); How about the lowest salary? SELECT * FROM Employee WHERE Salary <= ALL (SELECT salary FROM Employee); 9/19/2018 Luke Huan Univ. of Kansas

More ways of getting the highest Salary Who has the highest Salary? SELECT * FROM Employee e WHERE NOT EXISTS (SELECT * FROM Employee WHERE Salary > e.Salary); SELECT * FROM Employee WHERE Eid NOT IN (SELECT e1.SID FROM Employee e1, Employee e2 WHERE e1.Salary < e2.Salary); 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Nested Queries Nested queries do not add expression power to SQL For convenient Write intuitive SQL queries Can always use SQL queries without nesting to complete the same task (though sometime it is hard) 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas More Exercise Sailors (sid: INTEGER, sname: string, rating: INTEGER, age: REAL) Boats (bid: INTEGER, bname: string, color: string) Reserves (sid: INTEGER, bid: INTEGER) Sailors Boats sid sname rating age 1 Fred 7 22 2 Jim 39 3 Nancy 8 27 bid bname color 101 Nina red 102 Pinta blue 103 Santa Maria sid bid 1 102 2 Reserves 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Exercise I Find sid’s of sailors who’ve reserved a red AND a green boat SELECT R1.sid FROM Boats B1, Reserves R1, Boats B2, Reserves R2 WHERE B1.color=‘red’ AND B2.color=‘green’ AND R1.bid=B1.bid AND R2.bid=B2.bid AND R1.sid=R2.sid SELECT sid FROM Boats, Reserves Where B.color = ‘red’ INTERSECT (SELECT sid Where B.color = ‘green’) 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Exercise II Find sid’s of sailors who have not reserved a boat SELECT sid FROM Sailors EXCEPT (SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid=R.sid) Non-monotonic operation! 9/19/2018 Luke Huan Univ. of Kansas

Exercise III (a tough one) Find sailors who’ve reserved all boats SELECT Sid FROM Sailor EXCEPT FROM (SELECT bid, sid FROM Boat, Sailor Reserves) Non-monotonic operation! #All sailors #Those who do not reserve all boats #All possible combinations between sid and bid #Existing reservations 9/19/2018 Luke Huan Univ. of Kansas

Summary of SQL features covered so far SELECT-FROM-WHERE statements Set and bag operations Ordering Aggregation and grouping Table expressions, subqueries NULL’s and outerjoins Next: data modification statements, constraints 9/19/2018 Luke Huan Univ. of Kansas

Relational Query Languages Two sublanguages: DDL – Data Definition Language Define and modify schema DML – Data Manipulation Language Queries can be written intuitively. Insert, delete, update rows in a table DBMS is responsible for efficient evaluation. The key: precise semantics for relational queries. Optimizer can re-order operations, without affecting query answer. Choices driven by “cost model”

Constraints in Table Declaration Restrictions on allowable data in a database In addition to the simple structure and type restrictions imposed by the table definitions Declared as part of the schema Enforced by the DBMS Why use constraints? Protect data integrity (catch errors) Tell the DBMS about the data (so it can optimize better) 9/19/2018 Luke Huan Univ. of Kansas

Types of SQL constraints Key constraint NOT NULL Referential integrity (foreign key) 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Key declaration At most one PRIMARY KEY per table Typically implies a primary index Rows are stored inside the index, typically sorted by the primary key value ) best speedup for queries Any number of UNIQUE keys per table Typically implies a secondary index To speedup queries 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas NOT NULL Indicates that the value of an attribute can not be NULL 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas Examples CREATE TABLE Student (SID INTEGER NOT NULL PRIMARY KEY, name VARCHAR(30) NOT NULL, email VARCHAR(30) UNIQUE, age INTEGER, GPA FLOAT); CREATE TABLE Course (CID CHAR(10) NOT NULL PRIMARY KEY, title VARCHAR(100) NOT NULL); CREATE TABLE Enroll (SID INTEGER NOT NULL, CID CHAR(10) NOT NULL, PRIMARY KEY(SID, CID)); Used for query processing This form is required for multi-attribute keys 9/19/2018 Luke Huan Univ. of Kansas

Referential integrity example Enroll.SID references Student.SID If an SID appears in Enroll, it must appear in Student Enroll.CID references Course.CID If a CID appears in Enroll, it must appear in Course That is, no “dangling pointers” How do you translate a relationship set in E/R? Referential integrity! 9/19/2018 Luke Huan Univ. of Kansas

Referential integrity in SQL Referenced column(s) must be PRIMARY KEY Referencing column(s) form a FOREIGN KEY Example CREATE TABLE Enroll (SID INTEGER NOT NULL, CID CHAR(10) NOT NULL, PRIMARY KEY(SID, CID), FOREIGN KEY CID REFERENCES Course(CID) FOREIGN KEY SID REFERENCES Student(SID)); 9/19/2018 Luke Huan Univ. of Kansas

Enforcing referential integrity Example: Enroll.SID references Student.SID Insert or update an Enroll row so it refers to a non-existent SID Reject Delete or update a Student row whose SID is referenced by some Enroll row Cascade: ripple changes to all referring rows Set NULL: set all references to NULL Set DEFAULT: use the default value of the attribute All four options can be specified in SQL 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas SQL Example CREATE TABLE Enroll (SID INTEGER NOT NULL, CID CHAR(10) NOT NULL, PRIMARY KEY(SID, CID), FOREIGN KEY CID REFERENCES Course(CID) CASCADE ON DELETE FOREIGN KEY SID REFERENCES Student(SID) CASCADE ON DELETE); Delete all student enrollments for a canceled class Delete all student enrollments for a dropped student 9/19/2018 Luke Huan Univ. of Kansas

Database Modification Operations INSERT DELETE UPDATE 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas INSERT Insert one row INSERT INTO Enroll VALUES (456, ’EECS647’); Student 456 takes EECS647 Insert the result of a query INSERT INTO Enroll (SELECT SID, ’EECS647’ FROM Student WHERE SID NOT IN (SELECT SID FROM Enroll WHERE CID = ’EECS647’)); Force everybody to take EECS647! 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas DELETE Delete everything DELETE FROM Enroll; Delete according to a WHERE condition Example: Student 456 drops CPS116 DELETE FROM Enroll WHERE SID = 456 AND CID = ’EECS647’; Example: Drop students from all EECS classes with GPA lower than 1.0 DELETE FROM Enroll WHERE SID IN (SELECT SID FROM Student WHERE GPA < 1.0) AND CID LIKE ’EECS%’; 9/19/2018 Luke Huan Univ. of Kansas

Luke Huan Univ. of Kansas UPDATE Example: Student 142 changes name to “Barney” UPDATE Student SET name = ’Barney’ WHERE SID = 142; 9/19/2018 Luke Huan Univ. of Kansas

Summary of SQL features covered so far Query SELECT-FROM-WHERE statements Set and bag operations Table expressions, subqueries Aggregation and grouping Ordering Outerjoins Constraints Modification INSERT/DELETE/UPDATE Next: triggers, views, indexes 9/19/2018 Luke Huan Univ. of Kansas