CS 405G: Introduction to Database Systems Instructor: Jinze Liu.

Slides:



Advertisements
Similar presentations
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Programming, Triggers Chapter 5 Modified by Donghui Zhang.
Advertisements

1 SQL: Structured Query Language (‘Sequel’) Chapter 5.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Programming, Triggers Chapter 5.
CS 405G: Introduction to Database Systems
SQL Review.
Subqueries Example Find the name of the producer of ‘Star Wars’.
 CS 405G: Introduction to Database Systems Lecture 7: Relational Algebra II Instructor: Chen Qian Spring 2014.
1 SQL: Structured Query Language (‘Sequel’) Chapter 5.
FALL 2004CENG 351 File Structures and Data Management1 SQL: Structured Query Language Chapter 5.
Nov 24, 2003Murali Mani SQL B term 2004: lecture 12.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #3.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Constraints, Triggers Chapter 5.
Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example.
CS405G: Introduction to Database Systems Final Review.
SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”
A Guide to SQL, Seventh Edition. Objectives Retrieve data from a database using SQL commands Use compound conditions Use computed columns Use the SQL.
SQL Operations Aggregate Functions Having Clause Database Access Layer A2 Teacher Up skilling LECTURE 5.
©Silberschatz, Korth and Sudarshan4.1Database System Concepts Chapter 4: SQL Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 3: Introduction.
1 IT420: Database Management and Organization SQL - Data Manipulation Language 27 January 2006 Adina Crăiniceanu
Chapter 3 Single-Table Queries
1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.
SQL: Data Manipulation Presented by Mary Choi For CS157B Dr. Sin Min Lee.
1 Single Table Queries. 2 Objectives  SELECT, WHERE  AND / OR / NOT conditions  Computed columns  LIKE, IN, BETWEEN operators  ORDER BY, GROUP BY,
1 TAC2000/ Protocol Engineering and Application Research Laboratory (PEARL) Structured Query Language Introduction to SQL Structured Query Language.
Instructor: Jinze Liu Fall Basic Components (2) Relational Database Web-Interface Done before mid-term Must-Have Components (2) Security: access.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 3: Introduction.
MIS 3053 Database Design & Applications The University of Tulsa Professor: Akhilesh Bajaj RM/SQL Lecture 5 © Akhilesh Bajaj, 2000, 2002, 2003, All.
Chapter 8: SQL. Data Definition Modification of the Database Basic Query Structure Aggregate Functions.
Database Management COP4540, SCS, FIU Structured Query Language (Chapter 8)
DATA RETRIEVAL WITH SQL Goal: To issue a database query using the SELECT command.
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
 CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CMPT 258 Database Systems The Relationship Model (Chapter 3)
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CS 405G: Introduction to Database Systems Relational Algebra.
A Guide to SQL, Eighth Edition Chapter Four Single-Table Queries.
1 SQL: Structured Query Language (‘Sequel’) Chapter 5.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2007.
CS 405G: Introduction to Database Systems SQL III.
1 SQL: The Query Language. 2 Example Instances R1 S1 S2 v We will use these instances of the Sailors and Reserves relations in our examples. v If the.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2007.
SQL: Interactive Queries (2) Prof. Weining Zhang Cs.utsa.edu.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Basic SQL Queries.
Concepts of Database Management, Fifth Edition Chapter 3: The Relational Model 2: SQL.
Subqueries CIS 4301 Lecture Notes Lecture /23/2006.
More SQL: Complex Queries,
Chapter 3 Introduction to SQL
Chapter 3 Introduction to SQL(2)
Slides are reused by the approval of Jeffrey Ullman’s
COP Introduction to Database Structures
Introduction to SQL.
CS 440 Database Management Systems
EECS 647: Introduction to Database Systems
CS 405G: Introduction to Database Systems
Lecture#7: Fun with SQL (Part 2)
SQL Structured Query Language 11/9/2018 Introduction to Databases.
CS 405G: Introduction to Database Systems
CS405G: Introduction to Database Systems
CS 405G: Introduction to Database Systems
CS 405G: Introduction to Database Systems
More SQL: Complex Queries, Triggers, Views, and Schema Modification
CMPT 354: Database System I
CS 405G: Introduction to Database Systems
Section 4 - Sorting/Functions
SQL: Structured Query Language
Recap – Relational languages
CS 405G: Introduction to Database Systems
Presentation transcript:

CS 405G: Introduction to Database Systems Instructor: Jinze Liu

2/23/2016Luke Huan Univ. of Kansas2 Database Design

2/23/2016Luke Huan Univ. of Kansas3 SQL SQL: Structured Query Language Pronounced “S-Q-L” or “sequel” The standard query language supported by most commercial DBMS A brief history IBM System R ANSI SQL89 ANSI SQL92 (SQL2) ANSI SQL99 (SQL3) ANSI SQL 2003 (+OLAP, XML, etc.)

2/23/2016Luke Huan Univ. of Kansas4 Creating and dropping tables CREATE TABLE table_name ( …, column_name i column_type i, … ); DROP TABLE table_name ; Examples create table Student (SID integer, name varchar(30), varchar(30), age integer, GPA float); create table Course (CID char(10), title varchar(100)); create table Enroll (SID integer, CID char(10)); drop table Student; drop table Course; drop table Enroll; -- everything from -- to the end of the line is ignored. -- SQL is insensitive to white space. -- SQL is insensitive to case (e.g.,...Course... is equivalent to --...COURSE...)

2/23/2016Luke Huan Univ. of Kansas5 Basic queries: SFW statement SELECT A 1, A 2, …, A n FROM R 1, R 2, …, R m WHERE condition ; Also called an SPJ (select-project-join) query (almost) Equivalent to relational algebra query π A 1, A 2, …, A n (σ condition (R 1 X R 2 X … X R m ))

2/23/2016Luke Huan Univ. of Kansas6 Semantics of SFW SELECT E 1, E 2, …, E n FROM R 1, R 2, …, R m WHERE condition; For each t 1 in R 1 : For each t 2 in R 2 : … … For each t m in R m : If condition is true over t 1, t 2, …, t m : Compute and output E 1, E 2, …, E n as a row t 1, t 2, …, t m are often called tuple variables Not 100% correct, we will see

2/23/2016Luke Huan Univ. of Kansas7 Example: selection and projection Name of students under 18 SELECT name FROM Student WHERE age < 20; π name ( σ age <20 (Student)) sidnameagegpa 1234John Smith Mary Carter Bob Lee Susan Wong Kevin Kim182.9 sidnameagegpa 1234John Smith Mary Carter Bob Lee Susan Wong Kevin Kim182.9

2/23/2016Luke Huan Univ. of Kansas8 Example: Operations When was Lisa born? SELECT 2006 – age FROM Student WHERE name = ’Lisa’; SELECT list can contain expressions Can also use built-in functions such as SUBSTR, ABS, etc. String literals (case sensitive) are enclosed in single quotes

2/23/2016Luke Huan Univ. of Kansas9 Example: reading a table SELECT * FROM Student; Single-table query, so no cross product here WHERE clause is optional * is a short hand for “all columns”

2/23/2016Luke Huan Univ. of Kansas10 Example: join SID’s and names of students taking the “Database” courses SELECT Student.SID, Student.name FROM Student, Enroll, Course WHERE Student.SID = Enroll.SID AND Enroll.CID = Course.CID AND title = ’Database’; Okay to omit table_name in table_name. column_name if column_name is unique A better way to deal with string is to use “LIKE” and %, which matches with any string with length 0 or more AND title LIKE ’%Database%’; \ is the escape operator ‘_’ is used for representing any a single character

2/23/2016Luke Huan Univ. of Kansas11 Example: rename SID’s of all pairs of classmates Relational algebra query: π e1.SID, e2.SID ( …ρ e1 Enroll e1.CID = e2.CID ^ e1.SID > e2.SID ρ e2 Enroll ) SQL: SELECT e1.SID AS SID1, e2.SID AS SID2 FROM Enroll AS e1, Enroll AS e2 WHERE e1.CID = e2.CID AND e1.SID > e2.SID; AS keyword is optional

2/23/2016Luke Huan Univ. of Kansas12 A more complicated example Titles of all courses that Bart and Lisa are taking together Tip: Write the FROM clause first, then WHERE, and then SELECT FROM Student sb, Student sl, Enroll eb, Enroll el, Course c WHERE sb.name = ’Bart’ AND sl.name = ’Lisa’ AND eb.SID = sb.SID AND el.SID = sl.SID AND eb.CID = c.CID AND el.CID = c.CID; SELECT c.title

2/23/2016Luke Huan Univ. of Kansas13 ORDER BY SELECT... FROM … WHERE ORDER BY output_column [ ASC | DESC ], … ; ASC = ascending, DESC = descending Operational semantics After SELECT list has been computed, sort the output according to ORDER BY specification

2/23/2016Luke Huan Univ. of Kansas14 ORDER BY example List all students, sort them by GPA (descending) and name (ascending) SELECT SID, name, age, GPA FROM Student ORDER BY GPA DESC, name; ASC is the default option Strictly speaking, only output columns can appear in ORDER BY clause (although some DBMS support more) Can use sequence numbers instead of names to refer to output columns: ORDER BY 4 DESC, 2;

2/23/2016Luke Huan Univ. of Kansas15 Why SFW statements? Out of many possible ways of structuring SQL statements, why did the designers choose SELECT - FROM - WHERE ? A large number of queries can be written using only selection, projection, and cross product (or join) Any query that uses only these operators can be written in a canonical form: π L (σ p (R 1 X … X R m )) Example: π R.A, S.B (R p1 S) p2 (π T.C σ p3 T) = π R.A, S.B, T.C σ p1 ^ p2^p3 ( R X S X T ) SELECT - FROM - WHERE captures this canonical form

2/23/2016Luke Huan Univ. of Kansas16 Operational semantics of SFW SELECT [ DISTINCT ] E 1, E 2, …, E n FROM R 1, R 2, …, R m WHERE condition; For each t 1 in R 1 : For each t 2 in R 2 : … … For each t m in R m : If condition is true over t 1, t 2, …, t m : Compute and output E 1, E 2, …, E n as a row If DISTINCT is present Eliminate duplicate rows in output t 1, t 2, …, t m are often called tuple variables

2/23/2016Luke Huan Univ. of Kansas17 Review SELECT - FROM - WHERE statements (select-project-join queries) Set and bag operations

2/23/2016Jinze University of Kentucky18 Topic next Set and Bag operation UNION, INTERSECTION, EXCEPT Aggregation HAVING Nested queries SELECT age, AVG(GPA) FROM Student S ENROLL E WHERE S.SID = E.SID AND E.CID = ‘EECS108’ GROUP BY age HAVING age > 20; --Compute the average GPA for Students who are at least 20 years old and are enrolled in 108 with the same age

2/23/2016Luke Huan Univ. of Kansas19 Set versus bag semantics Set No duplicates Relational model and algebra use set semantics Bag Duplicates allowed Number of duplicates is significant SQL uses bag semantics by default

2/23/2016Luke Huan Univ. of Kansas20 Set versus bag example π SID Enroll Enroll SELECT SID FROM Enroll; sidcidgrade A B A A sid sid

2/23/2016Luke Huan Univ. of Kansas21 A case for bag semantics Efficiency Saves time of eliminating duplicates Which one is more useful? π GPA Student SELECT GPA FROM Student; The first query just returns all possible GPA’s The second query returns the actual GPA distribution Besides, SQL provides the option of set semantics with DISTINCT keyword

2/23/2016Luke Huan Univ. of Kansas22 Forcing set semantics SID’s of all pairs of classmates SELECT e1.SID AS SID1, e2.SID AS SID2 FROM Enroll AS e1, Enroll AS e2 WHERE e1.CID = e2.CID AND e1.SID > e2.SID; Say Bart and Lisa both take CPS116 and CPS114 SELECT DISTINCT e1.SID AS SID1, e2.SID AS SID2... With DISTINCT, all duplicate ( SID1, SID2 ) pairs are removed from the output

2/23/2016Jinze University of Kentucky23 SQL set and bag operations UNION, EXCEPT, INTERSECT Set semantics Duplicates in input tables, if any, are first eliminated Exactly like set union, - and intersect in relational algebra UNION ALL, EXCEPT ALL, INTERSECT ALL Bag semantics Think of each row as having an implicit count (the number of times it appears in the table) Bag union: sum up the counts from two tables Bag difference: subtract the two counts (a row with negative count vanishes) Bag intersection: take the minimum of the two counts

2/23/2016Jinze University of Kentucky24 Examples of bag operations Bag1Bag2 Bag1 UNION ALL Bag2 Bag1 EXCEPT ALL Bag2 Bag1 INTERSECT ALL Bag2 fruit Apple Orange fruit Apple Orange fruit Apple Orange Apple fruit Apple Orange

2/23/2016Jinze University of Kentucky25 Exercise Enroll(SID, CID), ClubMember(club, SID) (SELECT SID FROM ClubMember) EXCEPT (SELECT SID FROM Enroll); SID’s of students who are in clubs but not taking any classes (SELECT SID FROM ClubMember) EXCEPT ALL (SELECT SID FROM Enroll); SID’s of students who are in more clubs than classes

2/23/2016Jinze University of Kentucky26 Operational Semantics of SFW SELECT [ DISTINCT ] E 1, E 2, …, E n FROM R 1, R 2, …, R m WHERE condition; For each t 1 in R 1 : For each t 2 in R 2 : … … For each t m in R m : If condition is true over t 1, t 2, …, t m : Compute and output E 1, E 2, …, E n as a row If DISTINCT is present Eliminate duplicate rows in output t 1, t 2, …, t m are often called tuple variables

2/23/2016Jinze University of Kentucky27 Summary of SQL features SELECT - FROM - WHERE statements (select-project-join queries) Renaming operation Set and bag operations UNION, DIFFERENCE, INTERSECTION  Next: aggregation and grouping

2/23/2016Jinze University of Kentucky28 Aggregates Standard SQL aggregate functions: COUNT, SUM, AVG, MIN, MAX Example: number of students under 18, and their average GPA SELECT COUNT(*), AVG(GPA) FROM Student WHERE age < 18; COUNT(*) counts the number of rows

2/23/2016Jinze University of Kentucky29 Aggregates with DISTINCT Example: How many students are taking classes? SELECT COUNT (SID) FROM Enroll; SELECT COUNT(DISTINCT SID) FROM Enroll;

2/23/2016Jinze University of Kentucky30 GROUP BY SELECT … FROM … WHERE … GROUP BY list_of_columns ; Example: find the average GPA for each age group SELECT age, AVG(GPA) FROM Student GROUP BY age;

2/23/2016Jinze University of Kentucky31 Operational semantics of GROUP BY SELECT … FROM … WHERE … GROUP BY … ; Compute FROM Compute WHERE Compute GROUP BY : group rows according to the values of GROUP BY columns Compute SELECT for each group For aggregation functions with DISTINCT inputs, first eliminate duplicates within the group  Number of groups = number of rows in the final output

2/23/2016Jinze University of Kentucky32 Example of computing GROUP BY SELECT age, AVG(GPA) FROM Student GROUP BY age; Compute GROUP BY: group rows according to the values of GROUP BY columns Compute SELECT for each group sidnameagegpa 1234John Smith Mary Carter Bob Lee Susan Wong Kevin Kim192.9 sidnameagegpa 1234John Smith Mary Carter Kevin Kim Bob Lee Susan Wong223.4 agegpa

2/23/2016Jinze University of Kentucky33 Aggregates with no GROUP BY An aggregate query with no GROUP BY clause represent a special case where all rows go into one group SELECT AVG(GPA) FROM Student; Compute aggregate over the group sidnameagegpa 1234John Smith Mary Carter Bob Lee Susan Wong Kevin Kim192.9 sidnameagegpa 1234John Smith Mary Carter Bob Lee Susan Wong Kevin Kim192.9 gpa 3.24 Group all rows into one group

2/23/2016Jinze University of Kentucky34 Restriction on SELECT If a query uses aggregation/group by, then every column referenced in SELECT must be either Aggregated, or A GROUP BY column  This restriction ensures that any SELECT expression produces only one value for each group

2/23/2016Jinze University of Kentucky35 Examples of invalid queries SELECT SID, age FROM Student GROUP BY age; Recall there is one output row per group There can be multiple SID values per group SELECT SID, MAX(GPA) FROM Student; Recall there is only one group for an aggregate query with no GROUP BY clause There can be multiple SID values Wishful thinking (that the output SID value is the one associated with the highest GPA) does NOT work

2/23/2016Jinze University of Kentucky36 HAVING Used to filter groups based on the group properties (e.g., aggregate values, GROUP BY column values) SELECT … FROM … WHERE … GROUP BY … HAVING condition; Compute FROM Compute WHERE Compute GROUP BY : group rows according to the values of GROUP BY columns Compute HAVING (another selection over the groups) Compute SELECT for each group that passes HAVING

2/23/2016Jinze University of Kentucky37 HAVING examples Find the average GPA for each age group over 10 SELECT age, AVG(GPA) FROM Student GROUP BY age HAVING age > 10; Can be written using WHERE without table expressions List the average GPA for each age group with more than a hundred students SELECT age, AVG(GPA) FROM Student GROUP BY age HAVING COUNT(*) > 100; Can be written using WHERE and table expressions

2/23/2016Jinze University of Kentucky38 Summary of SQL features covered so far SELECT - FROM - WHERE statements Renaming operation Set and bag operations Aggregation and grouping  Next: Table expressions, subqueries  Table  Scalar  In  Exist

2/23/2016Jinze University of Kentucky39 Table expression Use query result as a table In set and bag operations, FROM clauses, etc. A way to “nest” queries Example: names of students who are in more clubs than classes (SELECT SID FROM ClubMember) EXCEPT ALL (SELECT SID FROM Enroll) SELECT DISTINCT name FROM Student, ( ) AS S WHERE Student.SID = S.SID;

2/23/2016Jinze University of Kentucky40 Scalar subqueries A query that returns a single row can be used as a value in WHERE, SELECT, etc. Example: students at the same age as Bart SELECT * FROM Student WHERE age = ( ); SELECT age FROM Student WHERE name = ’Bart’ What’s Bart’s age? Runtime error if subquery returns more than one row Under what condition will this runtime error never occur? name is a key of Student What if subquery returns no rows? The value returned is a special NULL value, and the comparison fails

2/23/2016Jinze University of Kentucky41 IN subqueries x IN ( subquery ) checks if x is in the result of subquery Example: students at the same age as (some) Bart SELECT * FROM Student WHERE age IN ( ); SELECT age FROM Student WHERE name = ’Bart’ What’s Bart’s age?

2/23/2016Jinze University of Kentucky42 EXISTS subqueries EXISTS ( subquery ) checks if the result of subquery is non-empty Example: students at the same age as (some) Bart SELECT * FROM Student AS s WHERE EXISTS (SELECT * FROM Student WHERE name = ’Bart’ AND age = s.age); This happens to be a correlated subquery—a subquery that references tuple variables in surrounding queries

2/23/2016Jinze University of Kentucky43 Operational semantics of subqueries SELECT * FROM Student AS s WHERE EXISTS (SELECT * FROM Student WHERE name = ’Bart’ AND age = s.age); For each row s in Student Evaluate the subquery with the appropriate value of s.age If the result of the subquery is not empty, output s.* The DBMS query optimizer may choose to process the query in an equivalent, but more efficient way (example?)

2/23/2016Jinze University of Kentucky44 Summary of SQL features covered so far SELECT - FROM - WHERE statements Ordering Set and bag operations Aggregation and grouping Table expressions, subqueries  Next: NULL ’s, outerjoins, data modification, constraints, …