Download presentation
Presentation is loading. Please wait.
1
SQL (DML) II
2
Review: Formal Semantics of SQL: RA* SELECT A1, A2, …, An FROM r1, r2, …, rm WHERE P Query: Semantics: * A1, A2,..., An ( * P (r1 x* r2 x*... rm)) Query: SELECT DISTINCT A1, A2,..., An FROM r1, r2,..., rm WHERE P Semantics: (Remove * on Pi) A1, A2,..., An ( * P (r1 x* r2 x*... rm)) Semantics: Assign result to a new relation S S A1, A2,..., An ( * P (r1 x* r2 x*... rm)) Query: CREATE TABLE S AS SELECT DISTINCT A1, A2,..., An FROM r1, r2,..., rm WHERE P
3
Review of SQL Clause ---------- SELECT[DISTINCT] FROM WHERE CREATE TABLE AS SELECT GROUP BY HAVING ORDER BY Semantics (RA/RA*) ---------- * (or ) X* * Can’t express * Can’t express
4
Aggregates and Group By Usually, aggregates used with “Group By” E.g. SELECT bname, COUNT (DISTINCT cname) FROM depositor d, account a WHERE d.acct_no = a.acct_no GROUP BY bname depositor (customer-name, account-number) account (account-number, branch-name, balance) Result: bname COUNT -------------------------- Downtown 2 Mianus 5 Perry 1 Brighton 5 Kenmore 7
5
Aggregates and Group By Intuition behind “Group By” SELECT bname, COUNT (DISTINCT cname) FROM depositor d, account a WHERE d.acct_no = a.acct_no GROUP BY bname Step 1: “Group “ results of join Step 2: aggregate on groups and project on result bname COUNT -------------------------- Downtown 1 Mianus 1 Perry 1 Brighton 2 Kenmore 2
6
Group By Another example: SELECT bname, SUM(assets) as total FROM branch GROUP BY bcity Result ?? bname total -------------------------- Redwood 2.1M Pownal 0.3M N. Town 3.7M ???? 2M ????? 10.1M Above query is NOT allowed Non-aggregated values in SELECT clause (e.g., bname) must also appear in GROUP BY clause SELECT A1, A2,..., Ak, Agg1(),...., Aggi() FROM.......... WHERE............ GROUP BY A1, A2,..., Ak, Ak+1,..., An branch(bname, bcity, assets)
7
HAVING WHERE :: FROM as HAVING :: GROUP BY HAVING P: selects rows from result of GROUP BY Optional (missing HAVING means TRUE) Example: Find names of branches and the average account balance for those branches having average account balance > $1200 SELECT bname, AVG(balance) AS avg FROM account GROUP BY bname HAVING avg > 1200 Result same as: CREATE TABLE Temp AS SELECT bname, AVG(balance) AS avg FROM account GROUP BY bname SELECT * FROM Temp WHERE avg > 1200
8
Review - Summary Thus Far Kitchen sink query: SELECT bcity, sum(balance) AS totalbalance FROM branch b, account a WHERE b.bname=a.bname AND assets >= 1M GROUP BY bcity HAVING totalbalance > 700 ORDER BY bcity DESC Steps 1,2 : FROM, WHERE account (acct_no, bname, balance) branch(bname, bcity, assets)
9
Summary thus far Steps 3,4 : GROUP BY, SELECT Steps 5 : HAVING Steps 6 : ORDER BY
10
Summary thus far Clause ---------- SELECT[DISTINCT] FROM WHERE GROUP BY HAVING ORDER BY Evaluation Order ---------- 4 1 2 3 5 6 Semantics (RA/RA*) ---------- * (or ) X* * Can’t express * Can’t express
11
branch (bname, bcity, assets) customer (cname, cstreet, customer-only) account (acct_no, bname, balance) loan (lno, bname, amt) depositor (cname, acct_no) borrower (cname, lno)
12
More SQL NULL values and SQL Nested Queries Views Updates Joins PseudoColumns Connect-By Hierarchical Queries
13
NULLs The “dirty litle secret” of SQL (major headache for query optimization) can be a value for any attribute e.g. : branch2= What does this mean? We don’t know Kenmore’s assets? Kenmore has no assets?.................... Effect on Queries: SELECT * FROM branch2 WHERE assets = NULL SELECT * FROM branch2 WHERE assets IS NULL
14
NULLs Arithmetic with nulls: n op null = null op : +, -, *, /, mod,... Booleans with nulls: One can write: 3-valued logic (true, false, unknown) SELECT........... FROM............. WHERE boolexpr IS UNKNOWN What expressions evaluate to UNKNOWN? 1.Comparisons with NULL (e.g. assets = NULL) 2.FALSE OR UNKNOWN (but: TRUE OR UNKNOWN = TRUE) 3.TRUE AND UNKNOWN 4.UNKNOWN AND/OR UNKNOWN
15
NULLs Given: branch2= Aggregate operations: SELECT SUM(assets) FROM branch2 returns SUM -------- 11.1M NULL is ignored Same for AVG, MIN, MAX But.... COUNT(assets) returns 4! Let branch3 an empty relation Then: SELECT SUM(assets) FROM branch3 returns NULL but COUNT( ) = 0
16
Nested Queries in SQL Queries containing other queries Inner query: Can appear in FROM or WHERE clause “outer query” “inner query” Example: SELECT cname FROM borrower WHERE cname IN (SELECT cname FROM depositor) Think of this as a function that returns the result of the inner query
17
Nested Queries in SQL Another example: SELECT DISTINCT cname FROM borrower as b, loan as l WHERE b.lno = l.lno AND bname = “Perry” AND (bname, cname) IN ( SELECT bname, cname FROM depositor as d, account as a WHERE d.acct_no = a.acct_no) Q: Describe what this query returns Ans: Names of borrowers at the Perry branch who also have savings account at the same branch
18
Nested Queries in SQL What can we do with queries in the WHERE clause? Let A = SELECT... FROM... WHERE.... 1.WHERE IN A WHERE NOT IN A 2.WHERE < SOME (A) ( ) WHERE < ALL (A) ( ) Also (‘>’, ‘ =‘, ‘=‘,...) 3.WHERE EXISTS(A) ( )
19
Nested Queries in SQL SOME ( ): e < SOME(A) 0 5 6 (5< some) = true 0 5 0 ) = false 5 0 5 (5 some) = true (since 0 5) (read: 5 < some tuple in the relation) (5< some ) = true (5 = some (= some) in However, ( some) not in example: (3 some ) 3 5
20
Nested Queries in SQL Example query with SOME: SELECT bname FROM branch WHERE assets > SOME ( SELECT assets FROM branch WHERE bcity=‘Bkln’) Q: Describe what this query returns Ans: Find all branches that have greater assets than some branch in Brooklyn.
21
Nested Queries in SQL Optimization of queries! SELECT bname FROM branch WHERE assets > SOME ( SELECT assets FROM branch WHERE bcity=‘Bkln’) Q: Can you rewrite this query to something that is equivalent, but more efficient to execute? 1) CREATE TABLE TEMP 2) SELECT bname AS SELECT assets FROM branch FROM branch WHERE assets > SOME (Temp) WHERE bname=‘Bkln’ Q: Why this is better ?
22
Nested Queries in SQL ALL ( ): e < ALL (A) 0 5 6 (5< all) = false 6 10 4 ) = true 5 4 6 (5 all) = true (since 5 4 and 5 6) (5< all ) = false (5 = all ( all) not in However, (= all) in
23
Nested Queries in SQL Example query with ALL: SELECT bname FROM branch WHERE assets > ALL ( SELECT assets FROM branch WHERE bcity=‘Bkln’) Returns: all branches that have greater assets than all branches in Brooklyn
24
Nested Queries in SQL EXISTS: EXISTS(A) (i.e., true if A is not empty) Example: Find all customers who have accounts at all branches in Brooklyn SELECT DISTINCT cname FROM depositor as S WHERE NOT EXISTS ( ( SELECT bname FROM branch WHERE bcity = ‘Bkln’) EXCEPT ( SELECT bname FROM depositor as T, account as R WHERE T.acct_no = R.acct_no AND S.cname = T.cname)) Inner Query: (branches in Brooklyn) - (branches where S has an account)
25
Nested Queries in SQL Correlated Queries: when the inner query contains tuple variables of the outer query Example: Find all accounts located at branches that also hold a loan for a smaller amount SELECT DISTINCT A.cct_no FROM account as A WHERE EXISTS ( SELECT * FROM loan as L WHERE A.bname = L.bname AND A.balance > L.amt) “correlation”: inner query contains reference to table in outer query
26
Nested Queries in SQL What about query optimization? SELECT DISTINCT A.cct_no FROM account as A WHERE EXISTS ( SELECT * FROM loan as L WHERE A.bname = L.bname AND A.balance > L.amt) SELECT DISTINCT A.acct_no FROM account as A, loan as L WHERE A.bname = L.bname AND A.balance > L.amt Above an example of Query Unnesting. advantageous because joins evaluate in more ways than nested queries
27
Nested Queries in SQL Correlation with Aggregates: SELECT bname FROM account as A WHERE bal > (SELECT AVG(amt) FROM loan as L WHERE A.bname = L.bname) Returns the branch names where accounts are held whose balances are more than the average loan taken at the same branch Kim’s technique: (1) CREATE TABLE Temp AS SELECT bname, AVG(amt) as avgloan FROM Loan GROUP BY bname (2) SELECT A. bname FROM account as A, temp as L WHERE A.bname = L.bname AND A.bal > L.avgloan
28
Kim’s Unnesting Technique Why is the rewrite better than the original query? Ans: the rewrite computes the avg loans JUST ONCE per branch Is the rewrite always better than the nested query? Ans: NO: if Loan has branch names not in account
29
Nested Queries in SQL Nesting also possible in FROM clause Example: Another way to express HAVING SELECT bname, avg(balance) as avgbal FROM account GROUP BY bname HAVING AVG(balance) > 1200 SELECT * FROM (SELECT bname, AVG(balance) as avgbal FROM account GOUP BY bname) TempRes WHERE avgbal > 1200
30
Nested Queries in SQL Nested query in FROM clause, another example: Find the maximum total balance across all branches SELECT MAX(tot-balance) FROM (SELECT bname, SUM(balance) AS tot_balance FROM account GROUP BY bname) branch-total
31
SQL: Joins Suppose: loan borrower (1)Select bname, loan.lno, amt, cname, borrower.lno From loan INNER JOIN borrower ON loan.lno = borrower.lno (2) Alternately, in Oracle SQL you can specify as: Select bname, loan.lno, amt, cname, borrower.lno from loan l, borrower b where l.lno = b.lno;
32
SQL: Joins Suppose: loan borrower (1)Select bname, loan.lno, amt, cname, borrower.lno From loan LEFT OUTER JOIN borrower ON loan.lno = borrower.lno (2) Alternately, in Oracle SQL you can specify as: Select bname, loan.lno, amt, cname, borrower.lno from loan l, borrower b where l.lno = b.lno (+);
33
SQL: Joins SQL Expression R INNER JOIN S ON P R LEFT OUTER JOIN S ON P R RIGHT OUTER JOIN S ON P R FULL OUTER JOIN S ON P Also: R NATURAL INNER JOIN S R NATURAL LEFT OUTER JOIN S RA expressions R P * S R * S Etc..
34
SQL: Joins In general: NATURAL or ON or USING (,,...., ) Where: = ( INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN)
35
SQL: Joins Using: loan borrower loan FULL OUTER JOIN borrower USING lno USING similar to NATURAL join but.. Columns listed must be those in common (used for equi-join), not necessarily all of them
36
SQL: Joins Find all customers who have an account or a loan at the bank (but not both) SELECT cname FROM depositor NATURAL FULL OUTER JOIN borrower WHERE (acct_no IS NULL) OR (lno IS NULL)
37
Views Can involve complex queries: example: create a view consisting of branches and customers. CREATE VIEW all-customers AS (SELECT bname, cname FROM depositor as d, account as a WHERE d.acct_no = a.acct_no) UNION (SELECT bname, cname FROM borrower as b, loan as l WHERE b.lno=l.lno) Can treat that view like a table: Find all customers of Perry: SELECT cname FROM all-customers WHERE bname = ‘Perry’
38
Views Mechanism for hiding data from view of certain users CREATE VIEW V AS Example: show all branches and cities, but hide assets CREATE VIEW branch2 AS SELECT bname, bcity FROM branch
39
Views create table vs create view (1) CREATE TABLE branch2 AS SELECT bname, bcity FROM branch (2) CREATE VIEW branch2 AS SELECT bname, bcity FROM branch vs (1) creates new table that gets stored on disk (2) creates “virtual table” (materialized when needed) Therefore: changes in branch are seen in the view version of branch2 (2) but not for the (1) case. Any indexes on branch are automatically used.
40
Pseudocolumns Rownum (numeric) A number for each row in a select query Example: Branch(bname, bcity) SELECT rownum ordered_branch_seq, bname, bcity FROM branch ORDER BY bname; Rowid (char string) A character string that represents the physical address for a row of a table. SELECT rowid, bname, bcity FROM branch. Access by rowid fastest. However, rowids may change during some table maintenance operations Split of table partitions. If you delete and insert the same branch name, the rowids for that row are likely to be different
41
CONNECT-BY (not in book) Person (father, name) Find the children of ‘Mike Johnson’ SELECT a.name FROM Person a where a.father = ‘Mike Johnson’; What-if you want to find great-great-grandchildren? Easier way in Oracle: CONNECT-BY Useful for answering hierarchical queries
42
CONNECT-BY Find all descendants from Mike Johnson SELECT a.name FROM Person a START WITH a.name = ‘Mike Johnson’ CONNECT BY PRIOR name = father; First Level (of recursion) Mike Johnson Second Level All names for which Mike is a father (i.e., children of Mike) Third Level All names for which Mike’s children are fathers ….
43
CONNECT-BY Find all descendants from Mike Johnson SELECT a.name FROM Person a START WITH a.name = ‘Mike Johnson’ CONNECT BY PRIOR name = father; Pseudocolumn: LEVEL SELECT a.name, LEVEL FROM Person a START WITH a.name = ‘Mike Johnson’ CONNECT BY PRIOR name = father; Ordering among siblings: SELECT a.name, LEVEL FROM Person a START WITH a.name = ‘Mike Johnson’ CONNECT BY PRIOR name = father ORDER SIBLINGS BY name;
44
CONNECT-BY Find upto grand-children only for Mike Johnson Restrict the LEVEL to be <=3 (Mike, his children and grand-children) What if you have “loops” in the data? Oracle raises an error To ignore the loops, specify NOCYCLE SELECT a.name, LEVEL FROM Person a START WITH a.name = ‘Mike Johnson’ CONNECT BY NOCYCLE PRIOR name = father ORDER SIBLINGS BY name; Lot more Information: Oracle Help links, or http://www.oracle.com/technology/oramag/webcolumns/2003/techarti cles/gennick_connectby.html http://www.stanford.edu/dept/itss/docs/oracle/10g/server.101/b10759 /queries003.htm
45
SQL Summary Query Language (DML): SELECT, FROM, WHERE GROUP BY, HAVING ORDER BY Nested Queries ( IN, EXISTS, UNIQUE, ALL, SOME, correlation) Aggregates Joins Updates Views PseudoColumns Connect-by hierarchical queries More to come: Inserts, deletes,…. DDL: CREATE TABLE DROP TABLE ALTER TABLE Embedded SQL/JDBC Integrity Constraints Security, authorization
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.