Lecture 3 Monday, April 8, 2002.

Slides:



Advertisements
Similar presentations
1 Lecture 02: SQL. 2 Outline Data in SQL Simple Queries in SQL (6.1) Queries with more than one relation (6.2) Recomeded reading: Chapter 3, Simple Queries.
Advertisements

SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: SQL92, SQL2, SQL3. Vendors support.
Relational Algebra (end) SQL April 19 th, Complex Queries Product ( pid, name, price, category, maker-cid) Purchase (buyer-ssn, seller-ssn, store,
Relational Algebra Maybe -- SQL. Confused by Normal Forms ? 3NF BCNF 4NF If a database doesn’t violate 4NF (BCNF) then it doesn’t violate BCNF (3NF) !
1 Lecture 12: SQL Friday, October 26, Outline Simple Queries in SQL (5.1) Queries with more than one relation (5.2) Subqueries (5.3) Duplicates.
1 Lecture 03: Advanced SQL. 2 Outline Unions, intersections, differences Subqueries, Aggregations, NULLs Modifying databases, Indexes, Views Reading:
SQL April 25 th, Agenda Grouping and aggregation Sub-queries Updating the database Views More on views.
1 Lecture 03: SQL Friday, January 7, Administrivia Have you logged in IISQLSRV yet ? HAVE YOU CHANGED YOUR PASSWORD ? Homework 1 is now posted.
1 Lecture 02: Basic SQL. 2 Outline Data in SQL Simple Queries in SQL Queries with more than one relation Reading: Chapter 3, “Simple Queries” from SQL.
Correlated Queries SELECT title FROM Movie AS Old WHERE year < ANY (SELECT year FROM Movie WHERE title = Old.title); Movie (title, year, director, length)
1 Lecture 2: SQL Wednesday, January 7, Agenda Leftovers from Monday The relational model (very quick) SQL Homework #1 given out later this week.
1 Lecture 3: More SQL Friday, January 9, Agenda Homework #1 on the web site today. Sign up for the mailing list! Next Friday: –In class ‘activity’
Union, Intersection, Difference (SELECT name FROM Person WHERE City=“Seattle”) UNION (SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The.
1 Information Systems Chapter 6 Database Queries.
Complex Queries (1) Product ( pname, price, category, maker)
One More Normal Form Consider the dependencies: Product Company Company, State Product Is it in BCNF?
CSE544: SQL Monday 3/27 and Wednesday 3/29, 2006.
Integrity Constraints An important functionality of a DBMS is to enable the specification of integrity constraints and to enforce them. Knowledge of integrity.
Exercises Product ( pname, price, category, maker) Purchase (buyer, seller, store, product) Company (cname, stock price, country) Person( per-name, phone.
1 Lecture 4: More SQL Monday, January 13th, 2003.
SQL (almost end) April 26 th, Agenda HAVING clause Views Modifying views Reusing views.
SQL April 22 th, Agenda Union, intersections Sub-queries Modifying the database Views Modifying views Reusing views.
1. Midterm summary Types of data on the web: unstructured, semi- structured, structured Scale and uncertainty are key features Main goals are to model,
Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008.
1 SQL cont.. 2 Outline Unions, intersections, differences (6.2.5, 6.4.2) Subqueries (6.3) Aggregations (6.4.3 – 6.4.6) Hint for reading the textbook:
IM433-Industrial Data Systems Management Lecture 5: SQL.
More SQL: Complex Queries, Triggers, Views, and Schema Modification UMM AL QURA UNIVERSITY College of Computer Dr. Ali Al Najjar 1.
SQL SQL Review. SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: ANSI SQL, SQL92.
Lecture 2: E/R Diagrams and the Relational Model Wednesday, April 3 rd, 2002.
1 Introduction to Database Systems CSE 444 Lecture 02: SQL September 28, 2007.
1 Lecture 02: SQL Friday, September 30, Administrivia Homework 1 is out. Due: Wed., Oct. 12 Did you login on IISQLSRV ? Did you change your password.
1 Introduction to Database Systems CSEP 544 Lecture #1 March 29, 2004 Alon Halevy.
Aggregation SELECT Sum(price) FROM Product WHERE manufacturer=“Toyota” SQL supports several aggregation operations: SUM, MIN, MAX, AVG, COUNT Except COUNT,
1 Lecture 03: SQL Monday, January 9, Project t/Default.aspxhttp://iisqlsrv.cs.washington.edu/444/Projec.
SQL.
Cours 7: Advanced SQL.
Lecture 04: SQL Monday, January 10, 2005.
Relational Algebra at a Glance
Database Systems Subqueries, Aggregation
Server-Side Application and Data Management IT IS 3105 (FALL 2009)
SQL Introduction Standard language for querying and manipulating data
06a: SQL-1 The Basics– Select-From-Where
Cse 344 April 4th – Subqueries.
Monday 3/27 and Wednesday 3/29, 2006
Lecture 2 (cont’d) & Lecture 3: Advanced SQL – Part I
Introduction to Database Systems CSE 444 Lecture 03: SQL
Lecture 4: Advanced SQL – Part II
Introduction to SQL Wenhao Zhang October 5, 2018.
Introduction to Database Systems CSE 444 Lecture 03: SQL
CSE544 SQL Wednesday, March 31, 2004.
Building a Database Application
SQL Introduction Standard language for querying and manipulating data
Lecture 12: SQL Friday, October 20, 2000.
Lectures 3: Introduction to SQL Part II
Introduction to Database Systems CSE 444 Lecture 02: SQL
Lectures 5: Introduction to SQL 4
Where are we? Until now: Modeling databases (ODL, E/R): all about the schema Now: Manipulating the data: queries, updates, SQL Then: looking inside -
Lecture 4: SQL Thursday, January 11, 2001.
Lectures 6: Introduction to SQL 5
Integrity Constraints
Lecture 4: SQL Wednesday, April 10, 2002.
Lecture 03: SQL Friday, October 3, 2003.
Lecture 3: Relational Algebra and SQL
Relational Schema Design (end) Relational Algebra SQL (maybe)
Lecture 04: SQL Monday, October 6, 2003.
Lecture 14: SQL Wednesday, October 31, 2001.
Presentation transcript:

Lecture 3 Monday, April 8, 2002

Review What does this mean ? What does this mean ? makes Company Product What does this mean ? makes Company Product This: also called referential integrity constraint

Review affiliation Team University sport number name city Team

Review name category price Product isa isa Software Product Educational Product platforms swID Age Group

Reading Assignment A Relational Model of Data for Large Shared Data Banks Of Objects and Databases: A Decade of Turmoil

Outline Basic SQL Queries 5.2 Binary operators 5.3 Subqueries 5.4 Aggregates 5.5

SQL Introduction Basic form: (many many more bells and whistles in addition) Select [attributes] From [relations] Where [conditions]

Projections Select only a subset of the attributes Input schema: Company(sticker, name, country, stockPrice) Output schema: R(name, stock price) SELECT name, stockPrice FROM Company WHERE country=“USA” AND stockPrice > 50

Projections Rename the attributes in the resulting table Input schema: Company(sticker, name, country, stockPrice) Output schema: R(company, price) SELECT name AS company, stockprice AS price FROM Company WHERE country=“USA” AND stockPrice > 50

Eliminating Duplicates Rename the attributes in the resulting table Input schema: Company(sticker, name, country, stockPrice) Output schema: R(country) SELECT DISTINCT country FROM Company WHERE stockPrice > 50

Ordering the Results SELECT name, stockPrice FROM Company WHERE country=“USA” AND stockPrice > 50 ORDERBY country, name Ordering is ascending, unless you specify the DESC keyword. Ties are broken by the second attribute on the ORDERBY list, etc.

Joins Find names of people living in Seattle that bought gizmo products, and the names of the stores they bought from Input schema: Person(pname, phoneNumber, city) Purchase (buyer, seller, store, product) Output Schema: R(pname, store) SELECT pname, store FROM Person, Purchase WHERE pname=buyer AND city=“Seattle” AND product=“gizmo”

Join Joins are a key operation in DBMS pname phone city John 1111 Seattle Fred 2222 Chicago Mary 3333 4444 5555 Portland 6666 7777 buyer . . . . . . product Jane Gizmo John Soap Mary Book Joins are a key operation in DBMS

Disambiguating Attributes Find names of people buying telephony products: SELECT Person.name FROM Person, Purchase, Product WHERE Person.name=Purchase.buyer AND Product=Product.name AND Product.category=“telephony” Input schema: Product (name, price, category, maker) Purchase (buyer, seller, store, product) Person(name, phoneNumber, city) Output schema: R(name)

Tuple Variables Find pairs of companies making products in the same category SELECT product1.maker AS maker1, product2.maker AS maker2 FROM Product AS product1, Product AS product2 WHERE product1.category=product2.category AND product1.maker <> product2.maker Input schema: Product ( name, price, category, maker) Output schema: R(maker1, maker2)

Tuple Variables Tuple variables introduced automatically by the system: Product ( name, price, category, maker) Becomes: Doesn’t work when Product occurs more than once: In that case the user needs to define variables explicitely. SELECT name FROM Product WHERE price > 100 SELECT Product.name FROM Product AS Product WHERE Product.price > 100

Meaning (Semantics) of SQL Queries SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions 1. Nested loops: Answer = {} for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer U {(a1,…,ak) return Answer

Meaning (Semantics) of SQL Queries SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions 2. Parallel assignment Doesn’t impose any order ! Like Datalog Answer = {} for all assignments x1 in R1, …, xn in Rn do if Conditions then Answer = Answer U {(a1,…,ak)} return Answer

Meaning (Semantics) of SQL Queries SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions 3. Translation to Datalog: one rule Answer(a1,…,ak)  R1(x11,…,x1p),…,Rn(xn1,…,xnp), Conditions

Meaning (Semantics) of SQL Queries SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions 4. Translation to Relational algebra: a1,…,ak ( s Conditions (R1 x R2 x … x Rn)) Select-From-Where queries are precisely Select-Project-Join

First Unintuitive SQLism SELECT DISTINCT R.A FROM R, S, T WHERE R.A=S.A OR R.A=T.A Looking for R  (S  T) But what happens if T is empty?

Union, Intersection, Difference (SELECT DISTINCT name FROM Person WHERE City=“Seattle”) UNION FROM Person, Purchase WHERE buyer=name AND store=“The Bon”) Similarly, you can use INTERSECT and EXCEPT. You must have the same attribute names (otherwise: rename).

Conserving Duplicates The UNION, INTERSECTION and EXCEPT operators operate as sets, not bags. (SELECT name FROM Person WHERE City=“Seattle”) UNION ALL FROM Person, Purchase WHERE buyer=name AND store=“The Bon”)

Exercises Product ( pname, price, category, maker) Purchase (buyer, seller, store, product) Company (cname, stock price, country) Person( per-name, phone number, city) Ex #1: Find people who bought telephony products. Ex #2: Find names of people who bought American products Ex #3: Find names of people who bought American products and did not buy French products Ex #4: Find names of people who bought American products and they live in Seattle. Ex #5: Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.

Subqueries (or Nested Queries) A subquery producing a single tuple: In this case, the subquery returns one value. If it returns more, it’s a run-time error. SELECT DISTINCT Purchase.product FROM Purchase WHERE buyer = (SELECT name FROM Person WHERE ssn = “123456789”);

Can say the same thing without a subquery: How do the two queries compare if we drop the DISTINCT keyword ? SELECT DISTINCT Purchase.product FROM Purchase, Person WHERE buyer = name AND ssn = “123456789”

Subqueries Returning Relations Find companies who manufacture products bought by Joe Blow. SELECT DISTINCT Company.name FROM Company, Product WHERE Company.name=Product.maker AND Product.name IN (SELECT Purchase.product FROM Purchase WHERE Purchase .buyer = “Joe Blow”); Here the subquery returns a set of values

Equivalent to: SELECT DISTINCT Company.name FROM Company, Product, Purchase WHERE Company.name= Product.maker AND Product.name = Purchase.product AND Purchase.buyer = “Joe Blow”

Subqueries Returning Relations You can also use: s > ALL R s > ANY R EXISTS R Product ( pname, price, category, maker) Find products that are more expensive than all those produced By “Gizmo-Works” SELECT DISTINCT name FROM Product WHERE price > ALL (SELECT price FROM Purchase WHERE maker=“Gizmo-Works”)

Question for Database Fans and their Friends Can we express this query as a single SELECT-FROM-WHERE query, without subqueries ? Hint: show that all SFW queries are monotone (figure out what this means). A query with ALL is not monotone

Conditions on Tuples SELECT DISTINCT Company.name FROM Company, Product WHERE Company.name= Product.maker AND (Product.name,price) IN (SELECT Purchase.product, Purchase.price) FROM Purchase WHERE Purchase.buyer = “Joe Blow”);

Correlated Queries Movie (title, year, director, length) Find movies whose title appears more than once. correlation SELECT DISTINCT title FROM Movie AS x WHERE year < ANY (SELECT year FROM Movie WHERE title = x.title); Note (1) scope of variables (2) this can still be expressed as single SFW

Complex Correlated Query Product ( pname, price, category, maker, year) Find products (and their manufacturers) that are more expensive than all products made by the same manufacturer before 1972 Powerful, but much harder to optimize ! SELECT DISTINCT pname, maker FROM Product AS x WHERE price > ALL (SELECT price FROM Product AS y WHERE x.maker = y.maker AND y.year < 1972);

Aggregation SELECT Sum(price) FROM Product WHERE maker=“Toyota” SQL supports several aggregation operations: SUM, MIN, MAX, AVG, COUNT SELECT Count(*) FROM Product WHERE year > 1995 Except COUNT, all aggregations apply to a single attribute

Aggregation: Count COUNT applies to duplicates, unless otherwise stated: Better: SELECT Count(name, category) FROM Product WHERE year > 1995 same as Count(*) SELECT Count(DISTINCT name, category) FROM Product WHERE year > 1995

Simple Aggregation Purchase(product, date, price, quantity) Example 1: find total sales for the entire database Example 1’: find total sales of bagels SELECT Sum(price * quantity) FROM Purchase SELECT Sum(price * quantity) FROM Purchase WHERE product = ‘bagel’

Simple Aggregations

Grouping and Aggregation Usually, we want aggregations on certain parts of the relation. Purchase(product, date, price, quantity) Example 2: find total sales after 10/1 per product. SELECT product, Sum(price*quantity) AS TotalSales FROM Purchase WHERE date > “10/1” GROUPBY product

Grouping and Aggregation 1. Compute the relation (I.e., the FROM and WHERE). 2. Group by the attributes in the GROUPBY 3. Select one tuple for every group (and apply aggregation) SELECT can have (1) grouped attributes or (2) aggregates.

First compute the relation (date > “9/1”) then group by product:

Then, aggregate SELECT product, Sum(price*quantity) AS TotalSales FROM Purchase WHERE date > “9/1” GROUPBY product

Another Example SELECT product, Sum(price * quantity) AS SumSales Max(quantity) AS MaxQuantity FROM Purchase GROUP BY product For every product, what is the total sales and max quantity sold?

HAVING Clause Same query, except that we consider only products that had at least 100 buyers. SELECT product, Sum(price * quantity) FROM Purchase WHERE date > “9/1” GROUP BY product HAVING Sum(quantity) > 30 HAVING clause contains conditions on aggregates.

General form of Grouping and Aggregation S = may contain attributes a1,…,ak and/or any aggregates but NO OTHER ATTRIBUTES C1 = is any condition on the attributes in R1,…,Rn C2 = is any condition on aggregate expressions SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2

General form of Grouping and Aggregation Evaluation steps: Compute the FROM-WHERE part, obtain a table with all attributes in R1,…,Rn Group by the attributes a1,…,ak Compute the aggregates in C2 and keep only groups satisfying C2 Compute aggregates in S and return the result SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2

More Aggregation Examples Author(login,name) Document(url, title) Wrote(login,url) Mentions(url,word)

Find all authors who wrote at least 10 documents: Select author.name From author, wrote Where author.login=wrote.login Groupby author.name Having count(wrote.url) > 10

Find all authors who have a vocabulary over 10000: Select author.name From author, wrote, mentions Where author.login=wrote.login and wrote.url=mentions.url Groupby author.name Having count(distinct mentions.word) > 10000