Download presentation
Presentation is loading. Please wait.
1
CS4433 Database Systems SQL - Basics
2
Introduction SQL stands for Structured Query Language
Standard language for querying and manipulating data SQL stands for Structured Query Language Initially developed at IBM by Donald Chamberlin and Raymond Boyce in the early 1970s, and called SEQUEL (Structured English Query Language) Many standards out there: SQL92, SQL3, SQL99, ……, SQL2011,….. Vendors support various subsets of these standards Why SQL? A very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in languages like C++ Its queries are “optimized” quite well, yielding efficient query executions to learn SQL
3
Introduction Select-From-Where Two sublanguages
DDL – Data Definition Language define and modify schema CREATE TABLE table_name ( { column_name data_type [ DEFAULT default_expr ] [ column_constraint [, ... ] ] | table_constraint } [, ... ] ) DML – Data Manipulation Language Queries can be written intuitively Select-From-Where
4
Data types Data definition Character strings Fixed length CHAR(n)
‘foo’ becomes ‘foo ‘ if defined as CHAR(5) Varying length VARCHAR(n) Up to n characters Bit strings Fixed length BIT(n) Varying length BIT VARYING(n)
5
Data types BOOLEAN TRUE FALSE UNKNOWN INT, INTEGER, SHORTINT
FLOAT, REAL, DOUBLE PRECISION, DECIMAL(n,d)- n decimal digits, decimal point d positions from right
6
Data types DATE TIME …
7
Dates and times Can compare dates and times using operators used for numbers or strings - >, < etc DATE ' ' TIME '00:00:00' DATETIME ' :00:00' TIMESTAMP YEAR 0000
8
Select-From-Where Statements
The principal form of a SQL query is: SELECT desired attributes FROM one or more tables WHERE condition about tuples of the tables
9
Our Running Example Movie( title, year, length, studioName, prodC#))
Most of our SQL queries will be based on the following database schema Underline indicates key attributes Movie( title, year, length, studioName, prodC#)) StarsIn(title, year, starId) MovieStar(id, name, address, gender, birthdate) MovieExec(name, address, cert#, netWorth) Studio(name, address, presC#)
10
Our Running Example CREATE TABLE Movie ( Title char(50), year int , length int, studioName int, prodC# int, PRIMARY KEY (title, year), FOREIGN KEY (studioName) REFERENCES Studio) CREATE TABLE StarsIn( MTitle char(50), Myear int , starId int, PRIMARY KEY (Mtitle, Myear, starId), FOREIGN KEY (Mtitle) REFERENCES Movie FOREIGN KEY (Myear) REFERENCES Movie FOREIGN KEY (starId) REFERENCES MovieStar)
11
Select-From-Where Example
All movies produced by Disney studios in 1990 select title from Movie where studioName=‘Disney’ and year= ‘1990’ The answer is a relation with title attribute of Movie, and tuples with the studioName=‘Disney’ and year= ‘1990’
12
Single-Relation Query
Operation Begin with the relation in the FROM clause Apply the selection indicated by the WHERE clause Apply the extended projection indicated by the SELECT clause Semantics To implement this algorithm think of a tuple variable ranging over each tuple of the relation mentioned in FROM Check if the “current” tuple satisfies the WHERE clause If so, compute the attributes or expressions of the SELECT clause using the components of this tuple
13
* In SELECT clauses When there is one relation in the FROM clause, * in the SELECT clause stands for “all attributes of this relation.” Example select * from Movie where studioName=‘Disney’ and year= ‘1990’ Now, the result has each of the attributes of Movie
14
Projections select title, length from Movie
Find Movie title and length of all movies produced by Disney studios in 1990 select title, length from Movie where studioName=‘Disney’ and year= ‘1990’ Now, the result has title, length attributes of Movie
15
Renaming Attributes If you want the result to have different attribute names, use “AS <new name>” to rename an attribute Example select title as name, length as duration from Movie where studioName=‘Disney’ and year= ‘1990 Result will have name and duration attribute for movie with ..
16
Selection WHERE clause select title from movie where year > 1970
17
Expressions in SELECT Clauses
Any expression that makes sense can appear as an element of a SELECT clause Example: select title, length *60 AS durationInsecs from Movie where studioName=‘Disney’ and year= ‘1990’
18
Selections What you can use in WHERE:
attribute names of the relation(s) used in the FROM comparison operators: =, <>, <, >, <=, >= apply arithmetic operations: stockprice*2 operations on strings (e.g., “||” for concatenation) Lexicographic order on strings Special stuff for comparing dates and times.
19
String Comparison <, >= lexicographic order ‘fodder’ < ‘foo’
‘bar’ < ‘bargain’
20
Patterns WHERE clauses can have conditions in which a string is compared with a pattern, to see if it matches s like p or s not like p s – string, p – pattern General form: <Attribute> LIKE <pattern> or <Attribute> NOT LIKE <pattern> % in p can match any sequence of 0 or more characters in s _ in p matches any one character in s s not like p: Is true iff string s does not match pattern p
21
Patterns Remember first name of movie – Star What movie is this?
select title from movie where title like ‘Star %’ Second part of name – 4 letters where title like ‘Star _ _ _ _’
22
Important Details SQL is case-insensitive
In general, upper and lower case characters are the same, except inside quoted strings Two single quotes inside a string represent the single-quote (apostrophe) Conditions in the WHERE clause can use AND, OR, NOT, and parentheses in the usual way Boolean conditions are built
23
String Comparison select title from movie where title like ’s%’
Movies with title starting with s select title from movie where title like ’s%’
24
NULL Values Tuples in SQL relations can have NULL as a value for one or more components Meaning depends on context. Two common cases: Missing value -Unknown value Unknown salary of employee we know that all employee has salary, but we don’t know what it is Inapplicable value Spouse entry inapplicable for a single employee The logic of conditions in SQL is really 3-valued logic: TRUE, FALSE, UNKNOWN When any value is compared with NULL, the truth value is UNKNOWN A query only produces a tuple in the answer if its value for the WHERE clause is TRUE (not FALSE or UNKNOWN)
25
Null x + 3 = NULL x = 3 is UNKNOWN
x is null Arithmetic operation on a null is a null x + 3 = NULL Compare a null value and any other value – result is UNKNOWN x = 3 is UNKNOWN
26
Three-Valued Logic To understand how AND, OR, and NOT work in 3-valued logic, think of TRUE = 1, FALSE = 0, and UNKNOWN = ½, AND = minimum of the two values ; OR = maximum of the two values , Not =negation of truth value v NOT(x) = 1-x.
27
Unknown x y x AND y x OR y NOT x U T U T U U U U U U U F F U U
28
Example Can test for NULL explicitly: x IS NULL x IS NOT NULL SELECT *
FROM Person WHERE age < 25 OR age >= 25 OR age IS NULL Now it includes all Persons
29
Multi-relation Queries
Interesting queries often combine data from more than one relation, we can address several relations in one query by listing them all in the FROM clause. Distinguish attributes of the same name by “<relation>.<attribute>” Movie(title, year, length, inColor, studioName, prodC#) MovieExec(name, address, cert#, netWorth) select name from Movie, MovieExec where title = ‘Star Wars’ AND prodC#= cert#
30
Another Example Purchase (buyer, seller, store, product)
Product (name, price, category, maker) Purchase (buyer, seller, store, product) Company (name, stockPrice, country) Person(name, phoneNumber, city) Find names of people living in Stillwater that bought gizmo products, and the names of the stores they bought from SELECT name, store FROM Person, Purchase WHERE name=buyer AND city=‘Stillwater’ AND product=‘gizmo’
31
Disambiguating Attributes
Product (name, price, category, maker) Purchase (buyer, seller, store, product) Company (name, stockPrice, country) Person(name, phoneNumber, city) Find names of people buying telephony products: SELECT Person.name FROM Person, Purchase, Product WHERE Person.name=Purchase.buyer AND Purchase.product=Product.name AND Product.category=“telephony” Tip: Always prefix with relation name to make attributes clear
32
Attributes with the same name
MovieStar(name, address, gender, birthdate) MovieExec(name, address, cert#, netWorth) select MovieStar.name, MovieExec.name from MovieStar, MovieExec where MovieStar.address=MovieExec.address
33
Semantics Almost the same as for single-relation queries:
Start with the (Cartesian) product of all the relations in the FROM clause Apply the selection condition from the WHERE clause Project onto the list of attributes and expressions in the SELECT clause SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions Translation to Relational algebra: Πa1,…,ak (s Conditions (R1 x R2 x … x Rn)) Select-From-Where queries are precisely Project-Join-Select
34
Explicit Tuple-Variables
Sometimes, a query needs to use two copies of the same relation Distinguish copies by following the relation name by the name of a tuple-variable, in the FROM clause select Star1.name, Star2.name from MovieStar Star1, MovieStar Star2 where Star1.address=Star2.address AND Star1.name < Star2.name It’s always an option to rename relations this way, even when not essential from MovieStar AS Star1, MovieStar AS Star2
35
Semantics SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions Answer = {} for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer U {(a1,…,ak) return Answer
36
Ordering To get output in sorted order
order by <list of attributes> Default ascending Append DESC for descending
37
Ordering from movie where studioName = ‘Disney’ AND year = 1990
Movie(title, year, length, inColor, studioName, prodC#) To get movies listed by length, shortest first, and among movies of equal length, alphabetically select * from movie where studioName = ‘Disney’ AND year = 1990 Order by length, title
38
Ordering- descending Order the tuples of a relation R(A,B) by the sum of the two components of the tuples, highest first select * from R order by A+B DESC
39
SubQueries Query that is part of another Can have many levels
A parenthesized SELECT-FROM-WHERE statement (subquery) can be used as a value in a number of places, including FROM and WHERE clauses Example: in place of a relation in the FROM clause, we can place another query, and then query its result Better use a tuple-variable to name tuples of the result Return Single constant –Subqueries that return Scalar, this constant can be compared with another value in a where clause If a subquery is guaranteed to produce one tuple with one component, then the subquery can be used as a value “Single” tuple often guaranteed by key constraint A run-time error occurs if there is no tuple or more than one tuple Relations – can be used in many ways in a where clause
40
Subqueries – produce scalar values
Subquery produces a constant that is used by the main query Movie(title, year, length, inColor, studioName, prodC#) MovieExec(name, address, cert#, netWorth) select name from Movie, MovieExec where title = ‘Star Wars’ AND prodC#=cert#
41
Subqueries – produce scalar values
select name from MovieExec where cert# = (select prodC# from Movie where title = ‘Star Wars’ ) subquery
42
Conditions involving relations
exists R – T iff R is not empty s in R – T iff s is equal to one of the values in R s > all R – T iff s is greater than every value in R (can be < etc.) s > any R – T iff s is greater than at least one value in R (can be < etc.)
43
Conditions involving relations
exists, all, in, any can be negated by putting not in front of the expression not exists R – T iff R is empty s not in R – T iff s is equal to no value of R not s > all R – T iff s is not greater than the maximum value in R (can be < etc.) not s > any R – T iff s is the minimum value in R (can be < etc.)
44
The IN Operator <tuple> IN <relation> is true if and only if the tuple is a member of the relation <tuple> NOT IN <relation> means the opposite IN-expressions can appear in WHERE clauses The <relation> is often a subquery S in R: true if an d only if s is equal to one of the values in R S not in R: true if and only if s is equal to no value in R
45
Conditions involving tuples
select name from MovieExec where cert# in (select prodC# from Movie where (title, year) in (select movieTitle, movieYear from StarsIn where starName=‘Harrison Ford’ )
46
The Exists Operator select name from MovieExec ME where not Exists
EXISTS( <relation> ) is true if and only if the <relation> is not empty Being a Boolean-valued operator, EXISTS can appear in WHERE clauses select name from MovieExec ME where not Exists (select * from Movie where prodC#=ME.cerT# and (title, year) in (select movieTitle, movieYear from StarsIn where starName=‘Harrison Ford’ ) )
47
The Operator ANY x = ANY( <relation> ) is a Boolean condition meaning that x equals at least one tuple in the relation Similarly, = can be replaced by any of the comparison operators Example: x >= ANY( <relation> ) means x is not smaller than some tuples in the relation Note tuples must have one component only
48
Correlated Subqueries
Subquery evaluated once and result passed to higher level query Correlated subquery Subquery evaluated many times Each time value for the term in the subquery comes from a tuple variable outside the subquery
49
The Operator ANY in correlated Subqueries
Query: Titles that have been used for 2 or more movies 1 select title 2 from Movie as Old 3 where year < ANY 4 (select year from Movie where title = Old.title ) Movies with the same titles and a greater year For each tuple, sub query determines whether there is a movie with the same title and a greater year Old.title instead of a constant, say, ‘Star Wars’ Alias ‘Old’ for ‘Movie’ Lines 1 -3 – each tuple provides a value of Old.title, produce a title one fewer times than there are movies with that title Lines 4 – this value for Old.title used
50
The Operator ALL x <> ALL( <relation> ) is true if and only if for every tuple t in the relation, x is not equal to t That is, x is not a member of the relation. The <> can be replaced by any comparison operator Example: x >= ALL( <relation> ) means there is no tuple larger than x in the relation Query: From Sells(bar, beer, price), find the beer(s) sold for the highest price SELECT beer FROM Sells WHERE price >= ALL( SELECT price FROM Sells); price from the outer Sells must not be less than any price
51
FROM clause subqueries
Subqueries also can appear in the FROM clause of a query. For example: SELECT name FROM MovieExec, (SELECT producer# FROM Movie, StarsIn WHERE title = movieTitle AND year = movieYear AND starName = 'Harrison Ford' ) PROD WHERE cert# = Prod.ProducerC# 51
52
Bag (Set) Semantics for SFW Queries
The SELECT-FROM-WHERE statement uses bag semantics Selection: preserve the number of occurrences Projection: preserve the number of occurrences (no duplicate elimination) Cartesian product, join: no duplicate elimination The default for union, intersection, and difference is set semantics, and is expressed by the following forms, each involving subqueries: ( subquery ) UNION ( subquery ) ( subquery ) INTERSECT ( subquery ) ( subquery ) EXCEPT ( subquery ) union, intersection, and difference – eliminate duplicates Bags are converted to sets
53
Union, intersection, difference
(select name, address from MovieStar where gender = ‘F’) intersect from MovieExec where netWorth > )
54
Union, in section, difference
(select name, address from MovieStar) except from MovieExec)
55
Set vs. Bag: Efficiency When doing projection in relational algebra, it is easier to avoid eliminating duplicates Just work tuple-at-a-time When doing intersection or difference, it is most efficient to sort the relations first At that point you may as well eliminate the duplicates anyway
56
Controlling Duplicate Elimination
Force the result to be a set by SELECT DISTINCT select distinct name from MovieExec, Movie, StarsIn where cert#=prodC# AND title=movieTitle AND year=movieYear AND starName=‘Harrison Ford’ Force the result to be a bag (i.e., don’t eliminate duplicates) by ALL, as in UNION ALL . . . To prevent the elimination of duplicates follow union, intersect or except by the keyword all (select title, year from Movie) union all (select movieTitle as title, movieYear as year from StarsIn) R intersect all S R except all S
57
Aggregations SUM, AVG, COUNT, MIN, and MAX can be applied to a column in a SELECT clause to produce that aggregation on the column e.g. COUNT(*) counts the number of tuples count(distinct x) – counts number of distinct values in column x Query: From Sells(bar, beer, price), find the average price of Bud SELECT AVG(price) FROM Sells WHERE beer = ‘Bud’ number of tuples in StarsIn relation select count(*) from StarsIn
58
Eliminating Duplicates in an Aggregation
DISTINCT inside an aggregation causes duplicates to be eliminated before the aggregation Query: find the number of different prices charged for Bud: SELECT COUNT(DISTINCT price) FROM Sells WHERE beer = ‘Bud’;
59
NULL’s in Aggregation Select count(price) Select count(*) From Sells
NULL never contributes to a sum, average, or count, and can never be the minimum or maximum of a column But if there are no non-NULL values in a column, then the result of the aggregation is NULL Select count(price) From Sells Where beer = ‘Bud’ The number of bars that sell Bud at a known price Select count(*) From Sells Where beer = ‘Bud’ The number of bars that sell Bud
60
Group By We may follow a SELECT-FROM-WHERE expression by GROUP BY and a list of attributes The relation that results from the SELECT-FROM-WHERE is grouped according to the values of all those attributes, and any aggregation is applied only within each group Query: Movie(title, year, length, inColor, studioName, prodC#) find sum of the lengths of all movies for each studio select studioName, sum(length) from movie group by studioName From Sells(bar, beer, price), find the average price for each beer: SELECT beer, AVG(price) FROM Sells GROUP BY beer
61
Group By group by for several relations steps
Relation R is the cartesian product of the relations mentioned in the from clause selection of the where clause is applied to R group tuples of R according to attributes in the group by clause execute select clause
62
Group By Movie(title, year, length, inColor, studioName, prodC#)
MovieExec(name, address, cert#, netWorth) select name, sum(length) from Movie, MovieExec where prodC#=cert# group by name Print table listing each producer’s total length of film produced
63
Example Query: From Sells(bar, beer, price) and Frequents (drinker, bar), find for each drinker the average price of Bud at the bars they frequent: SELECT drinker, AVG(price) FROM Frequents, Sells WHERE beer = ‘Bud’ AND Frequents.bar = Sells.bar GROUP BY drinker; Compute drinker-bar- price of Bud tuples first, then group by drinker
64
Restriction on SELECT Lists With Aggregation
If any aggregation is used, then each element of the SELECT list must be either: Aggregated, or An attribute on the GROUP BY list Question: How about this query? SELECT bar, MIN(price) FROM Sells WHERE beer = ‘Bud’;
65
Having Clause HAVING <condition> may follow a GROUP BY clause. If so, the condition applies to each group, and groups not satisfying the condition are eliminated These conditions may refer to any relation or tuple-variable in the FROM clause They may refer to attributes of those relations, as long as the attribute makes sense within a group; i.e., it is either: A grouping attribute, or Aggregated
66
Full-relation operations
Print each producer’s total length of film produced – only for those producers who made at least one film prior to 1930 select name, sum(length) from Movie, MovieExec where prodC#=cert# group by name having min(year) < 1930
67
Having Clause: Example
SELECT beer, AVG(price) FROM Sells GROUP BY beer HAVING COUNT(bar) >= 3 OR beer = ‘michelob’;
68
General form of Grouping and Aggregation
SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2 S = may contain attributes a1,…,ak and/or any aggregates but NO OTHER ATTRIBUTES C1 = is any condition on the attributes in R1,…,Rn C2 = is any condition on aggregate expressions or grouping attributes
69
General form of Grouping and Aggregation
SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2 Evaluation steps: Compute the FROM-WHERE part, obtain a table with all attributes in R1,…,Rn Group by the attributes a1,…,ak Compute the aggregates in C2 and keep only groups satisfying C2 Compute aggregates in S and return the result
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.