OLTP Will be talking about On Line Transaction Processing OLTP for most of this course Operational databases As opposed to OLAP On Line Analytical Processing Decision support
SQL Chapters 6, 7
SQL or SEQUEL (Structured English Query Language) Based on relational algebra First called ‘Square’ Developed in 1970's released in early 1980's Standardized - SQL-92 (SQL2), SQL-3, SQL:1999 (SQL-99), 2003 (aka SQL: 200n), SQL:2008 SQL:2011 2011 includes better support for temporal databases SQL: 2016 access JSON https://en.wikipedia.org/wiki/SQL:2016
SQL High-level DB language used in ORACLE, etc. created at IBM with System R SQL provides DDL and DML DDL - create table, drop table, alter table DML - Queries in SQL
SQL Is SQL useful? Languages to learn
Give examples of queries in English What basic keywords would you use for a query language for a DB
SQL Basic building block of SQL is the Select Statement SELECT <attribute list> FROM <table list > [WHERE <search conditions>]
Select Statement Select - chooses columns (project operation p in relational algebra) From - combines tables if > 1 table (join operation |X| in relational algebra and Cartesian product X) Where - chooses rows (select operation s in relational algebra) Result of a query is usually considered another relation Results may contain duplicate tuples
Queries Select specified columns for all rows of a table Select all columns for some of the rows of a table Select specified columns for some rows of a table Select all rows and columns of a table All of the above for multiple tables
select lname from employee LNAME ---------- Smith Wong Zelaya Wallace Narayan English Jabbar Borg
select salary from employee; SALARY ---------- 30000 40000 25000 43000 38000 55000
Differences with relational model Relation not a set of tuples - a multiset or bag of tuples Therefore, 2 or more tuples may be identical Use distinct to eliminate duplicate tuple Select distinct salary from employee
Select Clause Select <attribute list> Attribute list can be: Column names * lists all attributes in a table Constants arithmetic expressions involving columns, etc. Aggregate functions In Oracle, can also be a select statement (but select can only return 1 column and 1 row) To rename an attribute, keyword ‘as’ is optional Select lname as last_name From employee
Queries To retrieve all the attribute values of the selected tuples, a * is used: Select * From Employee
From clause From <table list> Table list can be: one or more table names a select statement itself
Where clause Where <search conditions> You can specify more than one condition in the where clause separated by: and or
Where clause Where <search conditions> (s in relational algebra) Search conditions can be: Comparison predicate: expr § expr2 where § is <, >, <=, etc. in, between, like, is, etc. expr is constant, col, qual.col, aexpr op aexpr expr2 is expr | select statement expr can be a select statement expr CANNOT be an aggregate – must appear within a select statement
Retrieve the ssn of the employee whose name is 'Smith‘ SQL> select ssn 2 from employee 3 where lname='Smith'; SSN ---------- 123456789
Miscellaneous SQL is NOT case sensitive Select from employee select FROM EMPLOYEE Except when comparing character strings All character strings in SQL are surrounded by single quotes where lname='Smith' However, tables names in some RDMS (MySQL) are case sensitive
Our Oracle Server We have an Oracle Server set up for this class: Our server is: vrbsky-oracle.ua-net.ua.edu In order to access this server you need Oracle client software installed on your laptop. One possibility is: SQL developer from the Oracle website http://www.oracle.com/technetwork/developer-tools/sql-developer/downloads/index.html NOTE: to download SQL developer you may need to create an account for the Oracle website. This is NOT the same as your Oracle Server login/pw
www.oracle.com Create your own account for www.oracle.com Download SQL Developer Oracle Client software Oracle Server Located at UA in OIT Use login/PW given to connect to our local Oracle Server vrbsky-oracle.ua-net.ua.edu Using SQL Developer: Create DB tables – commands given Start running SQL queries
Once you have installed the Oracle client you need to tell it where to find the server and your login/pw Your Oracle Server username is your bama id Your Password is your CWID Info about setting up the Oracle client
If you need to delete a table: Once you can connect to our Oracle Server you should create the Company DB – Create the Company DB by cutting/pasting these SQL statements in your Oracle client. If you need to delete a table: Drop table table_name;
SQL query Oracle Server Query result vrbsky-oracle.ua-net.ua.edu
What if you want a list of the employee SSNs and department number of the departments they work for? Select SSN, dno From Employee What if you want a list of the employee SSNs and name of the departments they work for?
Combining tuples using where clause To retrieve data that is in more than one table can use: a cartesian product X List ssn and dname of department employee works for Select ssn, dname From Employee, Department A join operation |X| Select ssn, dname Where dno=dnumber
Combining tuples in from clause A cartesian product combines each tuple in one table, with all the tuples in the second table (and all columns unless specified in select clause) A join combines a tuple from the first table with tuple(s) in the second table if the specified (join) condition is satisfied (again, all columns included unless specified in select clause) This join is also referred to as an inner join
Alternative SQL notation for Join Select ssn, dname From Employee Join Department on dno=dnumber Select lname, relationship From Employee Join Dependent on ssn=essn Where dno=5
Select * From Employee, Dependent
Where clause Select ssn, fname, lname From Employee Join From Employee, Department Where mgrssn=ssn and sex='F' Mgrssn=ssn is a join condition Sex=‘F’ if a select condition Select ssn, fname, lname From Employee Join Department on dno=dnumber Where dname=‘Research’
Additional characteristics In SQL we can use the same name for 2 or more attributes in different relations. Must qualify the attributes names: employee.lname department.*
Additional characteristics Aliases are used to rename relations List dname of each department and its locations Select dname as dept_name, dlocation From Department D, Dept_Locations DL Where DL.dnumber = D.dnumber NOTE: cannot use ‘as’ keyword here in Oracle
Predicates Predicates evaluate to either T or F. Many of the previous queries can be specified in an alternative form using nesting.
In predicate The in predicate tests set membership for a single value at a time. In predicate: expr [not] in (select | val {, val}) Select <attribute list> From <table list> Where expr in (select | val {, val})
In predicate Select SSN of employees who work in departments located in Houston The outer query selects an Employee tuple if its dno value is in the result of the nested query. Select SSN of employees who do not work in departments located in Houston
Quantified predicate Quantified predicate compares a single value with a set according to the predicate. Quantified predicate: expr § [all | any] (select) Select <attribute list> From <table list> Where expr § [all | any] (select) § is < > = <> <= >=
Quantified predicate Write using quantified predicate: Select SSN of employees who work in departments located in Houston Which predicate should be used? = all, = any, > all, etc.?
Quantified predicate What does the following query? Select * From Employee Where salary > all (Select salary Where sex = 'F') = any equivalent to in not in equivalent to <> all or != all
Exists predicate The exists predicate tests if a set of rows is non-empty Exists predicate: [not] exists (select) Select <attribute list> From <table list> Where exists (select)
Exists predicate Exists is used to check whether the result of the inner query is empty or not. If the result is NOT empty, then the tuple in the outer query is in the result.
Exists predicate Write using exists predicate: Select SSN of employees who work in departments located in Houston
Exists predicate Exists is used to check whether the result of the inner query is empty or not. If the result is NOT empty, then the tuple in the outer query is in the result. Exists is used to implement difference (‘not in’ also used) and intersection.
Exists predicate Retrieve all the names of employees who do not work in a department located in Houston. Retrieves the locations of the department Employee works for to see if one of them is Houston. If none exist (the inner query is empty and not exists is true) the Employee tuple is in the result.
select ssn from employee, dept_locations where dno=dnumber and dlocation=‘Houston’; Select ssn from employee join dept_locations on dno=dnumber Where dlocation=‘Houston’; select ssn from employee where dno in (select dnumber from dept_locations where dlocation='Houston'); where dno =any (select dnumber from dept_locations where exists (select dnumber from dept_locations where dlocation='Houston' and dno=dnumber);
SSN of employees who work for department located in Houston select ssn from employee, dept_locations where dno=dnumber and dlocation=‘Houston’; Select ssn from employee join dept_locations on dno=dnumber Where dlocation=‘Houston’; select ssn from employee where dno in (select dnumber from dept_locations where dlocation='Houston'); where dno =any (select dnumber from dept_locations where exists (select dnumber from dept_locations where dlocation='Houston' and dno=dnumber);
Note: if the query asks for the Research department you MUST use ‘Research’ in your query NOT dno=5 Why??
Correlated Nested Queries If a condition in the where-clause of a nested query references an attribute of a relation declared in an outer query, the two queries are said to be correlated. The result of a correlated nested query is different for each tuple (or combination of tuples) of the relation in the outer query. Which takes longer to execute? a correlated nested query or a non-correlated nested query?
Join Conditions For every project located in 'Stafford' list the project number, the controlling department number and department manager's last name, address and birthdate. How many join conditions in the above query? How many selection conditions?
Write queries to do the following: List the pno of the project and essn of employees who work more than 20 hours on a project List the pname of the project and essn of employees who work more than 20 hours on a project Optional – for fun: Using one of the predicates List the fname, lname of employees who have a dependent. Email screenshot to tmahjabin@crimson.ua.edu Include HW#1 CS301 and Section 1 in the subject
Note: if the query asks for the Research department you MUST use ‘Research’ in your query NOT dno=5 Why??
Join queries queries List the pname of projects and dname of their departments. List the name of employees who have dependents with the same birthday as they do.
Join Queries List all employee names and their supervisor names select e.lname, s.lname from employee e, employee s where e.superssn=s.ssn;
Single block queries An Expression written using =any or IN may almost always be expressed as a single block query. Find example where this is not true in your textbook
List All Employees and the name of any department if they manage one List All Employees and the name of any department if they manage one. If they don’t manage any department they should still be included.
Outer Join Outer Join - extension of join and union In a regular join, tuples in R1 or R2 that do not have matching tuples in the other relation do not appear in the result. Some queries require all tuples in R1 (or R2 or both) to appear in the result When no matching tuples are found, nulls are placed for the missing attributes.
Outer Join You can use the keywords left, right, full (works in Oracle) The following is a left outer join Select lname, dname From Employee Left Outer Join Department on ssn=mgrssn The keyword Outer is optional
LNAME DNAME ---------- --------------- Wong Research Wallace Administration Borg Headquarters Jabbar English Zelaya Narayan Smith
Outer Join You can also use a + to indicate an outer join The following example indicates a left outer join in Oracle Select lname, dname From Employee, Department Where ssn=mgrssn(+) Select lname, dname From Employee Left Outer Join Department on ssn=mgrssn
Nested queries In general we can have several levels of nested queries. An inner query can reference an attribute in an outer query An outer query cannot reference an attribute in an inner query (like scope rules in higher level languages). A reference to an unqualified attribute refers to the relation declared in the inner most nested query. A reference to an attribute must be qualified if its name is ambiguous.
Will this work? Select ssn, dname from employee Suppose you want the ssn and dname of each employee not in the Research department: Select ssn, dname from employee where dno not in (select dnumber from department where dname='Research')
Nested select statements Select lname, dno From employee Where dno = (select dnumber from department where dname = 'Research'); You need to be careful using this. Result must be a single value Where dno = (select dnumber from dept_locations where dlocation= 'Houston');
More SQL Anything missing to answer typical queries?
Aggregate functions Aggregate Functions (set functions, aggregates): Include COUNT, SUM, MAX, MIN and AVG aggr (col) Find the maximum salary, the minimum salary and the average salary among all employees. Select MAX(salary), MIN(salary), AVG(salary) From Employee
Aggregates Retrieve the total number of employees in the company Select COUNT(*) From Employee Retrieve the number of employees in the research department. From Employee, Department Where dno=dnumber and dname='Research'
Aggregates Note that: Select COUNT(*) from Employee Will give you the same result as: Select COUNT(salary) from Employee Unless there are nulls - not counted To count the number of distinct salaries. Select COUNT(distinct salary) From Employee
Aggregates Additional aggregates have been added to RDBMS - variance Read the Oracle documentation to see what has been added
List average salary, max salary over all employees List lname, salary for employees with salaries > average salary List lname, salary for employees with salaries > average salary for their department
SELECT dno, lname, salary FROM employee e WHERE salary > (SELECT AVG(salary) FROM employee WHERE e.dno=dno); What if we get rid of the ‘e’ in e.dno?
List each department name and average salary Difficult to write?
Grouping We can apply the aggregate functions to subgroups of tuples in a relation. Each subgroup of tuples consists of the set of tuples that have the same value for the grouping attribute(s). The aggregate is applied to each subgroup independently. SQL has a group-by clause for specifying the grouping attributes. Group By col {, col}
Grouping For each department, retrieve department name, the average salary Select dname, AVG(salary) From Employee, department where dno=dnumber Group By dname The tables are joined The tuples are divided into groups with the same dname AVG are then applied to each group.
Oracle group by – STANDARD SQL Only grouping attribute(s) and aggregate functions can be listed in the SELECT clause. Expressions in the GROUP BY clause can contain any columns of the tables or views in the FROM clause, regardless of whether the columns appear in the SELECT clause. Some DBMS (e.g. MySQL) do not implement standard SQL In this class everyone will use standard SQL
For each department, retrieve the department number, department name, the average salary and total number of employees. Select dno, dname, AVG(salary), count(*) From Employee, department where dno=dnumber Group By dno, dname
List each department name and average salary We will try writing it without group by later
Write the following SQL queries: list employee name, their department name and number, and salary for employees with salary > $32,000. list department name, department number and average salary
3. What if you want those departments with average salary > 32000?
Having Clause Sometimes we want to retrieve those tuples with certain values for the aggregates (Group By). The having clause is used to specify a selection condition on a group (rather than individual tuples). If a having is specified, you must specify a group by. Having search_condition
With group by / Having List departments with average salary > 32000 select dname, avg(salary) from department, employee where dno=dnumber group by dname having avg(salary) > 32000;
Which departments have an avg salary > than the overall avg salary? select dname, avg(salary) from department, employee where dno=dnumber group by dname having avg(salary) > (select avg(salary) From employee);
Subselect formal definition Select called Subselect Select expr {, expr} From tablename [alias] {, tablename [alias]} [Where search_condition] [Group By col {, col}] [Having search_condition]
Select Select is really: Subselect {Set_Operation [all] Subselect} [Order By col [asc | desc] {, col [asc | desc]}]
Order By To sort the tuples in a query result based on the values of some attribute: Order by col_list Default is ascending order (asc), but can specify descending order (desc)
Order by Retrieve names of the employees, their department, and salary and order it by department and within each department order the employees salary in descending order. Select lname, fname, dname, salary From department, employee Where dno=dnumber Order by dname, salary desc
Set Operations The Set Operations are: UNION, INTERSECT and MINUS NOTE: MySQL does not have minus The resulting relations are sets of tuples; duplicate tuples are eliminated. Operations apply only to union compatible relations. The two relations must have the same number of attributes and the attributes must be of the same type.
Union SELECT bdate FROM employee UNION FROM dependent
List essn of employee who work on both pno=1 and pno=2 Select essn from works_on where pno=1 Intersect Select essn from works_on where pno=2
Set operations - Intersect List all project names for projects that are worked on by an employee whose last name is Smith and has Wong as a manager of the department that controls the project (Select pname From Project, Works_on, Employee Where pnumber=pno and essn=ssn and lname='Smith') Intersect (Select pname From Project, Department, Employee Where dnum=dnumber and mgrssn=ssn and lname='Wong')
Minus Example using minus: Select departments that are not located in ‘Houston’ We wrote using ‘in’ predicate: Select dnumber from dept_locations where dnumber not in ( Select dnumber from dept_Locations Where dlocation = 'Houston'); Now use minus: Select dnumber from department Minus Where dlocation = 'Houston';
Minus Select departments that are not located in ‘Houston’ Without minus or ‘in’ – does this work? Select dnumber from dept_locations Where dlocation <> 'Houston';
1:1, 1:N, N:M relationships Select employees who do not work for dno=4 Select dno of departments that doesn’t have ssn=123456789 as an employee (e.g. ssn=123456789 doesn’t work for that department) Select * from employee //N side of 1:N Where dno<>4; Select dno from employee //1 side of 1:N – NOT correct Where ssn<>123456789; //Correct select dno from employee minus where ssn =123456789;
The difference is: 1:N (on the N-side) or 1:1 //can use <> versus 1:N (on the 1-side) or N:M //cannot use <> What are all the 1:1, 1:N, N:M relationships in the Company DB?
Example Compute the average number of employees over all departments There are several ways to do this, but note that you can do: aggr(aggr(col)) But you will need a group by somewhere in your query select avg(count(essn)) from dependent,employee where ssn=essn(+) group by ssn;
Earlier we considered List each department name and average salary Now try writing it without group by
Try to nest select in select clause select dname, (select avg(salary) as avgsal from employee where dno=dnumber) from department;
Grouping Without the having: list department name, average salary for departments with average salary > $32,000. Will this work? Select dname, avg(salary) From department, employee Where dno=dnumber and avg(salary) > 32000 Group by dname;
//instead these work select dname, avg(salary) from department, employee where dno=dnumber and (select avg(salary) from employee where dno=dnumber) > 32000 group by dname; //try to write it using a nested query in the from clause
select dname, avgsal from (select dno, avg(salary) as avgsal from employee group by dno), department where dno=dnumber and avgsal > 32000
Try to nest select in select clause //Does NOT work!! - can't recognize avgsal if inside () or outside () But it can be written without group by or having select dname, (select avg(salary) as avgsal from employee where dno=dnumber) from department where avgsal > 32000;
Listed as extra credit in homework
Queries Using the student and dept tables Student( ID, Age, GPA, Major) Dept(Name, DeptHead, Location, College) Using the student and dept tables List the dept name and number of majors Assume some departments may not have any majors:
Example - Queries Compute the number of dependents List the essn and number of dependents for employee with dependents List the essn and number of dependents for employees with or without dependents Compute the average number of dependents over employees with dependents select avg(count(essn)) from dependent,employee where ssn=essn(+) group by ssn;
How do we count over all employees? SQL> select count(*) 2 from employee, dependent 3 where ssn=essn 4 group by essn; COUNT(*) ---------- 3 1 SQL> select essn, count(*) 2 from employee, dependent 3 where ssn=essn 4 group by essn; ESSN COUNT(*) ---------- ---------- 333445555 3 123456789 3 987654321 1 How do we count over all employees? SQL> select avg(count(*)) 2 from employee, dependent 3 where ssn=essn 4 group by essn; AVG(COUNT(*)) ------------- 2.33333333 SQL> select avg(cnt) 2 from (select count(*) as cnt 3 from employee, dependent 4 where ssn=essn group by essn); AVG(CNT) ---------- 2.33333333 select avg(select count(*) from employee, dependent where ssn=essn group by essn) from employee; Gives an error
DDL – Data Definition in SQL Used to CREATE, DROP and ALTER the descriptions of the relations of a database CREATE TABLE Specifies a new base relation by giving it a name, and specifying each of its attributes and their data types CREATE TABLE name (col1 datatype, col2 datatype, ..)
Data Types Data types: (ANSI SQL vs. Oracle) There are differences between SQL and Oracle, but Oracle will convert the SQL types to its own internal types int, smallint, integer converted to NUMBER Can specify the precision and scale Float and real converted to number Character is char(l) or varchar2(l), varchar(l) still works Have date, blob, etc.
Constraints Constraints are used to specify primary keys, referential integrity constraints, etc. [CONSTRAINT constr_name] PRIMARY KEY need to name it if want to alter it later CONSTRAINT constr_name REFERENCES table (col) The table(col) referenced must exist Constraint names must be unique across database You can also specify NOT NULL for a column You can also specify UNIQUE for a column
Create table – In line constraint definition Create table Project1 (pname varchar2(9) CONSTRAINT pk PRIMARY KEY, pnumber int not null, plocation varchar2(15), dnum int CONSTRAINT fk REFERENCES Department (dnumber), phead int);
Create table To create a table with a composite primary key must use out of line definition: Create table Works_on (essn char(9), pno int, hours number(4,1), PRIMARY KEY (essn, pno));
Oracle Specifics A foreign key may also have more than one column so you need to specify an out of line definition There are differences with the in line When you specify a foreign key constraint out of line, you must specify the FOREIGN KEY keywords and one or more columns. When you specify a foreign key constraint inline, you need only the REFERENCES clause.
Create table – out of line constraint definition Create table Project2 (pname varchar2(9), pnumber int not null, plocation varchar(15), dnum int, phead int, PRIMARY KEY (pname), CONSTRAINT fk FOREIGN KEY (dnum) REFERENCES Department (dnumber));
DROP TABLE Used to remove a relation and its definition The relation can no longer be used in queries, updates or any other commands since its description no longer exists Drop table dependent;
ALTER TABLE To alter the definition of a table in the following ways: to add a column to add an integrity constraint to redefine a column (datatype, size, default value) – there are some limits to this to enable, disable or drop an integrity constraint or trigger other changes relate to storage, etc.
Alter table - Oracle The table you modify must have been created by you, or you must have the ALTER privilege on the table. If used to add an attribute to one of the base relations, the new attribute will have NULLS in all the tuples of the relation after command is executed; hence, NOT NULL constraint is not allowed for such an attribute. Alter table employee add job varchar(12); The database users must still enter a value for the new attribute job for each employee tuple using the update command. Oracle alter
How to create a table when? CONSTRAINT constr_name REFERENCES table (col) The table(col) referenced must exist Department mgrssn references employee ssn with mgrssn Employee dno references department dnumber
Alter is useful when … You have two tables that reference each other Table must be defined before referenced, so how to define?: department mgrssn references employee ssn with mgrssn Employee dno references department dnumber Create employee table without referential constraint for dno then Create department table with reference to mgrssn Alter employee and add dno referential constraint Or when you specify create table you can disable the references, then enable them later
Updates (DML) Insert, delete and update INSERT Insert into table_name ( [(col1 {, colj})] values (val1 {, valj}) | (col1 {, colj}) subselect ) add a single tuple attribute values must be in the same order as the CREATE table
Insert Insert into Employee values ('Richard', 'K', 'Marini', '654298343', '30-DEC-52', '98 Oak Forest, Katy, TX', 'M', 37000, '987654321, 4); Use null for null values in ORACLE
Insert Alternative form - specify attributes and leave out the attributes that are null Insert into Employee (fname, lname, ssn) values ('Richard', 'Marini', '654298343'); Constraints specified in DDL are enforced when updates are applied.
Insert To insert multiple tuples from existing table: create table ename (name varchar(15)); Table created. insert into ename (select lname from employee); 8 rows created. select * from ename; NAME --------------- Smith Wong Zelaya Wallace Narayan English Jabbar Borg
Importing Data Obviously there are many utilities to import large amounts of data In Oracle SQL developer, use Tools -> Preferences->Databse-> Utilities-> Import Then set File Formats
Delete Delete from table_name [search_condition] If include a where clause to select, tuples are deleted from table one at a time The number of tuples deleted depends on the where clause If no where clause included all tuples are deleted - the table is empty
Delete Examples: Delete From Employee Where dno = 5; Where ssn = '123456789‘; Delete from Employee Where dno in (Select dnumber From Department Where dname = 'Research'); Delete from Employee;
Update Modifies values of one or more tuples Where clause used to select tuples Set clause specified the attribute and value (new) Only modifies tuples in one relation at a time Update <table name> Set attribute = value {, attribute = value} Where <search conditions>
Update Examples: Update Project Set plocation = 'Bellaire', dnum = 5 Where pnumber = 10 Update Employee Set salary = salary * 1.5 Where dno = (Select dnumber From department Where dname = ‘Headquarters')
Metadata To get information about a specific table: Describe employee Lists all attributes and type To get information about all user tables, can query user_tables Select table_name from user_tables
System tables user_tables user_tab_columns user_constraints user_cons_columns user_triggers user_views user_tab_privs user_tab_privs_made (lists privileges granted to others) user_col_privs
Standard SQL What is the deal with MySQL vs. standard SQL? Oracle has standard SQL MySQL does not https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
Example Queries Suppose you have created a table QtrSales (ID, Q1, Q2, Q3, Q4) SQL to compute the total sales for each quarter? SQL to compute the total sales for each ID?