Download presentation
Presentation is loading. Please wait.
Published byPeregrine Foster Modified over 8 years ago
1
1 Theory, Practice & Methodology of Relational Database Design and Programming Copyright © Ellis Cohen 2002-2008 Basic SQL These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. For more information on how you may use them, please see http://www.openlineconsult.com/db
2
2 © Ellis Cohen 2001-2008 Overview of Lecture The SELECT Statement Creating Tables Performance & Indexing Duplicate Elimination Aggregate Functions Distinct Aggregation
3
3 © Ellis Cohen 2001-2008 The SELECT Statement
4
4 © Ellis Cohen 2001-2008 Emps Table empno ename job hiredate sal comm ----- ------ --------- --------- ---- ---- 7369 SMITH CLERK 17-DEC-80 800 7499 ALLEN SALESMAN 20-FEB-81 1600 300 7521 WARD SALESMAN 22-FEB-81 1250 500 7566 JONES DEPTMGR 02-APR-81 2975 7654 MARTIN SALESMAN 28-SEP-81 1250 1400 7698 BLAKE DEPTMGR 01-MAY-81 2850 7782 CLARK DEPTMGR 09-JUN-81 2450 7788 SCOTT ANALYST 19-APR-87 3000 7839 KING PRESIDENT 17-NOV-81 5000 7844 TURNER SALESMAN 08-SEP-81 1500 0 7876 ADAMS CLERK 23-MAY-87 1100 7900 JAMES CLERK 03-DEC-81 950 7902 FORD ANALYST 03-DEC-81 3000 7934 MILLER CLERK 23-JAN-82 1300 Primary Key Emps( empno, ename, job, hiredate, sal, comm )
5
5 © Ellis Cohen 2001-2008 Queries SELECT ename FROM Emps WHERE empno = 7499 SELECT empno FROM Emps WHERE ename = 'ALLEN' SELECT empno, ename FROM Emps WHERE sal > 2975 ORDER BY ename ENAME ------ ALLEN EMPNO ENAME ----- ------ 7902 FORD 7839 KING 7788SCOTT EMPNO ----- 7499 Note symmetry of lookups Query Result Query
6
6 © Ellis Cohen 2001-2008 Basic Query Sequence SELECT empno, ename FROM Emps WHERE sal > 2000 ORDER BY ename 2. Projection 1. Restriction 2. Ordering It is possible in Oracle to order by renamed / computed fields It is possible in SQL Server to also project and restrict using renamed / computed fields
7
7 © Ellis Cohen 2001-2008 Naming Query Result Columns SELECT empno AS "Employee Number", ename AS Name FROM Emps WHERE sal > 2000 ORDER BY ename Employee Number NAME --------------- -------- 7698 BLAKE 7782 CLARK 7902 FORD 7566 JONES 7839 KING 7788 SCOTT Use double quotes for case-sensitive names or names with embedded blanks
8
8 © Ellis Cohen 2001-2008 IN SELECT empno, ename FROM Emps WHERE job IN ('CLERK', 'SALESMAN', 'ANALYST') What would be the equivalent SQL if you couldn't use IN?
9
9 © Ellis Cohen 2001-2008 IN vs OR SELECT empno, ename FROM Emps WHERE job IN ('CLERK', 'SALESMAN', 'ANALYST') SELECT empno, ename FROM Emps WHERE job = 'CLERK' OR job = 'SALESMAN' OR job = 'ANALYST'
10
10 © Ellis Cohen 2001-2008 LIKE SELECT empno, ename FROM Emps WHERE job LIKE 'C%' SELECT empno, ename FROM Emps WHERE ename LIKE 'SMIT_' What do these do?
11
11 © Ellis Cohen 2001-2008 LIKE Answer SELECT empno, ename FROM Emps WHERE job LIKE 'C%' SELECT empno, ename FROM Emps WHERE ename LIKE 'SMIT_' job starts with C ename has 5 characters and starts with SMIT
12
12 © Ellis Cohen 2001-2008 Computation SELECT empno, sal, 52*sal FROM Emps SELECT empno, sal, 52*sal AS yrsal FROM Emps
13
13 © Ellis Cohen 2001-2008 Functions SELECT empno, sal, sqrt(sal) FROM Emps SELECT empno, sal, sqrt(sal) AS sqrtsal FROM Emps Oracle has a wide variety of built-in functions
14
14 © Ellis Cohen 2001-2008 Parameterless Functions SELECT empno, sysdate – hiredate AS daysHired FROM Emps sysdate: parameterless function which returns the current date (and time) hiredate: the date an employee was hired sysdate – hiredate: the number of days since an employee was hired
15
15 © Ellis Cohen 2001-2008 NULLs in Tables int varchar int decimal decimal Datatypes Each cell in a column has a single value of the datatype specified for that column empno ename deptno sal comm Emps 7499ALLEN301600300 7654MARTIN3012501400 7698BLAKE302850 7839KING105000 7844TURNER3015000 7986STERN501300 NULL value
16
16 © Ellis Cohen 2001-2008 NULLs NULLs are different from other data values –NULL is not the same as 0 –NULL is not the same as an empty string NULL is typically used in the following situations –A value is Not Applicable –A value is known, but is currently Missing or Not Provided –A value is Unknown If a columns allows nulls, document what NULL represents!
17
17 © Ellis Cohen 2001-2008 Nulls in Conditions empno ename dept sal comm 7499ALLEN301600300 7654MARTIN3012501400 7698BLAKE302850 7839KING105000 7844TURNER3015000 7986STERN501300 Emps SELECT * FROM Emps WHERE … comm = 0 comm != 0 comm IS NULL comm IS NOT NULL How do we get all employees except those who definitely get a commission?
18
18 © Ellis Cohen 2001-2008 Answer: Nulls in Conditions empno ename dept sal comm 7499ALLEN301600300 7654MARTIN3012501400 7698BLAKE302850 7839KING105000 7844TURNER3015000 7986STERN501300 Emps SELECT * FROM Emps WHERE (comm = 0) OR (comm IS NULL) comm = 0 comm != 0 comm IS NULL comm IS NOT NULL How do we get all employees except those who definitely get a commission? SELECT * FROM Emps WHERE nvl( comm, 0 ) = 0
19
19 © Ellis Cohen 2001-2008 Creating Tables
20
20 © Ellis Cohen 2001-2008 Table Creation & Insertion CREATE TABLE Emps ( empnoint primary key, enamevarchar(30), deptnonumber(3), salnumber(7,2) DEFAULT 700, commnumber(7,2) ) INSERT INTO Emps VALUES( 7499, 'ALLEN', 30, 1600, 300 ) INSERT INTO Emps( empno, deptno, ename ) VALUES( 8614, 30, 'LUPIN' ) INSERT INTO Emps VALUES( 8614, 'LUPIN', 30, 700, NULL ) The default starting salary Defaults to NULL same result
21
21 © Ellis Cohen 2001-2008 Primary Key Primary key of a table a column in the table or columns (sometimes more than one column is needed) chosen by the database designer that uniquely identifies tuples Requirements All values in the column must be unique None of the values can be NULL Example empno is the primary key of Emps
22
22 © Ellis Cohen 2001-2008 Check Constraints CREATE TABLE Emps ( empnoint primary key, enamevarchar2(30), deptnonumber(3), salnumber(7,2) DEFAULT 700 check( sal >= 100 ), commnumber(7,2) ) check( sal >= 100 ) ensures that every employee's salary is $100 or more. If an attempt is made to insert a new employee with a salary less than $100, or to change an existing employee's salary so it is less than $100, the database will not allow the operation to be done.
23
23 © Ellis Cohen 2001-2008 Query-Based Create & Insert CREATE TABLE RichEmps AS SELECT empno, ename FROM Emps WHERE sal > 3000 INSERT INTO RichEmps SELECT empno, ename FROM Emps WHERE (sal > 2000) AND (sal 200) Emps RichEmps empno, ename Example DDL command
24
24 © Ellis Cohen 2001-2008 Bare CREATEs What’s the effect of CREATE TABLE LikeEmps AS SELECT * FROM Emps WHERE empno IS NULL
25
25 © Ellis Cohen 2001-2008 Oracle String Data Types CHAR( size ) Fixed length character string (up to 2K) VARCHAR2( maxsize ) VARCHAR( maxsize ) Variable length char string (up to 4K) LONG Variable length char string (up to 2G) CLOB Character large object (up to 4G)
26
26 © Ellis Cohen 2001-2008 Oracle Numeric Data Types NUMBER( dprec ) Integer NUMBER( dprec, scale ) Fixed point NUMBER or FLOAT FLOAT( bprec) Floating point
27
27 © Ellis Cohen 2001-2008 Oracle Date/Time Types The DATE datatype holds a date plus a time (which is usually invisible) Initialization –Dates can be initialized from strings using a default format model e.g. '12-NOV-2003'. Since no time is specified, it is set to midnight. –The SYSDATE or CURRENT_DATE functions return the current date/time Conversion –The TO_CHAR function converts a DATE to a string using a customizable format model (which can show hours, minutes & seconds). They are automatically coverted using the default format model. –The TO_DATE function converts a string with a customizable format model to a DATE Other Date/Time dataypes –TIMESTAMP datatypes which provide flexibility in the precision of seconds & the way timezones are treated. –INTERVAL datatypes are used to represent the difference between date/time values.
28
28 © Ellis Cohen 2001-2008 Performance & Indexing
29
29 © Ellis Cohen 2001-2008 Query Performance SELECT empno, ename FROM Emps WHERE sal = 2000 Suppose there are 1M employees This requires scanning through all 1M of them to find the ones whose sal = 2000 If the database is stored on a hard disk, with 10 employees in each disk block, this requires loading 100K disk blocks into memory in order to look through them
30
30 © Ellis Cohen 2001-2008 Automatic Indexing SELECT empno, ename FROM Emps WHERE empno = 8197 The database automatically maintains a data structure (called an index) which it uses to look up the location of a tuple given its primary key (or any attribute whose values are known to be unique) No need to scan through all 1M employees to find the one whose empno = 8197 If the database is stored on a hard disk (including the index), ~4 disk blocks need to be examined to find 8197 in the index, which identifies the one disk block where employee 8197 is stored Result: Load ~5 blocks instead of 100K blocks!
31
31 © Ellis Cohen 2001-2008 Creating Indexes An index can be explicitly created for any attribute (or combination of attributes) CREATE INDEX Emp_Index_By_Sal ON Emps( sal ) creates an index for the Emps table based on the sal attribute This generally makes it possible to more quickly find the location of all the employees who have a specific salary!
32
32 © Ellis Cohen 2001-2008 Index-Based Performance SELECT empno, ename FROM Emps WHERE sal = 2000 Suppose there are 1M employees, but only 1K have a salary of 2000 If the database is stored on a hard disk (including the sal index), ~20 disk blocks need to be examined to get the locations of all 1K employees whose sal = 2000 If all 1K employees are in different blocks, need to load ~1K blocks (actually 20 + 1K), (which is certainly better than 100K blocks) Even better if some of those employees are stored in the same block. Even better if you can arrange to have the database store all the employees with the same salary together (this is called clustering)
33
33 © Ellis Cohen 2001-2008 Cost of Ordering SELECT empno, ename, sal FROM Emps ORDER BY sal This requires sorting all 1M employees by salary Unless they are clustered by salary, indexing doesn't really help Because they don't all fit in memory, the sorting must be done using the disk, which could be an order of magnitude slower than just scanning through all the employees
34
34 © Ellis Cohen 2001-2008 Duplicate Elimination
35
35 © Ellis Cohen 2001-2008 DISTINCT empno ename deptno sal comm 7499ALLEN301600300 7654MARTIN3012501400 7698BLAKE302850 7839KING105000 7844TURNER3015000 7986STERN501300 Emps SELECT deptno FROM Emps DEPTNO ------ 30 30 30 10 30 50 SELECT DISTINCT deptno FROM Emps DEPTNO ------ 10 30 50 10 30 50 Selection of a non-unique field eliminates duplicates after projecting deptno
36
36 © Ellis Cohen 2001-2008 Exercise: Restricted Duplicate Elimination What is the equivalent SQL List the departments which have employees who make > 1550
37
37 © Ellis Cohen 2001-2008 Restricted Duplicate Elimination empno ename deptno sal comm 7499ALLEN301600300 7654MARTIN3012501400 7698BLAKE302850 7839KING105000 7844TURNER3015000 7986STERN501300 Emps SELECT DISTINCT deptno FROM Emps WHERE sal > 1550 DEPTNO ------ 10 30 10 30 1) Restrict: sal > 1550 2) Project: deptno 3) Eliminate duplicates List the departments which have employees who make > 1550
38
38 © Ellis Cohen 2001-2008 Duplicate Elimination Keeps NULLs empno ename deptno … 7219SOLWAY… 8444TOAD… 7839KING10… 7698BLAKE30… 7844TURNER30… 7986STERN50… Emps 10 30 50 deptno SELECT DISTINCT deptno FROM Emps
39
39 © Ellis Cohen 2001-2008 Composite Duplicate Elimination SELECT DISTINCT deptno, job FROM Emps 783910CLERK 749930ANALYST 765430ANALYST 769830CLERK 784430SALESMAN 798650CLERK 721450CLERK 758650SALESMAN Emps empno deptno job 10CLERK 30ANALYST 30CLERK 30SALESMAN 50CLERK 50SALESMAN deptno job Some SQL implementations will order the results in the same way they are grouped, but use ORDER BY to guarantee this SELECT DISTINCT deptno, job FROM Emps ORDER BY deptno, job List the distinct jobs within each department
40
40 © Ellis Cohen 2001-2008 Distinct Tuples What is the effect of SELECT DISTINCT * FROM Emps
41
41 © Ellis Cohen 2001-2008 Distinct Tuple Answer What is the effect of SELECT DISTINCT * FROM Emps Lists Emps, eliminating duplicate tuples. This is the same as Emps, since Emps has a primary key, which ensures that (all values of empno, and therefore) all tuples are unique
42
42 © Ellis Cohen 2001-2008 Performance of Duplicate Elimination SELECT DISTINCT deptno FROM Emps As the database scans through the employees, it maintains a list of deptno's. If the deptno is not already in the list, it has to be added to the list. This all adds a small overhead to the cost of scanning. If an index for the Emps table based on deptno has already been created, then a list of the distinct deptno's can be obtained directly from the index very efficiently!
43
43 © Ellis Cohen 2001-2008 Aggregate Functions
44
44 © Ellis Cohen 2001-2008 Aggregate Functions empno ename deptno sal comm 7499ALLEN301600300 7654MARTIN3012501400 7698BLAKE302850 7839KING105000 7844TURNER3015000 7986STERN501300 Emps SELECT count(*) FROM Emps 6 SELECT count(deptno) FROM Emps ? SELECT count(DISTINCT deptno) FROM Emps ? SELECT count(comm)FROM Emps ? SELECT count(DISTINCT comm)FROM Emps ? SELECT max(comm)FROM Emps ? SELECT sum(comm)FROM Emps ? SELECT avg(comm)FROM Emps ? SELECT round(avg(comm),2)FROM Emps ? What are the answers?
45
45 © Ellis Cohen 2001-2008 Aggregate Function Answers empno ename deptno sal comm 7499ALLEN301600300 7654MARTIN3012501400 7698BLAKE302850 7839KING105000 7844TURNER3015000 7986STERN501300 Emps SELECT count(*) FROM Emps 6 SELECT count(deptno) FROM Emps 6 SELECT count(DISTINCT deptno) FROM Emps 3 SELECT count(comm)FROM Emps 3 SELECT count(DISTINCT comm)FROM Emps 3 SELECT max(comm)FROM Emps 1400 SELECT sum(comm)FROM Emps 1700 SELECT avg(comm)FROM Emps 566.666666 SELECT round(avg(comm),2)FROM Emps 566.67 Ignore NULLS
46
46 © Ellis Cohen 2001-2008 How SQL Treats NULLs Unknown SELECT comm + 3 FROM Emps Special Value SELECT DISTINCT comm FROM Emps Ignore SELECT avg(comm) FROM Emps
47
47 © Ellis Cohen 2001-2008 Restriction & Aggregation What is the meaning of SELECT sum(sal), max(sal) FROM Emps WHERE deptno = 10 assuming every employee has a salary
48
48 © Ellis Cohen 2001-2008 Answer: Restriction & Aggregation What is the meaning of SELECT sum(sal), max(sal) FROM Emps WHERE deptno = 10 It generates the average & maximum salary of the employees in dept #10 SUM(SAL) MAX(SAL) -------- 1700 1400 Generates a result set with 1 row and 2 columns
49
49 © Ellis Cohen 2001-2008 SQL Query Sequence w Aggregation SELECT sum(sal), max(sal) FROM Emps WHERE deptno = 10 2. Projection 1. Restriction
50
50 © Ellis Cohen 2001-2008 Aggregates over NULL Assuming dept 50 has employees but no employees in dept 50 have commissions: SELECT count(*) FROM Emps WHERE deptno = 50 – The # of employees in dept 50 What is the result of SELECT max(comm) FROM Emps WHERE deptno = 50 SELECT count(comm) FROM Emps WHERE deptno = 50 SELECT sum(comm), avg(comm) FROM Emps WHERE deptno = 50
51
51 © Ellis Cohen 2001-2008 Answer: Aggregates over NULL Assuming dept 50 has employees but no employees in dept 50 have commissions: SELECT count(*) FROM Emps WHERE deptno = 50 – The # of employees in dept 50 SELECT count(comm) FROM Emps WHERE deptno = 50 – 0 (zero), there are no employees w commissions SELECT max(comm) FROM Emps WHERE deptno = 50 – NULL, nothing to take max of SELECT sum(comm), avg(comm) FROM Emps WHERE deptno = 50 – NULL & NULL, nothing to sum [NOT zero!] – nothing to average [definitely NOT zero!]
52
52 © Ellis Cohen 2001-2008 No Aggregation in WHERE SELECT deptno FROM Emps WHERE avg(sal) > 2000 WRONG WHERE specifies a test to be applied to each individual tuple, so it CAN'T include aggregation! Suppose we want to know the departments in which the average employee salary is greater than 2000 We’ll see how to do this LATER! *
53
53 © Ellis Cohen 2001-2008 Naming the Result Column SELECT max(sal) AS maxsal FROM Emps WHERE deptno = 10 Not required, but generally a good idea to name the result column when doing any computation MAXSAL ------ 1400 Generates a result set with 1 row and 1 column
54
54 © Ellis Cohen 2001-2008 Aggregate Function Exercise Using Emps( empno, ename, deptno, sal, comm ) Assume sal is the weekly salary, and that all employees work 40 hrs/week. Write SQL to determine the average hourly salary.
55
55 © Ellis Cohen 2001-2008 SQL Answers: Aggregate Functions Determine the average hourly salary. SELECT avg(sal/40) AS avghsal FROM Emps SELECT avg(sal)/40 AS avghsal FROM Emps
56
56 © Ellis Cohen 2001-2008 Attribute Aggregation Problem Using Emps( empno, ename, deptno, job, sal, comm ) If only count(*) were allowed in SQL, but not count( attribute ), how would you write SELECT count(job) AS jknt FROM Emps
57
57 © Ellis Cohen 2001-2008 Attribute Aggregation Answer If only count(*) were allowed in SQL, but not count( attribute ), how would you write SELECT count(job) AS jknt FROM Emps SELECT count(*) AS jknts FROM Emps WHERE job IS NOT NULL
58
58 © Ellis Cohen 2001-2008 Using count(*) SELECT count(*) FROM Emps gives the same result as SELECT count(empno) FROM Emps Use count(*) It is clearer, and in many systems, has better performance
59
59 © Ellis Cohen 2001-2008 Multiple Aggregation Exercise Using Emps( empno, ename, deptno, sal, comm ) Write a single SQL expression to determines the average salary of all employees, and the average salary of those in dept 20 (HINT: Use CASE)
60
60 © Ellis Cohen 2001-2008 Multiple Aggregation Answer Determine the average hourly salary of all employees, and the average salary of just those in dept 20 (in one statement) SELECT avg(sal) AS asal, avg(CASE WHEN deptno=20 THEN sal END) AS nsal FROM Emps Notes: CASE without ELSE implies ELSE NULL, so this CASE statements gives sal when deptno =20 NULL otherwise avg ignores NULLs, so avg(CASE …) gives the average of the sals in department 20
61
61 © Ellis Cohen 2001-2008 Distinct Aggregation
62
62 © Ellis Cohen 2001-2008 Distinct Aggregation SELECT count(DISTINCT deptno) AS knt FROM Emps Distinct Aggregation can be used with any aggregation function, though it is primarily used with count How many different departments do employees work in?
63
63 © Ellis Cohen 2001-2008 Aggregate Function Exercise Using Emps( empno, ename, deptno, sal, comm ) Write SQL to determine How many departments have employees whose salary is more than 1500?
64
64 © Ellis Cohen 2001-2008 Answers: Aggregate Function Exercise How many departments have employees whose salary is more than 1500? SELECT count(DISTINCT deptno) FROM Emps WHERE sal > 1500
65
65 © Ellis Cohen 2001-2008 Problem: Distinct Counts With NULLs Suppose, you executed CREATE TABLE JustDepts AS SELECT DISTINCT deptno FROM Emps What's the difference between SELECT count(DISTINCT deptno) FROM Emps SELECT count(deptno) FROM JustDepts SELECT count(*) FROM JustDepts
66
66 © Ellis Cohen 2001-2008 Distinct Counts With NULLs Suppose, you executed CREATE TABLE JustDepts AS SELECT DISTINCT deptno FROM Emps What's the difference between SELECT count(DISTINCT deptno) FROM Emps – this ignores tuples with NULL deptno's (i.e. unassigned employees) SELECT count(deptno) FROM JustDepts – this also ignores unassigned employees SELECT count(*) FROM JustDepts – the count will be one higher if any employees are unassigned
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.