Download presentation
Presentation is loading. Please wait.
2
Matthew P. Johnson, OCL2, CISDD CUNY, January 20051 OCL2 Oracle 10g: SQL & PL/SQL Session #3 Matthew P. Johnson CISDD, CUNY January, 2005
3
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 2 Agenda Last time: FDs This time: 1. Anomalies 2. Normalization
4
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 3 Review examples: finding FDs Product(name, price, category, color) name, category price category color Keys are: {name, category} Enrollment(student, address, course, room, time) student address room, time course student, course room, time Keys are: [in class]
5
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 4 Next topic: Anomalies (3.6) Identify anomalies in existing schema How to decompose a relation Boyce-Codd Normal Form (BCNF) Recovering information from a decomposition Third Normal Form
6
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 5 Types of anomalies Redundancy Repeat info unnecessarily in several tuples Update anomalies: Change info in one tuple but not in another Deletion anomalies: Delete some values & lose other values too Insert anomalies: Inserting row means NULL-ing some fields
7
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 6 Example of anomalies Redundancy: name, address Update anomaly: Bill moves Delete anom.: Bill doesn’t pay bills, lose phones lose Bill! Underlying cause: SSN-phone is many-many Effect: partial dependency ssn name, address NameSSNMailing-addressPhone Michael123NY212-111-1111 Michael123NY917-111-1111 Hilary456DC202-222-2222 Hilary456DC914-222-2222 Bill789Chappaqua914-222-2222 Bill789Chappaqua212-333-3333 SSN Name, Mailing-address SSN Phone
8
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 7 Decomposition by projection Soln: replace anomalous R with projections of R onto two subsets of attributes Projection: an operation in Relational Algebra projection = SELECT in SQL Projecting R onto attributes (A 1,…,A n ) means removing all other attributes Result of projection is another relation Yields tuples whose fields are A 1,…,A n Resulting duplicates ignored
9
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 8 Projection for decomposition R 1 = projection of R on A 1,..., A n, B 1,..., B m R 2 = projection of R on A 1,..., A n, C 1,..., C p A 1,..., A n B 1,..., B m C 1,..., C p = all attributes, usually disjoint sets R 1 and R 2 may (/not) be reassembled to produce original R. R(A 1,..., A n, B 1,..., B m, C 1,..., C p ) R 1 (A 1,..., A n, B 1,..., B m ) R 2 (A 1,..., A n, C 1,..., C p )
10
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 9 Chappaqua789Bill NY123Hilary NY123Michael Mailing-addressSSNName Decomposition example The anomalies are gone No more redundant data Easy to for Bill to move Okay for Bill to lose all phones Break the relation into two: NameSSNMailing-addressPhone Michael123NY212-111-1111 Michael123NY917-111-1111 Hilary456DC202-222-2222 Hilary456DC914-222-2222 Bill789Chappaqua914-222-2222 Bill789Chappaqua212-333-3333 789 914-222-2222 789 914-222-2222 567 202-222-2222 456 917-111-1111 123 212-111-1111 123 PhoneSSN
11
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 10 Thus: high-level strategy Person buys Product name pricenamessn Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies
12
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 11 Using FDs to produce good schemas Start with set of relations Define FDs (and keys) for them based on real world Transform your relations to “normal form” (normalize them) Do this using “decomposition” Intuitively, good design here means No anomalies Can reconstruct all original information
13
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 12 Decomposition terminology Projection: eliminating certain attributes from relation Decomposition: separating a relation into two by projection Join: (re)assembling two relations Whenever a row from R 1 and a row from R 2 have the same value for att A join to form a row of R 3 If the original data can be reproduced after a decomposition by joining the relations, then the decomposition was lossless We join on the attributes R 1 and R 2 have in common (As) If it can’t, the decomposition was lossy
14
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 13 A decomposition is lossless if we can recover: R(A,B,C) R1(B,C) R2(B,A) R’(A,B,C) should be the same as R(A,B,C) R’ is in general larger than R. Must ensure R’ = R Decompose Recover Lossless Decompositions
15
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 14 Lossless decomposition Sometimes the data can be reproduced: (Word, 100) + (Word, WP) (Word, 100, WP) (Oracle, 1000) + (Oracle, DB) (Oracle, 1000, DB) (Access, 100) + (Access, DB) (Access, 100, DB) NamePriceCategory Word100WP Oracle1000DB Access100DB NamePrice Word100 Oracle1000 Access100 NameCategory WordWP OracleDB AccessDB
16
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 15 Lossy decomposition Sometimes it’s not: (Word, WP) + (100, WP) (Word, 100, WP) (Oracle, DB) + (1000, DB) (Oracle, 1000, DB) (Oracle, DB) + (100, DB) (Oracle, 100, DB) (Access, DB) + (1000, DB) (Access, 1000, DB) (Access, DB) + (100, DB) (Access, 100, DB) NamePriceCategory Word100WP Oracle1000DB Access100DB NameCategory WordWP OracleDB AccessDB PriceCategory 100WP 1000DB 100DB What’s wrong?
17
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 16 Ensuring lossless decomposition Examples: name price, so first decomposition was lossless category name, so second decomposition was lossy R(A 1,..., A n, B 1,..., B m, C 1,..., C p ) If A 1,..., A n B 1,..., B m Then the decomposition is lossless R 1 (A 1,..., A n, B 1,..., B m ) R 2 (A 1,..., A n, C 1,..., C p ) Note: don’t necessarily need A 1,..., A n C 1,..., C p
18
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 17 Quick lossless/lossy example At a glance: can we decompose into R 1 (Y,X), R 2 (Y,Z)? At a glance: can we decompose into R 1 (X,Y), R 2 (X,Z)? XYZ 123 425
19
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 18 Next topic: Normal Forms First Normal Form = all attributes are atomic As opposed to set-valued Assumed all along Second Normal Form (2NF) Third Normal Form (3NF) Boyce Codd Normal Form (BCNF) Fourth Normal Form (4NF)
20
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 19 Most important: BCNF A simple condition for removing anomalies from relations: I.e.: The left side must always contain a key I.e: If a set of attributes determines other attributes, it must determine all the attributes A relation R is in BCNF if: If As Bs is a non-trivial dependency in R, then As is a superkey for R A relation R is in BCNF if: If As Bs is a non-trivial dependency in R, then As is a superkey for R Codd: Ted Codd, IBM researcher, inventor of relational model, 1970 Boyce: Ray Boyce, IBM researcher, helped develop SQL in the 1970s
21
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 20 BCNF decomposition algorithm Repeat choose A 1, …, A m B 1, …, B n that violates the BNCF condition split R into R 1 (A 1, …, A m, B 1, …, B n ) and R 2 (A 1, …, A m, [others]) continue with both R 1 and R 2 Until no more violations Repeat choose A 1, …, A m B 1, …, B n that violates the BNCF condition split R into R 1 (A 1, …, A m, B 1, …, B n ) and R 2 (A 1, …, A m, [others]) continue with both R 1 and R 2 Until no more violations A’s Others B’s R1R1 R2R2 //Heuristic: choose Bs as large as possible
22
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 21 Boyce-Codd Normal Form Name/phone example is not BCNF: {ssn,phone} is key FD: ssn name,mailing-address holds Violates BCNF: ssn is not a superkey Its decomposition is BCNF Only superkeys anything else NameSSNMailing-addressPhone Michael123NY212-111-1111 Michael123NY917-111-1111 NameSSNMailing-address Michael123NY SSNPhoneNumber 123 212-111-1111 123 917-111-1111
23
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 22 BCNF Decomposition Larger example: multiple decompositions {Title, Year, Studio, President, Pres-Address} FDs: Title Year Studio Studio President President Pres-Address Studio President, Pres-Address (why?) No many-many this time Problem cause: transitive FDs: Title,year studio president
24
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 23 BCNF Decomposition Illegal: As Bs, where As don’t include key Decompose: Studio President, Pres-Address As = {studio} Bs = {president, pres-address} Cs = {title, year} Result: 1. Studios(studio, president, pres-address) 2. Movies(studio, title, year) Is (2) in BCNF? Is in (1) BCNF? Key: Studio FD: President Pres-Address Q: Does president studio? If so, president is a key But if not, it violates BCNF
25
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 24 BCNF Decomposition Studios(studio, president, pres-address) Illegal: As Bs, where As don’t include key Decompose: President Pres-Address As = {president} Bs = {pres-address} Cs = {studio} {Studio, President, Pres-Address} becomes {President, Pres-Address} {Studio, President}
26
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 25 BCNF and two-att relations Must a two-attribute relation be in BCNF? Case 1: there are no non-trivial FDs Case 2: A B but not B A Case 3: B A but not A B Case 4: Both A B and B A Note that relations may have multiple keys BCNF requires a key on the left, not all keys
27
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 26 Agenda Last time: Normalization Homework 1 due now Project part 2 is up, due on the 19 th (Thurs.) This time: 1. Finish BCNF 2. 3NF 3. 4NF
28
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 27 BCNF Review Q: What’s required for BCNF? Q: What’s the slogan for BCNF? Q: Who are B & C? Q: What are the two types of violations?
29
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 28 BCNF Review Q: How do we fix a non-BCNF relation? Q: If As Bs violates BCNF, what do we do? Q: In this case, could the decomposition be lossy? Q: Under what circumstances could a decomposition be lossy? Q: How do we combine two relations?
30
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 29 Decomposition algorithm example R(N,O,R,P) F = {N O, O R, R N} Key: N,P Violations of BCNF: N O, O R, N OR which kinds of violations are these? Pick N OR(on board) Can we rejoin?(on board) What happens if we pick N O instead? Can we rejoin? (on board) NameOfficeResidencePhone GeorgePres.WH202-… GeorgePres.WH486-… DickVPNO202-… DickVPNO307-…
31
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 30 Lossless BCNF decomposition Consider simple relation: R(A,B,C) Only FD: A B (assume C! A) Key: A,C Diff vars from text! Also goes through if assumption is false BCNF violation (which kind?): no key on the left Thus: Decomposition to BCNF: Create R1(A,B) and R2(A,C) Could this be lossy? We will join R1 and R2 on A to find out Q: If C A, then what kind do we have? Q: Since C ! A, what kind of bad FD do we have?
32
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 31 Lossless BCNF decomposition Suppose R contains (b,a,c) and (b’,a,c’) In projection onto (B,A): (b,a,c) (b,a), (b’,a,c’) (b’,a) In projection onto (A,C): (b,a,c) (a,c), (b’,a,c’) (a,c’) In joining, (b’,a), (a,c) (b’,a,c) Q: Is/must/can this be correct? A: Yes! A B, so b = b’ So this was lossless We assumed C! A, but argument also goes through when C A Moral: BCNF decomp alg is always lossless
33
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 32 BCNF summary BCNF decomposition is lossless Can reproduce original by joining Saw last time: Every 2-attribute relation is in BCNF Final set of decomposed relations might be different depending on Order of bad FDs chosen Saw last time: But all results will be in BCNF
34
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 33 A problem with BCNF Relation: R(Title, Theater, Neighboorhood) FDs: Title,N’hood Theater Assume movie can’t play twice in same neighborhood Theater N’hood Keys: {Title, N’hood} {Theater, Title} TitleTheaterN’hood City of GodAngelicaVillage Fog of WarAngelicaVillage
35
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 34 A problem with BCNF BCNF violation: Theater N’hood Decompose: {Theater, N’Hood} {Theater, Title} Resulting relations: VillageAngelica N’hoodTheater R1 Fog of WarAngelica City of GodAngelica TitleTheater R2
36
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 35 Problem - continued Suppose we add new rows to R1 and R2: Their join: City of GodVillageFilm Forum Village N’hood Fog of War City of God Title Angelica Theater (R’) TheaterN’hood AngelicaVillage Film ForumVillage TheaterTitle AngelicaCity of God AngelicaFog of War Film ForumCity of God R1R2 A and B could not enforce FD Title,N’hood Theater
37
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 36 Third normal form: motivation There are some situations in which BCNF is not dependency-preserving, and Efficient checking for FD violation on updates is important In these cases BCNF is too severe a req. Solution: define a weaker normal form, called Third Normal Form in which FDs can be checked on individual relations without performing a join (no inter-relational FDs) to which relations can be converted, preserving both data and FDs
38
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 37 Third Normal Form BCNF decomposition is not dependency-preserving! We now define the (weaker) Third Normal Form Turns out: this example was already in 3NF A relation R is in 3rd normal form if : For every nontrivial dependency A 1, A 2,..., A n B for R, {A 1, A 2,..., A n } is a super-key for R, or B is part of a key, i.e., B is prime A relation R is in 3rd normal form if : For every nontrivial dependency A 1, A 2,..., A n B for R, {A 1, A 2,..., A n } is a super-key for R, or B is part of a key, i.e., B is prime Tradeoff: BCNF = no FD anomalies, but may lose some FDs 3NF = keeps all FDs, but may have some anomalies
39
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 38 BCNF: vices and virtues Be clear on the problem just described v. the arg. that BCNF decomp is lossless BCNF decomp does not lose data Resulting relations can be rejoined to obtain the original But: it can can lose dependencies After decomp, possible to add rows whose corresponding rows would be illegal in (rejoined) original
40
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 39 Recap: goals of normalization When we decompose a relation R with FDs F into R1..Rn we want: 1. lossless-join decomposition – no data lost 2. no/little redundancy: the relations Ri should be in either BCNF or at least 3NF 3. Dependency preservation: if Fi be the set of dependencies in F + that include only attributes in Ri: F is the “sum” of the FDs of the new relations (F 1 F 2 F 3 … F n ) + = F + Otherwise checking updates for violation of FDs may require computing joins, which is expensive
41
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 40 Dependency preservation Saw that last req. didn’t hold in move-theater example Did it hold in R(N,O,R,P) example? (on board)
42
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 41 Testing for 3NF For each dependency X Y, use attribute closure to check if X is a superkey If X is not a superkey, verify that each attribute in Y is prime This test is rather more expensive, since it involves finding candidate keys Testing for 3NF is NP-complete (in what?) Interestingly, decomposition into 3NF can be done in polynomial time Testing for 3NF is harder than decomposing into 3NF! Optimization: need to check only FDs in F, need not check all FDs in F + (why?)
43
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 42 3NF Example R = (J, K, L) F = (JK L, L K) Two candidate keys: JK and JL R is in 3NF JK LJK is a superkey L KK is prime BCNF decomposition yields R1 = (L,K), R2 = (L,J) testing for JK L requires a join There is some redundancy in R
44
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 43 BCNF and 3NF Comparison Example of problems due to redundancy in 3NF R = (J, K, L) F = (JK L, L K) A schema that is in 3NF but not BCNF has the problems of: redundancy (e.g., the relationship between l1 and k1) need to use null values (if allowed!), e.g. to represent the relationship between l2 and k2 when there is no corresponding value for attribute J JKL j1k1l1 j2k1l1 j3k1l1 NULLk2l2
45
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 44 Comparison of BCNF and 3NF It is always possible to decompose a relation into relations in 3NF such that: the decomposition is lossless the dependencies are preserved It is always possible to decompose a relation into relations in BCNF such that: the decomposition is lossless but it may not be possible to preserve dependencies But may eliminate more redundancy
46
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 45 The Normal Forms (so far) 1NF: every attribute has an atomic value 2NF: 1NF and no partial dependencies 3NF: for each FD X Y either it is trivial, or X is a superkey, or Y is a part of some key BCNF: 3NF and third 3NF option disallowed I.e, 2NF and no transitive dependencies
47
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 46 Distinguishing examples 1NF but not 2NF: R(Name, SSN,Mailing- address,Phone) Key: SSN,Phone Partial: ssn name, address 2NF but not 3NF: R(Title,Year,Studio,Pres,Pres-Addr) Key: Title,Year Transitive: studio president 3NF but not BCNF: R( Title, Theater, N’hood) Title,N’hood Theater Prime-on-right: Theater N’hood
48
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 47 Design Goals Goal for a relational database design is: No redundancy Lossless Join Dependency Preservation If we cannot achieve this, we accept one of dependency loss use of more expensive inter-relational methods to preserve dependencies data redundancy due to use of 3NF Interesting: SQL does not provide a direct way of specifying FDs other than superkeys can specify FDs using assertions, but they are expensive to test
49
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 48 3NF 3NF means we may have anomalies Example: TEACH(student, teacher, subject) student, subject teacher (students not allowed in the same subject with two teachers) teacher subject (each teacher teaches one subject) Subject is prime, so this is 3NF But we have anomalies: Insertion: cannot insert a teacher until we have a student taking his subject If we convert to BCNF, we lost student, subject teacher
50
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 49 BCNF and over-normalization What is the problem? Schema overload – trying to capture two meanings: 1) subject X can be taught by teacher Y 2) student Z takes subject W from teacher V What to do? 3NF has anomalies, normalizing to BCNF loses FDs One soln: keep the 3NF TEACH and another (BCNF) relation SUBJECT-TAUGHT (teacher, subject) Still (more!) redundancy, but no more insert and delete anomalies
51
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 50 Normalization Review Q: What’s required for BCNF? Q: What are the two types of violations? Q: What’s the loophole for 3NF? Q: How do we fix a non-BCNF relation?
52
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 51 Normalization Review Q: If As Bs violates BCNF, what do we do? Q: In this case, could the decomposition be lossy? Q: How do we combine two relations? Q: Can BCNF decomp. lose FDs? Q: Can 3NF decomp. lose FDs?
53
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 52 Redundancy in BCNF Lots of redundancy! Key? All fields None determined by others! Non-trivial FDs? None! In BCNF? Yes! NameStreetsCitysJobs Michael111 East 60 th StreetNew YorkMayor Michael222 Brompton RoadLondonMayor Michael111 East 60 th StreetNew YorkCEO Michael222 Brompton RoadLondonCEO Hilary333 Some StreetChappaquaSenator Hilary444 Embassy RowWashingtonSenator Hilary333 Some StreetChappaquaFirst Lady Hilary444 Embassy RowWashingtonFirst Lady Hilary333 Some StreetChappaquaLawyer Hilary444 Embassy RowWashingtonLawyer Now what? New concept, leading to another normal form: Multivalued dependencies
54
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 53 E/R to relational model courses Depts Computer- allocation room number givenBy name chair
55
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 54 High-level agenda Review keys, FDs Install Oracle Start SQL Lab on Oracle system info/SQL
56
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 55 Key/FD review Product(name, price, category, color) name, category price category color Keys are: {name, category} Enrollment(student, address, course, room, time) student address room, time course student, course room, time Keys are: [in class]
57
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 56 Next topic: SQL (6.1) Standard language for querying and manipulating data Structured Query Language Many standards: ANSI SQL, SQL92/SQL2, SQL3/SQL99 Vendors support various subsets/extensions We’ll do Oracle’s version Basic form (many more bells and whistles in addition): SELECT attributes FROM relations (possibly multiple, joined) WHERE conditions (selections) SELECT attributes FROM relations (possibly multiple, joined) WHERE conditions (selections)
58
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 57 Data Types in SQL Characters: CHAR(20)-- fixed length VARCHAR(40)-- variable length Numbers: BIGINT, INT, SMALLINT, TINYINT REAL, FLOAT -- differ in precision MONEY Times and dates: DATE DATETIME-- SQL Server
59
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 58 “Tables” PNamePriceCategoryManufacturer Gizmo$19.99GadgetsGizmoWorks Powergizmo$29.99GadgetsGizmoWorks SingleTouch$149.99PhotographyCanon MultiTouch$203.99HouseholdHitachi Product Attribute names Table name Tuples or rows
60
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 59 Simple SQL Query PNamePriceCategoryManufacturer Gizmo$19.99GadgetsGizmoWorks Powergizmo$29.99GadgetsGizmoWorks SingleTouch$149.99PhotographyCanon MultiTouch$203.99HouseholdHitachi SELECT * FROM Product WHERE category=‘Gadgets’ Product PNamePriceCategoryManufacturer Gizmo$19.99GadgetsGizmoWorks Powergizmo$29.99GadgetsGizmoWorks “selection”
61
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 60 Simple SQL Query PNamePriceCategoryManufacturer Gizmo$19.99GadgetsGizmoWorks Powergizmo$29.99GadgetsGizmoWorks SingleTouch$149.99PhotographyCanon MultiTouch$203.99HouseholdHitachi SELECT PName, Price, Manufacturer FROM Product WHERE Price > 100 Product PNamePriceManufacturer SingleTouch$149.99Canon MultiTouch$203.99Hitachi “selection” and “projection”
62
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 61 A Notation for SQL Queries SELECT Name, Price, Manufacturer FROM Product WHERE Price > 100 Product(PName, Price, Category, Manfacturer) (PName, Price, Manfacturer) Input Schema Output Schema
63
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 62 SQL SQL SELECT Sometimes called a “projection” What goes in the WHERE clause: x = y, x < y, x <= y, etc. For number, they have the usual meanings For CHAR and VARCHAR: lexicographic ordering Expected conversion between CHAR and VARCHAR For dates and times, what you expect
64
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 63 SQL e.g. Movies(Title,Year,Length,inColor,Studio,Prdcr#) Q: How long was Star Wars (1977), in SQL? Q: Which Fox movies are are at least 100 minutes long, in SQL?
65
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 64 SQL e.g. Reps(ssn, name, etc.) Clients(ssn, name, rssn) Q: Who are George’s clients, in SQL? Conceptually: Clients.name ( Reps.name=“George” and Reps.ssn=rssn (Reps x Clients))
66
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 65 The LIKE operator s LIKE p: pattern matching on strings p may contain two special symbols: _ = any single character % = zero or more chars Product(Name, Price, Category, Manufacturer) Find all products whose name contains ‘gizmo’: SELECT * FROM Products WHERE Name LIKE ‘%gizmo%’
67
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 66 The LIKE operator Q: What it want to search for values containing a ‘%’? PName LIKE ‘%%’ won’t work Instead, must use escape chars In C/C++/J, prepend ‘\’ In SQL, prepend an arbitrary escape char: PName LIKE ‘%x%’ ESCAPE ‘x’
68
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 67 More on escape chars SQL: no official default escape char In SQL*Plus: default escape char == ‘\’ Can set with SQL> set escape x Other tools, DBMSs: your mileage may vary SQL string literals put in ‘ ‘: ‘mystring’ Single-quote literals escaped with single- quotes: ‘George’’s string’
69
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 68 More on single-quotes Dates with DATE: DATE ‘1948-05-14’ Timestamps with TIMESTAMP: TIMESTAMP ‘1948-05-14 12:00:00’
70
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 69 Eliminating Duplicates SELECT DISTINCT category FROM Product SELECT DISTINCT category FROM Product Compare to: SELECT category FROM Product SELECT category FROM Product Category Gadgets Photography Household Category Gadgets Photography Household
71
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 70 Ordering the Results Ordering is ascending, unless you specify the DESC keyword per attribute. SELECT pname, price, manufacturer FROM Product WHERE category=‘gizmo’ AND price > 50 ORDER BY price, pname SELECT pname, price, manufacturer FROM Product WHERE category=‘gizmo’ AND price > 50 ORDER BY price, pname SELECT pname, price, manufacturer FROM Product WHERE category=‘gizmo’ AND price > 50 ORDER BY price DESC, pname ASC SELECT pname, price, manufacturer FROM Product WHERE category=‘gizmo’ AND price > 50 ORDER BY price DESC, pname ASC
72
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 71 Ordering the Results SELECTCategory FROMProduct ORDER BYPName SELECTCategory FROMProduct ORDER BYPName PNamePriceCategoryManufacturer Gizmo$19.99GadgetsGizmoWorks Powergizmo$29.99GadgetsGizmoWorks SingleTouch$149.99PhotographyCanon MultiTouch$203.99HouseholdHitachi ?
73
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 72 Ordering the Results SELECT DISTINCT category FROMProduct ORDER BYcategory SELECT DISTINCT category FROMProduct ORDER BYcategory Compare to: Category Gadgets Household Photography SELECT DISTINCT category FROMProduct ORDER BYPName SELECT DISTINCT category FROMProduct ORDER BYPName ?
74
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 73 Joins in SQL Connect two or more tables: PNamePriceCategoryManufacturer Gizmo$19.99GadgetsGizmoWorks Powergizmo$29.99GadgetsGizmoWorks SingleTouch$149.99PhotographyCanon MultiTouch$203.99HouseholdHitachi Product Company CNameStockPriceCountry GizmoWorks25USA Canon65Japan Hitachi15Japan What is the connection between them?
75
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 74 Joins in SQL Product (pname, price, category, manufacturer) Company (cname, stockPrice, country) Find all products under $200 manufactured in Japan; return their names and prices. SELECT PName, Price FROM Product, Company WHERE Manufacturer=CName AND Country=‘Japan’ AND Price <= 200 Join between Product and Company
76
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 75 Joins in SQL PNamePriceCategoryManufacturer Gizmo$19.99GadgetsGizmoWorks Powergizmo$29.99GadgetsGizmoWorks SingleTouch$149.99PhotographyCanon MultiTouch$203.99HouseholdHitachi Product Company CnameStockPriceCountry GizmoWorks25USA Canon65Japan Hitachi15Japan PNamePrice SingleTouch$149.99 SELECT PName, Price FROM Product, Company WHERE Manufacturer=CName AND Country=‘Japan’ AND Price <= 200
77
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 76 Joins in SQL Product (pname, price, category, manufacturer) Company (cname, stockPrice, country) Find all countries that manufacture some product in the ‘Gadgets’ category. SELECTCountry FROMProduct, Company WHEREManufacturer=CName AND Category=‘Gadgets’
78
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 77 Joins in SQL NamePriceCategoryManufacturer Gizmo$19.99GadgetsGizmoWorks Powergizmo$29.99GadgetsGizmoWorks SingleTouch$149.99PhotographyCanon MultiTouch$203.99HouseholdHitachi Product Company CnameStockPriceCountry GizmoWorks25USA Canon65Japan Hitachi15Japan Country ?? What is the problem? What’s the solution? SELECT Country FROM Product, Company WHERE Manufacturer=CName AND Category=‘Gadgets’
79
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 78 Joins Product (pname, price, category, manufacturer) Purchase (buyer, seller, store, product) Person(name, phone, city) Find names of Seattleites who bought Gadgets, and the names of the stores they bought such product from. SELECT DISTINCT name, store FROM Person, Purchase, Product WHERE persname=buyer AND product = pname AND city=‘Seattle’ AND category=‘Gadgets’
80
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 79 Disambiguating Attributes Sometimes two relations have the same attr: Person(pname, address, worksfor) Company(cname, address) SELECT DISTINCT pname, address FROM Person, Company WHERE worksfor = cname SELECT DISTINCT Person.pname, Company.address FROM Person, Company WHERE Person.worksfor = Company.cname Which address ?
81
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 80 Tuple Variables SELECT DISTINCT x.store FROM Purchase AS x, Purchase AS y WHERE x.product = y.product AND y.store = ‘BestBuy’ SELECT DISTINCT x.store FROM Purchase AS x, Purchase AS y WHERE x.product = y.product AND y.store = ‘BestBuy’ Find all stores that sold at least one product that the store ‘BestBuy’ also sold: Answer: (store) Product (pname, price, category, manufacturer) Purchase (buyer, seller, store, product) Person(persname, phoneNumber, city)
82
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 81 Tuple Variables Tuple variables introduced automatically: Product (name, price, category, manufacturer) Becomes: Doesn’t work when Product occurs more than once In that case the user needs to define variables explicitly SELECT name FROM Product WHERE price > 100 SELECT name FROM Product WHERE price > 100 SELECT Product.name FROM Product AS Product WHERE Product.price > 100 SELECT Product.name FROM Product AS Product WHERE Product.price > 100
83
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 82 SQL Query Semantics SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions 1. Nested loops: Answer = {} for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)} return Answer Answer = {} for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)} return Answer
84
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 83 SQL Query Semantics SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions 2. Parallel assignment Doesn’t impose any order! Answer = {} for all assignments x1 in R1, …, xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)} return Answer Answer = {} for all assignments x1 in R1, …, xn in Rn do if Conditions then Answer = Answer {(a1,…,ak)} return Answer
85
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 84 First Unintuitive SQLism SELECTR.A FROMR, S, T WHERER.A=S.A OR R.A=T.A Looking for R (S T) But what happens if T is empty? See transcript of this in Oracle on salestranscript
86
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 85 Review Examples from sqlzoo.netsqlzoo.net SELECT L FROM R 1, …, R n WHERE C SELECT L FROM R 1, …, R n WHERE C L ( C (R 1 x … R n )
87
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 86 Set/bag ops in SQL Orthodox SQL has set operators: UNION, INTERSECT, EXCEPT And bag operators: UNION ALL, INTERSECT ALL, EXCEPT ALL These operators are applied to queries: (SELECT name FROM Person WHERE City=“Seattle”) UNION (SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The Bon”) (SELECT name FROM Person WHERE City=“Seattle”) UNION (SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The Bon”)
88
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 87 Set/bag ops in Oracle SQL Oracle SQL support uses MINUS rather than EXCEPT Oracle SQL supports bag op UNION ALL but not INTERSECT ALL or MINUS ALL See the Ullman page on more differencesUllman
89
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 88 Disambiguation in Oracle SQL Can rename fields by Select name as n … Select name n … But not by Select name=n… Can rename relations only by … from tab t1, tab t2 Lesson: if you get errors, remove all =s, ASs
90
Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 89 Disambiguation in Oracle SQL Every selected field must be unambiguous For R(A,B), Select A from R, R Select R1.A from R R1, R R2 Consider: Why? * is shorthand for all fields, each must be unambiguous Select * from R R1, R R2 SQL> Select * from R, R; Select * from R, R * ERROR at line 1: ORA-00918: column ambiguously defined SQL> Select * from R, R; Select * from R, R * ERROR at line 1: ORA-00918: column ambiguously defined
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.