Download presentation
Presentation is loading. Please wait.
Published byAda Johnston Modified over 9 years ago
1
Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111
2
2 Announcements HW1: was due Monday HW2: due next Monday Dan Suciu -- CSEP544 Fall 2011
3
Outline and Reading Material Schema Normalization: 19.1 – 19.7 –Read only about BCNF, skip 3NF –Note: you need BCNF for HW2 Constraints and triggers: 3.2, 3.3, 5.8 Views: 3.6 –Answering queries using views: Sec. 1,2,3 Dan Suciu -- CSEP544 Fall 20113 Some material is NOT covered in the book. Read the slides carefully: we will not cover all in class!
4
Schema Normalization Dan Suciu -- CSEP544 Fall 20114
5
5 Normal Forms 1st Normal Form = all tables are flat 2nd Normal Form = obsolete Boyce Codd Normal Form = will study 3rd Normal Form = see book 4 th Normal Form = golden standard Dan Suciu -- CSEP544 Fall 2011
6
6 First Normal Form (1NF) A database schema is in First Normal Form if all tables are flat NameGPA Course s Alice3.8 Bob3.7 Carol3.9 Math DB OS DB OS Math OS Student NameGPA Alice3.8 Bob3.7 Carol3.9 Student Course Math DB OS StudentCourse AliceMath CarolMath AliceDB BobDB AliceOS CarolOS Takes Course May need to add keys
7
7 Relational Schema Design Person buys Product name pricenamessn Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies Dan Suciu -- CSEP544 Fall 2011
8
8 Data Anomalies When a database is poorly designed we get anomalies: Redundancy: data is repeated Updated anomalies: need to change in several places Delete anomalies: may lose data when we don’t want Dan Suciu -- CSEP544 Fall 2011
9
9 Relational Schema Design Anomalies: Redundancy = repeat data Update anomalies = Fred moves to “Bellevue” Deletion anomalies = Joe deletes his phone number: what is his city ? Recall set attributes (persons with several phones): NameSSNPhoneNumberCity Fred123-45-6789206-555-1234Seattle Fred123-45-6789206-555-6543Seattle Joe987-65-4321908-555-2121Westfield One person may have multiple phones, but lives in only one city
10
10 Relation Decomposition Break the relation into two: NameSSNCity Fred123-45-6789Seattle Joe987-65-4321Westfield SSNPhoneNumber 123-45-6789206-555-1234 123-45-6789206-555-6543 987-65-4321908-555-2121 Anomalies have gone: No more repeated data Easy to move Fred to “Bellevue” (how ?) Easy to delete all Joe’s phone number (how ?) NameSSNPhoneNumberCity Fred123-45-6789206-555-1234Seattle Fred123-45-6789206-555-6543Seattle Joe987-65-4321908-555-2121Westfield
11
11 Relational Schema Design (or Logical Design) Main idea: Start with some relational schema Find out its functional dependencies Use them to design a better relational schema Dan Suciu -- CSEP544 Fall 2011
12
12 Functional Dependencies What is a functional dependency? It is a kind of constraint In theory one should find the FDs during requirement analysis, then apply rigorously the normalization steps that we discuss next In practice one rarely follows this strict prescription Dan Suciu -- CSEP544 Fall 2011
13
13 Functional Dependencies Meaning: If two tuples agree on the attributes then they must also agree on the attributes Notation: A 1, A 2, …, A n B 1, B 2, …, B m A 1, A 2, …, A n B 1, B 2, …, B m Dan Suciu -- CSEP544 Fall 2011
14
14 When Does an FD Hold Definition: A 1,..., A m B 1,..., B n holds in R if: t, t’ R, (t.A 1 =t’.A 1 ... t.A m =t’.A m t.B 1 =t’.B 1 ... t.B n =t’.B n ) A1A1...AmAm B1B1 nmnm if t, t’ agree here then t, t’ agree here t t’ R
15
15 Examples EmpID Name, Phone, Position Position Phone but not Phone Position An FD holds, or does not hold on an instance: EmpIDNamePhonePosition E0045Smith1234Clerk E3542Mike9876Salesrep E1111Smith9876Salesrep E9999Mary1234Lawyer Dan Suciu -- CSEP544 Fall 2011
16
16 Example Position Phone EmpIDNamePhonePosition E0045Smith1234Clerk E3542Mike 9876 Salesrep E1111Smith 9876 Salesrep E9999Mary1234Lawyer Dan Suciu -- CSEP544 Fall 2011
17
17 Example EmpIDNamePhonePosition E0045Smith 1234 Clerk E3542Mike9876Salesrep E1111Smith9876Salesrep E9999Mary 1234 Lawyer but not Phone Position Dan Suciu -- CSEP544 Fall 2011
18
18 Example FD’s are constraints: On some instances they hold On others they don’t namecategorycolordepartmentprice GizmoGadgetGreenToys49 TweaksGadgetGreenToys99 Does this instance satisfy all the FDs ? name color category department color, category price name color category department color, category price Dan Suciu -- CSEP544 Fall 2011
19
19 Example namecategorycolordepartmentprice GizmoGadgetGreenToys49 TweaksGadgetBlackToys99 GizmoStationaryGreen Office- supp. 59 What about this one ? name color category department color, category price name color category department color, category price
20
20 An Interesting Observation If all these FDs are true: name color category department color, category price name color category department color, category price Then this FD also holds: name, category price Why ??
21
21 Goal: Find ALL Functional Dependencies Anomalies occur when certain “bad” FDs hold We know some of the FDs Need to find all FDs, then look for the bad ones Dan Suciu -- CSEP544 Fall 2011
22
22 Armstrong’s Rules (1/3) Is equivalent to Splitting rule and Combing rule A1...AmB1...Bm A 1, A 2, …, A n B 1, B 2, …, B m A 1, A 2, …, A n B 1 A 1, A 2, …, A n B 2..... A 1, A 2, …, A n B m A 1, A 2, …, A n B 1 A 1, A 2, …, A n B 2..... A 1, A 2, …, A n B m Dan Suciu -- CSEP544 Fall 2011
23
23 Armstrong’s Rules (2/3) Trivial Rule Why ? A1A1 …AmAm where i = 1, 2,..., n A 1, A 2, …, A n A i Dan Suciu -- CSEP544 Fall 2011
24
24 Armstrong’s Rules (3/3) Transitive Closure Rule If and then Why ? A 1, A 2, …, A n B 1, B 2, …, B m B 1, B 2, …, B m C 1, C 2, …, C p A 1, A 2, …, A n C 1, C 2, …, C p Dan Suciu -- CSEP544 Fall 2011
25
25 A1A1 …AmAm B1B1 …BmBm C1C1...CpCp Dan Suciu -- CSEP544 Fall 2011
26
26 Example (continued) Start with these: Infer these: Inferred FD Which Rule did we apply ? 4. name, category name 5. name, category color 6. name, category category 7. name, category color, category 8. name, category price 1. name color 2. category department 3. color, category price 1. name color 2. category department 3. color, category price
27
27 Example (continued) Answers: Inferred FD Which Rule did we apply ? 4. name, category nameTrivial rule 5. name, category colorTransitivity on 4, 1 6. name, category categoryTrivial rule 7. name, category color, categorySplit/combine on 5, 6 8. name, category priceTransitivity on 3, 7 1. name color 2. category department 3. color, category price 1. name color 2. category department 3. color, category price THIS IS TOO HARD ! There is an easier way.
28
28 Closure of a set of Attributes Given a set of attributes A 1, …, A n The closure, {A 1, …, A n } + = the set of attributes B s.t. A 1, …, A n B Given a set of attributes A 1, …, A n The closure, {A 1, …, A n } + = the set of attributes B s.t. A 1, …, A n B name color category department color, category price name color category department color, category price Example: name + = {name, color} {name, category} + = {name, category, color, department, price} color + = {color}
29
29 Closure Algorithm X={A1, …, An}. repeat until X doesn’t change do: if B 1, …, B n C is a FD and B 1, …, B n are all in X then add C to X. X={A1, …, An}. repeat until X doesn’t change do: if B 1, …, B n C is a FD and B 1, …, B n are all in X then add C to X. {name, category} + = ? [ …in class…] name color category department color, category price name color category department color, category price Example:
30
30 Practice at Home… Compute {A,B} + X = {A, B, } Compute {A, F} + X = {A, F, } R(A,B,C,D,E,F) A, B C A, D E B D A, F B A, B C A, D E B D A, F B Dan Suciu -- CSEP544 Fall 2011
31
31 Why Do We Need Closure With closure we can find all FD’s easily To check if X A –Compute X + –Check if A X + Dan Suciu -- CSEP544 Fall 2011
32
Practice at Home A, B C A, D B B D A, B C A, D B B D Find all FD’s implied by:
33
Practice at Home A, B C A, D B B D A, B C A, D B B D Step 1: Compute X +, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ =ABCD, AC+=AC, AD+=ABCD, BC+=BCD, BD+=BD, CD+=CD ABC+ = ABD+ = ACD + = ABCD (no need to compute– why ?) BCD + = BCD, ABCD+ = ABCD A+ = A, B+ = BD, C+ = C, D+ = D AB+ =ABCD, AC+=AC, AD+=ABCD, BC+=BCD, BD+=BD, CD+=CD ABC+ = ABD+ = ACD + = ABCD (no need to compute– why ?) BCD + = BCD, ABCD+ = ABCD Find all FD’s implied by:
34
Practice at Home A, B C A, D B B D A, B C A, D B B D Step 1: Compute X +, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ =ABCD, AC+=AC, AD+=ABCD, BC+=BCD, BD+=BD, CD+=CD ABC+ = ABD+ = ACD + = ABCD (no need to compute– why ?) BCD + = BCD, ABCD+ = ABCD A+ = A, B+ = BD, C+ = C, D+ = D AB+ =ABCD, AC+=AC, AD+=ABCD, BC+=BCD, BD+=BD, CD+=CD ABC+ = ABD+ = ACD + = ABCD (no need to compute– why ?) BCD + = BCD, ABCD+ = ABCD Step 2: Enumerate all FD’s X Y, s.t. Y X + and X Y = : AB CD, AD BC, ABC D, ABD C, ACD B Find all FD’s implied by:
35
Keys A superkey is a set of attributes X = A 1,..., A n s.t. for any other attribute B, we have X B Note: X is a superkey if X + = [all attributes] A key is a minimal superkey X How do we compute all keys X ?
36
36 Example Product(name, price, category, color) name, category price category color name, category price category color What are the keys ? Dan Suciu -- CSEP544 Fall 2011
37
37 Example Product(name, price, category, color) name, category price category color name, category price category color (name, category) + = name, category, price, color Hence X = (name, category) is a key What are the keys ?
38
Practice at Home Dan Suciu -- CSEP544 Fall 201138 student address room, time course student, course room, time student address room, time course student, course room, time Find all keys Enrollment(student, address, course, room, time)
39
39 Read at Home We can have more than one key. Example: relation R(A,B,C) satisfies AB C BC A A BC B AC or Find all keys
40
40 Boyce-Codd Normal Form There are no “bad” FDs: A relation R is in BCNF if: Whenever X B is a non-trivial dependency, then X is a superkey. A relation R is in BCNF if: Whenever X B is a non-trivial dependency, then X is a superkey. Equivalently: Dan Suciu -- CSEP544 Fall 2011 A relation R is in BCNF if: X, either X + = X or X + = [all attributes] A relation R is in BCNF if: X, either X + = X or X + = [all attributes]
41
41 Example The only key is: {SSN, PhoneNumber} Hence SSN Name, City is a “bad” dependency SSN Name, City Alternatively: SSN+ = Name, City and is neither SSN nor all attributes NameSSNPhoneNumberCity Fred123-45-6789206-555-1234Seattle Fred123-45-6789206-555-6543Seattle Joe987-65-4321908-555-2121Westfield Joe987-65-4321908-555-1234Westfield
42
42 BCNF Decomposition Algorithm Normalize(R) find X s.t.: X ≠ X + ≠ [all attributes] if (not found) then “R is in BCNF” let Y = X + - X; Z = [all attributes] - X + decompose R into R1(X Y) and R2(X Z) Normalize(R1); Normalize(R2); YXZ X+X+
43
43 Example BCNF Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN name, age age hairColor Find X s.t.: X ≠X + ≠ [all attributes] Dan Suciu -- CSEP544 Fall 2011
44
44 Example BCNF Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN name, age age hairColor Find X s.t.: X ≠X + ≠ [all attributes] Dan Suciu -- CSEP544 Fall 2011 Iteration 1: Person: SSN+ = SSN, name, age, hairColor Decompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber) Iteration 1: Person: SSN+ = SSN, name, age, hairColor Decompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber) SSN name, age, hairColor phoneNumber
45
45 Example BCNF Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN name, age age hairColor Find X s.t.: X ≠X + ≠ [all attributes] Dan Suciu -- CSEP544 Fall 2011 Iteration 1: Person: SSN+ = SSN, name, age, hairColor Decompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber) Iteration 2: P: age+ = age, hairColor Decompose: People(SSN, name, age) Hair(age, hairColor) Phone(SSN, phoneNumber) Iteration 1: Person: SSN+ = SSN, name, age, hairColor Decompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber) Iteration 2: P: age+ = age, hairColor Decompose: People(SSN, name, age) Hair(age, hairColor) Phone(SSN, phoneNumber) What are the keys ?
46
46 Practice at Home A B B C R(A,B,C,D) A + = ABC ≠ ABCD R(A,B,C,D) Dan Suciu -- CSEP544 Fall 2011
47
47 Practice at Home What are the keys ? A B B C R(A,B,C,D) A + = ABC ≠ ABCD R(A,B,C,D) What happens if in R we first pick B + ? Or AB + ? R 1 (A,B,C) B + = BC ≠ ABC R 2 (A,D) R 11 (B,C) R 12 (A,B) Dan Suciu -- CSEP544 Fall 2011
48
48 Decompositions in General R 1 = projection of R on A 1,..., A n, B 1,..., B m R 2 = projection of R on A 1,..., A n, C 1,..., C p R(A 1,..., A n, B 1,..., B m, C 1,..., C p ) R 1 (A 1,..., A n, B 1,..., B m ) R 2 (A 1,..., A n, C 1,..., C p ) Dan Suciu -- CSEP544 Fall 2011
49
49 Theory of Decomposition Sometimes it is correct: NamePriceCategory Gizmo19.99Gadget OneClick24.99Camera Gizmo19.99Camera NamePrice Gizmo19.99 OneClick24.99 Gizmo19.99 NameCategory GizmoGadget OneClickCamera GizmoCamera Lossless decomposition
50
50 Incorrect Decomposition Sometimes it is not: NamePriceCategory Gizmo19.99Gadget OneClick24.99Camera Gizmo19.99Camera NameCategory GizmoGadget OneClickCamera GizmoCamera PriceCategory 19.99Gadget 24.99Camera 19.99Camera What’s incorrect ?? Lossy decomposition
51
51 Decompositions in General R(A 1,..., A n, B 1,..., B m, C 1,..., C p ) If A 1,..., A n B 1,..., B m Then the decomposition is lossless R 1 (A 1,..., A n, B 1,..., B m ) R 2 (A 1,..., A n, C 1,..., C p ) BCNF decomposition is always lossless. WHY ? Note: don’t need A 1,..., A n C 1,..., C p
52
Constraints Dan Suciu -- CSEP544 Fall 201152
53
Integrity Constraints in SQL Constraint = a property we want to hold System enforces constraints at updates Constraints in SQL: –Keys, foreign keys –Attribute-level constraints –Tuple-level constraints –Global constraints: assertions Dan Suciu -- CSEP544 Fall 201153
54
54 Keys OR: CREATE TABLE Product ( name CHAR(30) PRIMARY KEY, price INT) CREATE TABLE Product ( name CHAR(30) PRIMARY KEY, price INT) CREATE TABLE Product ( name CHAR(30), price INT, PRIMARY KEY (name)) CREATE TABLE Product ( name CHAR(30), price INT, PRIMARY KEY (name)) Product(name, price) Dan Suciu -- CSEP544 Fall 2011
55
55 Keys with Multiple Attributes CREATE TABLE Product ( name CHAR(30), category VARCHAR(20), price INT, PRIMARY KEY (name, category)) CREATE TABLE Product ( name CHAR(30), category VARCHAR(20), price INT, PRIMARY KEY (name, category)) NameCategoryPrice GizmoGadget10 CameraPhoto20 GizmoPhoto30 GizmoGadget40 Product(name, category, price)
56
56 Other Keys CREATE TABLE Product ( productID CHAR(10), name CHAR(30), category VARCHAR(20), price INT, PRIMARY KEY (productID), UNIQUE (name, category)) CREATE TABLE Product ( productID CHAR(10), name CHAR(30), category VARCHAR(20), price INT, PRIMARY KEY (productID), UNIQUE (name, category)) There is at most one PRIMARY KEY; there can be many UNIQUE Dan Suciu -- CSEP544 Fall 2011
57
57 Foreign Key Constraints CREATE TABLE Purchase ( buyer CHAR(30), seller CHAR(30), product CHAR(30) REFERENCES Product(name), store VARCHAR(30)) CREATE TABLE Purchase ( buyer CHAR(30), seller CHAR(30), product CHAR(30) REFERENCES Product(name), store VARCHAR(30)) Foreign key Purchase(buyer, seller, product, store) Product(name, price) Dan Suciu -- CSEP544 Fall 2011
58
58 NameCategory Gizmogadget CameraPhoto OneClickPhoto ProdNameStore GizmoWiz CameraRitz CameraWiz ProductPurchase Dan Suciu -- CSEP544 Fall 2011
59
Foreign Key Constraints 59 Purchase(buyer, seller, product, category, store) Product(name, category, price) CREATE TABLE Purchase( buyer VARCHAR(50), seller VARCHAR(50), product CHAR(20), category VARCHAR(20), store VARCHAR(30), FOREIGN KEY (product, category) REFERENCES Product(name, category) ); CREATE TABLE Purchase( buyer VARCHAR(50), seller VARCHAR(50), product CHAR(20), category VARCHAR(20), store VARCHAR(30), FOREIGN KEY (product, category) REFERENCES Product(name, category) ); Dan Suciu -- CSEP544 Fall 2011
60
60 NameCategory Gizmogadget CameraPhoto OneClickPhoto ProdNameStore GizmoWiz CameraRitz CameraWiz ProductPurchase What happens during updates ? Types of updates: In Purchase: insert/update In Product: delete/update
61
61 What happens during updates ? SQL has three policies for maintaining referential integrity: Reject violating modifications (default) Cascade: after a delete/update do a delete/update Set-null set foreign-key field to NULL Dan Suciu -- CSEP544 Fall 2011
62
Constraints on Attributes and Tuples 62 CREATE TABLE Purchase (... store VARCHAR(30) NOT NULL,... ) CREATE TABLE Purchase (... store VARCHAR(30) NOT NULL,... ) CREATE TABLE Product (... price INT CHECK (price >0 and price < 999)) CREATE TABLE Product (... price INT CHECK (price >0 and price < 999)) Attribute level constraints: Tuple level constraints: Dan Suciu -- CSEP544 Fall 2011... CHECK (price * quantity < 10000)...
63
63 CREATE TABLE Purchase ( prodName CHAR(30) CHECK (prodName IN SELECT Product.name FROM Product), date DATETIME NOT NULL) CREATE TABLE Purchase ( prodName CHAR(30) CHECK (prodName IN SELECT Product.name FROM Product), date DATETIME NOT NULL) What is the difference from Foreign-Key ? Dan Suciu -- CSEP544 Fall 2011
64
64 General Assertions CREATE ASSERTION myAssert CHECK NOT EXISTS( SELECT Product.name FROM Product, Purchase WHERE Product.name = Purchase.prodName GROUP BY Product.name HAVING count(*) > 200) CREATE ASSERTION myAssert CHECK NOT EXISTS( SELECT Product.name FROM Product, Purchase WHERE Product.name = Purchase.prodName GROUP BY Product.name HAVING count(*) > 200) Dan Suciu -- CSEP544 Fall 2011
65
65 Comments on Constraints Can give them names, and alter later We need to understand exactly when they are checked We need to understand exactly what actions are taken if they fail Dan Suciu -- CSEP544 Fall 2011
66
Semantic Optimization using Constraints 66 SELECT Purchase.store FROM Product, Purchase WHERE Product.name=Purchase.product Purchase(buyer, seller, product, store) Product(name, price) SELECT Purchase.store FROM Purchase Why ? and When ?
67
Views Dan Suciu -- CSEP544 Fall 201167
68
Overview Views are ubiquitous in data management: Used in SQL as names for predefined queries More generally, any derived data is a view Dan Suciu -- CSEP544 Fall 201168
69
69 CREATE VIEW CustomerPrice AS SELECT DISTINCT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname CREATE VIEW CustomerPrice AS SELECT DISTINCT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname View Basics CustomerPrice(customer, price) “virtual table” Dan Suciu -- CSEP544 Fall 2011 Views are relations, defined by a query Purchase(customer, product, store) Product(pname, price) CustomerPrice(customer, price)
70
View Basics Dan Suciu -- CSEP544 Fall 201170 SELECT DISTINCT u.customer, v.store FROM CustomerPrice u, Purchase v WHERE u.customer = v.customer AND u.price > 100 SELECT DISTINCT u.customer, v.store FROM CustomerPrice u, Purchase v WHERE u.customer = v.customer AND u.price > 100 We can later use the view: Purchase(customer, product, store) Product(pname, price) CustomerPrice(customer, price)
71
71 View Basics SELECT DISTINCT u.customer, v.store FROM CustomerPrice u, Purchase v WHERE u.customer = v.customer AND u.price > 100 SELECT DISTINCT u.customer, v.store FROM CustomerPrice u, Purchase v WHERE u.customer = v.customer AND u.price > 100 CREATE VIEW CustomerPrice AS SELECT DISTINCT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname CREATE VIEW CustomerPrice AS SELECT DISTINCT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname View: Query: Purchase(customer, product, store) Product(pname, price) CustomerPrice(customer, price) Dan Suciu -- CSEP544 Fall 2011
72
72 View Basics SELECT DISTINCT u.customer, v.store FROM (SELECT DISTINCT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname) u, Purchase v WHERE u.customer = v.customer AND u.price > 100 SELECT DISTINCT u.customer, v.store FROM (SELECT DISTINCT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname) u, Purchase v WHERE u.customer = v.customer AND u.price > 100 Modified query: Dan Suciu -- CSEP544 Fall 2011 Purchase(customer, product, store) Product(pname, price) CustomerPrice(customer, price) Next, unnest the query…
73
73 View Basics SELECT DISTINCT x.customer, v.store FROM Purchase x, Product y, Purchase v, WHERE x.customer = v.customer AND y.price > 100 AND x.product = y.pname SELECT DISTINCT x.customer, v.store FROM Purchase x, Product y, Purchase v, WHERE x.customer = v.customer AND y.price > 100 AND x.product = y.pname Modified and unnested query: Dan Suciu -- CSEP544 Fall 2011 Purchase(customer, product, store) Product(pname, price) CustomerPrice(customer, price)
74
74 Practice at Home… SELECT DISTINCT u.customer, v.store FROM CustomerPrice u, Purchase v WHERE u.customer = v.customer AND u.price > 100 SELECT DISTINCT u.customer, v.store FROM CustomerPrice u, Purchase v WHERE u.customer = v.customer AND u.price > 100 ?? Dan Suciu -- CSEP544 Fall 2011 Purchase(customer, product, store) Product(pname, price) CustomerPrice(customer, price)
75
75 Answer SELECT DISTINCT u.customer, v.store FROM CustomerPrice u, Purchase v WHERE u.customer = v.customer AND u.price > 100 SELECT DISTINCT u.customer, v.store FROM CustomerPrice u, Purchase v WHERE u.customer = v.customer AND u.price > 100 Dan Suciu -- CSEP544 Fall 2011 Purchase(customer, product, store) Product(pname, price) CustomerPrice(customer, price) SELECT DISTINCT x.customer, v.store FROM Purchase x, Product y, Purchase v, WHERE x.customer = v.customer AND y.price > 100 AND x.product = y.pname SELECT DISTINCT x.customer, v.store FROM Purchase x, Product y, Purchase v, WHERE x.customer = v.customer AND y.price > 100 AND x.product = y.pname
76
76 Types of Views Virtual views: –Pros/cons ? Materialized views –Pros/cons ? Dan Suciu -- CSEP544 Fall 2011
77
77 Types of Views Virtual views: –Used in databases –Computed only on-demand – slow at runtime –Always up to date Materialized views –Used in databases and data warehouses –Pre-computed offline – fast at runtime –May have stale data or expensive synchronization Dan Suciu -- CSEP544 Fall 2011
78
Technical Aspects View inlining, or query modification Query answering using views Updating views Incremental view update DbView Answer V Q DbView Answer V Q DbView V Update ?? DbView V Update??
79
79 Applications Views have lots of applications Physical and logical data independence Security Indexes Denormalization Semantic caching … Dan Suciu -- CSEP544 Fall 2011
80
Physical Data Independence: Vertical Partitioning SSNNameAddressResumePicture 234234MaryHustonClob1…Blob1… 345345SueSeattleClob2…Blob2… 345343JoanSeattleClob3…Blob3… 234234AnnPortlandClob4…Blob4… Resumes SSNNameAddress 234234MaryHuston 345345SueSeattle... SSNResume 234234Clob1… 345345Clob2… SSNPicture 234234Blob1… 345345Blob2… T1 T2T3
81
81 Vertical Partitioning CREATE VIEW Resumes AS SELECT T1.ssn, T1.name, T1.address, T2.resume, T3.picture FROM T1,T2,T3 WHERE T1.ssn=T2.ssn and T2.ssn=T3.ssn CREATE VIEW Resumes AS SELECT T1.ssn, T1.name, T1.address, T2.resume, T3.picture FROM T1,T2,T3 WHERE T1.ssn=T2.ssn and T2.ssn=T3.ssn Dan Suciu -- CSEP544 Fall 2011
82
82 Vertical Partitioning SELECT address FROM Resumes WHERE name = ‘Sue’ SELECT address FROM Resumes WHERE name = ‘Sue’ We want the system to query only table T1. Will that happen ? Dan Suciu -- CSEP544 Fall 2011
83
Vertical Partitioning Hot trend in databases today for analytics Main idea: –Storage = Column(TID, value) pairs –Sort by TID enables reconstructing the table –Compress great compression, minimize I/O –Updates = VERY, VERY expensive Companies: C-Store and Vertica Dan Suciu -- CSEP544 Fall 201183
84
Horizontal Partitioning SSNNameCity 234234MaryHuston 345345SueSeattle 345343JoanSeattle 234234AnnPortland --FrankCalgary --JeanMontreal Customers SSNNameCity 234234MaryHuston CustomersInHuston SSNNameCity 345345SueSeattle 345343JoanSeattle CustomersInSeattle.....
85
85 Horizontal Partitioning CREATE VIEW Customers AS CustomersInHuston UNION ALL CustomersInSeattle UNION ALL... CREATE VIEW Customers AS CustomersInHuston UNION ALL CustomersInSeattle UNION ALL... Dan Suciu -- CSEP544 Fall 2011
86
86 Horizontal Partitioning SELECT name FROM Customers WHERE city = ‘Seattle’ SELECT name FROM Customers WHERE city = ‘Seattle’ Which tables are queried by the system ? WHY ??? Dan Suciu -- CSEP544 Fall 2011
87
87 Horizontal Partitioning SELECT name FROM Customers WHERE city = ‘Seattle’ SELECT name FROM Customers WHERE city = ‘Seattle’ Now even humans can’t tell which table contains customers in Seattle Dan Suciu -- CSEP544 Fall 2011 CREATE VIEW Customers AS CustomersInXXX UNION ALL CustomersInYYY UNION ALL... CREATE VIEW Customers AS CustomersInXXX UNION ALL CustomersInYYY UNION ALL...
88
88 Horizontal Partitioning CREATE VIEW Customers AS (SELECT SSN, name, ‘Huston’ as city FROM CustomersInHuston) UNION ALL (SELECT SSN, name, ‘Seattle’ as city FROM CustomersInSeattle) UNION ALL... CREATE VIEW Customers AS (SELECT SSN, name, ‘Huston’ as city FROM CustomersInHuston) UNION ALL (SELECT SSN, name, ‘Seattle’ as city FROM CustomersInSeattle) UNION ALL... A hack around the problem: Dan Suciu -- CSEP544 Fall 2011
89
89 Horizontal Partitioning SELECT name FROM Customers WHERE city = ‘Seattle’ SELECT name FROM Customers WHERE city = ‘Seattle’ SELECT name FROM CustomersInSeattle SELECT name FROM CustomersInSeattle Dan Suciu -- CSEP544 Fall 2011
90
Data Integration Terminology Local DB 1 Local DB k … Integrated Data Local DB 1 Local DB k … Integrated Data Global as View V V1V1 VkVk Local as View Which one needs query expansion, which one needs query answering using views ?
91
Horizontal Partitioning as LAV CREATE VIEW CustomersInSeattle AS (SELECT * FROM Customers WHERE city = ‘Seattle’) CREATE VIEW CustomersInHuston AS (SELECT * FROM Customers WHERE city = ‘Huston’) …. CREATE VIEW CustomersInSeattle AS (SELECT * FROM Customers WHERE city = ‘Seattle’) CREATE VIEW CustomersInHuston AS (SELECT * FROM Customers WHERE city = ‘Huston’) …. SELECT name FROM Customers WHERE city = ‘Seattle’ SELECT name FROM Customers WHERE city = ‘Seattle’ SELECT name FROM CustomersInSeattle
92
Views and Security NameAddressBalance MaryHuston450.99 SueSeattle-240 JoanSeattle333.25 AnnPortland-520 Customers: Fred is not allowed to see Balance CREATE VIEW PublicCustomers SELECT Name, Address FROM Customers CREATE VIEW PublicCustomers SELECT Name, Address FROM Customers NameAddressBalance MaryHuston450.99 SueSeattle-240 JoanSeattle333.25 AnnPortland-520 John is not allowed to see >0 balances CREATE VIEW BadCreditCustomers SELECT * FROM Customers WHERE Balance < 0 CREATE VIEW BadCreditCustomers SELECT * FROM Customers WHERE Balance < 0
93
93 Indexes REALLY important to speed up query processing time. SELECT * FROM Person WHERE name = 'Smith' SELECT * FROM Person WHERE name = 'Smith' CREATE INDEX myindex05 ON Person(name) Person (name, age, city) May take too long to scan the entire Person table Now, when we rerun the query it will be much faster
94
B+ Tree Index 94 AdamBettyCharles….Smith…. We will discuss them in detail in a later lecture. Dan Suciu -- CSEP544 Fall 2011
95
95 Creating Indexes Indexes can be created on more than one attribute: CREATE INDEX doubleindex ON Person (age, city) Example:
96
96 Creating Indexes Indexes can be created on more than one attribute: SELECT * FROM Person WHERE age = 55 AND city = 'Seattle' Helps in: CREATE INDEX doubleindex ON Person (age, city) Example:
97
97 Creating Indexes Indexes can be created on more than one attribute: SELECT * FROM Person WHERE age = 55 AND city = 'Seattle' Helps in: CREATE INDEX doubleindex ON Person (age, city) Example: SELECT * FROM Person WHERE age = 55 and even in:
98
98 Creating Indexes Indexes can be created on more than one attribute: SELECT * FROM Person WHERE age = 55 AND city = 'Seattle' Helps in: SELECT * FROM Person WHERE city = 'Seattle' But not in: CREATE INDEX doubleindex ON Person (age, city) Example: SELECT * FROM Person WHERE age = 55 and even in:
99
CREATE INDEX W ON Product(weight) CREATE INDEX P ON Product(price) CREATE INDEX W ON Product(weight) CREATE INDEX P ON Product(price) Indexes are Materialized Views SELECT weight, price FROM Product WHERE weight > 10 and price < 100 SELECT weight, price FROM Product WHERE weight > 10 and price < 100 Product(pid, name, weight, price, …) SELECT x.weight, y.price FROM W x, P y WHERE x.weight > 10 and y.price < 100 and x.pid = y.pid SELECT x.weight, y.price FROM W x, P y WHERE x.weight > 10 and y.price < 100 and x.pid = y.pid CREATE VIEW W AS SELECT weight, pid FROM Product y CREATE VIEW P AS SELECT price, pid FROM Product y CREATE VIEW W AS SELECT weight, pid FROM Product y CREATE VIEW P AS SELECT price, pid FROM Product y Indexes as LAV: “Covering indexes”: query uses only the indexes
100
Denormalization Compute a view that is the join of several tables The view is now a relation that is not in BCNF (why not?) Dan Suciu -- CSEP544 Fall 2011100 CREATE VIEW CustomerPurchase AS SELECT x.customer, x.store, y.pname, y.price FROM Purchase x, Product y WHERE x.product = y.pname CREATE VIEW CustomerPurchase AS SELECT x.customer, x.store, y.pname, y.price FROM Purchase x, Product y WHERE x.product = y.pname Purchase(customer, product, store) Product(pname, price)
101
101 Semantic Caching Queries Q1, Q2, … have been executed, and their results are stored at the client Now we need to compute a new query Q Sometimes we can use the prior results in answering Q These queries can be seen as materialized views The problem becomes: answer Q using the views Q1,Q2,…
102
Technical Aspects of Views Simplifying queries after the views have been in-lined –Query un-nesting –Query minimization Handling updates –Updating virtual views –Incremental update of materialized views 102Dan Suciu -- CSEP544 Fall 2011
103
103 Set v.s. Bag Semantics SELECT DISTINCT a,b,c FROM R, S, T WHERE... SELECT DISTINCT a,b,c FROM R, S, T WHERE... SELECT a,b,c FROM R, S, T WHERE... SELECT a,b,c FROM R, S, T WHERE... Set semantics Bag semantics Dan Suciu -- CSEP544 Fall 2011
104
104 Unnesting: Sets/Sets SELECT DISTINCT a,b,c FROM (SELECT DISTINCT u,v FROM R,S WHERE …), T WHERE... SELECT DISTINCT a,b,c FROM (SELECT DISTINCT u,v FROM R,S WHERE …), T WHERE... SELECT DISTINCT a,b,c FROM R, S, T WHERE... SELECT DISTINCT a,b,c FROM R, S, T WHERE... Dan Suciu -- CSEP544 Fall 2011
105
105 Unnesting: Sets/Bags SELECT DISTINCT a,b,c FROM (SELECT u,v FROM R,S WHERE …), T WHERE... SELECT DISTINCT a,b,c FROM (SELECT u,v FROM R,S WHERE …), T WHERE... SELECT DISTINCT a,b,c FROM R, S, T WHERE... SELECT DISTINCT a,b,c FROM R, S, T WHERE... Dan Suciu -- CSEP544 Fall 2011
106
106 Unnesting: Bags/Bags SELECT a,b,c FROM (SELECT u,v FROM R,S WHERE …), T WHERE... SELECT a,b,c FROM (SELECT u,v FROM R,S WHERE …), T WHERE... SELECT a,b,c FROM R, S, T WHERE... SELECT a,b,c FROM R, S, T WHERE... Dan Suciu -- CSEP544 Fall 2011
107
107 Unnesting: Bags/Sets SELECT a,b,c FROM (SELECT DISTINCT u,v FROM R,S WHERE …), T WHERE... SELECT a,b,c FROM (SELECT DISTINCT u,v FROM R,S WHERE …), T WHERE... NO Dan Suciu -- CSEP544 Fall 2011
108
Query Minimization Replace a query Q with Q’ having fewer tables in the FROM clause When Q has fewest number of tables in the FROM clause, then we say it is minimized Usually (but not always) users write queries that are already minimized But the result of rewriting a query over view is often not minimized Dan Suciu -- CSEP544 Fall 2011108
109
Query Minimization under Bag Semantics Rule 1: If: x, y are tuple variables over the same table and: The condition x.key = y.key is in the WHERE clause Then combine x, y into a single variable query Dan Suciu -- CSEP544 Fall 2011109
110
Query Minimization under Bag Semantics SELECT x.date, y.name FROM Order x, Product y, Order z WHERE x.pid = y.pid and y.price 150 and y.pid = z.pid and x.cid = z.cid SELECT x.date, y.name FROM Order x, Product y, Order z WHERE x.pid = y.pid and y.price 150 and y.pid = z.pid and x.cid = z.cid Order(cid, pid, weight, date) Product(pid, name, price) SELECT x.date, y.name FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 SELECT x.date, y.name FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 What constraints do we need to have for this optimization ?
111
111 Query Minimization under Bag Semantics Rule 2: If x ranges over S, y ranges over T, and The condition x.fk = y.key is in the WHERE clause, and there is a not null constraint on x.fk y is not used anywhere else, and Then remove T (and y) from the query Dan Suciu -- CSEP544 Fall 2011
112
Query Minimization under Bag Semantics Order(cid, pid, weight, date) Product(pid, name, price) SELECT x.cid, x.date FROM Order x WHERE x.weight > 20 SELECT x.cid, x.date FROM Order x WHERE x.weight > 20 SELECT x.cid, x.date FROM Order x, Product y WHERE x.pid = y.pid and x.weight > 20 SELECT x.cid, x.date FROM Order x, Product y WHERE x.pid = y.pid and x.weight > 20 Q: Where do we encounter non- minimized queries ? What constraints do we need to have for this optimization ?
113
Query Minimization under Bag Semantics CREATE VIEW CheapOrders AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid and y.price < 99 CREATE VIEW HeavyOrders AS SELECT a.cid,a.pid,a.date,b.name,b.price FROM Order a, Product b WHERE a.pid = b.pid and a.weight > 150 CREATE VIEW CheapOrders AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid and y.price < 99 CREATE VIEW HeavyOrders AS SELECT a.cid,a.pid,a.date,b.name,b.price FROM Order a, Product b WHERE a.pid = b.pid and a.weight > 150 Order(cid, pid, weight, date) Product(pid, name, price) SELECT u.cid FROM CheapOrders u, HeavyOrders v WHERE u.pid = v.pid and u.cid = v.cid SELECT u.cid FROM CheapOrders u, HeavyOrders v WHERE u.pid = v.pid and u.cid = v.cid A: in queries resulting from view inlining Customers who ordered cheap, heavy products
114
Query Minimization CREATE VIEW CheapOrders AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid and y.price < 99 CREATE VIEW HeavyOrders AS SELECT a.cid,a.pid,a.date,b.name,b.price FROM Order a, Product b WHERE a.pid = b.pid and a.weight > 150 CREATE VIEW CheapOrders AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid and y.price < 99 CREATE VIEW HeavyOrders AS SELECT a.cid,a.pid,a.date,b.name,b.price FROM Order a, Product b WHERE a.pid = b.pid and a.weight > 150 Order(cid, pid, weight, date) Product(pid, name, price) SELECT u.cid FROM CheapOrders u, HeavyOrders v WHERE u.pid = v.pid and u.cid = v.cid SELECT u.cid FROM CheapOrders u, HeavyOrders v WHERE u.pid = v.pid and u.cid = v.cid SELECT a.cid FROM Order x, Product y Order a, Product b WHERE.... SELECT a.cid FROM Order x, Product y Order a, Product b WHERE.... Redundant Orders and Products
115
SELECT a.cid FROM Order x, Product y, Order a, Product b WHERE x.pid = y.pid and a.pid = b.pid and y.price 150 and x.cid = a.cid and x.pid = a.pid SELECT a.cid FROM Order x, Product y, Order a, Product b WHERE x.pid = y.pid and a.pid = b.pid and y.price 150 and x.cid = a.cid and x.pid = a.pid SELECT x.cid FROM Order x, Product y, Product b WHERE x.pid = y.pid and x.pid = b.pid and y.price 150 SELECT x.cid FROM Order x, Product y, Product b WHERE x.pid = y.pid and x.pid = b.pid and y.price 150 x = a SELECT x.cid FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 SELECT x.cid FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 y = b
116
Query Minimization under Set Semantics Rules 1 and 2 still apply Rule 3 involves homomorphisms 116Dan Suciu -- CSEP544 Fall 2011
117
Definition of a Homomorphism A homomorphism from Q’ to Q is a mapping h : {y1, …, ym} {x1, …, xk} such that: (a)If h(yi) = xj, then Ri’ = Rj (b)C logically implies h(C’) and (c)h(A’) = A A homomorphism from Q’ to Q is a mapping h : {y1, …, ym} {x1, …, xk} such that: (a)If h(yi) = xj, then Ri’ = Rj (b)C logically implies h(C’) and (c)h(A’) = A SELECT DISTINCT A FROM R1 x1, …, Rk xk WHERE C SELECT DISTINCT A FROM R1 x1, …, Rk xk WHERE C SELECT DISTINCT A’ FROM R1’ y1, …, Rm’ ym WHERE C’ SELECT DISTINCT A’ FROM R1’ y1, …, Rm’ ym WHERE C’ Q Q’
118
Definition of a Homomorphism Theorem If there exists a homomorphism from Q’ to Q, then every answer returned by Q is also returned by Q’. We say that Q is contained in Q’ If there exists a homomorphism from Q’ to Q, and a homomorphism from Q to Q’, then Q and Q’ are equivalent
119
Find Homomorphism Dan Suciu -- CSEP544 Fall 2011119 SELECT DISTINCT x.cid FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 SELECT DISTINCT x.cid FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 Q SELECT DISTINCT x.cid FROM Order x, Product y, Order z WHERE x.pid = y.pid and y.pid = z.pid and y.price 150 and z.weight > 100 SELECT DISTINCT x.cid FROM Order x, Product y, Order z WHERE x.pid = y.pid and y.pid = z.pid and y.price 150 and z.weight > 100 Q’ Order(cid, pid, weight, date) Product(pid, name, price)
120
Homomorphism Q Q’ Dan Suciu -- CSEP544 Fall 2011120 SELECT DISTINCT x.cid FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 SELECT DISTINCT x.cid FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 Q SELECT DISTINCT x.cid FROM Order x, Product y, Order z WHERE x.pid = y.pid and y.pid = z.pid and y.price 150 and z.weight > 100 SELECT DISTINCT x.cid FROM Order x, Product y, Order z WHERE x.pid = y.pid and y.pid = z.pid and y.price 150 and z.weight > 100 Q’ Order(cid, pid, weight, date) Product(pid, name, price) Every answer to Q is also an answer to Q’ WHY ?
121
Homomorphism Q Q’ Dan Suciu -- CSEP544 Fall 2011121 SELECT DISTINCT x.cid FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 SELECT DISTINCT x.cid FROM Order x, Product y WHERE x.pid = y.pid and y.price 150 Q SELECT DISTINCT x.cid FROM Order x, Product y, Order z WHERE x.pid = y.pid and y.pid = z.pid and y.price 150 and z.weight > 100 SELECT DISTINCT x.cid FROM Order x, Product y, Order z WHERE x.pid = y.pid and y.pid = z.pid and y.price 150 and z.weight > 100 Q’ Order(cid, pid, weight, date) Product(pid, name, price) Q and Q’ are equivalent !
122
Query Minimization under Set Semantics SELECT DISTINCT x.pid FROM Product x, Product y, Product z WHERE x.category = y.category and y.price > 100 and x.category = z.category and z.price > 500 and z.weight > 10 SELECT DISTINCT x.pid FROM Product x, Product y, Product z WHERE x.category = y.category and y.price > 100 and x.category = z.category and z.price > 500 and z.weight > 10 SELECT DISTINCT x.pid FROM Product x, Product z WHERE x.category = z.category and z.price > 500 and z.weight > 10 SELECT DISTINCT x.pid FROM Product x, Product z WHERE x.category = z.category and z.price > 500 and z.weight > 10 Same as:
123
123 Query Minimization under Set Semantics Rule 3: Let Q’ be the query obtained by removing the tuple variable x from Q. If: Q has set semantics (and same for Q’) there exists a homomorphism from Q to Q’ Then Q’ is equivalent to Q. Hence one can safely remove x. Dan Suciu -- CSEP544 Fall 2011
124
Example SELECT DISTINCT x.pid FROM Product x, Product y, Product z WHERE x.category = y.category and y.price > 100 and x.category = z.category and z.price > 500 and z.weight > 10 SELECT DISTINCT x.pid FROM Product x, Product y, Product z WHERE x.category = y.category and y.price > 100 and x.category = z.category and z.price > 500 and z.weight > 10 SELECT DISTINCT x’.pid FROM Product x’, Product z’ WHERE x’.category = z’.category and z’.price > 500 and z’.weight > 10 SELECT DISTINCT x’.pid FROM Product x’, Product z’ WHERE x’.category = z’.category and z’.price > 500 and z’.weight > 10 Q Q’ Find a homomorphism h: Q Q’
125
Example SELECT DISTINCT x.pid FROM Product x, Product y, Product z WHERE x.category = y.category and y.price > 100 and x.category = z.category and z.price > 500 and z.weight > 10 SELECT DISTINCT x.pid FROM Product x, Product y, Product z WHERE x.category = y.category and y.price > 100 and x.category = z.category and z.price > 500 and z.weight > 10 SELECT DISTINCT x’.pid FROM Product x’, Product z’ WHERE x’.category = z’.category and z’.price > 500 and z’.weight > 10 SELECT DISTINCT x’.pid FROM Product x’, Product z’ WHERE x’.category = z’.category and z’.price > 500 and z’.weight > 10 Q Q’ Answer: H(x) = x’, H(y) = H(z) = z’
126
126 CREATE VIEW Expensive-Product AS SELECT pname FROM Product WHERE price > 100 CREATE VIEW Expensive-Product AS SELECT pname FROM Product WHERE price > 100 Updating Views INSERT INTO Expensive-Product VALUES(‘Gizmo’) INSERT INTO Expensive-Product VALUES(‘Gizmo’) Purchase(customer, product, store) Product(pname, price) Updateable view Dan Suciu -- CSEP544 Fall 2011
127
Updatable Views Have a virtual view V(A1, A2, …) over tables R1, R2, … User wants to update a tuple in V –Insert/modify/delete Can we translate this into updates to R1, R2, … ? If yes: V = “an updateable view” If not: V = “a non-updateable view” Dan Suciu -- CSEP544 Fall 2011127
128
128 CREATE VIEW Expensive-Product AS SELECT pname FROM Product WHERE price > 100 CREATE VIEW Expensive-Product AS SELECT pname FROM Product WHERE price > 100 Updating Views INSERT INTO Product VALUES(‘Gizmo’, NULL) INSERT INTO Product VALUES(‘Gizmo’, NULL) Purchase(customer, product, store) Product(pname, price) Updateable view Dan Suciu -- CSEP544 Fall 2011 INSERT INTO Expensive-Product VALUES(‘Gizmo’) INSERT INTO Expensive-Product VALUES(‘Gizmo’)
129
129 CREATE VIEW AcmePurchase AS SELECT customer, product FROM Purchase WHERE store = ‘AcmeStore’ CREATE VIEW AcmePurchase AS SELECT customer, product FROM Purchase WHERE store = ‘AcmeStore’ Updating Views INSERT INTO AcmePurchase VALUES(‘Joe’, ‘Gizmo’) INSERT INTO AcmePurchase VALUES(‘Joe’, ‘Gizmo’) Purchase(customer, product, store) Product(pname, price) Updateable view Dan Suciu -- CSEP544 Fall 2011
130
130 CREATE VIEW AcmePurchase AS SELECT customer, product FROM Purchase WHERE store = ‘AcmeStore’ CREATE VIEW AcmePurchase AS SELECT customer, product FROM Purchase WHERE store = ‘AcmeStore’ Updating Views INSERT INTO AcmePurchase VALUES(‘Joe’, ‘Gizmo’) INSERT INTO AcmePurchase VALUES(‘Joe’, ‘Gizmo’) INSERT INTO Purchase VALUES(‘Joe’,’Gizmo’,NULL) INSERT INTO Purchase VALUES(‘Joe’,’Gizmo’,NULL) Note this Purchase(customer, product, store) Product(pname, price) Updateable view Dan Suciu -- CSEP544 Fall 2011
131
131 Updating Views INSERT INTO CustomerPrice VALUES(‘Joe’, 200) ? ? ? ? ? Non-updateable view Most views are non-updateable CREATE VIEW CustomerPrice AS SELECT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname CREATE VIEW CustomerPrice AS SELECT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname Purchase(customer, product, store) Product(pname, price) Dan Suciu -- CSEP544 Fall 2011
132
132 Incremental View Update Also known as view synchronization Immediate synchronization = after each update Deferred synchronization –Lazy = at query time –Periodic –Forced = manual
133
133 CREATE VIEW FullOrder AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid CREATE VIEW FullOrder AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid Incremental View Update UPDATE Product SET price = price / 2 WHERE pid = ‘12345’ UPDATE Product SET price = price / 2 WHERE pid = ‘12345’ Order(cid, pid, date) Product(pid, name, price) UPDATE FullOrder SET price = price / 2 WHERE pid = ‘12345’ UPDATE FullOrder SET price = price / 2 WHERE pid = ‘12345’ No need to recompute the entire view ! Dan Suciu -- CSEP544 Fall 2011
134
134 CREATE VIEW Categories AS SELECT DISTINCT category FROM Product CREATE VIEW Categories AS SELECT DISTINCT category FROM Product Incremental View Update DELETE Product WHERE pid = ‘12345’ DELETE Product WHERE pid = ‘12345’ Product(pid, name, category, price) DELETE Categories WHERE category in (SELECT category FROM Product WHERE pid = ‘12345’) DELETE Categories WHERE category in (SELECT category FROM Product WHERE pid = ‘12345’) It doesn’t work ! Why ? How can we fix it ?
135
135 CREATE VIEW Categories AS SELECT category, count(*) as c FROM Product GROUP BY category CREATE VIEW Categories AS SELECT category, count(*) as c FROM Product GROUP BY category Incremental View Update DELETE Product WHERE pid = ‘12345’ DELETE Product WHERE pid = ‘12345’ Product(pid, name, category, price) UPDATE Categories SET c = c-1 WHERE category in (SELECT category FROM Product WHERE pid = ‘12345’); DELETE Categories WHERE c = 0 UPDATE Categories SET c = c-1 WHERE category in (SELECT category FROM Product WHERE pid = ‘12345’); DELETE Categories WHERE c = 0
136
136 Answering Queries Using Views [See the paper] We have several materialized views: –V1, V2, …, Vn Given a query Q –Answer it by using views instead of base tables Variation: Query rewriting using views –Answer it by rewriting it to another query first Example: if the views are indexes, then we rewrite the query to use indexes Dan Suciu -- CSEP544 Fall 2011
137
Rewriting Queries Using Views 137 Purchase(buyer, seller, product, store) Person(pname, city) CREATE VIEW SeattleView AS SELECT y.buyer, y.seller, y.product, y.store FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x.pname = y.buyer CREATE VIEW SeattleView AS SELECT y.buyer, y.seller, y.product, y.store FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x.pname = y.buyer SELECT y.buyer, y.seller FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’ SELECT y.buyer, y.seller FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’ Goal: rewrite this query in terms of the view Have this materialized view:
138
Rewriting Queries Using Views 138 SELECT y.buyer, y.seller FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’ SELECT y.buyer, y.seller FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’ SELECT buyer, seller FROM SeattleView WHERE product= ‘gizmo’ SELECT buyer, seller FROM SeattleView WHERE product= ‘gizmo’ Dan Suciu -- CSEP544 Fall 2011
139
Rewriting is not always possible 139 CREATE VIEW DifferentView AS SELECT y.buyer, y.seller, y.product, y.store FROM Person x, Purchase y, Product z WHERE x.city = ‘Seattle’ AND x.pname = y.buyer AND y.product = z.name AND z.price < 100 CREATE VIEW DifferentView AS SELECT y.buyer, y.seller, y.product, y.store FROM Person x, Purchase y, Product z WHERE x.city = ‘Seattle’ AND x.pname = y.buyer AND y.product = z.name AND z.price < 100 SELECT y.buyer, y.seller FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’ SELECT y.buyer, y.seller FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’ SELECT buyer, seller FROM DifferentView WHERE product= ‘gizmo’ SELECT buyer, seller FROM DifferentView WHERE product= ‘gizmo’ “Maximally contained rewriting”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.