Download presentation
Presentation is loading. Please wait.
Published bySunniva Magnussen Modified over 5 years ago
1
Equivalence of Aggregate Queries in Conjunctive QL
David DeHaan CS 848 February 22, 2003 4/25/2019
2
Dialects of QL (semantics) (expressiveness) Conjunctive QL with
bag semantics† Positive QL First order QL bag semantics bag semantics‡ †[Khizder et al., 1999], ‡[Lui et al., 2002] 4/25/2019
3
Conjunctive QL Q ::= D as A (quantification) | A1 = A2.R (unnest)
| A1.Pf1 = A2.Pf2 (selection) | elim A1, … , An Q (projection) | true (null tuple) | from Q1, Q2 (natural join) | ( Q ) D ::= THING | C (basic description) Pf ::= id | A.Pf (path function) 4/25/2019
4
Conjunctive QL with bag semantics
Q ::= D as A (quantification) | A1.Pf1 = A2.Pf2 (selection) | select A1, … , An Q (projection) | elim Q (duplicate elimination) | true (null tuple) | from Q1, Q2 (natural join) | ( Q ) D ::= THING | C (basic description) Pf ::= id | A.Pf (path function) 4/25/2019
5
Aggregate Conjunctive QL
Q ::= D as A (quantification) | A1.Pf1 = A2.Pf2 (selection) | select A1, … , An Q (projection) | elim Q (duplicate elimination) | agg A1, ... , An, (B) Q (aggregate) | true (null tuple) | from Q1, Q2 (natural join) | ( Q ) D ::= THING | C (basic description) Pf ::= id | A.Pf (path function) 4/25/2019
6
“Deciding Equivalences among Aggregate Queries”
W. Nutt, Y. Sagiv, S. Shurin PODS 1998 Equivalence of conjunctive queries containing a single aggregate operator with comparison predicates 4/25/2019
7
Nutt et. al. In other words, SQL queries of the form
SELECT A1, …, An, (B) FROM R1, …, Rm WHERE [Equality Conditions] AND [Binary Comparisons] GROUP BY A1, …, An where (B) 2 {count(*), cntd(B), sum(B), max(B), min(B)} Define core of q(x, (y)) as q(x, y) 4/25/2019
8
Count(*) Queries q ´ q0 $ qc ´bs q0c Relational (no comparisons):
qc ´bs q0c $ qc, q0c are isomorphic Complexity: NP [Chaudhuri, Vardi; PODS 1993] 4/25/2019
9
Count(*) Queries With Comparisons: qc, q0c isomorphic ! qc ´bs q0c
e.g. bag-set equivalent but not isomorphic: q à p(x) Æ p(y) Æ p(z) Æ x<y Æ x<z q0 à p(x) Æ p(y) Æ p(z) Æ x<z Æ y<z 4/25/2019
10
Count(*) Queries Compatible linearizations Resulting linear expansions
qc: {(x<y=z), (x<y<z)} q0c: {(x=y<z), (x<y<z)} Resulting linear expansions qL: { [q à p(x) Æ p(z) Æ p(z) Æ x<z], [q à p(x) Æ p(y) Æ p(z) Æ x<y<z] } q0L: { [q0 à p(y) Æ p(y) Æ p(z) Æ y<z], [q0 à p(x) Æ p(y) Æ p(z) Æ x<y<z] } qc ´bs q0c $ qL, q0L isomorphic Complexity: P-space 4/25/2019
11
Count Distinct Queries
Sufficient: qc ´s q0c ! q ´ q0 e.g. q ´ q0 but qc s q0c q(cntd(y)) Ã p(y) & p(z) & y<z q0(cntd(y)) Ã p(y) & p(z) & y>z qc returns all elements except greatest. q0c returns all elements except least 4/25/2019
12
Count Distinct Queries
Necessary: qc ´s q0c $ q ´ q0 only when q, q0 are reduced no variable in same position as y occurs in strict comparison (c.f. previous example) one of: q, q0 range over rationals q, q0 don’t contain constants No variable in same position as y occurs in any comparison 4/25/2019
13
Sum Queries Relational, without Constants: q ´ q0 $ qc ´bs q0c
Complexity: NP With Comparisons, without Constants: Complexity: P-space 4/25/2019
14
Sum Queries With Constants: q ´ q0 if and only if Complexity: P-space
qc ´ws q0c and qc, q0c have variable-isomorphic linear expansions Complexity: P-space 4/25/2019
15
Max/Min Queries Definition: q dominates q0 if for all databases:
whenever q returns tuple (x, y), q0 returns tuple (x, y0) where y ¸ y0 (for Max, · for Min) q ´ q0 $ qc dominates q0c and q0c dominates qc 4/25/2019
16
Max/Min Queries Relational: p dominates p0 $ p0 µs p
Complexity: NP-complete With Comparisons: p dominates p0 $ 8linearizations p0L of p0, p dominates p0L Complexity: P2-complete 4/25/2019
17
Summary - Nutt et. al. Consider equivalence of CQL queries only where agg occurs at top level Necessary & Sufficient conditions differ depending upon aggregate operator Only consider the most general case where no schema information is used In reality, schema information is often present 4/25/2019
18
“Exploiting Uniqueness in Query Optimization”
G. Paulley, P. Larson ICDE 1994 Use schema information to remove DISTINCT operator from conjunctive SQL queries (i.e. elim operator from CQL with bag semantics) 4/25/2019
19
Paulley et. al. Q: select distinct W RW(Q): select W
from C1 as A1, …, Cn as An where R RW(Q): select W from C1 as A1, …, Cn as An where R R={constraints over W [ {attributes of A1, …, An}} 4/25/2019
20
Paulley et. al. Theorem: Q ´ RW(Q) if and only if
C1,…,Cm all have candidate keys Define K = key(C1) ± … ± key(Cn) K is a candidate key for C1 £ … £ Cn One of: K µ W Some K0 µ K exists such that: K0 µ W Unique values for (K – K0) can be inferred from R + Schema (CHECK +KEY constraints) 4/25/2019
21
Paulley et. al. Testing this condition
= satisfaction of arbitrary Boolean expression = NP-complete 4/25/2019
22
“Reasoning about Duplicate Elimination with Descriptive Logic”
V. Khizder, D. Toman, G. Weddell DOOD 2002 Incrementally remove conjuncts from scope of elim operator in CQL Map equivalence to DL membership problem instead of Boolean satisfiability 4/25/2019
23
Khizder et. al. Q: select V RW(Q):
from C1 as A1,…,Cm as Am,(elim select W from Cm+1 as Am+1,…,Cn as An,R) RW(Q): select V from C1 as A1,…,Cm+1 as Am+1,(elim select W [ {Am+1} from Cm+2 as Am+2,…,Cn as An,R) R={equality constraints over Pf’s on W [ {A1, …, An}} This Normal Form can always be achieved 4/25/2019
24
Khizder et. al. Define: S = Database Schema SQ = “Query Schema”
Expressed in CFD (CLASSIC + Functional Dependencies) C(Pf1, …, Pfn ! Pf) SQ = “Query Schema” = {CQv(A1:C1), …, CQv(An:Cn), CQvR} 4/25/2019
25
Reformulating Paulley et. al.
Schema ² Q ´ RW(Q) m Schema + R + instance of W ² unique instance of K S [ SQ ² CQ v CQ(W ! K) S [ SQ ² CQ v CQ(W ! A1, …, An) 4/25/2019
26
Khizder et. al. Theorem: Q ´ RW(Q) if and only if
S [ SQ ² CQ v CQ(A [ W ! Am+1) where A = {A1, …, Am} Am+1 2 W FD obviously true Am+1 W Am+1 existentially qualified FD guarantees no duplicates introduced 4/25/2019
27
Khizder et. al. CQL: S ² Q ´ RW(Q) CFD: m
S [ SQ ² CQ v CQ({A1,…,Am} [ W ! Am+1) Apply rewrite iteratively from m=0 to m=n-1 iff: S [ SQ ² CQ v CQ(W ! A1, …, An) 4/25/2019
28
Complexity Membership in CLASSIC is P-time Holds for CFD assuming
S [ SQ only contain regular path FD’s C(Pf1,…,Pfn ! Pf), Pf prefix of some Pfi S does not contain equation constraints SQL CHECK constraints allow disjunction Not expressible in CFD P-time bound does not apply 4/25/2019
29
Usefulness of Incremental Re-write
Move out of elim: Increase search space for join Shrink size of intermediate result requiring sorting Move into elim: When W Å V = then Values of W not important; only existence Replace subquery with probe to an index 4/25/2019
30
Rewriting with Aggregates
Using aggregate views “A view will be usable to answer a query only if there is an isomorphism between the view and a subset of the query” [Halevy, 2002] Note that the above quote is incorrect. It states a condition as necessary that is actually sufficient (but not necessary). Using schema information can increase the number of usable views. 4/25/2019
31
Simple Example Schema: Query: Cust(Id, Name) Purch(Id, Item, Price)
Cust v Cust(Id ! Name) View1(Id, Name, Sum(Price)) Ã Cust(Id, Name) Æ Purch(Id, Item, Price) Query: Q(Name, Tot) Ã Cust(Id, Name) Æ Spend(Id, Tot) Spend(Id, Sum(P)) Ã Purch(Id, P) 4/25/2019
32
Rewriting m Rewrite: Valid because
Q(Name, Tot) Ã Cust(Id, Name) Æ Spend(Id, Tot) Spend(Id, Sum(P)) Ã Purch(Id, P) m Q0(Name, Tot) Ã Spend0(Id, Name, Tot) Spend0(Id, N, Sum(P)) Ã Cust(Id, N) Æ Purch(Id, Z, P) Valid because S [ SQ ² Cust v Cust(Id ! Name) Now an Spend0 and View1 are isomorphic (Sufficient condition for using View1). 4/25/2019
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.