Download presentation
Presentation is loading. Please wait.
Published byWidyawati Chandra Modified over 6 years ago
1
Optimal Query Processing Meets Information Theory
Dan Suciu – University of Washington Joint w/ Mahmoud Abo-Khamis and Hung Ngo RelationalAI Inc
2
Basic Question What is the optimal runtime to compute a query Q on a database D? Data complexity: Q is fixed, runtime = f(D) Applications: CSP where the nodes of Q are the variables of the CSP problem, its hyperedges are the constraints, and D represents the valid assignments Databases: …
3
Example select * -- natural join from R, S, T
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Will prove later: |Q(D)| ≤ N3/2 select * natural join from R, S, T where R.Y = S.Y and S.Z = T.Z and T.X = R.X
4
Example select * -- natural join from R, S, T
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Will prove later: |Q(D)| ≤ N3/2 select * natural join from R, S, T where R.Y = S.Y and S.Z = T.Z and T.X = R.X Query plan ⨝ ⨝ T(Z,X) R(X,Y) S(Y,Z)
5
Example select * -- natural join from R, S, T
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Will prove later: |Q(D)| ≤ N3/2 select * natural join from R, S, T where R.Y = S.Y and S.Z = T.Z and T.X = R.X R: X Y 1 2 3 ... N/2 Query plan ⨝ ⨝ T(Z,X) N R(X,Y) S(Y,Z)
6
Example select * -- natural join from R, S, T
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Will prove later: |Q(D)| ≤ N3/2 select * natural join from R, S, T where R.Y = S.Y and S.Z = T.Z and T.X = R.X R: X Y 1 2 3 ... N/2 S: (same as R) Y Z 1 2 3 ... N/2 T: (same as R) Z X 1 2 3 ... N/2 Query plan ⨝ ⨝ T(Z,X) N R(X,Y) S(Y,Z)
7
Example select * -- natural join from R, S, T
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Will prove later: |Q(D)| ≤ N3/2 select * natural join from R, S, T where R.Y = S.Y and S.Z = T.Z and T.X = R.X R: X Y 1 2 3 ... N/2 S: (same as R) Y Z 1 2 3 ... N/2 T: (same as R) Z X 1 2 3 ... N/2 Query plan ⨝ Θ(N2) tuples ⨝ T(Z,X) N R(X,Y) S(Y,Z)
8
Example select * -- natural join from R, S, T
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Will prove later: |Q(D)| ≤ N3/2 select * natural join from R, S, T where R.Y = S.Y and S.Z = T.Z and T.X = R.X 0 tuples R: X Y 1 2 3 ... N/2 S: (same as R) Y Z 1 2 3 ... N/2 T: (same as R) Z X 1 2 3 ... N/2 Query plan ⨝ Θ(N2) tuples ⨝ T(Z,X) N R(X,Y) S(Y,Z)
9
Example select * -- natural join from R, S, T
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Will prove later: |Q(D)| ≤ N3/2 select * natural join from R, S, T where R.Y = S.Y and S.Z = T.Z and T.X = R.X 0 tuples R: X Y 1 2 3 ... N/2 S: (same as R) Y Z 1 2 3 ... N/2 T: (same as R) Z X 1 2 3 ... N/2 Query plan ⨝ Θ(N2) tuples ⨝ T(Z,X) N R(X,Y) S(Y,Z) Every plan costs Θ(N2)!
10
Outline Motivation Full CQ Boolean CQ Conclusions
11
Maximum Output Size maxD satisfies stats (|Q(D)|)
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N No other info: |Q(D)| ≤ N2 S.Y is a key: |Q(D)| ≤ N S.Y has degree ≤ d: |Q(D)| ≤ d×N E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) No other info: |Q(D)| ≤ N3/2
12
Maximum Output Size maxD satisfies stats (|Q(D)|)
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N No other info: |Q(D)| ≤ N2 S.Y is a key: |Q(D)| ≤ N S.Y has degree ≤ d: |Q(D)| ≤ d×N E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) No other info: |Q(D)| ≤ N3/2
13
Maximum Output Size maxD satisfies stats (|Q(D)|)
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N No other info: |Q(D)| ≤ N2 S.Y is a key: |Q(D)| ≤ N S.Y has degree ≤ d: |Q(D)| ≤ d×N E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) No other info: |Q(D)| ≤ N3/2
14
Maximum Output Size maxD satisfies stats (|Q(D)|)
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N No other info: |Q(D)| ≤ N2 S.Y is a key: |Q(D)| ≤ N S.Y has degree ≤ d: |Q(D)| ≤ d×N E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) No other info: |Q(D)| ≤ N3/2
15
Maximum Output Size maxD satisfies stats (|Q(D)|)
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N No other info: |Q(D)| ≤ N2 S.Y is a key: |Q(D)| ≤ N S.Y has degree ≤ d: |Q(D)| ≤ d×N E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) No other info: |Q(D)| ≤ N3/2
16
Maximum Output Size maxD satisfies stats (|Q(D)|)
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N No other info: |Q(D)| ≤ N2 S.Y is a key: |Q(D)| ≤ N S.Y has degree ≤ d: |Q(D)| ≤ d×N E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) No other info: |Q(D)| ≤ N3/2
17
Maximum Output Size maxD satisfies stats (|Q(D)|)
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N No other info: |Q(D)| ≤ N2 S.Y is a key: |Q(D)| ≤ N S.Y has degree ≤ d: |Q(D)| ≤ d×N E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) No other info: |Q(D)| ≤ N3/2
18
Maximum Output Size maxD satisfies stats (|Q(D)|)
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N No other info: |Q(D)| ≤ N2 S.Y is a key: |Q(D)| ≤ N S.Y has degree ≤ d: |Q(D)| ≤ d×N E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) No other info: |Q(D)| ≤ N3/2
19
Main Principle Find information-theoretic proof of the upper bound, or the submodular width Convert proof to algorithm
20
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Database D entropic function H
21
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Database D entropic function H Database D R(X,Y) S(Y,Z) T(Z,X) X Y a 3 2 b d Y Z 3 m 2 q Z X m a q b d
22
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Database D entropic function H Output Q(D) Database D R(X,Y) S(Y,Z) T(Z,X) X Y Z a 3 m 2 q b d X Y a 3 2 b d Y Z 3 m 2 q Z X m a q b d
23
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Database D entropic function H Output Q(D) Database D R(X,Y) S(Y,Z) T(Z,X) X Y Z a 3 m 1/5 2 q b d X Y a 3 2 b d Y Z 3 m 2 q Z X m a q b d H(XYZ) = log |Q(D)|
24
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Database D entropic function H Output Q(D) Database D R(X,Y) S(Y,Z) T(Z,X) X Y Z a 3 m 1/5 2 q b d X Y a 3 2/5 2 1/5 b d Y Z 3 m 2/5 2 q 1/5 Z X m a 1/5 q 2/5 b d H(XYZ) = log |Q(D)|
25
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) Database D entropic function H Output Q(D) Database D R(X,Y) S(Y,Z) T(Z,X) X Y Z a 3 m 1/5 2 q b d X Y a 3 2/5 2 1/5 b d Y Z 3 m 2/5 2 q 1/5 Z X m a 1/5 q 2/5 b d H(XYZ) = log |Q(D)| H(XY) ≤ log NR H(YZ) ≤ log NS H(Z|Y) ≤ log degS(z|y) H(XZ) ≤ log NT Cardinalitites, functional dependences, max degrees
26
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) |R|,|S|,|T| ≤ N |Q(D)|≤ N3/2
27
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) |R|,|S|,|T| ≤ N |Q(D)|≤ N3/2 3 log N ≥ h(XY) + h(YZ) + h(XZ) ≥ h(XYZ) + h(Y) + h(XZ) ≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ) = 2 log |Output|
28
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) |R|,|S|,|T| ≤ N |Q(D)|≤ N3/2 submodularity 3 log N ≥ h(XY) + h(YZ) + h(XZ) ≥ h(XYZ) + h(Y) + h(XZ) ≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ) = 2 log |Output|
29
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) |R|,|S|,|T| ≤ N |Q(D)|≤ N3/2 submodularity 3 log N ≥ h(XY) + h(YZ) + h(XZ) ≥ h(XYZ) + h(Y) + h(XZ) ≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ) = 2 log |Output|
30
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) |R|,|S|,|T| ≤ N |Q(D)|≤ N3/2 submodularity 3 log N ≥ h(XY) + h(YZ) + h(XZ) ≥ h(XYZ) + h(Y) + h(XZ) ≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ) = 2 log |Output| submodularity
31
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) |R|,|S|,|T| ≤ N |Q(D)|≤ N3/2 submodularity 3 log N ≥ h(XY) + h(YZ) + h(XZ) ≥ h(XYZ) + h(Y) + h(XZ) ≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ) = 2 log |Output| submodularity
32
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) |R|,|S|,|T| ≤ N |Q(D)|≤ N3/2 submodularity 3 log N ≥ h(XY) + h(YZ) + h(XZ) ≥ h(XYZ) + h(Y) + h(XZ) ≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ) = 2 log |Q(D)| submodularity
33
Shearer’s inequality h(XY) + h(YZ) + h(XZ) ≥ 2 h(XYZ)
Proof of Upper Bound Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) |R|,|S|,|T| ≤ N |Q(D)|≤ N3/2 submodularity 3 log N ≥ h(XY) + h(YZ) + h(XZ) ≥ h(XYZ) + h(Y) + h(XZ) ≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ) = 2 log |Q(D)| submodularity Shearer’s inequality h(XY) + h(YZ) + h(XZ) ≥ 2 h(XYZ)
34
Proof to Algorithm + Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)
h(XY) + h(YZ) + h(XZ) ≥ 2 h(XYZ) h(XY)+h(YZ)+h(XZ) h(XYZ) h(Y) + h(XZ) + Proof h(XYZ)
35
Proof to Algorithm + Runtime Õ(N3/2)
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) h(XY) + h(YZ) + h(XZ) ≥ 2 h(XYZ) Algorithm h(XY)+h(YZ)+h(XZ) R(X,Y)∧S(Y,Z)∧T(Z,X) h(XYZ) h(Y) + h(XZ) + Proof h(XYZ)
36
Proof to Algorithm + Runtime Õ(N3/2)
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) h(XY) + h(YZ) + h(XZ) ≥ 2 h(XYZ) Algorithm h(XY)+h(YZ)+h(XZ) R(X,Y)∧S(Y,Z)∧T(Z,X) N3/2 h(XYZ) h(Y) + h(XZ) + Rlight(X,Y)∧S(Y,Z) Proof h(XYZ) Rlight or Rheavy: degree(Y) ≤ or > N1/2
37
Proof to Algorithm + ∪ Runtime Õ(N3/2)
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) h(XY) + h(YZ) + h(XZ) ≥ 2 h(XYZ) Algorithm h(XY)+h(YZ)+h(XZ) R(X,Y)∧S(Y,Z)∧T(Z,X) N3/2 h(XYZ) h(Y) + h(XZ) + N3/2 Rlight(X,Y)∧S(Y,Z) ∪ Proof h(XYZ) Rheavy(Y)∧T(X,Z) Rlight or Rheavy: degree(Y) ≤ or > N1/2
38
Proof to Algorithm + ∪ Runtime Õ(N3/2)
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X) h(XY) + h(YZ) + h(XZ) ≥ 2 h(XYZ) Algorithm h(XY)+h(YZ)+h(XZ) R(X,Y)∧S(Y,Z)∧T(Z,X) N3/2 h(XYZ) h(Y) + h(XZ) + N3/2 Rlight(X,Y)∧S(Y,Z) ∪ Proof h(XYZ) Rheavy(Y)∧T(X,Z) Rlight or Rheavy: degree(Y) ≤ or > N1/2
39
Another Quick Example Runtime Õ(N3/2)
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,U) ∧ A(X,Z,U) ∧ B(X,Y,U) 3 log N ≥ h(XY)+h(YZ)+h(XZ)+h(U|XZ)+h(X|YU) ≥ h(XY)+h(YZ)+h(XZ)+h(U|XYZ)+h(X|YZU) h(Y)+h(XZ)+h(X|YZU) ≥ h(XYZ)+h(U|XYZ) h(XYZU) h(XYZ)+h(X|YZU) h(XYZU)
40
Enumeration Problem: Discussion
Cardinalities: [Atserias,Grohe,Marx’08, Ngo,Re,Rudra’13] Entropic bound = polymatroid bound Algorithm for Q(D) has single log factor Cardinalities + FDs + max degrees: Entropic bound ≨ polymatroid bound Algorithm for Q(D) has polylog factor
41
Outline Motivation Full CQ Boolean CQ Conclusions
42
Background: Tree Decomposition
Informally: TD = a tree where each node t represents an enumeration problem Fractional hypetree width [Grohe,Marx’14] mintree maxnode t maxD Submodular width [Marx’2013] maxD mintree maxnode t
43
Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97]
44
mintree maxnode t maxD Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] mintree maxnode t maxD
45
mintree maxnode t maxD Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] mintree maxnode t maxD R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y)
46
mintree maxnode t maxD Runtime Õ(N2) (suboptimal)
Z X Y Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x) |R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] mintree maxnode t maxD R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) Runtime Õ(N2) (suboptimal)
47
mintree maxnode t maxD Runtime Õ(N2) (suboptimal)
Z X Y Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x) |R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] mintree maxnode t maxD R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) Runtime Õ(N2) (suboptimal)
48
maxD mintree maxnode t Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y)
49
maxD mintree maxnode t Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) T1 min( max(h(xyz),h(zux)), max(h(yzu),h(uxy))) =
50
maxD mintree maxnode t Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) T1 min( max(h(xyz),h(zux)), max(h(yzu),h(uxy))) = T2
51
maxD mintree maxnode t Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) T1 min( max(h(xyz),h(zux)), max(h(yzu),h(uxy))) = T2
52
maxD mintree maxnode t Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) T1 min( max(h(xyz),h(zux)), max(h(yzu),h(uxy))) = T2 = max(min(h(xyz),h(yzu)), min(h(xyz),h(uxy)), min(h(zux),h(yzu)), min(h(zux),h(uxy)))
53
3 log N ≥ h(xy) + h(yz) + h(zu)
Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x) |R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) T1 min( max(h(xyz),h(zux)), max(h(yzu),h(uxy))) = 3 log N ≥ h(xy) + h(yz) + h(zu) T2 = max(min(h(xyz),h(yzu)), min(h(xyz),h(uxy)), min(h(zux),h(yzu)), min(h(zux),h(uxy)))
54
3 log N ≥ h(xy) + h(yz) + h(zu) ≥ h(xyz) + h(y) + h(zu)
Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x) |R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) T1 min( max(h(xyz),h(zux)), max(h(yzu),h(uxy))) = 3 log N ≥ h(xy) + h(yz) + h(zu) ≥ h(xyz) + h(y) + h(zu) T2 = max(min(h(xyz),h(yzu)), min(h(xyz),h(uxy)), min(h(zux),h(yzu)), min(h(zux),h(uxy)))
55
3 log N ≥ h(xy) + h(yz) + h(zu) ≥ h(xyz) + h(y) + h(zu)
Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x) |R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) T1 min( max(h(xyz),h(zux)), max(h(yzu),h(uxy))) = 3 log N ≥ h(xy) + h(yz) + h(zu) ≥ h(xyz) + h(y) + h(zu) ≥ h(xyz) + h(yzu) + h(∅) T2 = max(min(h(xyz),h(yzu)), min(h(xyz),h(uxy)), min(h(zux),h(yzu)), min(h(zux),h(uxy)))
56
3 log N ≥ h(xy) + h(yz) + h(zu) ≥ h(xyz) + h(y) + h(zu)
Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x) |R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions S(y,z),T(z,u) K(u,x),R(x,y) T1 min( max(h(xyz),h(zux)), max(h(yzu),h(uxy))) = 3 log N ≥ h(xy) + h(yz) + h(zu) ≥ h(xyz) + h(y) + h(zu) ≥ h(xyz) + h(yzu) + h(∅) ≥ 2 min(h(xyz),h(yzu)) T2 = max(min(h(xyz),h(yzu)), min(h(xyz),h(uxy)), min(h(zux),h(yzu)), min(h(zux),h(uxy)))
57
3 log N ≥ h(xy) + h(yz) + h(zu) ≥ h(xyz) + h(y) + h(zu)
Q() = ∃x∃y∃z∃u R(x,y)∧S(y,z)∧T(z,u)∧K(u,x) |R|,|S|,|T|,|K| ≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97] maxD mintree maxnode t R(x,y),S(y,z) T(z,u),K(u,x) Tree decompositions subw(Q) = 3/2 log N S(y,z),T(z,u) K(u,x),R(x,y) T1 min( max(h(xyz),h(zux)), max(h(yzu),h(uxy))) = 3 log N ≥ h(xy) + h(yz) + h(zu) ≥ h(xyz) + h(y) + h(zu) ≥ h(xyz) + h(yzu) + h(∅) ≥ 2 min(h(xyz),h(yzu)) T2 = max(min(h(xyz),h(yzu)), min(h(xyz),h(uxy)), min(h(zux),h(yzu)), min(h(zux),h(uxy))) ≤ 3/2 log N
58
Proof to Algorithm Use the proof of: to compute the disjunctive datalog rule: (details omitted) h(xyz)+h(yzu) ≤ h(xy) + h(yz) + h(zu) A(x,y,z) ∨ B(y,z,u) R(x,y) ∧ S(y,z) ∧ T(z,u) Runtime Õ(N3/2)
59
Outline Motivation Full CQ Boolean CQ Conclusions
60
Conclusions Query evaluation summary: Information theory Proof
Proof Algorithm Open problems: Better “Proof Algorithm” Fine-grained lower bounds
61
Thank You! Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.