Download presentation
Presentation is loading. Please wait.
Published byMillicent Garrison Modified over 9 years ago
1
A NSWERING C ONJUNCTIVE Q UERIES W ITH I NEQUALITIES Paris Koutris 1 Tova Milo 2 Sudeepa Roy 1 Dan Suciu 1 ICDT 2015 1 University of Washington 2 Tel Aviv University
2
P ROBLEM What is the combined complexity of computing conjunctive queries with inequalities (CQ ≠ )? query (q,I): q = R(x,y),S(y,z),T(z,w) I = {x ≠ z, y ≠ w} 2
3
E XAMPLE : P ATH Q UERY Path query (of length k) P k = R 1 (x 1,x 2 ),R 2 (x 2,x 3 ),…,R k (x k,x k+1 ) acyclic query polynomial combined complexity 3 x1x1 x2x2 x3x3... xkxk x k+1 R1R1 R2R2 R3R3 RkRk
4
E XAMPLE : P ATH Q UERY Path query + inequalities P k = R 1 (x 1,x 2 ),R 2 (x 2,x 3 ),…,R k (x k,x k+1 ) I = {x i ≠ x j, for all i<j} equivalent to Hamiltonian path NP-hard 4 x1x1 x2x2 x3x3... xkxk x k+1 R1R1 R2R2 R3R3 RkRk inequality graph
5
E XAMPLE : P ATH Q UERY Path query + inequalities P k = R 1 (x 1,x 2 ),R 2 (x 2,x 3 ),…,R k (x k,x k+1 ) I = {x i ≠ x i+2, for all i} polynomial combined complexity 5 x1x1 x2x2 x3x3... xkxk x k+1 R1R1 R2R2 R3R3 RkRk
6
C ONTRIBUTION How does the combined complexity of computing CQs changes when we add inequalities? Given any blackbox algorithm that computes q, we can compute (q,I) with a g(q,I) log(|D|) blowup Given any Selection-Projection-Join plan that computes q, we can compute (q,I) with a f(q,I) blowup 6
7
O UTLINE 7 Color Coding The Main Technique Query Plans for Inequalities
8
B ACKGROUND [Papadimitriou, Yannakakis ‘97] Let q be a boolean acyclic CQ ≠ and D be a database instance. Then, q can be evaluated in time k = #variables in the inequality graph 8 fixed-parameter tractability
9
C OLOR C ODING : I DEA Pick a random coloring h: Dom {1, …, k} – maps values to k colors If a tuple t belongs in the answer of the full query, then the colors satisfy the inequalities with probability ≥ e -k 9 q = R(x,y),S(y,z),T(z,w) I = {x ≠ z, y ≠ w} tupleabcd col #11214 col #21233 valid [Alon, Yuster, Zwick ‘97]
10
C OLOR C ODING : T HEOREM /Theorem/ Let q be a CQ that can be computed in time T(|q|, |D|). Then, (q, I) can be computed in time 10 Color-coding demands the construction of k-perfect hash family for every instance There is a log(|D|) additional factor The algorithm is oblivious to the combined structure of the query + inequalities
11
O UTLINE 11 Color Coding The Main Technique Query Plans for Inequalities
12
M AIN T ECHNIQUE q = R(x 1,…,x m ),S(y 1,…,y l ) + inequalities How do we compute (q,I) ? Cartesian product, then apply the inequalities – time O(ml|R||S|) IDEA: compress R to a representation R’ of size independent of |R|, then compute the product R’,S 12
13
R UNNING E XAMPLE inequality graph (bipartite) H 13 x1x1 x2x2 y1y1 y2y2 y3y3 R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4)
14
H-A CCEPTED T UPLES 14 A tuple t over the schema of S is H-accepted by R if for some t’ in R, t and t’ satisfy the inequalities in H t = (2,1,3) is H-accepted t = (2,1,2) is not! R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4) x1x1 x2x2 y1y1 y2y2 y3y3
15
H-E QUIVALENCE 15 Relations R 1, R 2 are H-equivalent if for any tuple t, t is H- accepted by R 1 if and only if t is H-accepted by R 2 /Lemma/ There exists a sub-instance R’ of R s.t. R’,R are H-equivalent |R’| ≤ f(H), independent of R R’ can be computed in time O(f(H) |R|)
16
H-F ORBIDDEN T UPLES 16 A tuple t over Dom + {-} is H-forbidden for R if for every tuple t’ in R, the inequalities between t, t’ are violated t = (1,2,3) is H-forbidden t = (1,2,-) is also H-forbidden The H-forbidden tuples are infinitely many but the minimally H-forbidden are finite R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4)
17
T HE A LGORITHM 17 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4)
18
T HE A LGORITHM 18 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) (1,2) (-,2,1) (-,1,1) (1,-,1) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4) (1,-,-) remains H-forbidden (-,1,-) remains H-forbidden (-,-,1) is not
19
T HE A LGORITHM 19 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) (1,2) (-,2,1) (-,1,1) (1,-,1) (1,4) (1,2,1) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4) only the rightmost node needs expansion
20
T HE A LGORITHM 20 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) (1,2) (-,2,1) (-,1,1) (1,-,1) (1,4) (1,2,1) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4) the tuple (1,8) expands no node
21
T HE A LGORITHM 21 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) (1,2) (-,2,1) (-,1,1) (1,-,1) (2,3) (2,1,1) (1,2,-) (1,3,-) (1,-,3)(2,1,-)(-,1,3) (1,2,1) (1,3,1) (1,4) (1,2,1) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4)
22
T HE A LGORITHM 22 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) (1,2) (-,2,1) (-,1,1) (1,-,1) (2,3) (2,1,1) (1,2,-) (1,3,-) (1,-,3)(2,1,-)(-,1,3) (1,2,1) (1,3,1) (1,4) (1,2,1) (2,1) (1,3,1) (1,1,3) (1,2,3) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4)
23
T HE A LGORITHM 23 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) (1,2) (-,2,1) (-,1,1) (1,-,1) (2,3) (2,1,1) (1,2,-) (1,3,-) (1,-,3)(2,1,-)(-,1,3) (1,2,1) (1,3,1) (1,4) (1,2,1) (2,1) (1,3,1) (1,1,3) (1,2,3) (3,2) (2,1,2) (3,1,3) (3,2) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4) the node should be expanded, but has no “space”
24
T HE A LGORITHM 24 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) (1,2) (-,2,1) (-,1,1) (1,-,1) (2,3) (2,1,1) (1,2,-) (1,3,-) (1,-,3)(2,1,-)(-,1,3) (1,2,1) (1,3,1) (1,4) (1,2,1) (2,1) (1,3,1) (1,1,3) (1,2,3) (3,2) (2,1,2) (3,1,3) (3,2)(5,2) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4)
25
T HE A LGORITHM 25 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) (1,2) (-,2,1) (-,1,1) (1,-,1) (2,3) (2,1,1) (1,2,-) (1,3,-) (1,-,3)(2,1,-)(-,1,3) (1,2,1) (1,3,1) (1,4) (1,2,1) (2,1) (1,3,1) (1,1,3) (1,2,3) (3,2) (2,1,2) (3,1,3) (3,2)(5,2) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4)
26
T HE A LGORITHM 26 (1,1) (-,-,-) (1,-,-) (-,1,-) (-,-,1) (1,2) (-,2,1) (-,1,1) (1,-,1) (2,3) (2,1,1) (1,2,-) (1,3,-) (1,-,3)(2,1,-)(-,1,3) (1,2,1) (1,3,1) (1,4) (1,2,1) (2,1) (1,3,1) (1,1,3) (1,2,3) (3,2) (2,1,2) (3,1,3) (3,2)(5,2) R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4) (1,2,-)(1,2,3) (2,1,2) (1,2,1)
27
A NALYSIS 27 R(x 1, x 2 ) (1,1) (1,2) (1,4) (1,8) (2,3) (2,1) (3,2) (5,2) (2,2) (2,4) relations with the same tree are H-equivalent tuples that do not expand a node can be removed the tree has only f(H) nodes E H (R) = constant-size relation that is H-equivalent to R
28
O UTLINE 28 Color Coding The Main Technique Query Plans for Inequalities
29
T HE H-P ROJECTION 29 Let R(A 1, …, A m ) X subset of A = {A 1,…,A m } H a bipartite graph with sets A \ X and some set B the size of the H-projection is at most f(H) times the projection
30
SPJ P LANS 30 q(w)=R(x,y,’a’),S(y,z),T(z,w) I={x≠z, y≠w, x≠w} R(A,B,E) S(B’,C) Π C,E σ E=‘a’ ΠDΠD T(C’,D) B=B’ C=C’ inequalities cannot be trivially added to the plan
31
SPJ P LANS : S TEP ONE 31 R(A,B,E) S(B’,C) Π C,E σ E=‘a’ ΠDΠD T(C’,D) B=B’ C=C’ R(A,B,E) S(B’,C) σ E=‘a’ ΠDΠD T(C’,D) B=B’ C=C’ push projections to the top of the plan
32
SPJ P LANS : S TEP T WO 32 R(A,B,E) S(B’,C) σ E=‘a’ Π D H0 T(C’,D) B=B’ C=C’ add the inequalities after the projection introduce H-projection with empty graph H0 σ A ≠C,B≠D,A≠D
33
SPJ P LANS : S TEP T HREE 33 R(A,B,E) S(B’,C) σ E=‘a’ Π D H0 T(C’,D) B=B’ C=C’ Push projections to initial place σ A ≠C,B≠D,A≠D R(A,B,E) S(B’,C) σ E=‘a’ Π D H0 T(C’,D) B=B’ C=C’ σ B≠D,A≠D Π C,E H2 σ A≠C A B D H2
34
SPJ P LANS : S TEP T HREE 34 Push projections to initial place R(A,B,E) S(B’,C) σ E=‘a’ Π D H0 T(C’,D) B=B’ C=C’ σ B≠D,A≠D Π C,E H2 σ A≠C A B D H2 R(A,B,E) S(B’,C) σ E=‘a’ Π D H0 T(C’,D) B=B’ C=C’ σ B≠D,A≠D Π C,E H2 σ A≠C
35
M AIN R ESULT /Theorem/ Let q be a CQ that can be evaluated in time T(|q|,|D|) using a Select-Project-Join plan. Then, we can compute (q, I) in time 35 x1x1 x2x2 x3x3... xkxk x k+1 R1R1 R2R2 R3R3 RkRk The function g depends on the joint structure of the query plan and the inequalities
36
C ONCLUSION 36 What is the complexity of computing CQ ≠ ? color-coding for any CQ ≠ SPJ query plans with inequalities In the paper : analysis of other structural properties Open questions can we apply the technique to arbitrary join algorithms? other classes of queries: UCQs, Datalog
37
Thank you! 37
38
C OLOR C ODING : A LGORITHM For any (valid) k-coloring c of the inequality graph, and any hash function h For each relation R, compute the sub-relation R c,h that satisfies the colors of c Apply the black-box join algorithm on the sub-instance with relations R c,h Output the union for all possible colorings and hash functions 38
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.