Download presentation
Presentation is loading. Please wait.
Published byMolly Nash Modified over 9 years ago
1
A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies 2012 ACM SIGMOD/PODS Conference Scottsdale, Arizona, USA PODS 2012 Benny Kimelfeld IBM Research – Almaden
2
This Work! Deletion Propagation Translate a tuple deletion on the view back to the source relations … properly Classic database problem –Specializing the more general view-update problem –[Dayal & Bernstein 1982; Cosmadakis & Papadimitriou 1984; Keller 1986; Cui & Widom 2001; Buneman & Khanna & Tan 2002; Cong & Fan & Geerts 2006; …] Renewed motivation: debug/causality for false positives [K, Vondrak, Williams, 2011] Various definitions of “properly” were studied –Minimize the view side effect # view tuples lost except the intentional one –Minimize the source side effect # source tuples to delete = maximal “responsibility” for an answer [Meliou et al., 2010]
3
Example: File Access GroupFile groupfile ai a.txt ai b.txt db a. txt db b.txt os a.txt UserGroup usergroup Emma ai Emma db Olivia os Olivia db Jacob ai Access (u,f) :– UserGroup (u,g), GroupFile (g,f) Delete source rows, s.t. Emma won’t access a.txt. But, maintain maximum access permissions! [Cui & Widom 2001; Buneman et al. 2002] Access userfile Emma a.txt Emma b.txt Olivia a.txt Olivia b.txt Jacob a.txt Jacob b.txt = ⋈
4
Example: File Access Access (u,f) :– UserGroup (u,g), GroupFile (g,f) Delete source rows, s.t. Emma won’t access a.txt. But, maintain maximum access permissions! = ⋈ GroupFile groupfile ai a.txt ai b.txt db a. txt db b.txt os a.txt UserGroup usergroup Emma ai Emma db Olivia os Olivia db Jacob ai Access userfile Emma a.txt Emma b.txt Olivia a.txt Olivia b.txt Jacob a.txt Jacob b.txt [Cui & Widom 2001; Buneman et al. 2002]
5
Example: File Access Delete source rows, s.t. Emma won’t access a.txt. But, maintain maximum access permissions! GroupFile groupfile ai a.txt ai b.txt db a. txt db b.txt os a.txt UserGroup usergroup Emma ai Emma db Olivia os Olivia db Jacob ai Access userfile Emma a.txt Emma b.txt Olivia a.txt Olivia b.txt Jacob a.txt Jacob b.txt Access (u,f) :– UserGroup (u,g), GroupFile (g,f) = ⋈ side-effect free (& minimal side effect) side-effect free (& minimal side effect) [Cui & Widom 2001; Buneman et al. 2002]
6
Formal Definitions Schema S : rel. symbols + functional dependencies (fd) R 1,…., R m R i : attribute-set → attribute Conjunctive Query (CQ) Q : head variables existential variables Q ( y 1, y 2, y 3 ) :– R 1 ( x 1, y 1 ), R 2 ( x 1,'ibm'), R 3 ( x 2, y 1, y 2, x 3 ), R 4 ( x 4, y 3 ) Solution: E ⊆ D s.t. a ∉ Q ( E ) Side-effect free: Q ( E ) = Q ( D ) – { a } Optimal: | Q ( E )| is maximal Input: DB D over S Answer a ∈ Q ( D ) to delete No self joins! atom
7
Complexity Questions What is the complexity of Deciding if a side-effect-free solution exists? Finding an optimal solution? –Or one w/ approximately minimal side effect? –Or one w/ approximately maximal # surviving answers? Not the same [K, Vondrák, Williams, 2011] Data complexity: Fixed:Schema S, CQ Q Input: DB D over S, answer a ∊ Q ( D ) to delete
8
Unirelation Algorithm ( 1Rel ): Example Delete a = (Emma, a.txt ) = ⋈ GroupFile groupfile ai a.txt ai b.txt db a. txt db b.txt os a.txt UserGroup usergroup Emma ai Emma db Olivia os Olivia db Jacob ai Access userfile Emma a.txt Emma b.txt Olivia a.txt Olivia b.txt Jacob a.txt Jacob b.txt Access (u,f) :– UserGroup (u,g), GroupFile (g,f) [Buneman et al., 2002]
9
Unirelation Algorithm ( 1Rel ): Example Recall: there is even better solution (side-effect free) better than previous ⇒ selected solution Delete a = (Emma, a.txt ) GroupFile groupfile ai a.txt ai b.txt db a. txt db b.txt os a.txt UserGroup usergroup Emma ai Emma db Olivia os Olivia db Jacob ai Access userfile Emma a.txt Emma b.txt Olivia a.txt Olivia b.txt Jacob a.txt Jacob b.txt Access (u,f) :– UserGroup (u,g), GroupFile (g,f) ⋈ [Buneman et al., 2002] =
10
1Rel : General Case undesired a ∈ Q ( D ) R1R1 R2R2 RkRk select best (i=1,…,k) solution i : delete from R i each tuple consistent w/ a solution 1 solution 2 solution k Q has k atoms … … R1R1 R2R2 RkRk … R1R1 R2R2 RkRk … D D D
11
Head Domination [K, Vondrák, Williams, 2011] Q:Q:A CQ over a schema S G∃[Q]:G∃[Q]: nodes = atoms( Q ) edges = “sharing ≥1 existential var.” head domination: ∀ C ∊ CC( G ∃ [ Q ]) ∃ ∊ atoms( Q ) s.t., headVars( C ) ⊆ vars( ) Connected Components Q ( y 1, y 2, y 3 ) :– R 1 ( x 1, y 1 ), R 2 ( x 1, y 2 ), R 3 ( y 1, y 2 ), R 4 ( x 2, y 2, y 3 ) Q ( y 1, y 2 ) :– R 1 ( x, y 1 ), R 2 ( x, y 2 ) Q ( y 1, y 2 ) :– R 1 ( x 1, y 1 ), R 2 ( x 1, y 2 ), R 3 ( x 1, y 1, y 2 ) Access ( u, f )
12
Previous Dichotomy Theorem [KVW 2011] Let Q be a CQ over a schema S (no self joins) [K, Vondrak, Williams, 2011], no FDs: Q has head domination ⇒ 1Rel returns an optimal solution (in PTime) otherwise ⇒ ∃ side-effect-free is NP-complete; NP-hard to find an ( α Q -approx.) optimal solution Q ( y 1, y 2, y 3 ) :– R 1 ( x 1, y 1 ), R 2 ( x 1, y 2 ), R 3 ( y 1, y 2 ), R 4 ( x 2, y 2, y 3 ) Q ( y 1, y 2 ) :– R 1 ( x, y 1 ), R 2 ( x, y 2 ) Q ( y 1, y 2 ) :– R 1 ( x 1, y 1 ), R 2 ( x 1, y 2 ), R 3 ( x 1, y 1, y 2 ) PTime (1Rel) PTime NP-hard Access ( u, f )
13
Access Example Revisited Delete (Emma, a.txt ) group ← file PTime GroupFile groupfile ai a.txt ai b.txt db a. txt db b.txt os a.txt UserGroup usergroup Emma ai Emma db Olivia os Olivia db Jacob ai Access userfile Emma a.txt Emma b.txt Olivia a.txt Olivia b.txt Jacob a.txt Jacob b.txt ⋈ = NP-hard
14
Access Example Revisited Delete (Emma, a.txt ) user → group NP-hard PTime GroupFile groupfile ai a.txt ai b.txt db a. txt db b.txt os a.txt UserGroup usergroup Emma ai Emma db Olivia os Olivia db Jacob ai Access userfile Emma a.txt Emma b.txt Olivia a.txt Olivia b.txt Jacob a.txt Jacob b.txt = ⋈ group ← file PTime
15
Access Example Revisited Delete (Emma, a.txt ) NP-hard user → group PTime group ← file PTime GroupFile groupfile ai a.txt ai b.txt db a. txt db b.txt os a.txt UserGroup usergroup Emma ai Emma db Olivia os Olivia db Jacob ai Access userfile Emma a.txt Emma b.txt Olivia a.txt Olivia b.txt Jacob a.txt Jacob b.txt user ← group PTime ⋈ =
16
Access Example Revisited Delete (Emma, a.txt ) NP-hard user → group PTime group ← file PTime user ← group PTime group → file PTime Every nontrivial set of FDs brings the problem to PTime GroupFile groupfile ai a.txt ai b.txt db a. txt db b.txt os a.txt UserGroup usergroup Emma ai Emma db Olivia os Olivia db Jacob ai Access userfile Emma a.txt Emma b.txt Olivia a.txt Olivia b.txt Jacob a.txt Jacob b.txt ⋈ =
17
Additional Examples Q ( y, y 1, y 2 ) :– R 1 ( y 1, x 1 ), R ( x 1, y, x 2 ), R 2 ( y 2, x 2 ) Q ( y, y 1, y 2 ) :– R 1 ( x 1, y 1 ), R ( x 1, y, x 2 ), R 2 ( x 2, y 2 ) PTime NP- hard
18
Dichotomy with FDs [K, Vondrak, Williams, 2011], no FDs: Q has head domination ⇒ 1Rel returns an optimal solution (in PTime) otherwise ⇒ ∃ side-effect-free is NP-complete; NP-hard to find an ( α Q -approx.) optimal solution This paper: (FDs) Q + has functional head dom. ⇒ 1Rel* returns an optimal solution (in PTime) otherwise ⇒ ∃ side-effect-free is NP-complete; NP-hard to find an ( α Q -approx.) optimal solution Let Q be a CQ over a schema S (no self joins) Depending on the CQ and FDs, the problem is either straightforward or hard! Remove tuple only if it is used for the undersired answer
19
FDs Among Variables Access (u,f) :– UserGroup (u,g), GroupFile (g,f) FD: group → file g → fg → fu → g FD: user → group u → f{u,g} → f Definition: CQ Q over schema S, U, V ⊆ variables( Q ) U → V : ∀ D ∈ db( S ) 1, 2 ∈ hom( Q→D ) 1 = 2 on U ⇒ 1 = 2 on V
20
The CQ Q + Definition: CQ Q over schema S, U, V ⊆ variables( Q ) U → V : ∀ D ∈ db( S ) 1, 2 ∈ hom( Q→D ) 1 = 2 on U ⇒ 1 = 2 on V Q + : add to Q ’s head every x s.t. headVars → x Access (u,f) :– UserGroup (u,g), GroupFile (g,f) group ← file Access + (u,g,f) :– UserGroup (u,g), GroupFile (g,f) g ← {u,f} ⇒ Tractability Condition: Q + has functional head domination Tractability Condition: Q + has functional head domination
21
Functional Head Domination functional head domination: ∀ C ∈ CC( G ∃ [ Q ]) ∃ ∊ atoms( Q ), s.t. vars( ) → headVars( C ) head domination: ∀ C ∈ CC( G ∃ [ Q ]) ∃ ∊ atoms( Q ), s.t. vars( ) ⊇ headVars( C ) Access (u,f) :– UserGroup (u,g), GroupFile (g,f) group → file {u,g} → {u,f} ⇐ Q:Q:A CQ over a schema S G∃[Q]:G∃[Q]: nodes = atoms( Q ) edges = “sharing ≥1 existential var.” Tractability Condition: Q + has functional head domination Tractability Condition: Q + has functional head domination
22
Examples Tractability Condition: Q + has functional head domination Tractability Condition: Q + has functional head domination Q ( y, y 1, y 2 ) :– R 1 ( x 1, y 1 ), R ( x 1, y, x 2 ), R 2 ( x 2, y 2 ) PTime (1Rel*) Q + ( y, y 1, y 2, x 2 ) :– R 1 ( x 1, y 1 ), R ( x 1, y, x 2 ), R 2 ( x 2, y 2 ) { y, y 1, y 2 } → x 2 Q ( y, y 1, y 2 ) :– R 1 ( x 1, y 1 ), R ( x 1, y, x 2 ), R 2 ( x 2, y 2 ) NP-hard
23
Example: Key-Preserving Views Theorem [Cong, Fan, Geerts, 2006]: Q preserves keys* ⇒ deletion propagation in PTime Tractability Condition: Q + has functional head domination Tractability Condition: Q + has functional head domination *Each relation has a key; none of the key attributes are projected out Q preserves keys ⇒ Q + has no existential vars ⇒ G ∃ [ Q + ] has no edges ⇒ Q + trivially has functional head domination (every connected component is a node, dominated by itself…) ⇒ 1Rel * returns an optimal solution For CQs w/o self joins, follows directly from our positive side:
24
About the Proof T he positive side is fairly simple –… once the tractability condition is found The negative side is intricate –Reduction from the special case of the Access CQ –Challenge: simulating Access (u,f) by an instance that satisfies all the FDs –Central concept: graph separation on the variable graph of the CQ Q' ( y 1, y 2 ) :– R 1 ( y 1, x 1, x ), R 2 ( x, x 2, y 2 ) Q ( y 1, y 2 ) :– R 1 ( y 1, x ), R 2 ( x, y 2 ) R3(x1, x2)R3(x1, x2) →
25
Conclusions & Ongoing Work Studied deletion propagation in the presence of functional dependencies Established a dichotomy in complexity: –PTime by a straightforward algorithm vs. –Hardness (of approximation) Generalizes previously established special cases: no FDs, key-preserving views Ongoing work: deletion of multiple answers –Preview: trichotomy Straightforward Hard but approximable (by a constant-factor) Hard to approximate Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.