Presentation is loading. Please wait.

Presentation is loading. Please wait.

Operator Placement for In-Network Stream Query Processing.

Similar presentations


Presentation on theme: "Operator Placement for In-Network Stream Query Processing."— Presentation transcript:

1 Operator Placement for In-Network Stream Query Processing

2 Outline  Introduction  Preliminaries  Filter placement  Extensions  Conclusions

3 Introduction  In-network query processing  Consider a video surveillance application Environment Target  Suspicious activity  dark, movement Need  filter for calculating intensity (F1)  filter for detecting sufficient motion (F2)

4 Introduction  Previous work  push down all filters since CPU cost << communication cost  What if the queries involve expensive predicates ?  Objective  place each filter at the “ best" node based on selectivity and cost  minimize the overall cost

5 Introduction  Operator placement problem  Tradeoff Lower computational costs  Put on the nodes higher up Lower transmission cost  Put on the nodes lower down  Candidate m-level hierarchy n filters  m n possible solutions  In this paper…  Key idea Model network links as filters  Content define the problem provide a greedy alg. that failed present a polynomial-time optimal alg. extend to multiway stream join …

6 Preliminaries  Consider a linear chain of nodes  Notation S = data acquired by node N 1 F = { F 1, F 2, …, F n }  Query

7 Cost Model  Three quantities  Selectivity of filter F : s(F) fraction of the tuples in stream S that are expected to satisfy F  Cost of filter F : c(F, i) per-tuple cost of execution on node N i c(F, i+1) =  i c(F, i)   i ≤ 1 (if  i > 1 )  Cost of network transmission : l i per-tuple cost of transmitting from N i to N i+1 r s( F )r

8 Cost Model  Notation  P(F) = i if filter F is executed on N i  F i = { F | P(F) = i }  F ’ = F ’ 1, F ’ 2, …, F ’ n ’ c(F ’, i) = the cost per tuple of executing F ’ at node N i  r(F i ) = F i in rank order   Ref. J. Hellerstein and M.Stonebraker. Predicate migration: Optimizing queries with expensive predicates. 1993  Cost on a single node  Overall cost 

9 Example 2.2 c(P) = c(F 1, 1) + s(F 1 ) c(F 2, 1) + s(F 1 ) s(F 2 ) [ l 1 + l 2 + c(F 3, 3) ] + s(F 1 ) s(F 2 ) s(F 3 ) [ l 3 + c(F 4, 4) ] = 200 + (½) 400 + (½) (½) [ 700 + 500 + (1/5) (1/2) 1300 ] + (½) (½) (½) [ 300 + (1/5) (1/2) (1/4) 2500 ] = 200 + 200 + 332.5 + 45.3125 = 777.8125 s(F) = 1/2

10 Filter Placement 1.Greedy algorithm 2.Optimal algorithm

11 Greedy algorithm  Notation  c(P, i) = part of the total cost c(P) incurred at N i including transmission from N i to N i+1  network link N i to N i+1 : s( ) = 0, c(,1) = l i 

12 Example 3.3 At N 1, r(F 1 ) = 400, r(F 2 ) = 800, r(F 3 ) = 2600, r(F 4 ) = 5000, F l 1 = 700 > r(F 1 ) At N 2, r(F 2 ) = 160, r(F 3 ) = 520, r(F 4 ) = 1000, F l 2 = 500 > r(F 2 ) At N 3, r(F 3 ) = 260, r(F 4 ) = 500, F l 3 = 300 > r(F 3 ) At N 4, r(F 4 ) c(P) = 200 + 350 + 40 + 125 + 32.5 + 37.5 + 7.8125 = 792.8125

13 Optimal algorithm  Notation  network link N i to N i+1 :, 

14 Optimal algorithm  Short-circuiting  Rank  Cost scaleup

15 Optimal algorithm

16 Example 3.7 Model links as filters = 4571.42857142857, r(F 1 ) = 400, r(F 2 ) = 800, r(F 3 ) = 2600, r(F 4 ) = 5000, r(F l 1 ) = 875, r(F l 2,4 ) = 4571.4 r(F 1 ) < r(F 2 ) < r(F l 1 ) < r(F 3 ) < r(F l 2,4 ) < r(F 4 ) c(P) = 200 + 200 + 175 + 65 + 100 + 7.8125 = 747.8125

17 Extensions  Correlated filters  Tree hierarchies  Joins  Other extensions

18 Correlated filters  Definition  Conditional selecivity s(F|Q) = the fraction of tuples that satisfy F given that they satisfy all the filters in Q  Reference  Optimal ordering of correlated filters at a single node NP-hard guaranteed to find a cost at most 4 times the opt. cost  Approximation ratio of 4 the best possible unless P = NP 

19 Correlated filters  Definition ,  Short-circuiting  Optimal solution  Tree hierarchy  = Each of the queries operates on different data. There is no sharing computation or transmission among them.

20 Joins  Problem  k different data streams acquired by N 1  Solution  Reference Sliding-window join MJoin operator  at a single node  join tree is left as future work  Query W 1 and W 2 represent the lengths of the windows (time-pased or tuple- based) on streams S 1 and S 2.

21 Joins  Joint operator  Illustration  Selectivity s(  ) = the fraction of the cross product that occurs in the join result  Cost r1r1 r2r2 s(  )r 1 r 2

22 Joins  Notation  F i = filters that can be applied either on S i before the join or after  | F i | = n i  F 12 = filters that can be applied only on after e the join 

23 Joins  Time complexity : O(n 2 n 1 m(n+m)log(n+m))

24 Extensions  Constrained nodes   Per-filter cost scaling  c(F, i+1) / c(F, i) may be different for different F.  Modeling network links as filters no longer applies.  It becomes NP-hard.

25 Conclusion  Environment  Operator placement problem  Tradeoff Lower computational costs  Put on the nodes higher up Lower transmission cost  Put on the nodes lower down  Provide  Greedy alg. & Optimal alg.  Extensions

26 Lemma 3.1 by (2)

27 F 1 in P is chosen according to the theorem. ∵ Lemma 3.1 and s(F l 1 )=0 ∴ F’ 1 in P’ s.t. c( P’, 1 ) < c( P, 1 ) ∵ Theorem 2.1 ∴ c( P, 1 ) ≦ c( P’, 1 ) → contradiction Theorem 3.2

28 Lemma 3.4

29 F 1 in P is chosen according to the theorem. ∵ Lemma 3.4 ∴ P’ s.t. c( P’, 1 ) < c( P, 1 ) ∵ Theorem 2.1 ∴ c( P, 1 ) ≦ c( P’, 1 ) → contradiction Theorem 3.5

30 Suppose and the best Moving the filters on node Ni to Ni-1 Moving the filters on node Ni to Ni+1 ∵ P is best plan ∴ c( P) < c( P’), c( P) < c( P”) → implies → contradiction Lemma 3.6


Download ppt "Operator Placement for In-Network Stream Query Processing."

Similar presentations


Ads by Google