Adaptive Ordering of Pipelined Stream Filters S. Babu, R. Motwani, K. Munagala, I. Nishizawa, and J. Widom In Proc. of SIGMOD 2004, June 2004
Outline Introduction Ordering of filters Adaptive ordering algorithm Preliminaries Algorithms Experimental evaluation Conclusion
Introduction Why “Stream”? Many modern applications deal with data that Many modern applications Is updated continuously Needs to be processed in real-time Why “Adaptive”? Arrival characteristics of streams may vary significantly over time Characteristic of filters Direct Commutative
Ordering of filters Challenges Selectivities across filters may be correlated. An exhaustive algorithm becomes infeasible. Greedy approach Arrival characteristics of streams may vary significantly over time Adaptive approach
Adaptive ordering algorithm Three-way tradeoff Run-time overhead Monitor the changes of statistics Determine when to change the current order Convergence properties If data and filter characteristics stabilize, adaptive algorithm should converge to a solution with desirable properties Speed of adaptivity The time the convergence takes when data and filter characteristics change
Preliminaries symbolmeaning query input stream filters stream tuple ordering conditional drop probability Processing time per tuple total cost
Algorithms A-GREEDY algorithm The SWEEP algorithm The INDEPENDENT algorithm The LOCALSWAP algorithm
(Static) Greedy algorithm Greedy approach
A-GREEDY Profiler
A-GREEDY Reoptimizer violation
A-GREEDY Algorithm (1/2)
A-GREEDY Algorithm (2/2)
A-GREEDY properties Convergence Good Run-time overhead Profile-tuple creation Profile-window maintenance Matrix-view update Detection and correction of GI violations Adaptivity rapidly
Algorithms A-GREEDY algorithm The SWEEP algorithm The INDEPENDENT algorithm The LOCALSWAP algorithm
The SWEEP Algorithm Rotating over
SWEEP properties Convergence Good Run-time overhead A-Greedy: Sweep: Adaptivity Violations may remain undetected for a relatively long time Up to stages
Algorithms A-GREEDY algorithm The SWEEP algorithm The INDEPENDENT algorithm The LOCALSWAP algorithm
The INDEPENDENT Algorithm Assume the filters are independent frequently in database literature seldom true in practice
INDEPENDENT properties Convergence independent converge to the optimal ordering dependent can be times worse than the GI ordering Run-time overhead lower than A-Greedy Adaptivity rapidly
Algorithms A-GREEDY algorithm The SWEEP algorithm The INDEPENDENT algorithm The LOCALSWAP algorithm
The LOCALSWAPS Algorithm Detect a swap between adjacent filters in would improve performance
LOCALSWAP properties Convergence dependent on the way characteristics change in some case, times higher cost than GI ordering Run-time overhead lower than A-Greedy Adaptivity may take more time to converge may get stuck in a local optima
Experimental evaluation Three parts Convergence experiments Run-time overhead experiments Adaptivity experiments
Convergence experiments Factors Number of filters Filter selectivities Cost of filters Correlation among filters Comparison Optimal A-Greedy Independent
Convergence experiments (number) Factors Number of filters
Convergence experiments (selectivity) Factors Filter selectivities
Convergence experiments (correlation) Factors Correlation among filters
Run-time overhead experiments Factors Number of filters Filter selectivities Cost of filters Comparison Optimal A-Greedy Sweep LocalSwaps Independent
Run-time overhead experiments (time) Factors Spending time ≥ 98%
Run-time overhead experiments (number) Factors Number of filters
Run-time overhead experiments (cost) Factors Cost of filters
Adaptivity experiments Factors Rate of change Cost of filters Comparison A-Greedy Sweep LocalSwaps Independent
Adaptivity experiments (speed)
Adaptivity experiments (rate) Factors Rate of change
Adaptivity experiments (cost) Tradeoff Run-time overhead ↔ adaptivity
Conclusion Different points along the tradeoff spectrum Fast Adaptivity A-Greedy Low run-time overhead Slow adaptivity, good convergence Sweep Independent filters Independent Swap between adjacent filters, unpredicted convergence LocalSwaps
Appendix
stable
correlation factor Γ filters are divided into groups Each contains Γ filters Filters in different groups independent the same groups 80% the same result of input tuples most correlated completely independent
Example 1
Example 2 For, Total cost = 20 Input total Cost
A-GREEDY one of + +permutation of others cost: Example … … 1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, … … INDEPENDENT permutation of + cost:
A-GREEDY + permutation of others cost: Example 7 1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, … LOCALSWAP cost: x x xx x x x Before, After, /100 ?