Presentation is loading. Please wait.

Presentation is loading. Please wait.

State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries Song Wang Elke Rundensteiner Database Systems Research Group Worcester.

Similar presentations


Presentation on theme: "State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries Song Wang Elke Rundensteiner Database Systems Research Group Worcester."— Presentation transcript:

1 State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries Song Wang Elke Rundensteiner Database Systems Research Group Worcester Polytechnic Institute Worcester, MA, USA. Samrat Ganguly Sudeept Bhatnagar NEC Laboratories America Inc. Princeton, NJ, USA.

2 32nd VLDB Conference, Seoul, Korea, 2006 2 Computation Sharing for Stream Processing Register Continuous Queries Streaming Data Streaming Result σ П σ σ New Challenges: In-memory processing of stateful operators Stateful operators with various window constraints Agg SPJA Query Network w1w1 w2w2 w3w3 Agg

3 32nd VLDB Conference, Seoul, Korea, 2006 3 Window Constraints for Stateful Operators  Time-based sliding window constraints Each tuple has a timestamp Only tuples within W timeframe can form an output Buffer A Buffer B A[ w ] AB B[ w ] Observations: States in the operator dominate memory usage State size is proportional to the input rate and window length Join CPU cost is proportional to the state size

4 32nd VLDB Conference, Seoul, Korea, 2006 4 A Motivation Example Q1: SELECT A.* FROM Temperature A, Humidity B WHERE A.LocationId= B.LocationId WINDOW w1 min Q2: SELECT A.* FROM Temperature A, Humidity B WHERE A.LocationId= B.LocationId AND A.Value>Threshold WINDOW w2 min A[ w 1 ] Q1Q1 AB B[ w 1 ] Q2Q2 σAσA A B A[ w 2 ] B[ w 2 ] Observations: State A[W 1 ] overlaps with state A[W 2 ] State B[W 1 ] overlaps with state B[W 2 ] Joined results of Q1 and Q2 overlap Let: w1<w2

5 32nd VLDB Conference, Seoul, Korea, 2006 5 Sharing with Selection Pull-up [CDF02, HFA+03] +  Selection pull up  Using larger window (w2) A[ w 1 ] Q1Q1 AB B[ w 1 ] Q2Q2 σAσA A B A[ w 2 ] B[ w 2 ] all Q2Q2 Q1Q1 |T a -T b | <W 1 Router B σAσA A R A[ w 2 ] B[ w 2 ] A B A[ w 2 ] B[ w 2 ] σAσA Q2Q2 [CDF02]: J. Chen, D. J. DeWitt, and J. F. Naughton. Design and evaluation of alternative selection placement strategies in optimizing continuous queries. In ICDE’02. [HFA+03]: M. A. Hammad, M. J. Franklin, W. G. Aref, and A. K. Elmagarmid. Scheduling for shared window joins over data streams. In VLDB’03.

6 32nd VLDB Conference, Seoul, Korea, 2006 6 Pros  Single Join Operator Cons  Wasted Computation without Early Filtering  Wasted State Memory without Early Filtering  Per Output-Tuple Routing Cost Sharing with Selection Pull-up [CDF02, HFA+03]

7 32nd VLDB Conference, Seoul, Korea, 2006 7  Split stream A by A.Value  Route shared join results Stream Partition with Selection Pushdown [KFH04] + A[ w 1 ] Q1Q1 AB B[ w 1 ] Q2Q2 σ A.Value>Threshold A B A[ w 2 ] B[ w 2 ] A1 Router > all B A Threshold <= U B1 Split 1 A2 B2 2 Q2Q2 Q1Q1 |T a -T b | Union R S A[ w 1 ] B[ w 1 ] A[ w 2 ] B[ w 2 ] <W 1 [KFH04]: S. Krishnamurthy, M. J. Franklin, J. M. Hellerstein, and G. Jacobson. The case for precision sharing. In VLDB’04.

8 32nd VLDB Conference, Seoul, Korea, 2006 8 Pros  Selection pushdown: no wasted Join Computation Cons  Multiple Join Operators  Duplicated State Memory in Multiple Join Operators  Per Output-Tuple Routing Cost Stream Partition with Selection Pushdown [KFH04]

9 32nd VLDB Conference, Seoul, Korea, 2006 9 State-Slice: New Sharing Paradigm Key Ideas:  State-Slice Concept for Sliding Window Join  Pipelined Chain of Join Slices Prospective Benefit:  Fine-grained Selection Push-down  Pipelined Join Operators  Avoiding Per-tuple Routing Cost

10 32nd VLDB Conference, Seoul, Korea, 2006 10 One-way State Sliced Window Join State of Stream A: [w1, w2] Probe A Tuple B Tuple Joined-Result Purged-A-Tuple Propagated-B-Tuple  Iower bound of sliding window: [w1,w2]  B tuple only probes A tuples that are “older” at least W 1, but at most W 2, than itself

11 32nd VLDB Conference, Seoul, Korea, 2006 11 The Chain of One-way State-Sliced Joins  Split state memory into chain of joins  No overlap of state memory in chain of joins Queue(s) State of Stream A: [0, w 1 ] Probe A Tuple B Tuple J1J1 J2J2 State of Stream A: [w 1, w 2 ] Probe U Union Joined-Result =

12 32nd VLDB Conference, Seoul, Korea, 2006 12 female From One-way to Two-way Binary Join  Intuitively a combination of two one-way join  Two references for each A or B tuples Male tuples are used to probe states Female tuples are inserted and cross-purged to respective states State of Stream A: [0, w 1 ] State of Stream B: [0, w 1 ] Queue(s) A Tuple B Tuple J1J1 J2J2 U Union Joined-Result State of Stream B: [w 1, w 2 ] State of Stream A: [w 1, w 2 ] male

13 32nd VLDB Conference, Seoul, Korea, 2006 13 State-Sliced Join Chain: The Example  States of sliced joins in a chain are disjoint with each other  Minimize State Memory Usage  Selection can be pushed down into middle of join chain  Avoid Unnecessary Resource Waste  No routing step is needed  Avoid Per Output-Tuple Routing Cost Completely A1 B1 B A [0,W 1 ] 1 A2 B2 2 Q2Q2 Q1Q1 U Union σAσA s s σAσA [0,W 1 ] [W 1,W 2 ] + Q2Q2 σAσA A B A[ w 2 ] B[ w 2 ] Q1Q1 A[ w 1 ] AB B[ w 1 ] Q1Q1 A[ w 1 ] AB B[ w 1 ]

14 32nd VLDB Conference, Seoul, Korea, 2006 14 Summary: State-Sliced Join Chain Pros:  Minimized Memory Usage  Reduced Routing Cost  No Need of Operator Synchronization in the Chain Cons:  Stream traffic between pipelined joins  Purge cost

15 32nd VLDB Conference, Seoul, Korea, 2006 15 Sharing via Chains: Memory-Optimal Chain U U U ss [w 1,w 2 ] B A 1 Q1Q1 [0,w 1 ] 2 Q2Q2 s [w N-1,w N ] N … Union … QNQN s [w 2,w 3 ] 3 Q3Q3 Union … U ss [w 1,w 2 ] B A 1 Q1Q1 [0,w 1 ] 2 Q2Q2 s [w N-1,w N ] N … U Union … QNQN U s [w 2,w 3 ] 3 Q3Q3 Union … σ’1σ’1 σ1σ1 σ’2σ’2 σ’2σ’2 σ2σ2 σ3σ3 σ’3σ’3 σ’3σ’3 σNσN σNσN  No Selection:  With Selection:

16 32nd VLDB Conference, Seoul, Korea, 2006 16 Mem-Optimal Chain  CPU-Optimal Chain? ss [w 1,w 2 ] B A 1 Q1Q1 [0,w 1 ] 2 Q2Q2 U Union s [w 2,w 3 ] 3 Q3Q3 U Union s [w 3,w 4 ] 4 Q4Q4 U Union s [w 4,w 5 ] 5 Q5Q5 U Union Overheads:  Too many operators may increase system context switch cost  Too many sliced states increase purging cost

17 32nd VLDB Conference, Seoul, Korea, 2006 17 Merging Sliced Joins Tradeoff:  Gain from Merging Reduce number of Join operators Reduce extra purging cost  Loss from Merging Introduce routing cost Increase memory usage due to selection pullup Cost Model for CPU Usage s i QiQi U Union … s [w j-1,w j ] QjQj U Union …… … … [w i-1,w i ] j QiQi U Union … s [w i-1,w j ] QjQj U Union … <wi<wi |T a -T b | R Router ≥w j-1 i … … …

18 32nd VLDB Conference, Seoul, Korea, 2006 18 CPU-Opt. Chain: Search Space & Solution v0v0 v1v1 v2v2 v5v5 v3v3 w0w0 w1w1 w2w2 w3w3 w5w5 v4v4 w4w4 ss [w 2,w 3 ] B A 1 [0,w 2 ] 2 Q3Q3 U Union s [w 3,w 5 ] 3 Q4Q4 U Union Q2Q2 <w1<w1 |T a -T b | R Router Q1Q1 <w4<w4 |T a -T b | R Router Q5Q5 U Union Legend: V i : window start/end time V i toV j : one slice window Shortest path problem

19 32nd VLDB Conference, Seoul, Korea, 2006 19 Summary: Mem-Opt. vs. CPU-Opt. Join Chain Mem-Optimal: Minimized Memory Usage  Higher System Overhead  Higher Purging Cost CPU-Optimal: Minimized CPU Usage  More Memory Usage if Selection is Pulled Up to Merge Slices. Selection PullUp SharingMem-Opt. Chain CPU-Opt. Chain State Slice State Merge

20 32nd VLDB Conference, Seoul, Korea, 2006 20 Experimental WPI Stream Engine: CAPE Software Demonstration VLDB’04

21 32nd VLDB Conference, Seoul, Korea, 2006 21 Experiment Study 1: Memory Consumption

22 32nd VLDB Conference, Seoul, Korea, 2006 22 Experiment Study 2: Total Service Rate

23 32nd VLDB Conference, Seoul, Korea, 2006 23 Experiment Study 3: Mem-Opt. vs. CPU-Opt. Window Distributions Used for 12 Queries. Small-Large: 12 Queries Small-Large: 24 Queries

24 32nd VLDB Conference, Seoul, Korea, 2006 24 Conclusion Pipelined state sliced join chain Mem-Optimal chain construction CPU-Optimal chain construction Implemented in CAPE Performance evaluation

25 32nd VLDB Conference, Seoul, Korea, 2006 25 Thank You! Visit CAPE Homepage http://davis.wpi.edu/dsrg/CAPE/index.html Supported by: CRI grant CNS 05-51584


Download ppt "State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries Song Wang Elke Rundensteiner Database Systems Research Group Worcester."

Similar presentations


Ads by Google