State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries Song Wang Elke Rundensteiner Database Systems Research Group Worcester Polytechnic Institute Worcester, MA, USA. Samrat Ganguly Sudeept Bhatnagar NEC Laboratories America Inc. Princeton, NJ, USA.
32nd VLDB Conference, Seoul, Korea, Computation Sharing for Stream Processing Register Continuous Queries Streaming Data Streaming Result σ П σ σ New Challenges: In-memory processing of stateful operators Stateful operators with various window constraints Agg SPJA Query Network w1w1 w2w2 w3w3 Agg
32nd VLDB Conference, Seoul, Korea, Window Constraints for Stateful Operators Time-based sliding window constraints Each tuple has a timestamp Only tuples within W timeframe can form an output Buffer A Buffer B A[ w ] AB B[ w ] Observations: States in the operator dominate memory usage State size is proportional to the input rate and window length Join CPU cost is proportional to the state size
32nd VLDB Conference, Seoul, Korea, A Motivation Example Q1: SELECT A.* FROM Temperature A, Humidity B WHERE A.LocationId= B.LocationId WINDOW w1 min Q2: SELECT A.* FROM Temperature A, Humidity B WHERE A.LocationId= B.LocationId AND A.Value>Threshold WINDOW w2 min A[ w 1 ] Q1Q1 AB B[ w 1 ] Q2Q2 σAσA A B A[ w 2 ] B[ w 2 ] Observations: State A[W 1 ] overlaps with state A[W 2 ] State B[W 1 ] overlaps with state B[W 2 ] Joined results of Q1 and Q2 overlap Let: w1<w2
32nd VLDB Conference, Seoul, Korea, Sharing with Selection Pull-up [CDF02, HFA+03] + Selection pull up Using larger window (w2) A[ w 1 ] Q1Q1 AB B[ w 1 ] Q2Q2 σAσA A B A[ w 2 ] B[ w 2 ] all Q2Q2 Q1Q1 |T a -T b | <W 1 Router B σAσA A R A[ w 2 ] B[ w 2 ] A B A[ w 2 ] B[ w 2 ] σAσA Q2Q2 [CDF02]: J. Chen, D. J. DeWitt, and J. F. Naughton. Design and evaluation of alternative selection placement strategies in optimizing continuous queries. In ICDE’02. [HFA+03]: M. A. Hammad, M. J. Franklin, W. G. Aref, and A. K. Elmagarmid. Scheduling for shared window joins over data streams. In VLDB’03.
32nd VLDB Conference, Seoul, Korea, Pros Single Join Operator Cons Wasted Computation without Early Filtering Wasted State Memory without Early Filtering Per Output-Tuple Routing Cost Sharing with Selection Pull-up [CDF02, HFA+03]
32nd VLDB Conference, Seoul, Korea, Split stream A by A.Value Route shared join results Stream Partition with Selection Pushdown [KFH04] + A[ w 1 ] Q1Q1 AB B[ w 1 ] Q2Q2 σ A.Value>Threshold A B A[ w 2 ] B[ w 2 ] A1 Router > all B A Threshold <= U B1 Split 1 A2 B2 2 Q2Q2 Q1Q1 |T a -T b | Union R S A[ w 1 ] B[ w 1 ] A[ w 2 ] B[ w 2 ] <W 1 [KFH04]: S. Krishnamurthy, M. J. Franklin, J. M. Hellerstein, and G. Jacobson. The case for precision sharing. In VLDB’04.
32nd VLDB Conference, Seoul, Korea, Pros Selection pushdown: no wasted Join Computation Cons Multiple Join Operators Duplicated State Memory in Multiple Join Operators Per Output-Tuple Routing Cost Stream Partition with Selection Pushdown [KFH04]
32nd VLDB Conference, Seoul, Korea, State-Slice: New Sharing Paradigm Key Ideas: State-Slice Concept for Sliding Window Join Pipelined Chain of Join Slices Prospective Benefit: Fine-grained Selection Push-down Pipelined Join Operators Avoiding Per-tuple Routing Cost
32nd VLDB Conference, Seoul, Korea, One-way State Sliced Window Join State of Stream A: [w1, w2] Probe A Tuple B Tuple Joined-Result Purged-A-Tuple Propagated-B-Tuple Iower bound of sliding window: [w1,w2] B tuple only probes A tuples that are “older” at least W 1, but at most W 2, than itself
32nd VLDB Conference, Seoul, Korea, The Chain of One-way State-Sliced Joins Split state memory into chain of joins No overlap of state memory in chain of joins Queue(s) State of Stream A: [0, w 1 ] Probe A Tuple B Tuple J1J1 J2J2 State of Stream A: [w 1, w 2 ] Probe U Union Joined-Result =
32nd VLDB Conference, Seoul, Korea, female From One-way to Two-way Binary Join Intuitively a combination of two one-way join Two references for each A or B tuples Male tuples are used to probe states Female tuples are inserted and cross-purged to respective states State of Stream A: [0, w 1 ] State of Stream B: [0, w 1 ] Queue(s) A Tuple B Tuple J1J1 J2J2 U Union Joined-Result State of Stream B: [w 1, w 2 ] State of Stream A: [w 1, w 2 ] male
32nd VLDB Conference, Seoul, Korea, State-Sliced Join Chain: The Example States of sliced joins in a chain are disjoint with each other Minimize State Memory Usage Selection can be pushed down into middle of join chain Avoid Unnecessary Resource Waste No routing step is needed Avoid Per Output-Tuple Routing Cost Completely A1 B1 B A [0,W 1 ] 1 A2 B2 2 Q2Q2 Q1Q1 U Union σAσA s s σAσA [0,W 1 ] [W 1,W 2 ] + Q2Q2 σAσA A B A[ w 2 ] B[ w 2 ] Q1Q1 A[ w 1 ] AB B[ w 1 ] Q1Q1 A[ w 1 ] AB B[ w 1 ]
32nd VLDB Conference, Seoul, Korea, Summary: State-Sliced Join Chain Pros: Minimized Memory Usage Reduced Routing Cost No Need of Operator Synchronization in the Chain Cons: Stream traffic between pipelined joins Purge cost
32nd VLDB Conference, Seoul, Korea, Sharing via Chains: Memory-Optimal Chain U U U ss [w 1,w 2 ] B A 1 Q1Q1 [0,w 1 ] 2 Q2Q2 s [w N-1,w N ] N … Union … QNQN s [w 2,w 3 ] 3 Q3Q3 Union … U ss [w 1,w 2 ] B A 1 Q1Q1 [0,w 1 ] 2 Q2Q2 s [w N-1,w N ] N … U Union … QNQN U s [w 2,w 3 ] 3 Q3Q3 Union … σ’1σ’1 σ1σ1 σ’2σ’2 σ’2σ’2 σ2σ2 σ3σ3 σ’3σ’3 σ’3σ’3 σNσN σNσN No Selection: With Selection:
32nd VLDB Conference, Seoul, Korea, Mem-Optimal Chain CPU-Optimal Chain? ss [w 1,w 2 ] B A 1 Q1Q1 [0,w 1 ] 2 Q2Q2 U Union s [w 2,w 3 ] 3 Q3Q3 U Union s [w 3,w 4 ] 4 Q4Q4 U Union s [w 4,w 5 ] 5 Q5Q5 U Union Overheads: Too many operators may increase system context switch cost Too many sliced states increase purging cost
32nd VLDB Conference, Seoul, Korea, Merging Sliced Joins Tradeoff: Gain from Merging Reduce number of Join operators Reduce extra purging cost Loss from Merging Introduce routing cost Increase memory usage due to selection pullup Cost Model for CPU Usage s i QiQi U Union … s [w j-1,w j ] QjQj U Union …… … … [w i-1,w i ] j QiQi U Union … s [w i-1,w j ] QjQj U Union … <wi<wi |T a -T b | R Router ≥w j-1 i … … …
32nd VLDB Conference, Seoul, Korea, CPU-Opt. Chain: Search Space & Solution v0v0 v1v1 v2v2 v5v5 v3v3 w0w0 w1w1 w2w2 w3w3 w5w5 v4v4 w4w4 ss [w 2,w 3 ] B A 1 [0,w 2 ] 2 Q3Q3 U Union s [w 3,w 5 ] 3 Q4Q4 U Union Q2Q2 <w1<w1 |T a -T b | R Router Q1Q1 <w4<w4 |T a -T b | R Router Q5Q5 U Union Legend: V i : window start/end time V i toV j : one slice window Shortest path problem
32nd VLDB Conference, Seoul, Korea, Summary: Mem-Opt. vs. CPU-Opt. Join Chain Mem-Optimal: Minimized Memory Usage Higher System Overhead Higher Purging Cost CPU-Optimal: Minimized CPU Usage More Memory Usage if Selection is Pulled Up to Merge Slices. Selection PullUp SharingMem-Opt. Chain CPU-Opt. Chain State Slice State Merge
32nd VLDB Conference, Seoul, Korea, Experimental WPI Stream Engine: CAPE Software Demonstration VLDB’04
32nd VLDB Conference, Seoul, Korea, Experiment Study 1: Memory Consumption
32nd VLDB Conference, Seoul, Korea, Experiment Study 2: Total Service Rate
32nd VLDB Conference, Seoul, Korea, Experiment Study 3: Mem-Opt. vs. CPU-Opt. Window Distributions Used for 12 Queries. Small-Large: 12 Queries Small-Large: 24 Queries
32nd VLDB Conference, Seoul, Korea, Conclusion Pipelined state sliced join chain Mem-Optimal chain construction CPU-Optimal chain construction Implemented in CAPE Performance evaluation
32nd VLDB Conference, Seoul, Korea, Thank You! Visit CAPE Homepage Supported by: CRI grant CNS