Download presentation
Presentation is loading. Please wait.
1
State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries Song Wang Elke Rundensteiner Database Systems Research Group Worcester Polytechnic Institute Worcester, MA, USA. Samrat Ganguly Sudeept Bhatnagar NEC Laboratories America Inc. Princeton, NJ, USA.
2
32nd VLDB Conference, Seoul, Korea, 2006 2 Computation Sharing for Stream Processing Register Continuous Queries Streaming Data Streaming Result σ П σ σ New Challenges: In-memory processing of stateful operators Stateful operators with various window constraints Agg SPJA Query Network w1w1 w2w2 w3w3 Agg
3
32nd VLDB Conference, Seoul, Korea, 2006 3 Window Constraints for Stateful Operators Time-based sliding window constraints Each tuple has a timestamp Only tuples within W timeframe can form an output Buffer A Buffer B A[ w ] AB B[ w ] Observations: States in the operator dominate memory usage State size is proportional to the input rate and window length Join CPU cost is proportional to the state size
4
32nd VLDB Conference, Seoul, Korea, 2006 4 A Motivation Example Q1: SELECT A.* FROM Temperature A, Humidity B WHERE A.LocationId= B.LocationId WINDOW w1 min Q2: SELECT A.* FROM Temperature A, Humidity B WHERE A.LocationId= B.LocationId AND A.Value>Threshold WINDOW w2 min A[ w 1 ] Q1Q1 AB B[ w 1 ] Q2Q2 σAσA A B A[ w 2 ] B[ w 2 ] Observations: State A[W 1 ] overlaps with state A[W 2 ] State B[W 1 ] overlaps with state B[W 2 ] Joined results of Q1 and Q2 overlap Let: w1<w2
5
32nd VLDB Conference, Seoul, Korea, 2006 5 Sharing with Selection Pull-up [CDF02, HFA+03] + Selection pull up Using larger window (w2) A[ w 1 ] Q1Q1 AB B[ w 1 ] Q2Q2 σAσA A B A[ w 2 ] B[ w 2 ] all Q2Q2 Q1Q1 |T a -T b | <W 1 Router B σAσA A R A[ w 2 ] B[ w 2 ] A B A[ w 2 ] B[ w 2 ] σAσA Q2Q2 [CDF02]: J. Chen, D. J. DeWitt, and J. F. Naughton. Design and evaluation of alternative selection placement strategies in optimizing continuous queries. In ICDE’02. [HFA+03]: M. A. Hammad, M. J. Franklin, W. G. Aref, and A. K. Elmagarmid. Scheduling for shared window joins over data streams. In VLDB’03.
6
32nd VLDB Conference, Seoul, Korea, 2006 6 Pros Single Join Operator Cons Wasted Computation without Early Filtering Wasted State Memory without Early Filtering Per Output-Tuple Routing Cost Sharing with Selection Pull-up [CDF02, HFA+03]
7
32nd VLDB Conference, Seoul, Korea, 2006 7 Split stream A by A.Value Route shared join results Stream Partition with Selection Pushdown [KFH04] + A[ w 1 ] Q1Q1 AB B[ w 1 ] Q2Q2 σ A.Value>Threshold A B A[ w 2 ] B[ w 2 ] A1 Router > all B A Threshold <= U B1 Split 1 A2 B2 2 Q2Q2 Q1Q1 |T a -T b | Union R S A[ w 1 ] B[ w 1 ] A[ w 2 ] B[ w 2 ] <W 1 [KFH04]: S. Krishnamurthy, M. J. Franklin, J. M. Hellerstein, and G. Jacobson. The case for precision sharing. In VLDB’04.
8
32nd VLDB Conference, Seoul, Korea, 2006 8 Pros Selection pushdown: no wasted Join Computation Cons Multiple Join Operators Duplicated State Memory in Multiple Join Operators Per Output-Tuple Routing Cost Stream Partition with Selection Pushdown [KFH04]
9
32nd VLDB Conference, Seoul, Korea, 2006 9 State-Slice: New Sharing Paradigm Key Ideas: State-Slice Concept for Sliding Window Join Pipelined Chain of Join Slices Prospective Benefit: Fine-grained Selection Push-down Pipelined Join Operators Avoiding Per-tuple Routing Cost
10
32nd VLDB Conference, Seoul, Korea, 2006 10 One-way State Sliced Window Join State of Stream A: [w1, w2] Probe A Tuple B Tuple Joined-Result Purged-A-Tuple Propagated-B-Tuple Iower bound of sliding window: [w1,w2] B tuple only probes A tuples that are “older” at least W 1, but at most W 2, than itself
11
32nd VLDB Conference, Seoul, Korea, 2006 11 The Chain of One-way State-Sliced Joins Split state memory into chain of joins No overlap of state memory in chain of joins Queue(s) State of Stream A: [0, w 1 ] Probe A Tuple B Tuple J1J1 J2J2 State of Stream A: [w 1, w 2 ] Probe U Union Joined-Result =
12
32nd VLDB Conference, Seoul, Korea, 2006 12 female From One-way to Two-way Binary Join Intuitively a combination of two one-way join Two references for each A or B tuples Male tuples are used to probe states Female tuples are inserted and cross-purged to respective states State of Stream A: [0, w 1 ] State of Stream B: [0, w 1 ] Queue(s) A Tuple B Tuple J1J1 J2J2 U Union Joined-Result State of Stream B: [w 1, w 2 ] State of Stream A: [w 1, w 2 ] male
13
32nd VLDB Conference, Seoul, Korea, 2006 13 State-Sliced Join Chain: The Example States of sliced joins in a chain are disjoint with each other Minimize State Memory Usage Selection can be pushed down into middle of join chain Avoid Unnecessary Resource Waste No routing step is needed Avoid Per Output-Tuple Routing Cost Completely A1 B1 B A [0,W 1 ] 1 A2 B2 2 Q2Q2 Q1Q1 U Union σAσA s s σAσA [0,W 1 ] [W 1,W 2 ] + Q2Q2 σAσA A B A[ w 2 ] B[ w 2 ] Q1Q1 A[ w 1 ] AB B[ w 1 ] Q1Q1 A[ w 1 ] AB B[ w 1 ]
14
32nd VLDB Conference, Seoul, Korea, 2006 14 Summary: State-Sliced Join Chain Pros: Minimized Memory Usage Reduced Routing Cost No Need of Operator Synchronization in the Chain Cons: Stream traffic between pipelined joins Purge cost
15
32nd VLDB Conference, Seoul, Korea, 2006 15 Sharing via Chains: Memory-Optimal Chain U U U ss [w 1,w 2 ] B A 1 Q1Q1 [0,w 1 ] 2 Q2Q2 s [w N-1,w N ] N … Union … QNQN s [w 2,w 3 ] 3 Q3Q3 Union … U ss [w 1,w 2 ] B A 1 Q1Q1 [0,w 1 ] 2 Q2Q2 s [w N-1,w N ] N … U Union … QNQN U s [w 2,w 3 ] 3 Q3Q3 Union … σ’1σ’1 σ1σ1 σ’2σ’2 σ’2σ’2 σ2σ2 σ3σ3 σ’3σ’3 σ’3σ’3 σNσN σNσN No Selection: With Selection:
16
32nd VLDB Conference, Seoul, Korea, 2006 16 Mem-Optimal Chain CPU-Optimal Chain? ss [w 1,w 2 ] B A 1 Q1Q1 [0,w 1 ] 2 Q2Q2 U Union s [w 2,w 3 ] 3 Q3Q3 U Union s [w 3,w 4 ] 4 Q4Q4 U Union s [w 4,w 5 ] 5 Q5Q5 U Union Overheads: Too many operators may increase system context switch cost Too many sliced states increase purging cost
17
32nd VLDB Conference, Seoul, Korea, 2006 17 Merging Sliced Joins Tradeoff: Gain from Merging Reduce number of Join operators Reduce extra purging cost Loss from Merging Introduce routing cost Increase memory usage due to selection pullup Cost Model for CPU Usage s i QiQi U Union … s [w j-1,w j ] QjQj U Union …… … … [w i-1,w i ] j QiQi U Union … s [w i-1,w j ] QjQj U Union … <wi<wi |T a -T b | R Router ≥w j-1 i … … …
18
32nd VLDB Conference, Seoul, Korea, 2006 18 CPU-Opt. Chain: Search Space & Solution v0v0 v1v1 v2v2 v5v5 v3v3 w0w0 w1w1 w2w2 w3w3 w5w5 v4v4 w4w4 ss [w 2,w 3 ] B A 1 [0,w 2 ] 2 Q3Q3 U Union s [w 3,w 5 ] 3 Q4Q4 U Union Q2Q2 <w1<w1 |T a -T b | R Router Q1Q1 <w4<w4 |T a -T b | R Router Q5Q5 U Union Legend: V i : window start/end time V i toV j : one slice window Shortest path problem
19
32nd VLDB Conference, Seoul, Korea, 2006 19 Summary: Mem-Opt. vs. CPU-Opt. Join Chain Mem-Optimal: Minimized Memory Usage Higher System Overhead Higher Purging Cost CPU-Optimal: Minimized CPU Usage More Memory Usage if Selection is Pulled Up to Merge Slices. Selection PullUp SharingMem-Opt. Chain CPU-Opt. Chain State Slice State Merge
20
32nd VLDB Conference, Seoul, Korea, 2006 20 Experimental WPI Stream Engine: CAPE Software Demonstration VLDB’04
21
32nd VLDB Conference, Seoul, Korea, 2006 21 Experiment Study 1: Memory Consumption
22
32nd VLDB Conference, Seoul, Korea, 2006 22 Experiment Study 2: Total Service Rate
23
32nd VLDB Conference, Seoul, Korea, 2006 23 Experiment Study 3: Mem-Opt. vs. CPU-Opt. Window Distributions Used for 12 Queries. Small-Large: 12 Queries Small-Large: 24 Queries
24
32nd VLDB Conference, Seoul, Korea, 2006 24 Conclusion Pipelined state sliced join chain Mem-Optimal chain construction CPU-Optimal chain construction Implemented in CAPE Performance evaluation
25
32nd VLDB Conference, Seoul, Korea, 2006 25 Thank You! Visit CAPE Homepage http://davis.wpi.edu/dsrg/CAPE/index.html Supported by: CRI grant CNS 05-51584
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.