1 Minimizing Latency and Memory in DSMS CS240B Notes By Carlo Zaniolo CSD--UCLA.

1 Minimizing Latency and Memory in DSMS CS240B Notes By Carlo Zaniolo CSD--UCLA

2 Query Optimization in DSMS Opportunities and Challenges Source Sink O2O2 O3O3 O1O1 Source1  U Source2 σ ∑1∑1 Sink ∑2∑2 Source σ Sink ∑2∑2 ∑1 Simple DBMS-like opportunities: e.g. pushing selection. Sharing of operators and buffers can be important. No Major saving in execution time from reordering, indexes and operator implementation. Except for these cases: Total execution time is determine by the query graphs and the buffer contents.

3 Optimization Objectives zRate-based optimization [VN02]: Overall objective is to maximize the tuple output rate for a query zMinimize Memory Consumption: with large buffer memory could become scarce zMinimize response time (latency): yTime from source to sink yMaximize user satisfaction

4 Rate Based Optimization zRate-based optimization [VN02]: yTake into account the rates of the streams in the query evaluation tree during optimization yRates can be known and/or estimated zOverall objective is to maximize the tuple output rate for a query yInstead of seeking the least cost plan, seek the plan with the highest tuple output rate. y maximizing output rate normally leads to optimum response-time. But no actual proof of that. z As opposed to Chain that guarantees optimality for memory [Babcock 2003]

5 Progress charts zEach step represents an operator zThe ith operator takes (t i – t i-1 ) units of time to process a tuple of size s i-1 zResult is a tuple of size s i zWe can define selectivity as the drop in tuple size from operator i to operator i+1. Source O1 O2O3 O1 O2

6 Chain Scheduling Algorithm Time Source O1O1 O2 Sink O3 Memory O1 O2 O3 Memory O1 O2 O3 envelope Lower Original query graph is partitioned into sub-graphs that are prioritized eagerly

7 State of the Art zChain Limitations: y Latency minimization not supported—only memory yGeneralization to general graphs leaves much to be desired yAssumes every tuple behaves in the same way yOptimality achieved only under this assumption---what about if tuples behave differently?

8 Query Graph: Arbitrary DAGs Source σ ∑1∑1 Sink ∑2∑2 Source Sink O2O2 O3O3 O1O1 Source1  U Source2 σ Sink Source1  U Source2 σ ∑1∑1 Sink ∑2∑2

9 Chain for Latency Minimization? zChain Contributions: yUse the efficient chart partitioning algorithm to break up each component into subgraphs, where y resulting subgraphs are scheduled greedly (steepest slope first). yHow can that be used to minimize latency on arbitrary graphs? (assuming that the idle- waiting problem has been solved--or does not occur because of massive and balanced arrivals).

10 Latency: the Output Completion Chart zSuppose we have 3 input tuples at operator O 1 yHorizontal axis is time, vertical axis is # remaining output to be produced yMany waiting tuples  curve smoothes into the dotted slope zThe slope is average tuple processing rate Source O1O1 Sink Remaining Output Time 3 2 1 tuple 1 tuple 2 tuple 3 Total Output Time 3 2 1 tuple 1 tuple 2 tuple 3 Remaining Output Time N S Example, one operator:

11 Latency Minimization zExample for latency optimization: multiple independent operators zTotal area under the curve represents total latency over time zMinimizing total area under the curve—same as lower envelope yOrder operators by non- increasing slopes Source O1O1 Sink A: Source O2O2 Sink B: Remaining Output: A first Time A B SASA SBSB B first Time A B SBSB SASA Remaining Output Time O1O1 O2O2 O3O3 O4O4

12 Latency Optimization on Tuple-Sharing Fork Tuples shared by multiple branches: scheduling choices at forks: yFinish all tuples on the fastest branch first (break the fork) xAchieve fastest rate over the first branch yTake each input through all branches (no break) xFIFO, achieves the average rate of all branches No Partition at Fork Source Sink O2O2 O3O3 O1O1 Source Sink O2O2 O3O3 O1O1 Time 2N N O 1 +O 2 O3O3 O 1 +O 2 +O 3 N(  1 +  2 +  3 ) Partition at Fork Time 2N N O 1 +O 2 O3O3 O 1 +O 2 +O 3 N(  1 +  2 ) N x  3

13 Latency Optimization on Nested Fork zRecursively apply the partitioning algorithm from bottom-up yStarting from forks closest to sink buffers zSimilar algorithms can be used for memory minimization 1.Slopes are memory-reduction rates 2.Require branch-segmentation, more complicated Remaining Output G+H+P O E D C+A+B Source Sink H P G O D A C B E

14 Optimal Algorithm-Latency Minimization zWe have so far assumed which buffer to process next by the average costs of the tuples in each buffer. Thus for the simple case above the complete schedule is a permutation of A, B, and C. zScheduling based on individual tuple cost: make a scheduling decision for each tuple and the basis of its individual cost. yScheduling is still constrained by tuple-arrival order—thus at each step, we chose between the heads of each buffer. yA greedy approach that selects the least expensive head is not optimal! Source A Sink Source B Sink Source C Sink

15 Optimal Algorithm: when cost of each tuple is known 1.For each buffer chart the costs of the tuples in the buffer: Partion each chart into groups of tuples 2.Schedule group of tuples eagerly, i.e. by decreasing slope 3.Optimality as minimization of resulting: area = cost × time. Source A Sink Source B Sink Source C Sink

16 Example: Optimal Algorithm Source A SinkSource B Sink B1 B3 B2 Time SB2 SB1 A1 A4 A5 A3 A2 SA1 Time SA2 * Naïve Greedy would take B1 before A1 ** Optimal Order: SA1=A1,A2,A3,A4. Then SB1=B1,B2, then SB2=B3, finally SA2=A5. A5,A4,A3,A2,A1B3,B2,B1 \ / a

17 Experiments – Practical vs. Optimal Over many tuples, the practical component-based algorithm for latency minimization closely resemble the performance of the (unrealistic) optimal algorithm. Latency Minimization

18 Results zUnified scheduling algorithms for both latency and memory optimization yThe proposed algorithms are based on the chart-partitioning method first used by Chain for memory minimization yAlso a better memory minimization for tuple-sharing forks. zDerived optimal algorithms under the assumption that the processing costs of individual tuples is known. zExperimental evaluation shows that optimization based on the average costs of tuples in buffer (instead of their individual costs ) produces nearly optimal results.

19 References S. Viglas, J. F. Naughton: Rate-based query optimization for streaming information sources. SIGMOD Conference 2002: 37-48 [VN02] B. Babcock, S. Babu, M. Datar, R. Motwani: Chain: Operator Scheduling for Memory Minimization in Data Stream Systems. SIGMOD Conference 2003: 253-264 This is the chain paper referred to as [BBDM03] or [Babcock et al.] Yijian Bai and Carlo Zaniolo: Minimizing Latency and Memory in DSMS: a Unified Approach to Quasi- Optimal Scheduling. The Second International Workshop on Scalable Stream Processing Systems, March 29, 2008, Nantes, France.

1 Minimizing Latency and Memory in DSMS CS240B Notes By Carlo Zaniolo CSD--UCLA.

Similar presentations

Presentation on theme: "1 Minimizing Latency and Memory in DSMS CS240B Notes By Carlo Zaniolo CSD--UCLA."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Minimizing Latency and Memory in DSMS CS240B Notes By Carlo Zaniolo CSD--UCLA.

Similar presentations

Presentation on theme: "1 Minimizing Latency and Memory in DSMS CS240B Notes By Carlo Zaniolo CSD--UCLA."— Presentation transcript:

Similar presentations

About project

Feedback