Presentation is loading. Please wait.

Presentation is loading. Please wait.

SWIM 1/9/20031 QoS in Data Stream Systems Rajeev Motwani Stanford University.

Similar presentations


Presentation on theme: "SWIM 1/9/20031 QoS in Data Stream Systems Rajeev Motwani Stanford University."— Presentation transcript:

1 SWIM 1/9/20031 QoS in Data Stream Systems Rajeev Motwani Stanford University

2 SWIM 1/9/2003 2 Need for QoS Guarantees? Characteristics of Data Stream Systems –High data rates –Variable, bursty load –Large number of continuous queries –Realtime/online processing requirements else, use traditional solutions (e.g., DBMS+triggers) Consequently –Periods of heavy load, exceeding system capacity –Cannot process all registered CQs fully –Timely response for critical queries  load-shedding

3 SWIM 1/9/2003 3 QoS in Data Stream Systems QoS –Enable intelligent load-shedding mechanisms –Prioritization of queries/streams –Specify method/degree of load-shedding –React to runtime events Key Goals –QoS Mechanisms – Techniques for load-shedding –QoS Models – Enable user to specify, control, and understand impact of load-shedding –QoS Delivery – Adaptive management of system resources to optimize global metric of QoS

4 SWIM 1/9/2003 4 Load-Shedding Mechanisms Sampling – input data reduction Alertness – false positives/negatives Approximation – incomplete processing Focusing – value/group-based reduction Skipping – output reduction Jumping Windows – instead of sliding windows Latency – reducing timeliness Granularity? –Output-level –Subquery- or Operator-level –Input-level

5 SWIM 1/9/2003 5 Sampling Typically apply to input streams to reduce rate Positives –simple queries  clean error-semantics –useful for slowly-varying signals (sensors, stocks) –easy to implement, compose Negatives –complex queries (joins, group-by) large errors murky semantics –may miss critical events (security)

6 SWIM 1/9/2003 6 Alertness Alerts are common stream queries –selection predicates, usually false but interesting when true Metric –fraction of false positives or negatives –special case of approximation Positives –clear error semantics, works well for simple queries –load-shedding via simple sampling Negatives –not suitable for complex queries –inapplicable when alerts are critical events (security)

7 SWIM 1/9/2003 7 Approximation Incomplete processing, giving “approximate” results –via sampling, synopses, approximate operators, … –introduce windows or reduce window size Metrics –Quantitative – relative/absolute numerical error –Set-valued – symmetric set difference Positives –sampling good for quantitative, unless group-by aggregate Negatives –murky error semantics, especially under composition –sampling works for simple queries only

8 SWIM 1/9/2003 8 Focusing User specifies relative importance or desired granularity for groups or attributes or domain values Possibly, event-driven focusing Metrics –weighted average error –weighted granularity Positives –easy load-shedding via filtering, selective sampling –clear semantics –composes easily Negatives –limited applicability

9 SWIM 1/9/2003 9 Skipping Instead of continuous output, throttle frequency Metrics –fraction of outputs skipped Positives –load-shedding is easy –clear application semantics –applicable to complex queries Negatives –may miss critical events

10 SWIM 1/9/2003 10 Jumping Windows Move windows multiple steps at a time Metrics –time-averaged relative errors Positives –easy load-shedding –clear semantics, easy to quantify Negatives –not event-oriented Open – other window-based approaches?

11 SWIM 1/9/2003 11 Latency Sacrifice timeliness of output – within delay D of earliest time it can be produced Metrics –maximum/average/weighted delay –deadline-based Positives –buffer inputs at times of bursty load –clear semantics, easy to quantify Negatives –requires sophisticated system resource management

12 SWIM 1/9/2003 12 Expressing QoS Requirements Load-shedding mechanisms –which forms are acceptable Granularity of load-shedding –Output, subquery, operator, input streams Relative priorities of queries/streams –Weights, partial orders, … –Static or dynamic (triggered by events, specified at runtime) Penalty for load-shedding –Trade-off curve – accuracy/latency vs utility (Aurora model) –Rate chart – allowable error/latency vs input rate –Threshold – maximum accuracy/latency of select queries, best-effort on rest

13 SWIM 1/9/2003 13 Sample QoS Models 1.Given trade-off curve for utility-vs-delay maximize total utility 2.Given latency bound, query weights minimize weighted error subject to meeting latency bound 3.Given query partial order, query weights best-effort to minimize weighted error subject to latency respecting partial order 4.Given event set, critical query list when events trigger guarantee performance for critical queries only

14 SWIM 1/9/2003 14 Delivering QoS Load Shedding –location in query plan affects latency and error Memory Allocation –synopsis size affects approximation error –queues/buffers to handle high rates and rate mismatch Operator Scheduling –affects latency –interaction with memory allocation to buffers Multi-Query Optimization –interacts with all of the above –could constrain QoS delivery – e.g., when combining queries with differing latency/error requirements

15 SWIM 1/9/2003 15 Summary QoS is critical to success of stream systems Many competing approaches for load- shedding and specifying QoS requirements Delivering QoS is hard and largely unexplored – should be a fruitful research area


Download ppt "SWIM 1/9/20031 QoS in Data Stream Systems Rajeev Motwani Stanford University."

Similar presentations


Ads by Google