Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimizing Interactive Analytics Engines for Heterogeneous Clusters

Similar presentations


Presentation on theme: "Optimizing Interactive Analytics Engines for Heterogeneous Clusters"— Presentation transcript:

1 Optimizing Interactive Analytics Engines for Heterogeneous Clusters
Ashwini Raina MS Thesis Part of EuroSys’18 paper titled “Popular is Cheaper: Curtailing Memory Costs in Interactive Analytics Engines”

2 Talk Outline Chronological order of work Baseline Getafix
Query Routing Segment Balancing Capacity-aware Getafix Stragglers Auto-tiering

3 Interactive Analytics Engines

4 Druid

5 Replication Goal Better Goal

6 Assumption : Equal capacity CNs
Baseline Getafix S1 S2 S3 S4 6 3 2 1 Goal is to provide a load balanced assignment with the least amount of replication CN1 CN2 CN3 Capacity: 4 𝟔+𝟑+𝟐+𝟏 𝟑 Assumption : Equal capacity CNs

7 Best Fit provably achieves optimal replica count
Baseline Getafix S1 S2 S3 S4 6 3 2 1 CN1 CN2 CN3 Best Fit provably achieves optimal replica count 1 replica 2 replicas 2 replicas Are there any side-effects of such allocation?

8 Baseline Getafix Tail latency 30% worse compared to Scarlett
Average Query Latency Vs Replication Factor Tail (99th) Query Latency Vs Replication Factor Tail latency 30% worse compared to Scarlett

9 Query Routing

10 Building Druid diagnostics

11 Query Routing Load based Allocation based Minimum Load
CNs piggyback “load indicators” on query responses to broker Broker routes a new query to lowest loaded CN Connection Count Each broker maintains a count of total open connections (outstanding query responses) Broker routes a new query to a CN with lowest connection count Allocation based Potion routing In each round Getafix outputs a segment to CN allocation map Each broker routes queries preserving the ratios in that allocation map

12 Query Routing Observations
Load based Minimum Load Load information inaccurate/stale Connection Count Simple scheme works surprisingly well Allocation based Potion routing Lags segment popularity trends

13 Segment imbalance

14 Segment imbalance

15 Segment Balancer Greedy algorithm
Reduces max memory utilization of a CN Reduces query latency Baseline Getafix 32% reduction in max memory 15% reduction in query latency Segment Balanced Getafix

16 Capacity-aware Getafix
S1 S2 S3 S4 6 3 2 1 CN1 CN2 CN3 Equal capacity assumption does not hold in heterogeneous environments Capacity: 4 𝟔+𝟑+𝟐+𝟏 𝟑

17 Capacity–aware Getafix
Estimates capacities of compute nodes dynamically CPU time spent on processing queries Higher CPU time implies higher capacity Does weighted allocation based on capacity Upto 23% reduction in tail latency Upto 27% reduction in memory Upto 16% reduction in makespan

18 55% reduction in tail latency
Stragglers Capacity-awareness automatically addresses stragglers Straggling nodes report lower CPU time Classified as lower capacity nodes Get lower segment query time allocation 55% reduction in tail latency 18% reduction in memory

19 Time slice is one Getafix round Darker color => higher popularity
Cluster auto-tiering Sysadmins manually tier the cluster Assign hot (popular) segments to powerful nodes Rule based assignment, not fully tuned to changes in popularity Laborious and costly Capacity-awareness auto-tiers the cluster Time slice is one Getafix round Darker color => higher popularity 75% tiering accuracy 80% better than baseline Getafix Baseline Getafix Getafix-H

20 % improvement over Getafix baseline Improvement over Scarlett
Results Summary Scenario % improvement over Getafix baseline Tail latency (99th) Memory Makespan Tiering accuracy Heterogeneous Cluster 23% 27% 16% 80% Stragglers 55% 28% Did not measure Metric Improvement over Scarlett Getafix baseline Getafix-H Tail latency (99th) -30% 9% Memory X 2-3X

21 Lessons “A plot is worth a thousand logs”
Expand your work from the core Replay-able runs => consistent results Don’t push AWS keys to github

22 Thank You.


Download ppt "Optimizing Interactive Analytics Engines for Heterogeneous Clusters"

Similar presentations


Ads by Google