Optimizing Interactive Analytics Engines for Heterogeneous Clusters Ashwini Raina MS Thesis Part of EuroSys’18 paper titled “Popular is Cheaper: Curtailing Memory Costs in Interactive Analytics Engines”
Talk Outline Chronological order of work Baseline Getafix Query Routing Segment Balancing Capacity-aware Getafix Stragglers Auto-tiering
Interactive Analytics Engines
Druid
Replication Goal Better Goal
Assumption : Equal capacity CNs Baseline Getafix S1 S2 S3 S4 6 3 2 1 Goal is to provide a load balanced assignment with the least amount of replication CN1 CN2 CN3 Capacity: 4 𝟔+𝟑+𝟐+𝟏 𝟑 Assumption : Equal capacity CNs
Best Fit provably achieves optimal replica count Baseline Getafix S1 S2 S3 S4 6 3 2 1 CN1 CN2 CN3 Best Fit provably achieves optimal replica count 1 replica 2 replicas 2 replicas Are there any side-effects of such allocation?
Baseline Getafix Tail latency 30% worse compared to Scarlett Average Query Latency Vs Replication Factor Tail (99th) Query Latency Vs Replication Factor Tail latency 30% worse compared to Scarlett
Query Routing
Building Druid diagnostics
Query Routing Load based Allocation based Minimum Load CNs piggyback “load indicators” on query responses to broker Broker routes a new query to lowest loaded CN Connection Count Each broker maintains a count of total open connections (outstanding query responses) Broker routes a new query to a CN with lowest connection count Allocation based Potion routing In each round Getafix outputs a segment to CN allocation map Each broker routes queries preserving the ratios in that allocation map
Query Routing Observations Load based Minimum Load Load information inaccurate/stale Connection Count Simple scheme works surprisingly well Allocation based Potion routing Lags segment popularity trends
Segment imbalance
Segment imbalance
Segment Balancer Greedy algorithm Reduces max memory utilization of a CN Reduces query latency Baseline Getafix 32% reduction in max memory 15% reduction in query latency Segment Balanced Getafix
Capacity-aware Getafix S1 S2 S3 S4 6 3 2 1 CN1 CN2 CN3 Equal capacity assumption does not hold in heterogeneous environments Capacity: 4 𝟔+𝟑+𝟐+𝟏 𝟑
Capacity–aware Getafix Estimates capacities of compute nodes dynamically CPU time spent on processing queries Higher CPU time implies higher capacity Does weighted allocation based on capacity Upto 23% reduction in tail latency Upto 27% reduction in memory Upto 16% reduction in makespan
55% reduction in tail latency Stragglers Capacity-awareness automatically addresses stragglers Straggling nodes report lower CPU time Classified as lower capacity nodes Get lower segment query time allocation 55% reduction in tail latency 18% reduction in memory
Time slice is one Getafix round Darker color => higher popularity Cluster auto-tiering Sysadmins manually tier the cluster Assign hot (popular) segments to powerful nodes Rule based assignment, not fully tuned to changes in popularity Laborious and costly Capacity-awareness auto-tiers the cluster Time slice is one Getafix round Darker color => higher popularity 75% tiering accuracy 80% better than baseline Getafix Baseline Getafix Getafix-H
% improvement over Getafix baseline Improvement over Scarlett Results Summary Scenario % improvement over Getafix baseline Tail latency (99th) Memory Makespan Tiering accuracy Heterogeneous Cluster 23% 27% 16% 80% Stragglers 55% 28% Did not measure Metric Improvement over Scarlett Getafix baseline Getafix-H Tail latency (99th) -30% 9% Memory 1.45-2.15X 2-3X
Lessons “A plot is worth a thousand logs” Expand your work from the core Replay-able runs => consistent results Don’t push AWS keys to github
Thank You.