GRASS: Trimming Stragglers in Approximation Analytics

Slides:



Advertisements
Similar presentations
TAU Agent Team: Yishay Mansour Mariano Schain Tel Aviv University TAC-AA 2010.
Advertisements

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Combating Outliers in map-reduce Srikanth Kandula Ganesh Ananthanarayanan , Albert Greenberg, Ion Stoica , Yi Lu, Bikas Saha , Ed Harris   1.
Aggressive Cloning of Jobs for Effective Straggler Mitigation Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, Ion Stoica.
Effective Straggler Mitigation: Attack of the Clones Ganesh Ananthanarayanan, Ali Ghodsi, Srikanth Kandula, Scott Shenker, Ion Stoica.
Varys Efficient Coflow Scheduling Mosharaf Chowdhury,
GRASS: Trimming Stragglers in Approximation Analytics Ganesh Ananthanarayanan, Michael Hung, Xiaoqi Ren, Ion Stoica, Adam Wierman, Minlan Yu.
Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.
Effective Straggler Mitigation: Attack of the Clones [1]
Fast Algorithms For Hierarchical Range Histogram Constructions
Trace Analysis Chunxu Tang. The Mystery Machine: End-to-end performance analysis of large-scale Internet services.
1 Schedule Risk Assessment (SRA) Overview July 2013 NAVY CEVM.
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
Matei Zaharia, Dhruba Borthakur *, Joydeep Sen Sarma *, Khaled Elmeleegy +, Scott Shenker, Ion Stoica UC Berkeley, * Facebook Inc, + Yahoo! Research Delay.
Three heuristics for transmission scheduling in sensor networks with multiple mobile sinks Damla Turgut and Lotzi Bölöni University of Central Florida.
Near-optimal Nonmyopic Value of Information in Graphical Models Andreas Krause, Carlos Guestrin Computer Science Department Carnegie Mellon University.
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
Scatter search for project scheduling with resource availability cost Jia-Xian Zhu.
The Power of Choice in Data-Aware Cluster Scheduling
VOLTAGE SCHEDULING HEURISTIC for REAL-TIME TASK GRAPHS D. Roychowdhury, I. Koren, C. M. Krishna University of Massachusetts, Amherst Y.-H. Lee Arizona.
Bold Stroke January 13, 2003 Advanced Algorithms CS 539/441 OR In Search Of Efficient General Solutions Joe Hoffert
Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) 1DISC 2010.
임규찬. 1. Abstract 2. Introduction 3. Design Goals 4. Sample-Based Scheduling for Parallel Jobs 5. Implements.
Scheduling policies for real- time embedded systems.
Department of Computer Science, UIUC Presented by: Muntasir Raihan Rahman and Anupam Das CS 525 Spring 2011 Advanced Distributed Systems Cloud Scheduling.
Low Latency Geo-distributed Data Analytics Qifan Pu, Ganesh Ananthanarayanan, Peter Bodik, Srikanth Kandula, Aditya Akella, Paramvir Bahl, Ion Stoica.
Reining in the Outliers in Map-Reduce Clusters using Mantri Ganesh Ananthanarayanan, Srikanth Kandula, Albert Greenberg, Ion Stoica, Yi Lu, Bikas Saha,
MC 2 : Map Concurrency Characterization for MapReduce on the Cloud Mohammad Hammoud and Majd Sakr 1.
Matei Zaharia, Dhruba Borthakur *, Joydeep Sen Sarma *, Khaled Elmeleegy +, Scott Shenker, Ion Stoica UC Berkeley, * Facebook Inc, + Yahoo! Research Fair.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
Multi-Resource Packing for Cluster Schedulers Robert Grandl Aditya Akella Srikanth Kandula Ganesh Ananthanarayanan Sriram Rao.
1 Low Latency Multimedia Broadcast in Multi-Rate Wireless Meshes Chun Tung Chou, Archan Misra Proc. 1st IEEE Workshop on Wireless Mesh Networks (WIMESH),
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Multi-Resource Packing for Cluster Schedulers Robert Grandl, Ganesh Ananthanarayanan, Srikanth Kandula, Sriram Rao, Aditya Akella.
PACMan: Coordinated Memory Caching for Parallel Jobs Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker,
Scheduling Jobs Across Geo-distributed Datacenters Chien-Chun Hung, Leana Golubchik, Minlan Yu Department of Computer Science University of Southern California.
Embedded System Scheduling
Big Data Analytics with Parallel Jobs
R-Storm: Resource Aware Scheduling in Storm
Packing Tasks with Dependencies
Data Driven Resource Allocation for Distributed Learning
Trading Timeliness and Accuracy in Geo-Distributed Streaming Analytics
Chris Cai, Shayan Saeed, Indranil Gupta, Roy Campbell, Franck Le
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
Edinburgh Napier University
Monte-Carlo Planning:
Altruistic Scheduling in Multi-Resource Clusters
PA an Coordinated Memory Caching for Parallel Jobs
Measuring Service in Multi-Class Networks
Altruistic Scheduling in Multi-Resource Clusters
Feedback directed optimization in Compaq’s compilation tools for Alpha
Optimizing Expected Time Utility in Cyber-Physical Systems Schedulers
 Real-Time Scheduling via Reinforcement Learning
Is Dynamic Multi-Rate Worth the Effort?
Reining in the Outliers in MapReduce Jobs using Mantri
Richard Anderson Lecture 6 Greedy Algorithms
Cost-effective Outbreak Detection in Networks
 Real-Time Scheduling via Reinforcement Learning
Lecture 18 CSE 331 Oct 9, 2017.
Richard Anderson Lecture 7 Greedy Algorithms
Cloud Computing Large-scale Resource Management
Cloud Computing MapReduce in Heterogeneous Environments
Job-aware Scheduling in Eagle: Divide and Stick to Your Probes
CherryPick: Adaptively Unearthing the Best
What is The Optimal Number of Features
Greg Knowles ECE Fall 2004 Professor Yu Hu Hen
Tiresias A GPU Cluster Manager for Distributed Deep Learning
Presentation transcript:

GRASS: Trimming Stragglers in Approximation Analytics Ganesh Ananthanarayanan, Michael Hung, Xiaoqi Ren, Ion Stoica, Adam Wierman, Minlan Yu

Next Generation of Analytics Timely results, even if approximate Data deluge makes this necessary

Approximation Dimensions Deadline: Maximize accuracy within deadline “Pick the best ad to display within 2s” Error: Minimize time to get desired accuracy “#cars sold to the nearest thousand” Optimal Scheduler Improve accuracy by 48% Speedup by 40% *w.r.t. state-of-the-art schedulers (production workloads from Facebook and Bing)

Scheduling Challenge Prioritize tasks Straggler tasks Subset of tasks to complete #tasks » #slots (multi-waved jobs) (NP-Hard but many known heuristics…) Straggler tasks Slowest task can be 8x slower than median task Speculation: Spawn a duplicate, earliest wins Google[OSDI’04], FB[OSDI’08], Microsoft[OSDI’10]

Challenge: dynamically prioritize between speculative & unscheduled tasks to meet deadline/error bound

Is speculation worth the payoff? Opportunity Cost Speculative copies consume extra resources Slot 1 T1 Is speculation worth the payoff? Slot 2 T2 ST: deadline matters for prioritization Slot 3 T1 T3 time 5 9 10

Roadmap Two natural scheduling designs GRASS: Combining the two designs Evaluation of GRASS

Greedy Scheduling (GS) Greedily improve accuracy, i.e., earliest finishing task Accuracy = 7/9 Straggler Slot 1 T1 T1 Deadline = 6 Slot 2 T2 T3 T4 T5 T6 T7 time 1 6 (at time =1 ) Task ID T1 T2 T3 T4 T5 T6 T7 T8 T9 Time remaining 5 --- New copy 2 1 3

Resource Aware Scheduling (RAS) Speculate only if it saves time and resources Straggler Accuracy = 8/9 Slot 1 T1 T1 T4 T6 T7 Deadline = 6 Slot 2 T2 T1 T3 T5 T8 time 1 3 6 One copy for 5s (vs.) Two copies for 2s (at time =1 ) Task ID T1 T2 T3 T4 T5 T6 T7 T8 T9 Time remaining 5 --- New copy 2 1 3

Neither GS nor RAS is uniformly better GS vs. RAS Accuracy = 3/9 Accuracy = 7/9 Slot 1 T1 T1 GS Deadline = 6 Deadline = 3 Slot 2 T2 T3 T4 T5 T6 T7 time 1 3 Neither GS nor RAS is uniformly better 6 Accuracy = 8/9 Accuracy = 2/9 Slot 1 T1 T4 T6 T7 RAS Deadline = 6 Slot 2 T2 Deadline = 3 T1 T3 T5 T8 time 1 3 6

Intuition: Use RAS early in the job (be “conservative”), switch to GS towards the end (be “aggressive”) “time remaining”

Theoretical Scheduling Model Multi-waved scheduling of tasks Constant wave-width Agnostic to fairness policies Heavy-tailed (Pareto) distribution of task durations Speculation: GS, RAS, Switching, Optimal Theorem: Using RAS when >2 waves of tasks remain, and GS when ≤2 waves of tasks remain is “near-optimal”

How to estimate two remaining waves? Wave boundaries are not strict Non-uniform task durations Wave-width is not constant Start with RAS and switch to GS close to the deadline/error-bound

Learning the switching point RAS[4s]+GS[2s] RAS RAS RAS GS GS RAS[5s]+GS[1s] RAS[6s] 4 5 6 Deadline GS-only and RAS-only job samples “Exploration vs. Exploitation” Multi-armed bandit solution, ɛ = 0.1

GRASS (= GS + RAS) Scheduler Opportunity Cost in speculation for stragglers GS  Greedy Scheduling RAS  Resource Aware Scheduling Switch RASGS close to deadline/error-bound Learn switching point empirically from job samples Provably near-optimal in theoretical model

Implementation Hadoop 0.20.2 and Spark 0.7.3 Task Estimators Modified Fair Scheduler Job bins with GS-only and RAS-only samples Task Estimators Remaining time is extrapolated from data-to-process progress reports at 5% intervals New copy’s time is sampled from completed tasks

How well does GRASS perform? Workload from Facebook and Bing traces Hadoop and Dryad production jobs Added deadlines and error bounds Baselines: LATE & Mantri 200 node EC2 deployment (m2.2xlarge instances)

Accuracy of deadline-bound jobs improve by 47% Gains hold across deadlines (lenient and stringent )

GRASS is 22% better than statically picking GS or RAS … and is near-optimal

Error-bound Jobs Overall speedup of 38% (optimal is 40%) Gains hold across all error bounds Exact jobs (0% error-bound) speed up by 34% Unified Straggler Mitigation

Conclusion Next gen. of analytics: Approximate but timely results Challenge: Dynamic and unpredictable stragglers GRASS – Conservative speculation early in the job; aggressive towards its end Evaluation with Hadoop & Spark Accuracy of deadline-bound jobs improve by 47% Error-bound jobs speed up by 38%