Presentation is loading. Please wait.

Presentation is loading. Please wait.

DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.

Similar presentations


Presentation on theme: "DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng."— Presentation transcript:

1 DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng He School of Computer Engineering Nanyang Technological University 22/12/2015

2 OutLine Background and Motivation DynamicMR Overview Experimental Evaluation Conclusion 2Nanyang Technological University22/12/2015

3 Big Data is Everywhere Lots of data is being collected and warehoused. – Web data, e-commerce – purchases at department/ grocery stores – Bank/Credit Card transactions – Social Network – Astronomical Image Processing – Bioinformatics. 3Nanyang Technological University22/12/2015

4 MapReduce is a Promising Choice A popular parallel programming model 4Nanyang Technological University Map Intermediate Result Map Reduce Output Result Reduce Output Result Reduce Output Result Reduce Output Result Final Result Map-Phase Computation Reduce-Phase Computation Input Data 22/12/2015

5 Hadoop Apache Hadoop is a open-source framework for reliable, scalable, and distributed computing. It implements the computational paradigm named MapReduce. – Scale up to 6,000-10,000 machines – Support for multi-tenancy Useful links: – http://hadoop.apache.org/ http://hadoop.apache.org/ – http://hadoop.apache.org/docs/r0.20.2/mapred_tutorial.html http://hadoop.apache.org/docs/r0.20.2/mapred_tutorial.html – http://apache.panu.it/hadoop/common/stable/ http://apache.panu.it/hadoop/common/stable/ 5Nanyang Technological University22/12/2015

6 Challenges in Distributed Environment Node failures and Stragglers (slow nodes) – Mean time between failures for 1000 nodes = 1 day  Affecting performance. Commodity network = low bandwidth – Push computation to the data (Data Locality Optimization)  Affecting performance. Resource contention in shared cluster environment – Performance isolation and fair resource sharing  Affecting performance and fairness. Performance and fairness optimization are important! 22/12/2015Nanyang Technological University6

7 Our Work Challenges: How to improve the performance of Hadoop while guarantee the fairness? Our Solution: DynamicMR: A Dynamic Resource Allocation System for Hadoop. – Improve the resource utilization as much as possible. – Improve the utilization efficiency as much as possible. 22/12/2015Nanyang Technological University7

8 OutLine Background and Motivation DynamicMR Overview Experimental Evaluation Conclusion 8Nanyang Technological University22/12/2015

9 Hadoop abstracts resources into map slots and reduce slots. – Configured by Hadoop administrator statically. – Resource constrain: map tasks can only use map slots, reduce tasks can only use reduce slots. Observation 1#: Poor Resource Utilization 9Nanyang Technological University22/12/2015 0 4 8 12162024 28 32 36 40 44 Map Phase Reduce Phase 8 map slots 4 reduce slots Slots resources are wasted during computation!

10 Core idea of DHSA. – Slots are generic and can be used by either map or reduce tasks, although there is a pre-configuration for the number of map and reduce slots. – Map tasks will prefer to use map slots and likewise reduce tasks prefer to use reduce slots. Technique 1#: Dynamic Hadoop Slot Allocation (DHSA) 10Nanyang Technological University22/12/2015 0 4 8 12162024 28 32 36 40 44 Map Phase Reduce Phase 8 map slots 4 reduce slots

11 Observation 2#: Speculative Execution is a Double-edged Sword Speculative Scheduling – Run a backup task for straggled task. – Pros: Can improve the performance of a single Job. – Cons: the resource utilization efficiency is reduced, especially when there are other pending tasks. 11Nanyang Technological University22/12/2015 1 1 2 2 3 3 4 4 5 5 1 straggler Backup task A Performance tradeoff for a single job and batch jobs! 1 1 2 2 3 3 4 4 5 5 6 6 Benefit J1 Benefit the whole workload

12 Key idea of SEPB: – Instead of running speculative tasks immediately when straggler of a job is detected, we check a subset of jobs ( maxNumOfJobsCheckedForPendingTasks )for pending tasks. – If there are pending tasks, allocate pending tasks. Otherwise, allocate speculative task. Technique 2#: Speculative Execution Performance Balancing (SEPB) 12Nanyang Technological University22/12/2015 J4J3J2J1J5J6 maxNumOfJobsCheckedForPendingTasks

13 Observation 3#: Load Balance Requirement Harms Data Locality Load Balancing is adopted by Hadoop. – Hadoop tries to keep the load (i.e., running tasks) in each node is as close as possible. 13Nanyang Technological University22/12/2015 Load Balancing makes J 1 failed to achieve data locality!

14 Key idea: Improve data locality at the expense of load balance. – When there are idle slots and local data, we preschedule the task on that machine first. – Otherwise, we keep the load balance constrain. Technique 3#: Slot PreScheduling 14Nanyang Technological University22/12/2015

15 DynamicMR A combination of the aforementioned three techniques. – DHSA : Slot Utilization Optimization. – SEPB, Slot PreScheduling: Efficiency Optimization 15Nanyang Technological University22/12/2015 Speculative Execution Performance Balancing (SEPB) Slot PreScheduling Dynamic Hadoop Slot Allocation (DHSA) Map Task Reduce Task (1). Slot Utilization Optimization (2). Utilization Efficiency Optimization Idle Slot 1 2 3

16 OutLine Background and Motivation DynamicMR Overview Experimental Evaluation Conclusion 16Nanyang Technological University22/12/2015

17 Experimental Setup Hadoop Cluster – 10 nodes, each with two Intel X5675 CPUs (6 cores per CPU with 3.07 GHz), 24GB DDR3 memory, 56GB hard disks. Benchmark and Data Sets. 17Nanyang Technological University22/12/2015

18 DynamicMR Performance Evaluation 18Nanyang Technological University22/12/2015

19 DynamicMR VS YARN DynamicMR achieves better performance than YARN. – Benefits from the ratio control of concurrently running map and reduce tasks of DynamicMR, whereas YARN not. 19Nanyang Technological University22/12/2015

20 OutLine Background and Motivation DynamicMR Overview Experimental Evaluation Conclusion 20Nanyang Technological University22/12/2015

21 Conclusion We propose a DynamicMR framework to improve the performance of MapReduce workloads while maintaining the fairness. – Consists of three techniques: DHSA, SEPB, and Slot Prescheduling. Experimental results show that: – It improves the performance of Hadoop 46%~115% for single jobs and 49%~112% for batch jobs. – It outperforms YARN by about 2%~9% for multiple jobs. 21Nanyang Technological University22/12/2015

22 22Nanyang Technological University22/12/2015

23 DHSA Evaluation DHSA achieves a better performance than Hadoop. Hadoop is sensitive to slot configuration, whereas DHSA does not. 23Nanyang Technological University22/12/2015

24 SEPB Evaluation SEPB improves the performance for the whole jobs (Figure a). There is a performance tradeoff between an individual jobs and the whole jobs with SEPB (Figure b). 24Nanyang Technological University22/12/2015

25 Slot PreScheduling Evaluation Data Locality and Performance Improvement 25Nanyang Technological University22/12/2015


Download ppt "DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng."

Similar presentations


Ads by Google