MapReduce Scheduling in Cloud Computing

Slides:



Advertisements
Similar presentations
MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
Advertisements

MapReduce.
LIBRA: Lightweight Data Skew Mitigation in MapReduce
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
MapReduce Online Veli Hasanov Fatih University.
EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Intro to Map-Reduce Feb 4, map-reduce? A programming model or abstraction. A novel way of thinking about designing a solution to certain problems…
Distributed Computations
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
An Introduction to MapReduce: Abstractions and Beyond! -by- Timothy Carlstrom Joshua Dick Gerard Dwan Eric Griffel Zachary Kleinfeld Peter Lucia Evan May.
UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.
Distributed Computations MapReduce
Introduction to Google MapReduce WING Group Meeting 13 Oct 2006 Hendra Setiawan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Hadoop & Cheetah. Key words Cluster  data center – Lots of machines thousands Node  a server in a data center – Commodity device fails very easily Slot.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
MapReduce: Simplified Data Processing on Large Clusters 컴퓨터학과 김정수.
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
f ACT s  Data intensive applications with Petabytes of data  Web pages billion web pages x 20KB = 400+ terabytes  One computer can read
HAMS Technologies 1
MapReduce M/R slides adapted from those of Jeff Dean’s.
임규찬. 1. Abstract 2. Introduction 3. Design Goals 4. Sample-Based Scheduling for Parallel Jobs 5. Implements.
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Using Map-reduce to Support MPMD Peng
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Team3: Xiaokui Shu, Ron Cohen CS5604 at Virginia Tech December 6, 2010.
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
Lecture 4: Mapreduce and Hadoop
Introduction to Google MapReduce
Introduction to Distributed Platforms
Tao Zhu1,2, Chengchun Shu1, Haiyan Yu1
Chapter 10 Data Analytics for IoT
Hadoop MapReduce Framework
Edinburgh Napier University
CS 425 / ECE 428 Distributed Systems Fall 2016 Nov 10, 2016
Abstract Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for.
CS 425 / ECE 428 Distributed Systems Fall 2017 Nov 16, 2017
PA an Coordinated Memory Caching for Parallel Jobs
Central Florida Business Intelligence User Group
Chapter 6: CPU Scheduling
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
Distributed Systems CS
The Basics of Apache Hadoop
湖南大学-信息科学与工程学院-计算机与科学系
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Effective Social Network Quarantine with Minimal Isolation Costs
Cse 344 May 4th – Map/Reduce.
3: CPU Scheduling Basic Concepts Scheduling Criteria
Chapter5: CPU Scheduling
Chapter 6: CPU Scheduling
CPU SCHEDULING.
Interpret the execution mode of SQL query in F1 Query paper
Lecture 16 (Intro to MapReduce and Hadoop)
Distributed Systems CS
Introduction to MapReduce
Cloud Computing MapReduce in Heterogeneous Environments
Operating System , Fall 2000 EA101 W 9:00-10:00 F 9:00-11:00
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Chapter 6: CPU Scheduling
MapReduce: Simplified Data Processing on Large Clusters
Communication Driven Remapping of Processing Element (PE) in Fault-tolerant NoC-based MPSoCs Chia-Ling Chen, Yen-Hao Chen and TingTing Hwang Department.
Module 5: CPU Scheduling
Presentation transcript:

MapReduce Scheduling in Cloud Computing Prof. Jenn-Wei Lin Department of Computer Science and Information Engineering Fu Jen Catholic University, Taiwan

Outline Introduction Proposed MapReduce Scheduling Scheme Cloud Computing MapReduce Scheduling Proposed MapReduce Scheduling Scheme Experience Sharing for My Short-Term Research in Iowa State University

What is Cloud Computing? Cloud Computing is a general term used to describe a new class of network based computing Using the network to provide hardware and software services to users. Hiding the complexity and details of the underlying infrastructure from users and applications Very simple graphical interface, API (Applications Programming Interface), Web-based Applications. 19th May, 09 mark.baker@computer.org

Infrastructure as a Service Delivery Models SaaS Software as a Service PaaS Platform as a Service IaaS Infrastructure as a Service 19th May, 09 mark.baker@computer.org

Hadoop MapReduce Hadoop MapReduce is a software framework for easily writing applications to process vast amounts of data in HDFS. 應用層面 大量數據資料的分析統計 大量資料的排序彙整 網頁存取紀錄的分析 Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.

MapReduce Framework …… Master node Jobtracker Job Submission Job 3 ⁞ FIFO Queue Map 2 Map 1 Reduce 1 … Map m Reduce n Map task Map task Reduce task Reduce task …… Work node 1 (Tasktracker) Work node 2 (Tasktracker) Work node K-1 (Tasktracker) Work node K (Tasktracker) Hadoop runs a MapReduce.

MapReduce Framework Input Data

MapReduce Program Example WordCount Map Function    public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {      private final static IntWritable one = new IntWritable(1);      private Text word = new Text();      public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {        String line = value.toString();        StringTokenizer tokenizer = new StringTokenizer(line);        while (tokenizer.hasMoreTokens()) {          word.set(tokenizer.nextToken());          output.collect(word, one);        }      }    } Reduce Function    public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {      public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {        int sum = 0;        while (values.hasNext()) {          sum += values.next().get();        }        output.collect(key, new IntWritable(sum));      }    }

MapReduce Program Example WordCount

FIFO scheduler A default approach to scheduling users’ jobs. All jobs run in order of submission. Each job would use the resources of whole cluster, so that all of incoming jobs have to wait their turn. Although a shared cluster offers great potential for offering large resources to many users, the problem of sharing resources fairly between users requires a better scheduler. Refer to Hadoop: The Definitive Guide by Tom White 2nd edition published by O'Reilly Media, Inc. Early versions of Hadoop had a very simple approach to scheduling users’ jobs: they ran in order of submission, using a FIFO scheduler. Typically, each job would use the whole cluster, so jobs had to wait their turn. Although a shared cluster offers great potential for offering large resources to many users, the problem of sharing resources fairly between users requires a better scheduler. Production jobs need to complete in a timely manner, while allowing users who are making smaller ad hoc queries to get results back in a reasonable time.

Fair scheduler All jobs get an equal share of resources over time. When there is a single job running, that job uses the entire cluster. When other jobs are submitted, a portion of resources are freed up for the new jobs, so that each job gets roughly the same amount of resources. Unlike the default Hadoop scheduler, which forms a queue of jobs, this lets short jobs finish in reasonable time while not starving long jobs. http://hadoop.apache.org/docs/r1.2.1/fair_scheduler.html The Fair Scheduler The Fair Scheduler aims to give every user a fair share of the cluster capacity over time. If a single job is running, it gets all of the cluster. As more jobs are submitted, free task slots are given to the jobs in such a way as to give each user a fair share of the cluster. A short job belonging to one user will complete in a reasonable time even while another user’s long job is running, and the long job will still make progress. Jobs are placed in pools, and by default, each user gets their own pool. A user who submits more jobs than a second user will not get any more cluster resources than the second, on average. It is also possible to define custom pools with guaranteed minimum capacities defined in terms of the number of map and reduce slots, and to set weightings for each pool. The Fair Scheduler supports preemption, so if a pool has not received its fair share for a certain period of time, then the scheduler will kill tasks in pools running over capacity in order to give the slots to the pool running under capacity.

Speculative execution (LATE Scheduler) Consider heterogeneous environments. If a node is available but is performing poorly (we call it a straggler,) MapReduce runs a backup task (also called speculative task) on another machine to finish the computation faster. (The method is called speculative execution). Without this mechanism of speculative execution, a job would be as slow as the misbehaving task. Google has noted that speculative execution can improve job response times by 44%. Refer to Improving MapReduce Performance in Heterogeneous Environments A key benefit of MapReduce is that it automatically handles failures, hiding the complexity of fault-tolerance from the programmer. If a node crashes, MapReduce re-runs its tasks on a different machine. Equally impor-tantly, if a node is available but is performing poorly, a condition that we call a straggler, MapReduce runs a speculative copy of its task (also called a “backup task”) on another machine to finish the computation faster. Without this mechanism of speculative execution, a job would be as slow as the misbehaving task. Stragglers can arise for many reasons, including faulty hardware and misconfiguration. Google has noted that speculative ex-ecution can improve job response times by 44% [1].

Speculative execution (LATE Scheduler) When a node has an empty task slot, Hadoop chooses a task for it from one of three categories. First, any failed tasks are given highest priority. This is done to detect when a task fails repeatedly due to a bug and stop the job. Second, non-running tasks are considered. For maps, tasks with data local to the node are chosen first. Finally, Hadoop looks for a task to execute speculatively.

Speculative execution (LATE Scheduler) Hadoop’s scheduler starts speculative tasks based on a simple heuristic comparing each task’s progress to the average progress. To select speculative tasks, Hadoop monitors task progress using a progress score between 0 and 1. For a map task, the progress score is the fraction of input data read. For a reduce task, the execution is divided into three phases (copy phase, sort phase, and reduce phase), each of which accounts for 1/3 of the score.

Deadline-Constrained MapReduce Scheduling Based on Graph Modelling Chien-Hung Chen1, Jenn-Wei Lin2, and Sy-Yen Kuo1 1Department of Electrical Engineering, National Taiwan University 2Department of Computer Science & Information Engineering, Fu Jen Catholic University 1sykuo@ntu.edu.tw, 2jwlin@csie.fju.edu.tw

Outline Introduction Background Proposed Scheduling Scheme Performance Evaluation Conclusion

Introduction MapReduce is a software framework for processing data-intensive applications with a parallel manner in cloud computing systems. Many data-intensive jobs may be issued simultaneously in a cloud computing system. When users run important data-intensive jobs, they usually specify the expected deadlines of the jobs in their Service Level Agreements (SLAs) with the cloud provider.

Introduction In this paper, we propose a new scheduler that utilizes the Bipartite Graph modelling to integrate the following points in the MapReduce Scheduling. Slot performance heterogeneity Adaptive task deadline setting Combining data locality and job deadline Minimizing the number of deadline-over jobs The proposed MapReduce scheduler is called the BGMRS. In the BGMRS, a weighted bipartite graph is first formed. Based on the bipartite graph, we can obtain an optimal deadline-aware MapReduce scheduling strategy by solving the Minimum Weighted Bipartite Matching (MWBM) problem.

Outline Introduction Background Proposed Scheduling Scheme Performance Evaluation Conclusion

System Model We refer to the Hadoop cluster architecture to design our MapReduce scheduling scheme. The Hadoop cluster architecture can be used to implement a cloud system platform, which consists of a master node and multiple worker (slave) nodes. A process, the Jobtracker, is run on the master node, which can coordinates all the jobs on the cluster to place their map and reduce tasks on worker nodes. Each worker node has a TaskTracker process to control the execution of map and reduce tasks on the node. The TaskTracker also sends the progress reports of its responsible tasks to the JobTracker.

System Model In the MapReduce, the execution resource of a node is divided into a number of slots. The slot holds a portion execution resource of a node to run a map or reduce task. With the node heterogeneity, our system model specially concerns the slots of a system with performance heterogeneity.

Related Work (1/2) L.-Y. Ho, J.-J. Wu, and P. Liu, “Optimal Algorithms for Cross-Rack Communication Optimization in MapReduce Framework,” in Proc. IEEE CLOUD, Jul. 2011, pp. 420–427. Authors presented two optimal reduce placement algorithms to mitigate the network traffic among the racks while the job is in shuffle phase. By reducing the data communication, it can improve job performance. Authors assume the all-to-all communication model between the map and reduce tasks of a single job. The reduction of input data traffic to map tasks was not been discussed.

Related Work (2/2) X. Dong, Y. Wang, and H. Liao, “Scheduling Mixed Real-Time and Non-real-Time Applications in MapReduce Environment,” in Proc. IEEE ICPADS, Dec. 2011, pp. 9–16. Authors focused on the scheduling of mixed real-time and non-real-time jobs. To meet the job deadline, if the real-time jobs cannot obtain resources during an interval time, they can preempt resources from non-real-time jobs. This work does well in handling different types of the MapReduce jobs. However, the heterogeneity of computing resources is not discussed.

Outline Introduction Background Proposed Scheduling Scheme Performance Evaluation Conclusion

The DAMRS problem We investigated the Deadline-Aware MapReduce Scheduling (DAMRS) problem The main objective of the DAMRS problem is to find an optimal task scheduling strategy which can minimize both the total number of deadline-over tasks and the total task execution time. Unlike the traditional MapReduce problem, the DAMRS problem considers the tasks and slots of a system with different deadline requirements and execution performance, respectively.

Proposed Scheduling Scheme The proposed scheme, the BGMRS, it considers multiple deadline requirements from different MapReduce jobs and various slot performance in the cloud computing system. The scheme consists of the following three steps: Deadline partition Bipartite graph formation Scheduling problem transformation

Proposed Scheduling Scheme For the deadline partition, it divides a job deadline into a map deadline and a reduce deadline. As shown in the flowchart, a job deadline is divided based on the job’s execution phase. The details will be described in the following slides.

Proposed Scheduling Scheme Ready phase The job deadline is divided into an initial map deadline and an initial reduce deadline. The deadline partition is done using the estimated map execution time and estimated reduce execution time to determine two partition ratios. As shown in the following equations, a task deadline is equal to the partition ratio times the job deadline. Initial map deadline Initial reduce deadline    

Proposed Scheduling Scheme Map phase The deadline partition in map phase also has to determine two partition ratios, as shown in the following equations. They are similar to the ready phase, but here the average map execution time is used instead of the estimated map execution time. And the original job deadline is replaced by the remaining job deadline. Map deadline Reduce deadline Reduce phase The remaining job deadline will be fully set for the reduce tasks.    

Proposed Scheduling Scheme For the bipartite graph formation, we first give an example here. In the example, there are 3 jobs run concurrently on the cloud system. With considering the data location and slot performance, we first find the feasible slots of each map (reduce) task and the estimated execution time on the corresponding slots. Based on the given tasks and their feasible slots, a weighted bipartite graph can be formed. The edges represent the relationship between the tasks and their corresponding feasible slots. And the costs of the edges are set by the estimated execution times. An execution scenario of MapReduce jobs Bipartite graph modelling

Proposed Scheduling Scheme To reflect that the reduce tasks have higher priorities to obtain feasible slots than the map tasks, the costs of map edges on the graph are re-labelled. If a slot has one or more reduce edges, the maximum cost of the reduce edges is added to the cost of each map edge of the slot. Finally, based on the weighted bipartite graph, we can apply an existing algorithm to solve the minimum weighted bipartite matching (MWBM) problem, then obtain the optimal task scheduling.

Outline Introduction Background Proposed Scheduling Scheme Performance Evaluation Conclusion

Performance Evaluation The simulations are performed using MATLAB. We assume that the cloud computing system has 400 servers, and the number of MapReduce jobs is set from 25 to 50. The cloud performance is refer to Amazon EC2. The schemes presented in related works are compared with our proposed scheme. Optimal Reduce Placement (ORP) The FIFO and Fair schedulers are used to extend the ORP for handling multiple jobs, called the ORP_FIFO and the ORP_FAIR, respectively. Approximately Uniform minimum Degree of parallelism (AUMD) Simulation Metrics Normalized total job elapsed time Deadline-over job ratio Average excess ratio with respect to the job deadline

Normalized total job elapsed time. Simulation Results In Fig. (a), the total job elapsed time of the BGMRS is about 24% of that of the other schemes on average. From Fig. (b), we can see that the other schemes have longer total job elapsed time in comparison with our proposed scheme. (a) 25 jobs. (b) 50 jobs. Normalized total job elapsed time.

Deadline-over job ratio. Simulation Results To compare the AUMD with the BGMRS, the BGMRS also significantly improve the deadline-over job ratios of the AUMD. The improvement ratio is at least 75%. (a) 25 jobs. (b) 50 jobs. Deadline-over job ratio.

Average excess ratio with respect to the job deadline. Simulation Results The results show that the BGMRS has the smallest average excess ratio with respect to the job deadline. (a) 25 jobs. (b) 50 jobs. Average excess ratio with respect to the job deadline.

Outline Introduction Background Proposed Scheduling Scheme Performance Evaluation Conclusion

Conclusion We used the bipartite graph modelling to solve the DAMRS problem. Compared to the previous MapReduce schemes, the BGMRS has significant improvements in the total job elapsed time and the deadline-over job ratio, respectively. In the future, we will improve the computational time of the BGMRS in a large-scale cloud computing system. Furthermore, we also plan to implement the proposed scheme in real-life cloud system.

Thank you for your attention!