Robust Task Scheduling in Non-deterministic Heterogeneous Computing Systems Zhiao Shi Asim YarKhan, Jack Dongarra Followed by GridSolve, FT-MPI, Open MPI.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Hadi Goudarzi and Massoud Pedram
Scheduling in Distributed Systems Gurmeet Singh CS 599 Lecture.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager
Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon.
Martha Garcia.  Goals of Static Process Scheduling  Types of Static Process Scheduling  Future Research  References.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Distributed Process Scheduling Summery Distributed Process Scheduling Summery BY:-Yonatan Negash.
Reference: Message Passing Fundamentals.
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
On the Task Assignment Problem : Two New Efficient Heuristic Algorithms.
Distributed Scheduling. What is Distributed Scheduling? Scheduling: –A resource allocation problem –Often very complex set of constraints –Tied directly.
ECE669 L23: Parallel Compilation April 29, 2004 ECE 669 Parallel Computer Architecture Lecture 23 Parallel Compilation.
Distributed Process Management1 Learning Objectives Distributed Scheduling Algorithms Coordinator Elections Orphan Processes.
Scheduling Parallel Task
COST IC804 – IC805 Joint meeting, February Jorge G. Barbosa, Altino M. Sampaio, Hamid Harabnejad Universidade do Porto, Faculdade de Engenharia,
ADLB Update Recent and Current Adventures with the Asynchronous Dynamic Load Balancing Library Rusty Lusk Mathematics and Computer Science Division Argonne.
Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.
SBSE Course 4. Overview: Design Translate requirements into a representation of software Focuses on –Data structures –Architecture –Interfaces –Algorithmic.
Task Alloc. In Dist. Embed. Systems Murat Semerci A.Yasin Çitkaya CMPE 511 COMPUTER ARCHITECTURE.
Self Adaptivity in Grid Computing Reporter : Po - Jen Lo Sathish S. Vadhiyar and Jack J. Dongarra.
Network Aware Resource Allocation in Distributed Clouds.
Topology aggregation and Multi-constraint QoS routing Presented by Almas Ansari.
1 Scheduling CEG 4131 Computer Architecture III Miodrag Bolic Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
Fault-Tolerant Workflow Scheduling Using Spot Instances on Clouds Deepak Poola, Kotagiri Ramamohanarao, and Rajkumar Buyya Cloud Computing and Distributed.
ROBUST RESOURCE ALLOCATION OF DAGS IN A HETEROGENEOUS MULTI-CORE SYSTEM Luis Diego Briceño, Jay Smith, H. J. Siegel, Anthony A. Maciejewski, Paul Maxwell,
Efficient and Scalable Computation of the Energy and Makespan Pareto Front for Heterogeneous Computing Systems Kyle M. Tarplee 1, Ryan Friese 1, Anthony.
Static Process Schedule Csc8320 Chapter 5.2 Yunmei Lu
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
DLS on Star (Single-level tree) Networks Background: A simple network model for DLS is the star network with a master-worker platform. It consists of a.
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11.
The Application of The Improved Hybrid Ant Colony Algorithm in Vehicle Routing Optimization Problem International Conference on Future Computer and Communication,
Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
Diskless Checkpointing on Super-scale Architectures Applied to the Fast Fourier Transform Christian Engelmann, Al Geist Oak Ridge National Laboratory Februrary,
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
1 Iterative Integer Programming Formulation for Robust Resource Allocation in Dynamic Real-Time Systems Sethavidh Gertphol and Viktor K. Prasanna University.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Dzmitry Kliazovich University of Luxembourg, Luxembourg
Rounding scheme if r * j  1 then r j := 1  When the number of processors assigned in the continuous solution is between 0 and 1 for each task, the speed.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Static Process Scheduling
CDP Tutorial 3 Basics of Parallel Algorithm Design uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison.
A System Performance Model Distributed Process Scheduling.
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
A stochastic scheduling algorithm for precedence constrained tasks on Grid Future Generation Computer Systems (2011) Xiaoyong Tang, Kenli Li, Guiping Liao,
Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.
Parallel Computing Presented by Justin Reschke
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
- DAG Scheduling with Reliability - - GridSolve - - Fault Tolerance In Open MPI - Asim YarKhan, Zhiao Shi, Jack Dongarra VGrADS Workshop April 2007.
VGrADS and GridSolve Asim YarKhan Jack Dongarra, Zhiao Shi, Fengguang Song Innovative Computing Laboratory University of Tennessee VGrADS Workshop – September.
Scheduling Algorithms Performance Evaluation in Grid Environments R, Zhang, C. Koelbel, K. Kennedy.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Open MPI - A High Performance Fault Tolerant MPI Library Richard L. Graham Advanced Computing Laboratory, Group Leader (acting)
TensorFlow– A system for large-scale machine learning
Introduction to Load Balancing:
A Study of Group-Tree Matching in Large Scale Group Communications
VGrADS Tools Activities
Basic Project Scheduling
Abstract Machine Layer Research in VGrADS
Steven Whitham Jeremy Woods
Planning and Scheduling in Manufacturing and Services
Presented By: Darlene Banta
Presentation transcript:

Robust Task Scheduling in Non-deterministic Heterogeneous Computing Systems Zhiao Shi Asim YarKhan, Jack Dongarra Followed by GridSolve, FT-MPI, Open MPI updates VGrADS Workshop Sept 2006

2 General Task Scheduling Problem Task scheduling: Allocation of tasks of a parallel program to processors to optimize certain goals, e.g. overall execution time (makespan) –Application model: task graph (DAG) –Node – computational task, has an execution cost –Edge – dependency between tasks, data transfer –Computing system model –Network of processing element (processor memory, unit, communication via message-passing) Optimal task scheduling problem is NP-complete –Heuristics: polynomial –Some assume every task has same computation cost, (homogenous tasks), some assume arbitrary cost –Some ignore communication cost

3 Robust Static Task Scheduling Heterogeneous and non-deterministic resources Application DAG with task execution distributions Traditional goal: Minimize the overall makespan based on expected system performance Our Goal: Find static schedules that are more robust to varying task execution time –Scheduled performance should be relatively stable with respect to the expected makespan. Approach: –Use “slack” to absorb task execution time increases caused by uncertainties –Employ genetic algorithm for optimization

4 Robustness – definition (I) Relative schedule tardiness – : Makespan of the schedule obtained with expected task execution time – : Real makespan with realization of task execution time –Each realization of expected values gives different schedule tardiness Robustness 1 –Reflects the amount by which the realized makespan exceeds the expected makespan

5 Robustness – definition (II) Schedule miss rate Robustness 2 –A simpler definition, simply counts the number of times the realized makespan exceeds the expected makespan

6 Calculating Makespan Given workflow, processors, and a schedule (assign tasks to processors) Adjust communication costs Makespan = longest path from source to sink (a)(a) P1P1 P1P1 P2P2 P2P2 P3P3 P3P3 P4P4 P4P4 (b)(b) P1P1 P2P (c)(c) P3P3 6 4 P4P (d)(d)

7 Defining Slack Slack of a task node i is defined as follow: – - bottom level (longest path from exit node to node i ) – - top level (longest path from entry node to node i) Average slack of a schedule Usefulness of slack in improving robustness –Slack at each task of a schedule reflects the “wiggle-room” that task has. –Large slack means a task node can tolerate large increase of execution time without increasing the makespan

8 Bi-objective optimization Want to optimize both makespan and robustness (as represented by slack) –Turn out to be conflicting goals  -constraint –Optimize one objective, subject to constraints imposed on the other objectives –maximize average slack of the schedule ( ) –subject to: –Where HEFT is a well known, efficient algorithm for serializing a DAG and assigning a schedule Use genetic algorithm to do the optimization –Fitness function is average slack –For solution violating constraint, fitness is penalized by the degree of violation of the constraint

9 Experiment settings Task graphs: Task number ( N ), shape parameter ( alpha ), average computation cost ( cc ), communication to computation ratio ( CCR ) Best case execution time matrix beta is generated taking into account task heterogeneity and machine heterogeneity – b_ij - the best case execution time of task v_i on proc p_j Uncertainty level: degree of uncertainty of actual execution time. – UL_ij- the uncertainty level of the execution time of task i on processor j –The real execution time is a uniformly distributed random variable – The graph has an average uncertainty level UL

10 Bi-objective optimization improves both makespan and robustness Performance improvement over HEFT ( = 1.0)

11 Relaxing  improves robustness R 1 improvement over = 1.0R 2 improvement over = 1.0

12 Review Want to schedule a DAG on a set of resources –Resources may be shared, or variable performance –Thus,for each resource, a task has a distribution for its execution time on that resource Want a bounded makespan for the DAG –Scheduled performance should be relatively stable with respect to the expected makespan. –Schedule is statically generated (not modified dynamically during runtime) –The schedule should, as much as possible, be able to withstand variations in the execution times of the individual tasks –Optimize the slack subject to constraints on makespan

13 Conclusions We developed an algorithm for scheduling DAG- structured applications with goals of both minimizing the makespan and maximizing the robustness. Due to conflicting of the two goals, epsilon- constraint method is used to solve the bi-objective optimization problem. Proposed two definitions of robustness Slack is an effective metric to be used to adjust the robustness. The algorithm is flexible in finding the epsilon value in certain user provided range so that the best overall performance is achieved.

14 GridSolve Update GridSolve 0.15 released 05/2006 –Support for NATs and firewalls –Based on GridRPC –Support for batch queues –Improved problem descriptions gsIDL –Matlab, C, Fortran client interfaces –History based execution models –Scheduling using perturbation model –Win32 native client Lots of work left –Client bindings (Mathematica, IDL,...) –Service libraries (ScaLAPACK, ARPACK,...) –Backend resource managers (Condor, VGrADS,...)

15 FT-MPI Update FT-MPI: Version 1.01 –Fast, scalable fault tolerant MPI implementation –System reports failures, recovers processes and messages –User must recover program (via checkpoint/restart,...) –Status update –In maintenance mode –Improvements in stability, performance

16 Open MPI Update Stable MPI-2 library – version 1.1 –A mixture of ideas coming from previous MPI libraries (FT-MPI, LA-MPI, LAM, PACX-MPI) –Team: Indiana U, U Tennessee, LANL, HLRS, U Houston, Cisco, Voltaire, Mellanox, Sun, Sandia, Myricom, IBM –Multiple simultaneous transports (ethernet, myrinet,...) –Resource managers (ssh, BProc, PBS, XGrid, Slurm,...) –Production quality, thread safe, high performance –Tuned collective operations Fault tolerance –Involuntary coordinated checkpointing –Application unaware that it was checkpointed (by SC06) –FT-MPI technologies –New FT framework in progress

The End