Presentation is loading. Please wait.

Presentation is loading. Please wait.

- DAG Scheduling with Reliability - - GridSolve - - Fault Tolerance In Open MPI - Asim YarKhan, Zhiao Shi, Jack Dongarra VGrADS Workshop April 2007.

Similar presentations


Presentation on theme: "- DAG Scheduling with Reliability - - GridSolve - - Fault Tolerance In Open MPI - Asim YarKhan, Zhiao Shi, Jack Dongarra VGrADS Workshop April 2007."— Presentation transcript:

1 - DAG Scheduling with Reliability - - GridSolve - - Fault Tolerance In Open MPI - Asim YarKhan, Zhiao Shi, Jack Dongarra VGrADS Workshop April 2007

2 Task Graph Scheduling with Reliability Scheduling task graphs in a heterogeneous system is proven to be NP-hard With the increasing size of the system, the reliability of the system should be addressed –Application Failure prevention –Task duplication –Checkpointing –Application Failure avoidance –Guarantee that the probability of application failure is kept as low as possible Jack Dongarra, Emannuel Jeannot, Erik Saule, Zhiao Shi, Bi-objective Scheduling Algorithms for Optimizing Makespan and Reliability on Heterogeneous Systems, Published in Proceedings of 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '07)

3 Bi-objective Optimization Problem Task graph G=(V,E), a DAG –Task v i  V has o i operations and |V|= n –e i = e(i,j)  E has associated l i (communication cost from v i to v j ) Processor p j  P, |P| =m, –  j is the unit execution time on p j – j is the failure rate of p j, constant failure model –Success rate of task i (with o i ops) on processor j is Failure model follows exponential law C j max is the completion time on processor j The success rate of the application is Goal: minimizing the makespan and maximizing p succ. –conflicting objectives –optimizing one objective will adversely affect the other

4 Generalizing Scheduling Heuristics HEFT (Heterogeneous Earliest Finish Time) –Well known task graph scheduling heuristic –At each scheduling step, assign task to processors using Earliest Finish Time (EFT) RHEFT (Reliable HEFT) to minimize failure costs –At each step assign task i to to processor j that has the minimum product T endj * j –Easily extended to other heuristics –T end1 = 6; T end1 * 1 = 12 –T end2 = 9; T end2 * 2 = 9

5 Reliability/Makespan Tradeoff Choose a subset of processors (by some ordering) –Choose k out of 100 processors ordering the processors: reliable first -> high makespan for low num proc ordering the processors: fastest first -> increasing reliability as num proc increases to 15

6 Reliability/Makespan Tradeoff Tradeoff between HEFT and RHEFT ordering the processors: smallest  first -> makespan OK and reliability decreases from optimal at 1 Tradeoff variable  at k = 100  =0  RHEFT  =1  HEFT

7 Task Graphs: Conclusion Studied problem of optimizing DAGs with 2 objectives –Minimize makespan –Maximize reliability Created a simple way to generalize heuristics (e.g. HEFT to RHEFT) to allow for reliability Characterized the role of  as a way to allow user to tradeoff reliability and makespan by choosing a subset of processors

8 GridSolve Grid based, client-agent- server, RPC system –Resource discovery, dynamic problem solving capabilities, load balancing, fault tolerance, asynchronous calls, disconnected operation, network data storage,... –Dynamic service bindings –Client does not need to have stubs for services –API provides a variety of methods –Blocking, non-blocking, task farms, disconnected,... –Easy to add additional services by wrapping libraries Prime goal: ease-of-use –Supports proposed standard GridRPC client API –Matlab, Octave, IDL: enable desktop based Grid computing

9 GridSolve: Scheduling Joint work with Emmanuel Jeannot, INRIA –History based performance estimation –Uses historical performance of specific problem on a specific server with a template based linear model to make more accurate service performance model –Communication cost estimates –Client estimates communication costs for a subset of servers via a simple probe –Perturbation model for scheduling –The agent uses a interaction model of the currently executing jobs on the servers to schedule jobs (includes estimated completion times) –Accounts for the effect of one job on another Emmanuel Jeannot and Keith Seymour and Asim YarKhan and Jack J. Dongarra, Improved Runtime and Transfer Time Prediction Mechanisms in a Network Enabled Servers Middleware, Parallel Processing Letters, 2007

10 GridSolve: Workflow support Work in progress, with Alexey Lastovetsky (UC Dublin) et al GridSolve is being enhanced to support workflow –Described by a DAG submitted at the client –DAG parsed by client into GridSolve calls –Data transferred from server to server Scheduling is currently static –Done at DAG submission time, uses agent knowledge to create a smart mapping of the services to the resources

11 GridSolve: Summary Beta versions of GridSolve have been released –Currently v0.15 –New version with scheduling options to be released soon Integration Possibility –As in NetSolve/GrADS –vgES can act as a resource manager for the GridSolve agent –Software and data are migrated on demand GrADS/NetSolve integration for SC05

12 Open MPI Open source (BSD style) MPI-2 implementation (v 1.2) Thread safe Dynamic process spawning Uses multiple network interfaces simultaneously via a single library Tunable (e.g. collective operations can be tuned to platform) In development: Network and process fault tolerance

13 Open MPI: Fault Tolerance Notes: blue=not maintained; white=still functional; red=maintained and to be incorporated in Open MPI

14 Open MPI: Fault Tolerance Open MPI plans on supporting the following fault tolerance techniques: –LAM/MPI style automatic, coordinated process checkpoint and restart (expected SC 07) –MPICH-V style automatic, uncoordinated process checkpoint with message logging techniques (expected SC 07) –User directed, and communicator driven fault tolerance. Similar to those implemented in FT-MPI (expected 2008)

15 The End


Download ppt "- DAG Scheduling with Reliability - - GridSolve - - Fault Tolerance In Open MPI - Asim YarKhan, Zhiao Shi, Jack Dongarra VGrADS Workshop April 2007."

Similar presentations


Ads by Google