Download presentation
Presentation is loading. Please wait.
Published byHugh Mason Modified over 8 years ago
1
Scheduling Strategies for Mapping Application Workflows Onto the Grid A. Mandal, K. Kennedy, C. Koelbel, G. Marin, J. Mellor- Crummey, B. Liu, L. Johnsson
2
The Forest Performance Prediction + Scheduling Heuristics Static Schedule for Workflow Components G. Marin, 2004T. Braun, 1999
3
Environment GrADSoft –Runs on top of Globus –Facilitates scheduling, launching, and monitoring of grid apps Extend GrADSoft to deal with workflows (not only tightly coupled apps)
4
What’s a workflow? A set of applications (workflow components) that must be run in a specific order DAG – Directed Acyclic Graph
5
Workflow Scheduling Condor DAGMan – dynamic, effectively random scheduling This approach is to do static scheduling –Classic problem: given a set of machines, a set of jobs, and the performance of each job on each machine, schedule all jobs as to minimize total makespan
6
Determining Machine Fitness Marin and Mellor-Crummey’s performance models –For each workflow component and target machine, produce a performance model –Advantage of performance models over cycle accurate simulations! Add data transfer penalty (using Network Weather Service) We now have the expected time to completion (ETC) of every machine for every task.
7
Minimum Multiprocessor Scheduling Problem Classic problem is NP-Complete Use traditional heuristics: –Min-Min – Schedule minimum-length job –Max-Min – Schedule maximum-length job –Sufferage – Schedule job with most to lose by waiting
8
Is This a Workflow Problem? Only one component is easy (Marin already showed this works) Scheduling many may not be tractable
9
Evaluation EMAN – Electron Micrograph Analysis Almost entire time spent here
10
Evaluation RN: Random Scheduling (DAGMan) RA: Weighted Random HC: Heuristic Scheduling with crude performance models (CPU speed) HA: Heuristic Scheduling with accurate performance models (this scheme)
11
Evaluation Testbed 147 machines 4 types 64 dual processor Itanium 900MHz IA-64 nodes (RTC – Houston) 16 Opteron 2009MHz nodes (Medusa - Houston) 60 dual processor 1300MHz Itanium IA-64 nodes (acrl – Houston) 7 Pentium IA-32 nodes (Knoxville) – used?
12
Results 2.2x improvement over random
13
Discussion Static vs Dynamic Scheduling –Problems? –Why not use performance models dynamically? Application to workflows or more to parameter sweeps? How did they achieve load balance? Barriers to adoption?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.