Scheduling Mixed Parallel Applications with Reservations Henri Casanova Information and Computer Science Dept. University of Hawai`i at Manoa

Scheduling Mixed Parallel Applications with Reservations Henri Casanova Information and Computer Science Dept. University of Hawai`i at Manoa henric@hawaii.edu

Mixed Parallelism Both task- and data-parallelism “Malleable tasks with precedence constraints”... time procs

Mixed Parallelism Mixed parallelism arises in many applications, many of them scientific workflows Example: Image processing applications that apply a graph of data-parallel filters e.g., [Hastings et al., 2003] Many workflow toolkits support mixed- parallel applications e.g., [Stef-Praun et al., 2007], [Kanazawa, 2005], [Hunold et al., 2003]

Mixed-Parallel Scheduling Mixed-parallel scheduling has been studied by several researchers NP-hard, with guaranteed algorithms [Lepere et al., 2001] [Jansen et al., 2006] Several heuristics have been proposed in the literature One-step algorithms [Boudet et al., 2003] [Vydyanathan et al., 2006] Task allocations and task mapping decisions happen concurrently Two-step algorithms [Radulescu et al., 2001] [Bandala et al., 2006] [Rauber et al., 1998] [Suter et al. 2007] First, compute task allocations Second, map tasks to processors using some standard list- scheduling approach

The Allocation Problem We can give each task very few (one?) processors We have tasks that run for a long time But we can do a lot of them in parallel We can give each task many (all?) processors We have tasks that run quickly, but typically with diminishing return due to <1 parallel efficiencies But we can’t run many tasks in parallel Trade-off: parallelism and task execution times Question: How do we achieve a good trade-off?

Critical Path and Work time processors Two constraints: Makespan * #procs > total work Makespan > critical path length total work = sum of rectangle surfaces critical path length = execution time of the longest path in the DAG

Work vs. CP Trade-off task allocations largesmall critical path total work / # procs best lower bound on makespan

The CPA 2-Step Algorithm Original Algorithm [Radulescu et al., 2001] For a homogeneous platform Start by allocating 1 processor to all tasks Then pick a task and increase its allocation by 1 processor Picking the task that benefits the most from one extra processor, in terms of execution time Repeat until the critical path length and the total work / # procs become approximately equal Improved Algorithm [Suter et al., 2007] Uses an empirically better stopping criterion

Presentation Outline Mixed-Parallel Scheduling The Scheduling Problem with Reservations Models and Assumptions Algorithms for Minimizing Makespan Algorithms for Meeting a Deadline Conclusion

Batch Scheduling and Reservations Platforms are shared by users, today typically by batch schedulers Batch schedulers have known drawbacks non-deterministic queue waiting times In many scenarios, one needs guarantees regarding application completion times As a result, most batch schedulers today support advance reservations: One can acquire reservations for some number of processors and for some period of time

Reservations time processors We have to schedule around the holes in the reservation schedule

Reservations time processors One reservation per task

Complexity The makespan minimization problem is NP-hard at several levels (and thus also for meeting a deadline) Mixed-parallel scheduling is NP-hard Guaranteed algorithms [Lepère et al., 2001] [Jansen et al., 2006] Scheduling independent tasks with reservations is NP- hard and unapproximable in general [Eyraud-Dubois et al., 2007] Guaranteed algorithms with restrictions Guaranteed algorithms for mixed-parallel scheduling with reservations are open In this work we focus on developing heuristics

Models and Assumptions Application We assume that the application is fully specified and static Conservative reservations can be used to be safe Random DAGs are generated using the method in [Suter et al., 2007] Data-parallelism is modeled based on Amdahl’s law Platform We assume that the reservation schedule does not change while we compute the schedule We assume that we know the reservation schedule Sometimes not enabled by cluster administrators We ignore communication between tasks Since a parent task may complete well before one of its children can start, data must be written to disk anyway Can be modeled via task execution time and/or Amdahl’s law parameter

Minimizing Makespan Natural approach: adapt the CPA algorithm It’s a simple algorithm: First phase: compute allocations Second phase: list-scheduling Problem: Allocations are computed without considering reservations Considering reservations would involve considering time, which is only done in the second phase Greedy Approach: Sort the tasks by decreasing bottom-level For each task in this order, determine the best feasible processor allocation i.e., the one that has the earliest completion time

Example time processors C B A possible task configurations: D A B C D B

Computing Bottom-Levels Problem: Computing bottom levels (BL) requires that we know task execution times Task execution times depend on allocations But we compute the allocations after using the bottom levels We compare four ways to compute BLs use 1-processor allocations use “all”-processor allocations use CPA-computed allocations, using all processors use CPA-computed allocations, using historical average number of non-reserved processors We find that the 4th method is marginally better wins in 78.4% of our simulations (more details on simulations later) All results hereafter use this method for computing BLs

Bounding Allocations A known problem with such a greedy approach is that allocations are too large reduction in parallelism ends up being detrimental to makespan Let’s try to bound allocations Three methods BD_HALF: bound to half of the processors BD_CPA: bound by allocations in the CPA schedule computed using all processors BD_CPAR: bound by allocations in the CPA schedule computed using the historical average number of non-reserved processors

Reservation Schedule Model? We conduct our experiments in simulation cheap, repeatable, controllable We need to simulate environments for given reservation schedules Question: what does a typical reservation schedule look like? Answer: we don’t really know yet There is no “reservation schedule” archive Let’s look at what people have done in the past...

Synthetic Reservation Schedules We have schedules of batch jobs e.g., “parallel workload archive”, by D. Feitelson Typical approach, e.g., in [Smith et al., 2000] Take a batch job schedule Mark some jobs as “reserved” Remove all other jobs Problem: the amount of reservation is approximately constant, while in the real world we expect it to be approximately decreasing And we see it to behave in this way in a real-world 2.5- year trace from the Grid5K platform We should generate reservation schedules where the amount of reservation decreases with time

Synthetic Reservation Schedules Three methods to “drop” reservations after the simulated application start time Linearly or exponentially so that there are no reservations after 7 days Based on job submission time Preliminary evaluations indicate that the exponential method leads to schedules that are more correlated to the Grid5K data For 4 logs from the “parallel workload archive” But this is not conclusive because we have only one (good) data set at this point We run simulations with 4 logs, the 3 above methods, and with the Grid5K data Bottom-line for this work: we do not observe discrepancies in our results for our purpose regarding any of the above

Simulation Procedure We use 40 application specifications DAG size, width, regularity, etc. 20 samples We use 36 reservation schedule specifications batch log, generation method, etc. 50 samples Total: 1,440 x 1,000 = 1,440,000 experiments Two metrics: Makespan CPU-hour consumptions

Simulation Results Algorithm MakespanCPU-hours avg. deg. from best # of winsavg. deg. from best # of wins BD_ALL33.75%3642.48%0 BD_HALF28.38%337.83%1 BD_CPA0.29%1,0260.75%6 BD_CPAR0.21%3860.00%1,434 Similar results for Grid5K reservation schedules

Meeting a Deadline A simple approach for meeting a deadline is to simply schedule backwards from the deadline Picking tasks by increasing bottom-levels The way to be as safe as possible is to find for each task the feasible allocation that starts as late as possible given that: The exit task must complete before the deadline The task must complete before all of its children begin Let’s see this on a simple example

Meeting a Deadline Example E B A D C E D C B A time procs Task 1 Task 2 possible Task 1 configurations possible Task 2 configurations

Meeting a Deadline Example time processors deadline A E B A D C

Meeting a Deadline Example time processors deadline B E B A D C

Meeting a Deadline Example time processors deadline C E B A D C

Meeting a Deadline Example time processors deadline D E B A D C

Meeting a Deadline Example time processors deadline E E B A D C

Meeting a Deadline Example time processors deadline Task 2 E B A D C

Meeting a Deadline Example time processors deadline Task 2 A E D C B A

Meeting a Deadline Example time processors deadline Task 2 B E D C B A

Meeting a Deadline Example time processors deadline Task 2 C E D C B A

Meeting a Deadline Example time processors deadline Task 2 D E D C B A

Meeting a Deadline Example time processors deadline Task 2 E E D C B A

Meeting a Deadline Example time processors deadline Task 2 E D C B A Task 1

Algorithms We can employ the same techniques for bounding allocations as for the makespan minimization algorithms BD_ALL, BD_HALF, BD_CPA, BD_CPAR Problem: the algorithms do not consider the tightness of the deadline If the deadline is loose, the above algorithms will consume unnecessarily high numbers of CPU-hours For a very loose deadline there should be no data- parallelism, and thus no parallel efficiency loss due to Amdahl’s law Question: How can we reason about deadline tightness?

Deadline Tightness For each task we have a choice of allocations: Ones that use too many processors may be wasteful Ones that use too few processors may be dangerous Idea: Consider the CPA-computed schedule assuming an empty reservation schedule Using all processors, or the historical average number of non- reserved processors Determine when the task would start in that schedule, i.e., at which fraction of the overall makespan Pick the allocation that allows the task to start at the same fraction of the time interval between “now” and the deadline

Matching the CPA schedule CPA Schedule time processors q procs ab

Matching the CPA schedule CPA Schedule time processors q procs ab Schedule with Reservation time processors p cd task “deadline”

Matching the CPA schedule CPA Schedule time processors q procs ab Schedule with Reservation time processors p cd Pick the cheapest allocation such that: b / (a+b) > d / (c+d) task “deadline”

Simulation Experiments We call this new approach “resource conservative” (RC) We conduct simulation similar to those for the makespan minimization algorithms Issue: the RC approach can be in trouble when it tries to schedule the first tasks if the reservation schedule is non-stationary and/or tight could be addressed via some tunable parameter (e.g., pick an allocation that starts at least x% after the scaled CPA start time) We do not use such a parameter in our results We use two metrics: Tightest deadline achieved Necessary because deadline tightness depends on instance Determined via binary search CPU-hours consumption for a deadline that’s 50% later than the tightest deadline

Simulation Results AlgorithmTightest deadline (average degradation from best) CPU-hours consumed for a loose deadline Reservation schedule sparsemediumtightGrid5KsparsemediumtightGrid5K BD_ALL 178%175%188%227%3556348637682006 BD_CPAR 6.52%6.44%6.91%8.38%231236243179 RC_CPA 13.17%13.27%17.36%19.51%6.396.807.982.15 RC_CPAR 4.12%4.27%8.26%15.14%0.160.150.160.09

Conclusions Makespan minimization Bounding task allocations based on the CPA schedule works well Meeting a deadline Using the CPA schedule for determining task start times works well, at least when the reservation schedule isn’t to tight Some tuning parameter may help for tight schedules Or, one can use the same approach as for makespan minimization but backwards In both cases using the historical number of unreserved processors leads to marginal improvements

Possible Future Directions Use a recent one-step algorithm instead of CPA iCASLB [Vydyanathan, 2006] Experiments in a real-world setting What kind of interface should a batch scheduler expose if the full reservation schedule must remain hidden? Reservation schedule archive Needs to be a community effort

Scheduling Mixed-Parallel Applications with Advance Reservations, Kento Aida and Henri Casanova, to appear in Proc. of HPDC 2008 Questions?

Scheduling Mixed Parallel Applications with Reservations Henri Casanova Information and Computer Science Dept. University of Hawai`i at Manoa

Similar presentations

Presentation on theme: "Scheduling Mixed Parallel Applications with Reservations Henri Casanova Information and Computer Science Dept. University of Hawai`i at Manoa"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scheduling Mixed Parallel Applications with Reservations Henri Casanova Information and Computer Science Dept. University of Hawai`i at Manoa

Similar presentations

Presentation on theme: "Scheduling Mixed Parallel Applications with Reservations Henri Casanova Information and Computer Science Dept. University of Hawai`i at Manoa"— Presentation transcript:

Similar presentations

About project

Feedback