Scheduling for MW applications E. Heymann, M.A. Senar and E. Luque Computer Architecture and Operating Systems Group Universitat Autònoma of Barcelona SPAIN
Contents Statement of the problem. Simulation framework. Simulation results. Implementation on MW. Future work.
Objective To develop and evaluate efficient scheduling policies for parallel applications with the following MW model: for i=1 to M for j=1 to NTasks do F(j) end [computation] In an opportunistic environment Expectation Variance The expectation of the execution time of F(j) is the same (with some variance) for all the executions of the external loop.
Objective How many workers ? How to assign tasks to workers ? Efficiency without forgetting performance. How sensitive is efficiency with respect to variance changes. To optimize the use of the processor resource to get the best efficiency, but considering the efficiency/execution time ratio.
Example 6 8 1 8 5 4 5 4 1 1 4 8 5 Without preoccupation LPTF Policy Efficiency = 6 + 12 = 3 12 + 12 4 8 5 4 1 8 5 4 1 LPTF Policy Efficiency = 9 + 9 = 1 9 + 9
Platforms Dedicated homogeneous machines. Non-dedicated (such as Condor) homogeneous machines. Non-dedicated (such as Condor) heterogeneous machines. 1) Fixed number of machines 2) 3) In an opportunistic environment like Condor.
Policies Simulated LPTF: Largest processing time first LPTF on Average Random Random and Average
LPTF => Implies sorting the tasks Rand&Avg => Implies the computation of the average of the previous execution times of task i. And sorting
Simulation Response Variables: Efficiency Execution Time
Simulation. Factors. Variation 6 24 7 40 8 (5) Simulation. Factors. Variation 35 (4) 28 20% deviation (1) 9 (1) 6 6
Simulation. Factors. Workload
Simulation. Factors Processor Number 31, 100, 300 Standard Deviation 0, 10, 30, 60, 100 External Loop 10, 35, 50, 100 Workload 20%load: 30, 40, 50, 60, 70, 80, 90 20%dist: equal, decreasing 80%dist: equal, decreasing
Simulation Results Dedicated homogeneous machines Efficiency: few big tasks and lots of machines => waste of resources. Loop: does not matter. Deviation: separates the curves when there are many big tasks. Few big tasks: Nothing to do
Simulation Results Dedicated homogeneous machines
Simulation Results Dedicated homogeneous machines
Simulation Results Dedicated homogeneous machines
Simulation Results Dedicated homogeneous machines
Simulation Conclusions Dedicated homogeneous machines Rough analysis: Variance does not seem to make efficiency significantly worse. External loop does not affect efficiency. To achieve an efficiency > 80% and an execution time < 1.1 respect to LPTF execution time, the number of workers should range between 15% of the tasks number (90% load) and 40% (30% load). Few big tasks and lots of machines => Waste of resources. With 30% load, about 33% of the tasks number is needed. With 90% load only 16% is needed.
Simulation Non-dedicated homogeneous machines Factors: Processor Number Standard Deviation External Loop Only 35 Workload Probability of loosing and getting machines Checkpoint Always Never Only for “big” tasks that have been “a long time” in execution The response variables are the same as (A): Efficiency & execution time checkpoint times: SAVE_OVERHEAD = Minute / 2 RESTORE_OVERHEAD = Minute / 2 CHECK_MACHINE_STATE = 2 * Minute (loose a machine with p=15%, win a machine with p=15%)
Simulation Results Non-dedicated homogeneous machines Biggest task = 857.143 Smaller task = 21.538
Simulation Results Non-dedicated homogeneous machines This is only an example, we haven’t done lots of execution and average yet. Probability of getting a machine = Probability of loosing a machine == 15% Ckpt time to save == ckpt time to restore = 30 segs. Dist30-100-1-1: Bigger task = 285 Smaller tasks = 3
Implementation on MW Support to the desired Program Model. Computation of the Efficiency. Scheduling policies Random Random & Average Efficiency: Always with the master clock
Implementation on MW Statistics S to get Eff. Policy: Rand || iteration 1 iteration 2 iteration M Statistics to get Eff. S Policy: Rand || Rand & Avg S
Future Non-dedicated homogeneous machines: Complete the simulations. Duplication of large tasks. (?) Non-dedicated heterogeneous machines: Use dynamic load information (provided by Condor) to rank machines. Implementation on MW: Test the MW scheduling policies with large applications. Lo de tener etiquetas de seguridad en las maquinas, lo tendria que proveer condor. To ask Condor for getting information about machines safety. Complete simulation includes doing proofs with different CKPT_OVERHEAD time, and when loosing more machines than winning machines, and trying some other winning/loosing machine probability.