Scheduling for MW applications

Scheduling for MW applications
E. Heymann, M.A. Senar and E. Luque Computer Architecture and Operating Systems Group Universitat Autònoma of Barcelona SPAIN

Contents Statement of the problem. Simulation framework.
Simulation results. Implementation on MW. Future work.

Objective To develop and evaluate efficient scheduling policies for parallel applications with the following MW model: for i=1 to M for j=1 to NTasks do F(j) end [computation] In an opportunistic environment Expectation Variance The expectation of the execution time of F(j) is the same (with some variance) for all the executions of the external loop.

Objective How many workers ? How to assign tasks to workers ?
Efficiency without forgetting performance. How sensitive is efficiency with respect to variance changes. To optimize the use of the processor resource to get the best efficiency, but considering the efficiency/execution time ratio.

Example 6 8 1 8 5 4 5 4 1 1 4 8 5 Without preoccupation LPTF Policy
Efficiency = = 3 8 5 4 1 8 5 4 1 LPTF Policy Efficiency = = 1 9 + 9

Platforms Dedicated homogeneous machines.
Non-dedicated (such as Condor) homogeneous machines. Non-dedicated (such as Condor) heterogeneous machines. 1) Fixed number of machines 2) 3) In an opportunistic environment like Condor.

Policies Simulated LPTF: Largest processing time first LPTF on Average
Random Random and Average

LPTF => Implies sorting the tasks
Rand&Avg => Implies the computation of the average of the previous execution times of task i. And sorting

Simulation Response Variables: Efficiency Execution Time

Simulation. Factors. Variation
6 24 7 40 8 (5) Simulation Factors Variation 35 (4) 28 20% deviation (1) 9 (1) 6 6

Simulation. Factors. Workload

Simulation. Factors Processor Number 31, 100, 300
Standard Deviation , 10, 30, 60, 100 External Loop , 35, 50, 100 Workload %load: 30, 40, 50, 60, 70, 80, 90 20%dist: equal, decreasing 80%dist: equal, decreasing

Simulation Results Dedicated homogeneous machines
Efficiency: few big tasks and lots of machines => waste of resources. Loop: does not matter. Deviation: separates the curves when there are many big tasks. Few big tasks: Nothing to do

Simulation Results Dedicated homogeneous machines

Simulation Conclusions Dedicated homogeneous machines
Rough analysis: Variance does not seem to make efficiency significantly worse. External loop does not affect efficiency. To achieve an efficiency > 80% and an execution time < 1.1 respect to LPTF execution time, the number of workers should range between 15% of the tasks number (90% load) and 40% (30% load). Few big tasks and lots of machines => Waste of resources. With 30% load, about 33% of the tasks number is needed. With 90% load only 16% is needed.

Simulation Non-dedicated homogeneous machines
Factors: Processor Number Standard Deviation External Loop Only 35 Workload Probability of loosing and getting machines Checkpoint Always Never Only for “big” tasks that have been “a long time” in execution The response variables are the same as (A): Efficiency & execution time checkpoint times: SAVE_OVERHEAD = Minute / 2 RESTORE_OVERHEAD = Minute / 2 CHECK_MACHINE_STATE = 2 * Minute (loose a machine with p=15%, win a machine with p=15%)

Simulation Results Non-dedicated homogeneous machines
Biggest task = Smaller task =

Simulation Results Non-dedicated homogeneous machines
This is only an example, we haven’t done lots of execution and average yet. Probability of getting a machine = Probability of loosing a machine == 15% Ckpt time to save == ckpt time to restore = 30 segs. Dist : Bigger task = 285 Smaller tasks = 3

Implementation on MW Support to the desired Program Model.
Computation of the Efficiency. Scheduling policies Random Random & Average Efficiency: Always with the master clock

Implementation on MW Statistics S to get Eff. Policy: Rand ||
iteration 1 iteration 2 iteration M Statistics to get Eff. S Policy: Rand || Rand & Avg S

Future Non-dedicated homogeneous machines:
Complete the simulations. Duplication of large tasks. (?) Non-dedicated heterogeneous machines: Use dynamic load information (provided by Condor) to rank machines. Implementation on MW: Test the MW scheduling policies with large applications. Lo de tener etiquetas de seguridad en las maquinas, lo tendria que proveer condor. To ask Condor for getting information about machines safety. Complete simulation includes doing proofs with different CKPT_OVERHEAD time, and when loosing more machines than winning machines, and trying some other winning/loosing machine probability.

Scheduling for MW applications

Similar presentations

Presentation on theme: "Scheduling for MW applications"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scheduling for MW applications

Similar presentations

Presentation on theme: "Scheduling for MW applications"— Presentation transcript:

Similar presentations

About project

Feedback