1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11
LOGO 2 Contents Scheduling Heuristics 3 Introduction 1 Simulation Results 4 Related Works 2 Conclusions and Future Works 5
LOGO 3 Contents Scheduling Heuristics 3 Introduction 1 Experimental Results 4 Related Works 2 Conclusions and Future Works 5
LOGO 4 General Purpose Context : global warming and climate fluctuations Numerical simulations using general circulation models of a climate system atmosphere ocean continental surfaces Climatologists’ purpose estimate global warming simulations’ sensitivity with respect to the model’s parameterization Climate forecast application provided by CERFACS within the LEGO project Introduction
LOGO 5 Our Goal Analyze the application Model its needs Execution model Data access pattern Computing needs Elaborate, test and compare appropriate scheduling heuristics Provide generic scheduling schemes for applications with similar dependence graphs Introduction
LOGO 6 Application Description “Scenario” simulations current climate followed by 21 st century for 150 years (1800 months) different parameterization of atmospheric model Introduction
LOGO 7 Application Description One monthly simulation : concatenate_atmospheric_input_files(1)modify_parameters(1) process_coupled_run convert_output_format(60) compress_diagonals(30)extract_minimun_information(30) atmospheric model (ARPEGE) ocean and sea-ice model (OPA) runoff pathway (TRIP) coupler (OASIS) Introduction
LOGO 8 Application Description Introduction
LOGO 9 Contents Scheduling Heuristics 3 Introduction 1 Experimental Results 4 Related Works 2 Conclusions 5
LOGO 10 Related Works Multiple DAGs Scheduling Mixed Parallelism Pipelined Data Parallel Tasks Related Works
LOGO 11 Multiple DAGs Scheduling Directed Acyclic Graph (DAG) Nodes – tasks Edges – precedence constraints Multiple DAGs Scheduling Related Works
LOGO 12 Multiple DAGs Scheduling Composite DAG Related Works
LOGO 13 Multiple DAGs Scheduling Group DAGs’ tasks in levels of independent tasks Related Works
LOGO 14 Related Works – Multiple DAGs Scheduling Composite DAG and round-robin policy of scheduling among DAGs Composite DAG & ranking based composition Related Works
LOGO 15 Mixed Parallelism Parallel scientific application Data parallelism Task parallelism Mixed parallelism Scheduling a DAG on a finite number of resources – NP complete even for the simple case of mono- processor tasks Heuristic approaches Related Works
LOGO 16 Mixed Parallelism A. Radulescu & A. Gemund (2001) – 2 step heuristic - CPA (Critical Path and Area based Scheduling) Processors allocation to tasks - based on a compromise between the critical path length and the processor utilization Task allocation on processors - list scheduling heuristic Related Works
LOGO 17 Pipelined Data Parallel Tasks Computations consisting of a chain of data- parallel tasks that process successive data sets in a pipeline fashion – particular case of mixed parallelism 2 key metrics to be optimized: Latency- duration of processing a data-set Throughput- rate at which data sets can be processed Related Works
LOGO 18 Related Works – Pipelined Data Parallel Tasks Aspects to be considered : Clustering of successive stages into modules Reduces communications Improves latency Replicating modules Improves throughput Increases latency Related Works
LOGO 19 Contents Scheduling Heuristics 3 Introduction 1 Experimental Results 4 Related Works 2 Conclusions 5
LOGO 20 Scheduling Heuristics Climate Application Scheduling Generic Scheduling Heuristics Scheduling Heuristics
LOGO 21 Climate Application Scheduling Homogeneous platform composed of R resources Communication assumed contention-free through NFS Tasks execution time is assumed to include the necessary time to access the data redistribute it to processors effective computing time store back the data Scheduling Heuristics
LOGO 22 Climate Application Scheduling concatenate_atmosferic_input_files(1)modify_parameters(1) process_coupled_run convert_output_format(60) compress_diagonals(30)extract_minimun_information(30) Main processing Post processing Scheduling Heuristics
LOGO 23 Climate Application Scheduling We divide processors into disjoint sets on which multi-processor tasks can execute All multi-processor tasks execute on the same number of resources G, defining a certain grouping of resources For the given application, 8 possible values for the parameter G (4 → 11) Scheduling Heuristics
LOGO 24 Climate Application Scheduling Case 1 Case 2 Scheduling Heuristics
LOGO 25 Climate Application Scheduling The makespan is computed analytically as a function of number of resources R; grouping G ; number of months in an independent simulation (NM) number of independent simulations (NS). The grouping G yielding the smallest makespan is chosen Scheduling Heuristics
LOGO 26 Climate Application Scheduling The constraint of scheduling all multi-processor tasks on the same number of resources is tight Eg. R=53, NS=10, NM=1800, found optimal grouping G = 7; –49 resources for main processing; –1 resource used for the corresponding post-processing –3 resources unused. however, 3 groups with 8 resources and 4 groups with 7 resources – 4.5% of gain Scheduling Heuristics
LOGO 27 Climate Application Scheduling Possibilities for improvement : Heuristic 1 distribute evenly the unused resources among the existing groups Heuristic 2 use all resources for multi-processor tasks (evenly distributing the extra-resources among processor groups) all post-processing at the end Heuristic 3 use all resources for multi-processor tasks and model the problem as an instance of the knapsack problem all post-processing at the end Scheduling Heuristics
LOGO 28 Climate Application Scheduling Knapsack problem modelization Items – the 8 possibilities (groupings of resources) for allocating processors to multi-processor tasks (4 → 11) Cost of an item – the number of resources of that grouping Value of a grouping G – 1/T[G] – the fraction of a multi- processor task that gets executed in a time unit on G resources Unknowns n i (i=4 → 11) – number of groups with i resources in the final solution Constraints Goal : maximize Scheduling Heuristics
LOGO 29 Climate Application Scheduling Scheduling Heuristics
LOGO 30 Generic Scheduling Heuristics Scheduling Heuristics We propose generic scheduling heuristics for a class of applications consisting of independent identical chains of identical DAGs
LOGO 31 Generic Scheduling Heuristics First approach Create a composite DAG – link all entry nodes to a common entry node and all exit tasks to a common exit node Apply mixed parallelism scheduling heuristics on the composite DAG CPA –reduced complexity (O(V(V+E)R)); –drawback of being a 2 step algorithm. Scheduling Heuristics
LOGO 32 Generic Scheduling Heuristics Second approach Exploit the knowledge on the specific structure of the application Exploit the pipelined structure of the application Separate the independent pre and post-processing tasks and schedule them with algorithms for independent malleable tasks (5/4 approximation in constant time) Scheduling Heuristics
LOGO 33 Generic Scheduling Heuristics Scheduling Heuristics
LOGO 34 Generic Scheduling Heuristics Scheduling Heuristics
LOGO 35 Generic Scheduling Heuristics Heuristic 1 Schedule all pre-processing tasks at the beginning Schedule inter and main processing tasks as interval on the same number of resources Schedule all post-processing tasks at the end Heuristic 2 Schedule all pre-processing tasks at the beginning Schedule inter and main processing tasks separately as a pipeline Schedule all post-processing tasks at the end Scheduling Heuristics
LOGO 36 Generic Scheduling Heuristics Heuristic 3 Schedule inter and main processing tasks as an interval pipeline on the same number of resources Schedule pre and post processing tasks simultaneously on resources specially reserved for them as well as resources unused by the pipeline Schedule pre and post-processing tasks left at the beginning and end of pipeline respectively Scheduling Heuristics
LOGO 37 Generic Scheduling Heuristics Heuristic 4 Schedule inter and main processing tasks separately as a pipeline schedule pre and post processing tasks simultaneously with the pipeline on resources specially reserved for them as well as resources unused by the pipeline ; schedule pre and post processing tasks left at the beginning and end of pipeline respectively; Scheduling Heuristics
LOGO 38 Contents Scheduling Heuristics 3 Introduction 1 Simulation Results 4 Related Works 2 Conclusions 5
LOGO 39 Simulation Results Behavior of the 4 heuristics tested against CPA applied on the composite DAG Tasks’ execution time modeled by Amdahl’s law: Several configurations tested Simulation Results
LOGO 40 Simulation Results Configuration 1 All tasks’ execution time on 1 processor identical (500) All tasks’ coefficient α is identical (0.1) Simulation Results
LOGO 41 Simulation Results Configuration 2 Same as before, with α interprocessing = 0.8 Simulation Results
LOGO 42 Simulation Results Configuration 3 T1 pre-processing = T1 post-processing =50, T1 main-processing = T1 inter-processing =500 α= 0.1, α inter_processing =0.6 Simulation Results
LOGO 43 Simulation Results Configuration 4 T1 pre-processing = T1 post-processing =50, T1 main-processing = T1 inter-processing =500 α= 0.1, α inter_processing =1.0 Simulation Results
LOGO 44 Contents Scheduling Heuristics 3 Introduction 1 Experimental Results 4 Related Works 2 Conclusions and Future Works 5
LOGO 45 Conclusions We found a model for the given real application We proposed a basic heuristic for this model and 3 improved versions We proposed 4 pipeline- based heuristics for the generalized problem and compared them with the approach of applying a mixed-parallelism algorithm on the composite DAG of the application Conclusions and Future Works
LOGO 46 Future Works Enhance the heuristics by taking into account a more precise communication model Perform real experimentations on Grid’5000 in order to validate the theoretical results Analyze other applications using a similar approach with the long term goal of deriving application dependent scheduling schemes that could finally be implemented as DIET plug-in schedulers Conclusions and Future Works