Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11.

Similar presentations


Presentation on theme: "1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11."— Presentation transcript:

1 1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11

2 LOGO 2 Contents Scheduling Heuristics 3 Introduction 1 Simulation Results 4 Related Works 2 Conclusions and Future Works 5

3 LOGO 3 Contents Scheduling Heuristics 3 Introduction 1 Experimental Results 4 Related Works 2 Conclusions and Future Works 5

4 LOGO 4 General Purpose  Context : global warming and climate fluctuations  Numerical simulations using general circulation models of a climate system atmosphere ocean continental surfaces  Climatologists’ purpose  estimate global warming simulations’ sensitivity with respect to the model’s parameterization  Climate forecast application provided by CERFACS within the LEGO project Introduction

5 LOGO 5 Our Goal  Analyze the application  Model its needs Execution model Data access pattern Computing needs  Elaborate, test and compare appropriate scheduling heuristics  Provide generic scheduling schemes for applications with similar dependence graphs Introduction

6 LOGO 6 Application Description  “Scenario” simulations  current climate followed by 21 st century for 150 years (1800 months)  different parameterization of atmospheric model Introduction

7 LOGO 7 Application Description  One monthly simulation : concatenate_atmospheric_input_files(1)modify_parameters(1) process_coupled_run convert_output_format(60) compress_diagonals(30)extract_minimun_information(30)  atmospheric model (ARPEGE)  ocean and sea-ice model (OPA)  runoff pathway (TRIP)  coupler (OASIS) Introduction

8 LOGO 8 Application Description Introduction

9 LOGO 9 Contents Scheduling Heuristics 3 Introduction 1 Experimental Results 4 Related Works 2 Conclusions 5

10 LOGO 10 Related Works  Multiple DAGs Scheduling  Mixed Parallelism  Pipelined Data Parallel Tasks Related Works

11 LOGO 11 Multiple DAGs Scheduling  Directed Acyclic Graph (DAG)  Nodes – tasks  Edges – precedence constraints  Multiple DAGs Scheduling Related Works

12 LOGO 12 Multiple DAGs Scheduling  Composite DAG Related Works

13 LOGO 13 Multiple DAGs Scheduling  Group DAGs’ tasks in levels of independent tasks Related Works

14 LOGO 14 Related Works – Multiple DAGs Scheduling  Composite DAG and round-robin policy of scheduling among DAGs  Composite DAG & ranking based composition Related Works

15 LOGO 15 Mixed Parallelism  Parallel scientific application  Data parallelism  Task parallelism  Mixed parallelism  Scheduling a DAG on a finite number of resources – NP complete even for the simple case of mono- processor tasks  Heuristic approaches Related Works

16 LOGO 16 Mixed Parallelism  A. Radulescu & A. Gemund (2001) – 2 step heuristic - CPA (Critical Path and Area based Scheduling)  Processors allocation to tasks - based on a compromise between the critical path length and the processor utilization  Task allocation on processors - list scheduling heuristic Related Works

17 LOGO 17 Pipelined Data Parallel Tasks  Computations consisting of a chain of data- parallel tasks that process successive data sets in a pipeline fashion – particular case of mixed parallelism  2 key metrics to be optimized:  Latency- duration of processing a data-set  Throughput- rate at which data sets can be processed Related Works

18 LOGO 18 Related Works – Pipelined Data Parallel Tasks  Aspects to be considered :  Clustering of successive stages into modules Reduces communications Improves latency  Replicating modules Improves throughput Increases latency Related Works

19 LOGO 19 Contents Scheduling Heuristics 3 Introduction 1 Experimental Results 4 Related Works 2 Conclusions 5

20 LOGO 20 Scheduling Heuristics  Climate Application Scheduling  Generic Scheduling Heuristics Scheduling Heuristics

21 LOGO 21 Climate Application Scheduling  Homogeneous platform composed of R resources  Communication assumed contention-free through NFS  Tasks execution time is assumed to include the necessary time to  access the data  redistribute it to processors  effective computing time  store back the data Scheduling Heuristics

22 LOGO 22 Climate Application Scheduling concatenate_atmosferic_input_files(1)modify_parameters(1) process_coupled_run convert_output_format(60) compress_diagonals(30)extract_minimun_information(30) Main processing Post processing Scheduling Heuristics

23 LOGO 23 Climate Application Scheduling  We divide processors into disjoint sets on which multi-processor tasks can execute  All multi-processor tasks execute on the same number of resources G, defining a certain grouping of resources  For the given application, 8 possible values for the parameter G (4 → 11) Scheduling Heuristics

24 LOGO 24 Climate Application Scheduling  Case 1  Case 2 Scheduling Heuristics

25 LOGO 25 Climate Application Scheduling  The makespan is computed analytically as a function of  number of resources R;  grouping G ;  number of months in an independent simulation (NM)  number of independent simulations (NS).  The grouping G yielding the smallest makespan is chosen Scheduling Heuristics

26 LOGO 26 Climate Application Scheduling  The constraint of scheduling all multi-processor tasks on the same number of resources is tight  Eg. R=53, NS=10, NM=1800, found optimal grouping G = 7; –49 resources for main processing; –1 resource used for the corresponding post-processing –3 resources unused. however, 3 groups with 8 resources and 4 groups with 7 resources – 4.5% of gain Scheduling Heuristics

27 LOGO 27 Climate Application Scheduling  Possibilities for improvement :  Heuristic 1 distribute evenly the unused resources among the existing groups  Heuristic 2 use all resources for multi-processor tasks (evenly distributing the extra-resources among processor groups) all post-processing at the end  Heuristic 3 use all resources for multi-processor tasks and model the problem as an instance of the knapsack problem all post-processing at the end Scheduling Heuristics

28 LOGO 28 Climate Application Scheduling  Knapsack problem modelization  Items – the 8 possibilities (groupings of resources) for allocating processors to multi-processor tasks (4 → 11)  Cost of an item – the number of resources of that grouping  Value of a grouping G – 1/T[G] – the fraction of a multi- processor task that gets executed in a time unit on G resources  Unknowns n i (i=4 → 11) – number of groups with i resources in the final solution  Constraints  Goal : maximize Scheduling Heuristics

29 LOGO 29 Climate Application Scheduling Scheduling Heuristics

30 LOGO 30 Generic Scheduling Heuristics Scheduling Heuristics  We propose generic scheduling heuristics for a class of applications consisting of independent identical chains of identical DAGs

31 LOGO 31 Generic Scheduling Heuristics  First approach  Create a composite DAG – link all entry nodes to a common entry node and all exit tasks to a common exit node  Apply mixed parallelism scheduling heuristics on the composite DAG CPA –reduced complexity (O(V(V+E)R)); –drawback of being a 2 step algorithm. Scheduling Heuristics

32 LOGO 32 Generic Scheduling Heuristics  Second approach  Exploit the knowledge on the specific structure of the application Exploit the pipelined structure of the application Separate the independent pre and post-processing tasks and schedule them with algorithms for independent malleable tasks (5/4 approximation in constant time) Scheduling Heuristics

33 LOGO 33 Generic Scheduling Heuristics Scheduling Heuristics

34 LOGO 34 Generic Scheduling Heuristics Scheduling Heuristics

35 LOGO 35 Generic Scheduling Heuristics  Heuristic 1  Schedule all pre-processing tasks at the beginning  Schedule inter and main processing tasks as interval on the same number of resources  Schedule all post-processing tasks at the end  Heuristic 2  Schedule all pre-processing tasks at the beginning  Schedule inter and main processing tasks separately as a pipeline  Schedule all post-processing tasks at the end Scheduling Heuristics

36 LOGO 36 Generic Scheduling Heuristics  Heuristic 3  Schedule inter and main processing tasks as an interval pipeline on the same number of resources  Schedule pre and post processing tasks simultaneously on resources specially reserved for them as well as resources unused by the pipeline  Schedule pre and post-processing tasks left at the beginning and end of pipeline respectively Scheduling Heuristics

37 LOGO 37 Generic Scheduling Heuristics  Heuristic 4  Schedule inter and main processing tasks separately as a pipeline  schedule pre and post processing tasks simultaneously with the pipeline on resources specially reserved for them as well as resources unused by the pipeline ;  schedule pre and post processing tasks left at the beginning and end of pipeline respectively; Scheduling Heuristics

38 LOGO 38 Contents Scheduling Heuristics 3 Introduction 1 Simulation Results 4 Related Works 2 Conclusions 5

39 LOGO 39 Simulation Results  Behavior of the 4 heuristics tested against CPA applied on the composite DAG  Tasks’ execution time modeled by Amdahl’s law:  Several configurations tested Simulation Results

40 LOGO 40 Simulation Results  Configuration 1  All tasks’ execution time on 1 processor identical (500)  All tasks’ coefficient α is identical (0.1) Simulation Results

41 LOGO 41 Simulation Results  Configuration 2  Same as before, with α interprocessing = 0.8 Simulation Results

42 LOGO 42 Simulation Results  Configuration 3  T1 pre-processing = T1 post-processing =50, T1 main-processing = T1 inter-processing =500  α= 0.1, α inter_processing =0.6 Simulation Results

43 LOGO 43 Simulation Results  Configuration 4  T1 pre-processing = T1 post-processing =50, T1 main-processing = T1 inter-processing =500  α= 0.1, α inter_processing =1.0 Simulation Results

44 LOGO 44 Contents Scheduling Heuristics 3 Introduction 1 Experimental Results 4 Related Works 2 Conclusions and Future Works 5

45 LOGO 45 Conclusions  We found a model for the given real application  We proposed a basic heuristic for this model and 3 improved versions  We proposed 4 pipeline- based heuristics for the generalized problem and compared them with the approach of applying a mixed-parallelism algorithm on the composite DAG of the application Conclusions and Future Works

46 LOGO 46 Future Works  Enhance the heuristics by taking into account a more precise communication model  Perform real experimentations on Grid’5000 in order to validate the theoretical results  Analyze other applications using a similar approach with the long term goal of deriving application dependent scheduling schemes that could finally be implemented as DIET plug-in schedulers Conclusions and Future Works


Download ppt "1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11."

Similar presentations


Ads by Google