Evaluating Coupling Strategies Mike Hobson 20 th April 2015
Evaluation of coupling strategies IS-ENES2 WP10 task 3 This task followed on from work started at CW2013. Sophie Valcke, Graham Riley, Rupert Ford & Mike Hobson To find a method for evaluating different coupling strategies. Provide benchmark problems as a way of comparing coupling technologies. Provide “Reference Implementations” using common technologies.
Met Office interest/experience The Met Office is currently investigating replacing its current Unified Model (UM) with a new more scalable model. The project is called LFRic. This model is likely to: Move away from using lon/lat meshes. Use a finite-element formulation.
Met Office interest/experience We have a future requirement to have semi-structured or unstructured meshes. But will need to work with regular lon/lat models. Met Office took on responsibility to provide a benchmark for coupling models with the following meshes: a regular longitude/latitude mesh a semi-structured mesh
Structured rectilinear mesh Longitude-latitude The mesh in use in the current Met Office Unified Model (UM). Can use direct addressing – the cell at (i+1,j) is always to the right of that at (i,j). Mesh will be useful in future for outputting diagnostics. Singularity at poles.
Semi-structured mesh Cube-sphere An approximation to an unstructured mesh – appears as a 1d list of cells. Requires indirect addressing. Likely to be similar to the sorts of mesh that will be used in future models to avoid the singularity at the poles.
Coupling benchmark Should be technology agnostic. Benchmarks are defined fully in the document “Benchmark definition for evaluation of coupling strategies”. For my simple reference implementations: Take a field from a model with one of the above meshes and transfer it to another model with the other mesh....and the reverse.
Reference implementation no.1 Using OASIS3-MCT Model 1 oasis_init_comp oasis_def_partition oasis_def_var oasis_enddef Time-step loop oasis_put oasis_get oasis_terminate Model 2 oasis_init_comp oasis_def_partition oasis_def_var oasis_enddef Time-step loop oasis_put oasis_get oasis_terminate
Reference implementation no.1 Using OASIS3-MCT System consists of two separate model components in two executables...with calls to the OASIS3-MCT library from each Very little code intrusion.
Reference implementation no.2 Using ESMF Grid/Cpl Component Register Init(state,… Run(state,… Finalize(state,… ESMFDriver ESMF_Initialize ESMF_Grid/CplCompCreate ESMF_Grid/CplCompSetServices ESMF_Grid/CplCompInitialize Time-step loop ESMF_Grid/CplCompRun ESMF_Grid/CplCompFinalize ESMF_Finalize
Reference implementation no.2 Using ESMF (Init) Model 1 init ESMF_GridCreate ESMF_FieldCreate ESMF_StateAdd Model 2 init ESMF_MeshCreate ESMF_FieldCreate ESMF_StateAdd Coupler init ESMF_StateReconcile ESMF_FieldRegridStore
Reference implementation no.2 Using ESMF (Run) Model 1 run ESMF_StateGet (perform a time step of model 1) Model 2 run ESMF_StateGet (perform a time step of model 2) Coupler (model1 → model2) run ESMF_StateGet ESMF_FieldRegrid Driver Coupler (model2 → model1) run ESMF_StateGet ESMF_FieldRegrid
Reference implementation no.2 Using ESMF (Finalize) Model 1 final ESMF_FieldDestroy ESMF_GridDestroy Model 2 final ESMF_FieldDestroy ESMF_MeshDestroy Coupler final ESMF_FieldRegridRelease
Reference implementation no.2 Using ESMF Using ESMF as a coupler is much more about using the whole framework. Would generally require much more of a rewrite compared to OASIS3-MCT. But will deliver all the benefits of a better structured code.
What was the point of all that? But we already knew all of that so what was the point! Well, that was the point. Implementing the benchmark has given us a good handle on the non-functional characteristics of the coupling technologies.
Assessment of functional characteristics It is easy to use the benchmark to answer the “Does it...” type questions. It is also reasonably straightforward to answer the correctness questions. It is more difficult to answer the performance (MOTR – “Measure of Time Required”) questions.
Performance benchmarking Care must be taken to make sure benchmarks are: Representative Fair Correct There are many technical challenges in achieving the above.
What to measure? The first question is what do you measure? Overall runtime to complete a coupling task – will not provide detailed enough information. Isolating specific characteristics can be difficult – the technologies can approach the problem in very different ways.
Technical challenges Compile each tool in a comparable way. Make sure all debugging information is turned off in each case. Ensure the processor configurations are the same.
MOTR for regridding Choose a representative problem: 0.25°lon/lat mesh (1440 x 720) cube-sphere mesh with 1.5million cells. (Both these meshes lead to about 20km resolution at UK latitudes) Isolate regridding from redistribution. Add calls to Fortran intrinsics to time the relevant calls to the coupling library in each case. Optimise.
Result My regridding problem was quite big, with only a small number of processors (so both technologies were quite slow) ~ 0.25s per regridding operation. Both results were within a few percent of each other. My optimisations were probably not comprehensive – so I’m sure both times could be improved.
Further work Finite-Element Meshes The Met Office also agreed go through the same process for models with finite- elements on similar meshes. This hasn’t been done yet because: The Met Office hasn’t yet built the architecture that could be used to generate a finite-element benchmark. The coupling technologies don’t yet support finite-element meshes.
Conclusions The benchmarks are a useful tool. They allow us to answer straightforward questions (both functional and non- functional) easily. Care is needed when using the benchmarks to measure performance.
Questions?