Presentation is loading. Please wait.

Presentation is loading. Please wait.

PT Evaluation of the Dycore Parallel Phase (EDP2)

Similar presentations


Presentation on theme: "PT Evaluation of the Dycore Parallel Phase (EDP2)"— Presentation transcript:

1 PT Evaluation of the Dycore Parallel Phase (EDP2)
X. Lapillonne, M. Baldauf, P. Spörri, O. Fuhrer, C. Barbu, C. Osuna, U. Schättler, A. Walser

2 Main goal Two implementations of the RK dynamical core coexist in COSMO: Fortran C++ based on STELLA library (required to run on GPU) C++ version will soon be in the offical version Evaluate cost and give recommendation to STC regarding the parallel phase where the 2 implementations coexist Stop short presentation at this slide

3 Cost evaluation The total time maintenance: 0.5 FTE
# Type Cost/Change Cost 20 Minor Changes Adapting configuration variables, defaults or changing a line in the computation. 0.005 FTE 0.1 FTE 5 Medium Changes Changes that require new stencils but fit into the existing context. 0.02 FTE 1 Major Changes E.g. port a new Fast Waves Solver 0.2 FTE General maintenance Keeping up with recent compilers, investigation of performance issues. The total time maintenance: 0.5 FTE 0.3 FTE for basic maintenance and small to medium changes 0.2 FTE for integrating major developments Based on previous years, may vary now that the focus switch to ICON and the C++ code is distributed (more small changes, less large one)

4 Impact and evaluation Impact (on users/developers) : need for additional communication between the main dycore developers and the C++ dycore maintainer Performance on latest architecture of the C++/STELLA as compared to the Fortran dynamics x3 faster on GPU as compare to the original code on CPU 9x Haswell Sockets 96x Compute, 4x I/O Nodes 1x Haswell, 8x K80 Sockets 8x Compute, 2x I/O Nodes Setup GCC 4.9 F90 Dycore C++ Dycore GCC 4.9, CUDA 7.0 Double Precision 316 s 214 s 101 s Single Precision 169 s 152 s 65 s Consequences and experience of using the C++ dynamics for operational weather prediction : No adverse consequences in terms of maintenance, stability or complexity for operations

5 Developer experience (using STELLA)
C++ Dycore Source Code Maintainer (P. Spörri, MCH) : nice and efficient to write single source performance portable HPC code C++ Dycore Users: Experiences at DWD: very different than higher programming language (like Fortran, C++, Python, ...), need permanent support for development. => Seen as a large obstacle for model development Experience at IMGW (port of EULAG with GridTools): while this is early stage, progress is good and currently there are no major obstacles to port the Fortran code to C++. => Divergent opinions !

6 Impact on support and installation of the COSMO code
NMA involved in support activity, and will provides the second level support Impact of using a DSL for the dynamical core Faster and easier to maintain code when targeting multiple architectures Comparison with an OpenACC GPU implementation of some dycore component shows the STELLA implementation is 1.4x to 1.8x faster on

7 Possible development workflow
Working in C++ only (no Fortran dynamics) reference implementation directly using STELLA or plain C++ code (with explicit loops) can be written directly inside the existing C++ dycore dycore maintainer integrates and optimizes the reference code in STELLA

8 Consequence of discontinuing the Fortran dynamics
Training and change of workflow of the main développer (several weeks of investment) Additional training for developer at other universities would be required New training also needed for the user community : compilation of the c++ + Fortran code requires some additional knowhow. Consequence of discontinuing the C++ dynamics Issue for members and universities using the C++ dycore for production (MCH, ETH, EMPA) Would loose the ability to run on GPU architectures which is a strength of the COSMO model

9 Recommendation Extend the parallel phase of maintaining both the C++ and Fortran implementations Evaluate again after a period of at least two years Cost of having 2 dynamics: 0.5 FTE/Year => Extension accepted by STC


Download ppt "PT Evaluation of the Dycore Parallel Phase (EDP2)"

Similar presentations


Ads by Google