Presentation is loading. Please wait.

Presentation is loading. Please wait.

CGAM Running the Met Office Unified Model on HPCx Paul Burton CGAM, University of Reading

Similar presentations


Presentation on theme: "CGAM Running the Met Office Unified Model on HPCx Paul Burton CGAM, University of Reading"— Presentation transcript:

1 CGAM Running the Met Office Unified Model on HPCx Paul Burton CGAM, University of Reading Paul@met.rdg.ac.uk www.cgam.nerc.ac.uk/~paul

2 2 30 June, 2015 Overview CGAM : Who, what, why and how The Met Office Unified Model Ensemble Climate Models High Resolution Climate Models Unified Model Performance Future Challenges and Directions

3 3 30 June, 2015 Centre for Global Atmospheric Modelling Atmospheric Chemistry Modelling Support Unit Universities’ Weather and Environment Research Network Distributed Institute for Atmospheric Composition British Atmospheric Data Centre University Facilities for Atmospheric Measurement Facility for Airbourne Atmospheric Measurements Who is CGAM? Data Assimilation Research Centre British Geological Survey Centre for Ecology and Hydrology Proudman Oceanographic Laboratory Southampton Oceanography Centre Centre for Terrestrial Carbon Dynamics Environmental Systems Science Centre British Antarctic Survey Tyndall Centre for Climate Change Research National Institute for Environmental e-Science Centre for Polar Observations and Modelling NERC Centres for Atmospheric Science N.E.R.C.

4 4 30 June, 2015 What does CGAM do? Climate Science –UK Centre of expertise for climate science –Lead UK research in climate science Understand and simulate the highly non-linear dynamics and feedbacks of the climate system Earth System Modelling From seasonal to 100’s of years Close links to Met Office Computational Science –Support scientists using Unified Model –Porting and optimisation –Development of new tools

5 5 30 June, 2015 Why does CGAM exist? Will there be an El Nino this year? –How severe will it be? Are we seeing increases in extreme weather events in the UK? –2000 Autumn floods –Drought? Will the milder winters of the last decade continue? Can we reproduce and understand past abrupt changes in climate?

6 6 30 June, 2015 How does CGAM answer such questions? Models are our laboratory –Investigate predictability –Explore forcings and feedbacks –Test hypothesis

7 7 30 June, 2015 Met Office Unified Model Standardise on using a single model Met Office’s Hadley Centre recognised as world leader in climate research Two way collaboration with the Met Office Very flexible model –Forecast –Climate –Global or Limited Area –Coupled ocean model –Easy configuration via a GUI –User configurable diagnostic output

8 8 30 June, 2015 Unified Model : Technical Details Climate configuration uses “old” vn4.5 –Vn5 has an updated dynamical core –Next generation “HadGEM” climate configuration will use this Grid-point model –Regular latitude/longitude grid Dynamics –Split-explicit finite-difference scheme –Diffusion and polar filtering Physical Parameterisation –Almost all constrained to a vertical column

9 9 30 June, 2015 Unified Model : Parallelisation Domain decomposition –Atmosphere : 2D regular decomposition –Ocean : 1D (latitude) decomposition GCOM library for communications –Interface to selectable communications library: MPI, SHMEM, ??? –Basic communication primitives –Specialised communications for UM Communication Patterns –Halo update (SWAPBOUNDS) –Gather/scatter –Global/partial summations Designed/optimised for Cray T3E!

10 10 30 June, 2015 Model Configurations Currently –HadAM3 / HadCM3 Low resolution (270km : 96 x 73 x 19L) Running on ~10-40 CPUs –Turing (T3E1200), Green (O3800), Beowulf cluster Over the next year –More of the same –Ensembles Low resolution (HadAM3/HadCM3) 10-100 members –High resolution 90km : 288 x 217 x 30L 60km : 432 x 325 x 40L

11 11 30 June, 2015 Ensemble Methods in Weather Forecasting Have been used operationally for many years (is. ECMWF) –Perturbed starting conditions –Reduced resolution Multi-model ensembles –Perturbed starting conditions –Different models Why are they used? –Give some indication of predictability –Allows objective assessment of weather-related risks –More chance of seeing extreme events

12 12 30 June, 2015

13 13 30 June, 2015 Climate Ensembles Predictability What confidence do we have in climate change? What effect do different forcings have? –CO 2 – different scenarios –Volcano erruptions –Deforestation How sensitive is the model –Twiddle the knobs and see what happens How likely are extreme events? –Allows governments to take defensive action now

14 14 30 June, 2015 Ensembles Implementation Setup –Allow users to specify and design an ensemble exeperiment Runtime –Allow the ensemble to run as a single job on the machine for easy management Analysis –How to view and process vast amounts of data produced

15 15 30 June, 2015 Setup : Normal UM workflow UMUI UM Job Shell script [poe executable] Fortran Namelists Data Starting data Forcing data Output Diagnostics Restart data

16 16 30 June, 2015 Control poe UM_Job UM_Job $MEMBERid=… cd “Job.$MEMBERid” Run script Setup : UM Ensemble workflow Job.1 Shell script Fortran Namelists Data.1 Starting data Forcing data Out.1 Diagnostics Restart data Job.2 Shell script Fortran Namelists Job.3 Shell script Fortran Namelists UM Job Shell script [poe executable] Fortran Namelists Config N_MEMBERS=3 Differences Data.2 Starting data Forcing data Data.3 Starting data Forcing data Out.2 Diagnostics Restart data Out.3 Diagnostics Restart data ect ecdt

17 17 30 June, 2015 UM Ensemble : Runtime (1) “poe” called at top level – calls a “top_level_script” –Works out which CPU it’s on –Hence which member it is –Hence which directory/model SCRIPT to run Model scripts run in a separate directory for each member Each model script calls the executable

18 18 30 June, 2015 UM Ensemble : Run time (2) Uses “MPH” to change the global communicator –http://www.nersc.gov/research/SCG/acpi/MPH/ –Freely available tool from NERSC –MPH designed for running coupled multi- model experiments Each member has a unique MPI communicator replacing the global communicator

19 19 30 June, 2015 UM Ensemble : Future Work Run time tools Control and monitoring of ensemble members Real-time production of diagnostics –Currently each member writes its own diagnostics files Lots of disk space I/O performance? –Have a dedicated diagnostics process Only output statistical analysis

20 20 30 June, 2015 UK-HIGEM National “Grand Challenge” Programme for High Resolution Modelling of the Global Environment Collaboration between a number of academic groups and the Met Office’s Hadley Centre Develop high resolution version of HadGEM (~ 1 0 atmosphere, 1/3 0 ocean) Better understanding and prediction of –Extreme events –Predictability –Feedbacks and interactions –Climate “surprises” Regional Impacts of climate change

21 21 30 June, 2015 UK HiGEM Status Project only just starting Plan to use Earth Simulator for production runs Preliminary runs carried out –Earth Simulator –Very encouraging results HPCx is a useful platform –For development –Possibly for some production runs

22 22 30 June, 2015 UM Performance Two configurations –Low resolution 96x73x19L –High resolution 288x217x30L Built in comprehensive timer diagnostics –Wallclock time –Communications –Not yet implemented I/O, memory, hardware counters, ??? Outputs an XML file Analysed using PHP web page

23 23 30 June, 2015 LowRes Scalability

24 24 30 June, 2015 LowRes : Communication Time

25 25 30 June, 2015 LowRes : Load Imbalance

26 26 30 June, 2015 LowRes : Relative Costs

27 27 30 June, 2015 HiRes Scalability

28 28 30 June, 2015 HiRes Communication Time

29 29 30 June, 2015 HiRes Load Imbalance

30 30 30 June, 2015 HiRes Relative Costs

31 31 30 June, 2015 HiRes Exclusive Timer QT_POS has large “Collective” time –Unexpected! Call to global_MAX routine in gather/scatter –Not needed, so deleted!

32 32 30 June, 2015 HiRes : After “optimisation” QT_POS reduced from 65s to 35s Improved scalability And repeat…

33 33 30 June, 2015 Optimisation Strategy Low Res –Aiming for 8 CPU runs as ensemble members (typically ~50 members) –Physics optimisation a priority Load Imbalance (SW radiation) Single processor optimisation Hi Res –As many CPUs as is feasible –Dynamics optimisation a priority Remove/optimise collective operations Increase average message length

34 34 30 June, 2015 Future Challenges Diagnostics and I/O –UM does huge amounts of diagnostic I/O in a typical climate run –All I/O through a single processor Cost of gather Non-parallel I/O Ocean models –Only 1D decomposition, so limited scalability –T3E optimised! Next generation UM5.x –Much more expensive –Better parallelisation for dynamics scheme


Download ppt "CGAM Running the Met Office Unified Model on HPCx Paul Burton CGAM, University of Reading"

Similar presentations


Ads by Google