WWOSC 2014 Montreal, Canada 2014 - 08 Running operational Canadian NWP models on next-generation supercomputers Michel Desgagné, Abdessamad Qaddouri, Janusz.

Slides:



Advertisements
Similar presentations
GEMS Kick- off MPI -Hamburg CTM - IFS interfaces GEMS- GRG Review of meeting in January and more recent thoughts Johannes Flemming.
Advertisements

School of something FACULTY OF OTHER School of Computing An Adaptive Numerical Method for Multi- Scale Problems Arising in Phase-field Modelling Peter.
A numerical simulation of urban and regional meteorology and assessment of its impact on pollution transport A. Starchenko Tomsk State University.
Discretizing the Sphere for Multi-Scale Air Quality Simulations using Variable-Resolution Finite-Volume Techniques Martin J. Otte U.S. EPA Robert Walko.
(c) MSc Module MTMW14 : Numerical modelling of atmospheres and oceans Staggered schemes 3.1 Staggered time schemes.
Computational Challenges in Air Pollution Modelling Z. Zlatev National Environmental Research Institute 1. Why air pollution modelling? 2. Major physical.
Parallel Computation of the 2D Laminar Axisymmetric Coflow Nonpremixed Flames Qingan Andy Zhang PhD Candidate Department of Mechanical and Industrial Engineering.
Coupling Continuum Model and Smoothed Particle Hydrodynamics Methods for Reactive Transport Yilin Fang, Timothy D Scheibe and Alexandre M Tartakovsky Pacific.
Meshless Elasticity Model and Contact Mechanics-based Verification Technique Rifat Aras 1 Yuzhong Shen 1 Michel Audette 1 Stephane Bordas 2 1 Department.
AIR POLLUTION. ATMOSPHERIC CHEMICAL TRANSPORT MODELS Why models? incomplete information (knowledge) spatial inference = prediction temporal inference.
Atmospheric modelling activities inside the Danish AMAP program Jesper H. Christensen NERI-ATMI, Frederiksborgvej Roskilde.
Inter-Processor communication patterns in weather forecasting models Tomas Wilhelmsson Swedish Meteorological and Hydrological Institute Sixth Annual Workshop.
Multi-Scale Finite-Volume (MSFV) method for elliptic problems Subsurface flow simulation Mark van Kraaij, CASA Seminar Wednesday 13 April 2005.
Landscape Erosion Kirsten Meeker
1 NGGPS Dynamic Core Requirements Workshop NCEP Future Global Model Requirements and Discussion Mark Iredell, Global Modeling and EMC August 4, 2014.
Application of the CSLR on the “Yin-Yang” Grid in Spherical Geometry X. Peng (Earth Simulator Center) F. Xiao (Tokyo Institute of Technology) K. Takahashi.
Non-hydrostatic algorithm and dynamics in ROMS Yuliya Kanarska, Alexander Shchepetkin, Alexander Shchepetkin, James C. McWilliams, IGPP, UCLA.
Numerical weather prediction: current state and perspectives M.A.Tolstykh Institute of Numerical Mathematics RAS, and Hydrometcentre of Russia.
Hydrostatic. HIWPP Hydrostatic Models ModelBy Res. at 40 deg lat Output Freq. Output Res. Vertical Levels NEMS ready Initial Condi- tions Physics GFS.
Next Gen AQ model Need AQ modeling at Global to Continental to Regional to Urban scales – Current systems using cascading nests is cumbersome – Duplicative.
Recent developments in data assimilation for global deterministic NWP: EnVar vs. 3D-Var and 4D-Var Mark Buehner 1, Josée Morneau 2 and Cecilien Charette.
Solution of the Implicit Formulation of High Order Diffusion for the Canadian Atmospheric GEM Model “High Performance Computing and Simulation Symposium.
Stratified Magnetohydrodynamics Accelerated Using GPUs:SMAUG.
Ensemble-variational sea ice data assimilation Anna Shlyaeva, Mark Buehner, Alain Caya, Data Assimilation and Satellite Meteorology Research Jean-Francois.
A Look at High-Order Finite- Volume Schemes for Simulating Atmospheric Flows Paul Ullrich University of Michigan.
Development of WRF-CMAQ Interface Processor (WCIP)
C M C C Centro Euro-Mediterraneo per i Cambiamenti Climatici COSMO General Meeting - September 8th, 2009 COSMO WG 2 - CDC 1 An implicit solver based on.
Experience with COSMO MPI/OpenMP hybrid parallelization Matthew Cordery, William Sawyer Swiss National Supercomputing Centre Ulrich Schättler Deutscher.
Scheduling Many-Body Short Range MD Simulations on a Cluster of Workstations and Custom VLSI Hardware Sumanth J.V, David R. Swanson and Hong Jiang University.
Using Partitioning in the Numerical Treatment of ODE Systems with Applications to Atmospheric Modelling Zahari Zlatev National Environmental Research Institute.
ParCFD Parallel computation of pollutant dispersion in industrial sites Julien Montagnier Marc Buffat David Guibert.
CFD Lab - Department of Engineering - University of Liverpool Ken Badcock & Mark Woodgate Department of Engineering University of Liverpool Liverpool L69.
A cell-integrated semi-Lagrangian dynamical scheme based on a step-function representation Eigil Kaas, Bennert Machenhauer and Peter Hjort Lauritzen Danish.
Efficient Integration of Large Stiff Systems of ODEs Using Exponential Integrators M. Tokman, M. Tokman, University of California, Merced 2 hrs 1.5 hrs.
© Crown copyright Met Office Challenges for weather and climate prediction – a UK perspective Nigel Wood, Dynamics Research, UK Met Office.
Sensitivity Analysis of Mesoscale Forecasts from Large Ensembles of Randomly and Non-Randomly Perturbed Model Runs William Martin November 10, 2005.
1 Complex Images k’k’ k”k” k0k0 -k0-k0 branch cut   k 0 pole C1C1 C0C0 from the Sommerfeld identity, the complex exponentials must be a function.
Georgia Institute of Technology Initial Application of the Adaptive Grid Air Quality Model Dr. M. Talat Odman, Maudood N. Khan Georgia Institute of Technology.
Sensitivity experiments with the Runge Kutta time integration scheme Lucio TORRISI CNMCA – Pratica di Mare (Rome)
The status and development of the ECMWF forecast model M. Hortal, M. Miller, C. Temperton, A. Untch, N. Wedi ECMWF.
© Fluent Inc. 11/24/2015J1 Fluids Review TRN Overview of CFD Solution Methodologies.
TEMPLATE DESIGN © A high-order accurate and monotonic advection scheme is used as a local interpolator to redistribute.
ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.
Recent Developments in the NRL Spectral Element Atmospheric Model (NSEAM)* Francis X. Giraldo *Funded.
1 The Nonhydrostatic Icosahedral (NIM) Model: Description and Potential Use in Climate Prediction Alexander E. MacDonald Earth System Research Lab Climate.
Discretization Methods Chapter 2. Training Manual May 15, 2001 Inventory # Discretization Methods Topics Equations and The Goal Brief overview.
Discretization for PDEs Chunfang Chen,Danny Thorne Adam Zornes, Deng Li CS 521 Feb., 9,2006.
On the Use of Finite Difference Matrix-Vector Products in Newton-Krylov Solvers for Implicit Climate Dynamics with Spectral Elements ImpactObjectives 
Computation and analysis of the Kinetic Energy Spectra of a SI- SL Model GRAPES Dehui Chen and Y.J. Zheng and Z.Y. Jin State key Laboratory of Severe Weather.
1 Spatially adaptive Fibonacci grids R. James Purser IMSG at NOAA/NCEP Camp Springs, Maryland, USA.
IMP3 1 RITM – code Computation with stiff transport models presented by D.Kalupin 12th Meeting of the ITPA Transport Physics (TP) Topical Group 7-10 May.
Numerics of Parametrization ECMWF training course May 2006 by Nils Wedi  ECMWF.
Global variable-resolution semi-Lagrangian model SL-AV: current status and further developments Mikhail Tolstykh Institute of Numerical Mathematics, Russian.
The Application of the Multigrid Method in a Nonhydrostatic Atmospheric Model Shu-hua Chen MMM/NCAR.
Performance of a Semi-Implicit, Semi-Lagrangian Dynamical Core for High Resolution NWP over Complex Terrain L.Bonaventura D.Cesari.
Computational Modeling of 3D Turbulent Flows With MC2 Claude Pelletier Environment Canada MSC / MRB.
Representing Effects of Complex Terrain on Mountain Meteorology and Hydrology Steve Ghan, Ruby Leung, Teklu Tesfa, PNNL Steve Goldhaber, NCAR.
Multipole-Based Preconditioners for Sparse Linear Systems. Ananth Grama Purdue University. Supported by the National Science Foundation.
Lecture Objectives: - Numerics. Finite Volume Method - Conservation of  for the finite volume w e w e l h n s P E W xx xx xx - Finite volume.
Albert Oliver, Raúl Arasa, Agustí Pérez-Foguet, Mª Ángeles González HARMO17 Budapest May 2016 Simulating large emitters using CMAQ and a local scale finite.
Developing a Vision for the Next-Generation Air Quality Modeling Tools: A Few Suggestions Mike Moran Air Quality Research Division, Environment Canada,
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
GEMDM: SOFTWARE/PERFORMANCE
Development of nonhydrostatic models at the JMA
4D-VAR Optimization Efficiency Tuning
Convergence in Computational Science
Objective Numerical methods Finite volume.
Conservative Dynamical Core (CDC)
Semi-implicit predictor-corrector methods for atmospheric models
Computational issues Issues Solutions Large time scale
Presentation transcript:

WWOSC 2014 Montreal, Canada Running operational Canadian NWP models on next-generation supercomputers Michel Desgagné, Abdessamad Qaddouri, Janusz Pudykiewicz, Stéphane Gaudreault, Michel Valin, Martin Charron R echerche en P révisions N umériques (RPN – Dorval,Canada)

Plan G lobal E nvironmental M ultiscale NWP model Current large Gem configurations at CMC Scaling up scenarios From Global Lat-Lon to Yin-Yang Investigating new numerics Investigating new horizontal geometry Conclusions

Global Deterministic Prediction System (GDPS) ENVar Data Assimilation 25 km horizontal resolution, 80 levels 00, 12Z runs up to 10-days 1280 IBM P7 cores Current NWP Systems(1) Regional Deterministic Prediction System (RDPS) ENVar Regional Data Assimilation 10 km horizontal resolution, 80 levels 48-hour forecast, 4x per day 1024 IBM P7 cores

Current NWP Systems(2) High-res Deterministic Prediction System (HRDPS): Downscaling from the RDPS 10 km 2.5 km horizontal resolution, 65 levels 48-hour forecast, 4x per day, 2976 IBM P7 cores

Scaling up those NWP Systems GDPS RDPS HRDPS Yin-Yang 10 km, 120 levels 12K “IBM P7” cores (4.5 X current cluster size at 33%) 2.5 km, 80 levels With ENVar & larger domain 5-9K “IBM P7” cores Experimental system Urban scale 250 m Strategic interests Special contracts Yin-Yang ~7 km ?? 36K “IBM P7” cores (3-fold) Yin-Yang ~2.5 km 800K “IBM P7” cores (22-fold) ~1.0 km, 120 levels 200K “IBM P7” cores

The GEM model 1.Grid point lat/lon model 2.Finite differences on an Arakawa-C grid 3.Semi-Lagrangian (poles are an issue) 4.Implicit time discretization 1.Direct solver (Nk 2D horizontal elliptic problems) 2.Full 3D iterative solver based on FGMRES 5.Global Lat-Lon, Yin-Yang and LAM configurations 6.Hybrid MPI/OpenMP 1.Halo exchanges 2.Array transposes for elliptic problems 7.PE block partitioning for I/O Global Lat-Lon grid: 1) a challenge for DM implementation 2) many more elliptic problems to solve due to implicit horizontal diffusion (transposes) 3) semi-Lagrangian near the poles 4) current DM implementation will not scale

Yin-Yang grid configuration Implemented as 2 LAMs communicating at boundaries Optimized Schwarz iterative method for solving the elliptic problem. Scales a whole lot better Operational implementation due in spring 2015 Communications are an issue Exchanging a Global Lat-Lon scalability problem (poles) by another scalability problem Abdessamad Qaddouri, Vivian Lee (2011) The Canadian Global Environmental Multiscale model on the Yin-Yang grid system, Q. J. R. Meteorol. Soc. 137: 1913–1926

H960H3200H5056H1920 Y3200Y8192Y16384Y30968 Y5056 H= CMC/hadar: IBM Power7 Y= NCAR/yellowstone: IBM iDataPlex 16x30x1 198x36 79x32x1 40x34 79x49x4 40x22 32x32x8 99x34 32x32x4 99x34 32x30x1 99x36 40x40x1 79x27 Yin-Yang 10 km scalability Ni=3160, Nj=1078, Nk=158 Y x49x4 20x22

Yin-Yang 10 km scalability Dynamics components semi-Lagrangian transpose Yin-Yang exchanges fft-solver

The future of GEM Yin-Yang 2km on order 100K cores is already feasible on P7 processors or similar Solver and Yin-Yang exchanges will need work Using GPUs capabilities is on the table Improve Omp scalability Reshaping main memory for better cache usage Export SL interpolations to reduce halo size Processor mapping to reduce the need to communicate through the switch Partition NK MIMD approach for I/O: I/O server

Revisiting Time Discretization: Exponential integration methods for meteorological models Jacobi operator Introducing the integrating factor yields Clancy C., Pudykiewicz J. (2013) On the use of exponential time integration methods in atmospheric models, Tellus A, vol. 65

And integrating over followed by multiplication by Depending on the quadrature used we will obtain different versions of the exponential integration schemes. Their common property is that the desired solution could be expressed as weighted sum of “phi functions” Tokman (2006) Efficient integration of large stiff systems of ODEs with exponential propagation iterative methods, J. Comp. Phys. (213),

Exploring exponential integrators with Eulerian finite-volume schemes in GEM The main difficulty of implementing exponential integrators is the evaluation of the φ functions. Recent advance in computational linear algebra is now allowing efficient computation:  Fast, matrix-free, Krylov method  Only the action of the matrix on a vector (matvec product) is need  Completely remove the need to evaluate φ functions  Reduced-size problem : Reduce the large system matrix to a smaller Hessenberg matrix on which calculating the exponential is easy  inherent local and global conservation  non-oscillatory advection (monotonic, positivity preservation)  geometric flexibility (any type of grid system)  stable even with large Courrant numbers

Major Findings Exponential methods can resolve high frequencies to the required level of tolerance without severe time step restriction of the explicit numerical schemes Contrast to usual implicit integration used with large time steps, which either damp high frequencies or map them to one and the same frequency (or nearly so) in the discretization scheme is free of noise even when the low order spatial discretization is used Compute efficienty is very good on linear problems and looks real promising for full model Expected to scale very well because of data locality

Investigating scalability and accuracy on an icosahedral geodesic grid Spatial discretization: finite volume method on icosahedral geodesic grid Time discretization: exponential integration methods which resolve high frequencies to the required level of tolerance without severe time step restriction Shallow water implementation already shows great scalability Vertical coordinates: Generalized quasi-Lagrangian with conservative monotonic remapping Pudykiewicz J. (2011), J. Comp. Phys., 230, pp Qaddouri A., J. et al. (2012), Q. J. Roy. Met. Soc., 138, pp

Conclusions Yin-Yang 2km on order 100K cores is already feasible on P7 processors or similar No real worries for next 4-6 years Ready to address architecture changes: –GPUs –Larger vectors –Larger # of cores per node Many developments items on the agenda Keep investigating new numerics: –that will enhance data locality and limit communications –that will be better suited for upcoming architectures

END