Patrick Marchesiello Brest, 13 Janvier 2005 Le modèle ROMS et son utilisation sur NYMPHEA Centre IRD de Bretagne
ROMS History Descendant of SPEM & SCRUM Descendant of SPEM & SCRUM (relative of POM) (Song & Haidvogel 1994; Barnier et al., 1998) UCLA: more like developer’s code (Shchepetkin et al., 1998, 2003, 2004; Marchesiello et al., 2001, 2003 … ) Rutgers: larger user community & support IRD Brest & UCLA & INRIA - AGRIF: Adaptive Grid Refinement In Fortran (Debreu 1999) - AGRIF: Adaptive Grid Refinement In Fortran (Debreu 1999) - Pre-processing tools - Pre-processing tools (Penven, Marchesiello)
Collaborators and Users FRANCE IRD Brest: Penven, Marchesiello et al. LMC Grenoble: Debreu et al. LPO Brest: Le Gentil et al. USA UCLA: McWilliams, Shchepetkin, et al. JPL: Chao et al. Rutgers U.: Arango et al. USERS France: Brest, Paris, Toulouse, Noumea Europe: Germany (U. Bremerhaven), Italy (JRC), Portugal (IPIMAR), Spain (AZTI) Africa: Morocco (INRH), Senegal (LPA), South Africa (U. Captown) America: California, Peru (IMARPE), Chili (U. Conception), Brazil
ROMS Main features Hydrostatic, Boussinesq Primitive Equations Hydrostatic, Boussinesq Primitive Equations Free surface Free surface Generalized vertical s-coordinate Generalized vertical s-coordinate Horizontal curvilinear coordinates Horizontal curvilinear coordinates High order, low dispersion numerics High order, low dispersion numerics Embedded domains: AGRIF Embedded domains: AGRIF Open boundary conditions Open boundary conditions Boundary layers parameterizations Boundary layers parameterizations Parallelization: OMP, MPI Parallelization: OMP, MPI Domain partitionning Domain partitionning Optimized for vector computers Optimized for vector computers Fortran 95 Fortran 95 UNIX/Linux UNIX/Linux C preprocessor C preprocessor NetCDF library, used for all I/O NetCDF library, used for all I/O
Numerics: Motivation Kantha and Clayson (2000) after Durran (1991)
Numerics: Strategy High order accurate methods: Sanderson (1998): optimal choice (lower cost for a given accuracy) for general ocean circulation models is 3RD OR 4TH ORDER accurate methods With special care to: Numerical dispersion Pressure gradient Mode splitting Combination of methods
Numerics in ROMS (Shchepetkin & McWilliams, 1998, 2003, 2004) Horizontal (“C”) and vertical staggered grids Time stepping –Split-explicit barotropic and baroclinic modes with 2-way time filter –Predictor-corrector Leapfrog-Adams-Molton 3rd order scheme with feed-back between momentum & tracer equations –Non-uniform density in barotropic mode –Conservative & constancy preserving advection for tracers. Advection –3rd order upstream biased (QUICK) Vertical terms –parabolic spline reconstruction for horiz. pressure gradient and advection terms (equivalent 8th order) –Implicite Crank-Nicholson scheme for vertical mixing terms
POG deg ROMS – 0.25 deg Numerics: Perfomances C. Blanc
ROMS_AGRIF Each domain has its own input/output files Grid’s locations specified in AGRIF_FixedGrids.in Works in OPENMP/MPI Forcings, initial conditions generated with an interactive matlab tool: « nesting gui » The same model (executable) runs on grids with different space/time resolutions
Nymphea
Implementation Compilation –Software required: Fortran95, Unix, C preprocessor, NetCDF library –Compilation interface in ROMS which defines machine dependent options (Tru64 UNIX) Parallelisation –OpenMP: 1 knot of 4 processors –MPI: for process studies (S. Le Gentil); needs work for realistic applications Applications –Realistic: coastal regions of West Africa (Morocco and Senegal), Iroise sea,Bay of Brest –Process studies at high resolution
W. Africa 25 km C. Vert C. Blanc Sahara 5 km Mercator Levitus Clipper ROMS_AGRIF for West AFRICA 242*252*32 points dt=720s
PERFORMANCES: COST CONFIGURATION 2 Embedded grids with refinement coef=5 Size (child grid): 242*252*32 points with dt=720s Duration of simulation: 10 model years Processors: 1 knot of 4 processors Alpha EV68 (1GHz) Parallelization with OpenMP Partitionning: 4*8 Cost: c = CPU seconds / grid point / time step ( Total run time = 15 days) Comparisons: PC Xeon 2.8Ghz: c= SGI/CRAY Origin2000: c= Earth Simulator (NEC SX): c=
PERFORMANCES: SCALABILITY Nymphea: 95 % for 1-4 proc. SGI/CRAY-Origin2000: 95% with saturation above 128 proc. Earth Simulator: 95-60% for proc. OMP opt. part. OMP (1 sub/proc) MPI (1 sub/proc)
Partitioning Senegal ideal case on Nymphea (P. Estrade) Domain: 150*500*40 with dt=480s Partitioning 1*1 : Cost = Partitioning 1*64 : Cost = (units= CPU s/ grid point/ time step) 25 % gain due to optimal cache use Domain: 159*171*20 with dt=480s New Caledonia region on PC (J. Lefêvre) 100 % gain due to optimal cache use NSUB_X NSUB_E
CONCLUSION ROMS is well optimized (code and methods) and adapted to Nymphea which allows to perform large runs in a reasonable time without excessive queuing time The model is ready for faster, more numerous processors (provided AGRIF is fully tested with MPI) More storage would be welcome