Download presentation
Presentation is loading. Please wait.
1
Is RRTMGP suited for GPU?
2
Expectations Embarrassing parallel Memory intensive computations
Columns can be split up and computed in parallel Memory intensive computations Memory is faster on GPU than CPU Answer to the title: Yes
3
Context
4
Speed-up GPU vs CPU The expected speed up will be defined much by the memory performance. GDDR is faster than DDR. This advantage is expected to further grow in future GPU generations. Memory bandwidth speed is approximately times faster on GPU. Drawback: GDDR is typically smaller than DDR
5
Computations in RRTMGP
Multiple components to parallelize Gas optics, flux solver, etc. Multiple sub-components, each with its own logic Computations are relatively lightweight Terms and factors from multiple sources, often arrays, are combined using basic arithmetic. Static data can be parked on GPU memory e.g. k-coefficients
6
Scale Dimensions Approx Columns 100 Layers 250 Pseudo-spectral 10 other
7
Memory Access Patterns
Memory access is mostly sequential. There are local interpolations that interfere with a perfect sequential memory access. These disruptions are at a local scale only. Indexing on arrays change for components. Gas optics (pseudo-spectral, layer, column) Flux solver (column, layer, pseudo-spectral)
8
Lessons learned
9
Overview Compilers struggle with newer FORTRAN, OpenACC, and libraries. FORTRAN 2003 NetCDF library for I/O For OpenACC we tested: PGI and Cray Without OpenACC we tested: Intel, PGI, GNU, Cray, and NAG
10
Success: Cray and OpenACC
We got gas optics to work to the extend that it compiled and computed the correct answers to 15 digit precision on GPU. With $ACC PARALLEL; $ACC KERNEL crashes Error messages could be better Issues Member variables and OpenACC are not workable Function calls within parallel regions are not supported by compiler Optional arguments and OpenACC are not workable Defining dynamic dimensions of variables in member functions
11
PGI and NetCDF Failure: PGI and NetCDF do not play nice
ERROR: Segmentation fault pgi/15.3 netcdf/ on rc.colorado.edu This prevented us from testing OpenACC and PGI. The PGI compiler is one of the prime choices for OpenACC. Q: What is the standard NetCDF library for Python? netCDF4, scipy.io.netcdf, or Scientific.IO.NetCDF
12
Intel Does not support OpenACC for practical purpose.
A few hick-ups with FORTAN 2003 standard, but overall “thumbs up”. Side note: The compiler is sometimes too lenient in the syntax it accepts intel/ netcdf/
13
GNU Does not support OpenACC for practical purpose.
A few hick-ups with FORTAN 2003 standard, but overall “thumbs up”. Does not support some FORTAN 2003 implicit memory allocations Expected to be slower than other compilers gnu/4.9.2 netcdf/
14
Extra slides
15
Parallelism in RRTMGP Columns Layers Pseudo-spectral (gpts) other
16
Strategies for OpenACC Parallelism
Solver Gas Optics
17
OpenACC – example gas optics
18
Future Outlook C++ implementation, Hackathon, etc.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.