Download presentation
Presentation is loading. Please wait.
Published byHaden Walkup Modified over 9 years ago
1
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Early Experience of Running WRF and CAM on Ranger Siddhartha Ghosh Wei Huang Juli Rew National Center for Atmospheric Research June 11, 2008 Contact: huangwei@ucar.edu
2
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Ranger Nodes 4X Quad Core 2GHz AMD Barcelona Processors InfiniBand Interconnection 3,936 16-way SMP nodes Memory Hierarchy 64KB L1 Cache (per core) 2MB L2 Cache 2MB L3 Cache (shared) 32GB (per node) 4 FP/clock, 8GF TP
3
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Pingpong Test Ranger vs IBM HPS (Federation)
4
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas WRF Variables and Domain Size Variables Defined in Registry # table entries are of the form # state real u ikjb dyn_em 2 X i01rhusdf=(bdy_interp:dt) "U" "x-wind component" "m s-1" state real v ikjb dyn_em 2 Y i01rhusdf=(bdy_interp:dt) "V" "y-wind component" "m s-1" state real w ikjb dyn_em 2 Z irhusdf=(bdy_interp:dt) ”W" "z-wind component" "m s-1” Domain Size Defined in “namelist.input” &domains max_dom = 1, s_we = 1, 1, 1, e_we = 74, 112, 94, s_sn = 1, 1, 1, e_sn = 61, 97, 91, s_vert = 1, 1, 1, e_vert = 28, 28, 28, Developed Python program to read Registry and process “namelist.input” Get Memory Usage before run
5
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Memory Usage (MB) for WRF Dataset 425*300*35 ProcessorsTotalGridGlobalComm. 13204.943153.2917.8533.77 21679.821635.0217.8526.92 4885.25850.1717.8517.20 8487.99457.7617.8512.35 16284.24257.7617.858.92 32178.74154.3717.856.49 64123.44100.7817.854.77 12893.9872.5317.853.56 25677.9757.3817.852.70 51269.0649.0817.852.09 102463.9644.4117.851.67
6
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Average Wall-Clock (seconds) Used for Each WRF Integration processorsSeconds/stepSpeedupEfficiency 122.56 211.711.920.96 48.262.730.68 84.195.380.67 162.568.810.55 321.3816.340.51 640.8327.180.42 1280.4550.130.39 2560.3270.500.27 5120.24940.18 10240.4155.020.05
7
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Memory Usage (MB) for WRF Dataset 1501*1201*35 ProcessorsTotalGridGlobalComm. 144134.8243758.25252.38124.15 222718.8222369.70252.3896.70 411737.8511423.05252.3862.39 86234.975938.25252.3844.30 163433.843149.92252.3831.51 321998.151723.27252.3822.47 641255.46986.98252.3816.07 128866.55602.59252.3811.55 256659.67398.91252.388.35 512547.45288.95252.386.09 1024485.13228.23252.384.49 2048449.58193.81252.383.36 4096428.70173.72252.382.56 8192416.06161.65252.381.99
8
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Average Wall-Clock (seconds) Used for Each WRF Integration Number of processorsSeconds/step 6411.65 1285.53 2564.39 5122.39 10241.21 20480.67 40962.39 8192TBD
9
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas WRF Timing on Blueice and Ranger (x-procs, y-seconds/time-step)
10
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Community Atmospheric Model (CAM) A Global Atmospheric Model developed at NCAR in collaboration with researchers elsewhere Supports many dynamical cores FV core is becoming popular Supports few standard out of the box resolutions In this study we considered 1x1.25 and 0.5x0.625 (lat x lon in degrees) 2D domain decomposition
11
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Performance –Reported in Model-years estimated to be computed per day –Compared with IBM 1.9GHz power5-HPS system at NCAR NCAR System (Blueice) –Dual core 16-way 1.9GHz IBM pwr5+ nodes –IBM HPS 2-SNI/node with 8 micro-seconds latency and 2GB/ps each SNI each way peak bw
12
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas CAM perturbation growth test * Give the model a small perturbation * The perturbation should not grow fast
13
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Performance comparison at resolution of 1x1.25
14
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Performance comparison at resolution 0.5x0.625
15
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Conclusion Able to run WRF, CAM, and CCSM on Ranger successfully WRF scales pretty good on Ranger –Small data set to 512 processors –Larger data set to 2048 processors CAM and CCSM doesn’t scale so well
16
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Acknowledgement and Reference Acknowledgement –We’d like to thank Rich Loft for providing encouragement in this study and including us in the team of early Ranger users. –We’d also like to thank John R. Boisseau for providing early user access and Karl W. Schulz for providing performance tips and initial hand holding needed to compute on Ranger. Reference –http://www.tacc.utexas.eduhttp://www.tacc.utexas.edu –http://www.ccsm.ucar.eduhttp://www.ccsm.ucar.edu –http://www.wrf-model.orghttp://www.wrf-model.org Contact –Siddhartha Ghosh: sghosh@ucar.edusghosh@ucar.edu –Wei Huang: huangwei@ucar.eduhuangwei@ucar.edu –Juli Rew: juliana@ucar.edujuliana@ucar.edu
17
June 11, 2008 National Center for Atmospheric Research Teragrid 08, Las Vagas Add small perturbation to the initial field The perturbation should not grow fast (at the order of 10 -9 within 2 days Used to check the impact of compiler/platform and optimization on model
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.