Global Hydrology Modelling and Uncertainty: Running Multiple Ensembles with the University of Reading Campus Grid Simon Gosling 1, Dan Bretherton 2, Nigel Arnell 1 & Keith Haines 2 1 Walker Institute for Climate System Research, University of Reading 2 Environmental Systems Science Centre (ESSC), University of Reading
Outline Uncertainty in climate change impact assessment The NERC QUEST-GSI project & requirement for HTC Modification to the CC impact model & Campus Grid Results: impact on global river runoff & water resources Conclusions & future developments
Uncertainty in Climate Change Impact Assessment
Uncertainty in climate change impact assessment Global climate models (GCMs) use different but plausible parameterisations to represent the climate system. Sometimes due to sub-grid scale processes (<250km) or limited understanding.
Uncertainty in climate change impact assessment Therefore climate projections differ by institution: 2°C2°C
The NERC QUEST-GSI Project and the Requirement for HTC
The NERC QUEST-GSI project Overall aim: To examine and assess the implications of different rates and degrees of climate change for a wide range of ecosystem services across the globe Our specific aims for global hydrology & water resources: A) To assess the global-scale consequences of different degrees of climate change on river runoff and water resources B) To characterise the uncertainty in the impacts associated with a given degree of climate change
A) achieved by investigating impacts associated with the following 9 degrees of global warming relative to present: ºC B) achieved by exploring impacts with the climate change patterns associated with 21 different GCMs (climate model structural uncertainty) Assessed impacts by applying above climate change scenarios to the global hydrological model (GHM) Mac- PDM.09 –A global water balance model operating on a 0.5°x0.5° grid –Reads climate data on precipitation, temperature, humidity, windspeed & cloud cover for input The NERC QUEST-GSI project
The challenge Prescribed Temperature GCM used to provide climate data Running on Linux Desktop: 1 run = 4 hours 1 st Priority runs 9 runs = 36 hours 2 nd & 3 rd Priority runs 63 runs = 252 hours (~11 days) 4 th Priority runs 189 runs = 756 hours (~32 days) Running on Campus Grid: 189 runs = 9 hours CSIRO MK GISS AOM CCCMA CGCM31T BCCR BCM NCAR PCM GFDL CM MRI CGCM232A INM CM GISS MODELER GISS MODELEH GFDL CM CNRM CM CCSR MIROC32MED CCSR MIROC32HI CSIRO MK UKMO HadGEM NCAR CCSM MPI ECHAM IPSL CM CCCMA CGCM UKMO HadCM
Modifications to Mac-PDM.09 and the Campus Grid
Modifications to MacPDM.09 Climate change scenarios previously downloaded from Climatic Research Unite (CRU) at UEA and re- formatted to be compatible with Mac-PDM.09 –Around 800Mb of climate forcing data needed for 1 Mac- PDM.09 simulation –Therefore ~160GB needed for 189 simulations –Integrated ClimGen code within Mac-PDM.09 as a subroutine to avoid downloading –Ensured all FORTRAN code was compatible with the GNU FORTRAN compiler But the large data requirements meant the Campus Grid storage was not adequate…
Campus Grid data management Total Grid storage only 600GB, shared by all users; 160GB not always available. Solution chosen was SSH File System (SSHFS Scientists own file system was mounted on Grid server via SSH. –Data transferred on demand to/from compute nodes via Condors remote I/O mechanism.
Campus Grid data management (2) Using SSHFS to run models on Grid with I/O to remote file system... Campus Grid Large file system Grid storage, not needed Grid server Scientists data server in Reading Remote FS mounted using SSHFS Data transfer via SSH Data transfer via Condor
Campus Grid data management (3) SSHFS advantages: Model remained unmodified, accessing data via file system interface. It is easy to mount remote data with SSHFS, using a single Linux command.
Campus Grid data management (4) Limitations of SSHFS approach Maximum simultaneous model runs was 60 for our models, implemented using a Condor Group Quota –Can submit all jobs, but only 60 allowed to run simultaneously. –Limited by Grid and data server CPU load (Condor load and SSH load) Software requires sys.admin. to install. Linux is the only platform
Campus Grid data management (5) Other approaches tried and failed Lighter SSH encryption (Blowfish) –No noticeable difference in performance Models work on local copies of files –Files transferred to compute nodes before runs –Resulted in even more I/O for Condor –Jobs actually failed Mount data on each compute node separately –Jobs failed because data server load too high
Results Global Average Annual Runoff
Multiple ensembles for various prescribed temperature changes 9 model runs18 model runs81 model runs
The ensemble mean But what degree of uncertainty is there? Global Average Annual Runoff Change from Present (%)
Uncertainty in simulations Number of models in agreement of an increase in runoff
Results Catchment-scale Seasonal Runoff The LiardThe Okavango The Yangtze
Seasonal Runoff Agreement of increased snow- melt induced runoff Agreement of dry- season becoming drier Less certainty regarding wet-season changes Large uncertainty throughout the year
Results Global Water Resources Stresses
Calculating stresses A region is stressed if water availability is less than 1000m3/capita/year Therefore stress will vary according to population growth and water extraction: –Stress calculated for 3 populations scenarios in the 2080s SRES A1B SRES A2 SRES B2 Calculated for different prescribed warming ( ºC)
Global water resources stresses Global Increase in Water Stress with 2080s A1B Population
The range of uncertainty Global Increase in Water Stress with 2080s A1B Population
Conclusions HTC on the Campus Grid has reduced total simulation time from 32 days to 9 hours –This allowed for a comprehensive investigation of climate change impacts uncertainty –Previous assessments have only partly addressed climate modelling uncertainty e.g. 7 GCMs for global runoff e.g. 21 GCMs for a single catchment (we looked at 65,000) Results demonstrate: –GCM structure is a major source of uncertainty –Sign and magnitude of runoff changes varies across GCMs –For water resources stresses, population change uncertainty is relatively minor
Further developments Several other simulations have just been completed on the Campus Grid & are now being analysed: –NERC QUEST-GSI project: 204-member simulation 3 future time periods, 4 emissions scenarios, 17 GCMs (3x4x17=204) 816 hours on Linux Desktop - 10 hours on Campus Grid –AVOID research programme ( Uses climate change scenarios included in the Committee on Climate Change report 420-member simulation 4 future time periods, 5 emissions scenarios, 21 GCMs (4x5x21=420) 70 days on Linux Desktop – 24 hours on Campus Grid –1,000-member simulation planned to explore GHM uncertainty
Forcing repositories at other institutes Forcing = hydrological model input Avoid making local copies in Reading Additional technical challenges: –Larger volume of data (GCMs not run locally) –Slower network connections (for some repos.) –Sharing storage infrastructure with more users –No direct SSH access to data Further developments
Possible solutions –Mount repos. on compute nodes with Parrot ( This technique is used by CamGrid Parrot talks to FTP, GridFTP, HTTP, Chirp + others No SSH encryption overheads –May need to stage-in subset of forcing data before runs Options include Stork (
Thank you for your time Visit Acknowledgements The authors would like to thank David Spence and the Reading Campus Grid development team at the University of Reading for their support of this project.