CESD 1 SAGES Scottish Alliance for Geoscience, Environment & Society The challenges of geo-simulation data Centre For Earth System Dynamics
CESD 2 This talk: perspectives from CESD’s climate modelling How climate modelling is done –Why model the climate? –NetCDF –CF – climate and forecast –Archives and metadata Current challenges Imminent challenges
CESD 3 What is “the climate”? Statistical concepts such as: –Typical seasonal rainfall distribution –Global mean annual outgoing shortwave radiation –Monthly mean surface temperature …arising from physical processes –Fluid dynamics on rotating sphere –Interactions of radiation –….
CESD 4 Why use a computer model of the climate? 1.Explore the climate: –Test hypotheses about how the climate works –Interpret observations –Express scientific community understanding –Generate possible past and future climates 2.Use climate model output data –To drive other models –To inform mitigation/adaptation –Where observations are sparse at best… e.g. the future
CESD 5 Karl and Trenberth 2003 Modelling the Climate System Main Message: Lots of things going on!
CESD 6 A climate model δ that/δ other = something else δ this/δ that = something Initial state Ancillary data can be time series Files of means: 6hr, daily…decadal Modelled processes New process New “diagnostic” Toolbox – not a black box!
CESD 7 Data volumes and typical analyses Typically we make 1-5GB/model year –40 model years/day (coarse coupled model (HadCM3) using 40 cores) Our biggest project: 14TB Researcher selects/slices data Does –Global/regional analyses – global means –Comparisons with related runs and observation,….,….,… –NCL, IDL, NCO,… tools built on data standards
CESD 8 NetCDF “NetCDF is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data”. File contains dimensions, variables, and attributes. Ed Hartnett’s talk at: df/papers/nasa_data_workshop_2010.pdf
CESD 9 Climate Forecast conventions conventions/1.4/cf-conventions.htmlhttp://cf-pcmdi.llnl.gov/documents/cf- conventions/1.4/cf-conventions.html define metadata that provide a definitive description of what the data in each variable represents –E.g. A variable called temp Long name (ad hoc): near-surface daily mean Standard name: air_temperature Units: K
CESD 10 CF: time – two examples double time(time) ; time:long_name = "time" ; time:units = "days since :0:0" ; Days; Hours; Min; Sec time:units = "days since :0:0” time:calendar = "none" ; data: time = 0., 1., 2.,...; All data are for same date:
CESD 11 How are data made accessible? publish data in data centres: –Provide “experiment” metadata –Upload NetCDF data –Metadata are harvested from files into catalogue Web services –E.g ncWMS
CESD 12 Some challenges
CESD 13 Current trends Data Diversity Volume Computation Legacy analyses (IDL, …,..,..,..) Collaboration Cooperation across groups Ensembles Global + Regional Publish more than papers Build research ecosystem
CESD 14 Future Lifecycle of research data Researcher Project Research community Archives: BADC ECDF Tools to capture metadata: instrument current codes + workflow Easy transitions personal-project-world Provenance: re-use/modify analyses Public Web services
CESD 15 Wrap/instrument tools to give Metadata + Provenance in post-model analyses, impact modelling… learn from –SYSMO (Univ. of Manchester) –e-Science Central (Univ. of Newcastle) –Steve! Workflow with wrapped legacy tools? Current challenges
CESD 16 Imminent challenges: impact / adaptation Socio-economics Climate land crops …… ecologies flood urban data NetCDF Census – sociopolitical area Regular/nested grids Triangulated irregular nwks Data synthesis Climate downscaling point->area modelling probabilistic data biodiversity