Data: way forward GO-ESSP meeting next week in Paris Will decide whether our ensemble definitions can become CF-approved ‘standard names’ May give other feedback Need to agree data gridding strategy Need to prioritize variables to be “served” online Need to give APCC some volume estimates
The plan … Distributed data serving Should allow easy scaling as more models join Allows us to get off the ground with limited resources Reduces political issues / accommodates requirements of funders Original idea: everyone serve their own data This is perfectly scalable BUT we want to allow data producers to do deals with data centres – some regional concentrations of data may be helpful
The plan … CF compliant netCDF There was a strong consensus from the scientists to use netCDF (rather than eg GRIB) The aim is to use CF compliant netCDF We want to create netCDF files containing multi-model ensemble forecast data – this means we need the concepts of ensemble member number, and also a “multi-model” dimension allowing an adequate characterization of the models concerned. A prototype implementation has been created, and is presently being used to serve ENSEMBLES seasonal data at ECMWF, via a THREDDS aggregation server. Note: The same metadata could be encoded in other ways in the future
The plan … netCDF structure Document from Paco Doblas-Reyes outlines the approach taken Two points to highlight: “forecast_reference_time” and “forecast_period” are independent time variables, needed to cope with forecasts which overlap in time. An “ensemble” dimension is used to distinguish the ensemble members, and is used in several auxiliary variables, which between them encode sufficient metadata to identify which integration the data come from. … (But if you look at Paco’s document in detail, you will be able to judge better than me what are the critical points ….)
Reference http://www.ecmwf.int/research/EU_projects/ENSEMBLES/data/data_dissemination.html