Accessing Data and Products my name is and I would like to show you how we access data and products which are generated by CAMS. Miha Razinger (ECMWF)
CAMS Datasets in Numbers Combined volume of datasets 300 TB 180 individual products 33 distinct atmospheric parameters 9 data servers sketch the scale of data sets and the challenge to share it:
The challenge Diversity of data producers, variety of data formats and dataset volumes, existing data serving methods poišči boljšo sliko
global surface ozone data June 2010 In our day to day work we are mostly dealing with knowledgable users What would be the ideal journey of a user without any prior awareness of CAMS and Copernicus? -> semantic web search queries Of course everybody would be happy if our datasets can be discovered as simple as this If you try to type something like this into a search engine CAMS products don‘t appear on the first two pages. So something that we defenitety need to improve on. Ok, let‘s take a easier case of someone who has already heard about CAMS and want to know more. What king of questions does such potential user has?
Questions from a potential user What products and services are available from CAMS? How can I access the data? How can I read and interpret the data? single entry point it depends it depends Can I use a product in our commercial service? We‘ll hear more about what users are actually asking in Karl‘s talk but here are some questions that surely interest a potential user. Free, full and open data policy What is the quality of a product? Is a product suitable for operational use? it depends it depends single data licence © Copyright
CAMS product catalogue Comprehensive and up-to-date inventory of CAMS products 180 individual products and services accompanied with description, details about geographical and temporal coverage, data access links, previews, documentation, validation reports, contact details ... Interoperable: INSPIRE and WMO Core Profile compliant meta data Can be harvested through OGC Catalogue Service for the Web (CSW) (implemented using pycsw library)
New catalogue BETA
Distributed data access Product catalogue FTP servers WCS WMS Regional data servers Solar radiation services ECMWF data servers service ECPDS Federated approach to data delivery. Zoo of data serving solution, some of the popular datasets available in more than one way. Different level of operational support. Different kinds of access, interactive, batch The real problem was that this is done ad-hoc and out of convenience rather that from a strategic plan which makes it difficult to harmonize it. Common user registration and statistics monitoring is still a distant dream.
New map application ECMWF data servers ECPDS data delivery Accessing global data New map application ECMWF data servers ECPDS data delivery
New interactive map application BETA OGC WMS and Leaflet library Fast development and publishing of maps Maps can be composed by various WMS sources Global or regional views, panning, zoom, fullscreen view Point forecast chart generation from the maps Regional forecasts, observations Take it for a spin and send us your feedback at
Iztočnice - catalogue as main entry point, distributed data servers - find, preview, access - use cases - compo announcement: service changes, timing, schedule, data volume - data usage numbers and trends - DOIs - direct feedback but another survey might be useful - data volumes and disk write speed limitations: subsetting, slicing, extracting - mogoče za na konec
Highly available infrastructure which is shared with ECMWF‘s ecCharts system. The data is kept in MARS, ECMWF‘s tape-based meteorologial archive or in case of smaller volume on disk. Size of request is limited to 20 GB, for huge ones Web API access should be used 3 March 2015
Delivering global model results Hosting the global real-time analyses and forecasts, MACC reanalysis, GFAS and GHG flux inversions datasets Very capable of serving big volumes of data Services build on top of existing ECMWF systems Most of the data is archived in MARS, some on disk Not best suitable for data browsing and occasional / light- weight data usage
ECPDS FTP data service Serves global model, boundary conditions and GFAS data streams FTP-push and –pull service Suitable for large data transfers 24/7 monitored and supported Highly available, load balanced, scalable service User Account Management
Global model upgrade On 21 June we are going to upgrade the model version New horizontal resolution (from about 80 km to about 40 km) Two forecast runs per day (00 run available at 10 UTC and 12 run at 22 UTC) 24/7 monitoring Test data available
Exposing catalogue to web crawlers Future plans Better data subsetting and slicing services (custom plots, subareas, time series, vertical profiles ...) Exposing catalogue to web crawlers Dataset DOIs – improved data citation, better usage tracking Follow CCCS Climate Data Store development At the beginning clash of cultures between NWP community on one side and atmospheric composition and air quality communities on the other. We made a few attempts to introduce CF convention as you know it‘s a bit like perl – there‘s more than one correct way to do it and the problem was each group was using their own (completely valid) but non- compatible flavour. Lack of good common format validation tools made this difficult – producers interested in science, not to study in details specification documents. If you are really paranoid provide sample code In including more observational data and improving the model we produce a lot of experimental datasets which are not suitable for wider distribution. However there are users who can understand and accept these limitation and give us a valuable feedback.