EGI workshop for E-infrastructure, April 2016 Amsterdam Background, ocean observation e-infrastructure Who is the community/project the use case belongs to? ENVRIPLUS EU project Ocean observations : Euro-Argo, SeaDataNet, Copernicus Marine service What's the timeline for development, tests and large-scale operation? First implementation in 2016 Large scale operations from 2017 What's your role in the use case? Any experience/link to EGI? Implement Euro-Argo and Copernicus use case Involve SeaDataNet, SOCAT and ESONET in a next step Link with EGI through ENVRIPLUS
EGI workshop for E-infrastructure, April 2016 Amsterdam Users Who will be the users of the planned community-specific e-infrastructure? How many of them? Environmental monitoring and forecasting : EU ocean-atmosphere models Calibration and validation observations : SMOS, Sentinel3 satellite missions Trans-disciplinary communities (ocean - atmosphere - biology - solid earth) Convenient space to associate storage and CPU for advanced services (such as geo-spatial analysis) Input stream or digital collaborative space for researchers or data providers How will the users interact with the system - Show this on a system architecture diagram Who and how would validate the system? Who : ENVRIPLUS project, Euro-Argo ERIC, SeaDataNet infrastructure How : use case review
EGI workshop for E-infrastructure, April 2016 Amsterdam ENVRIPLUS data subscription use case The user provides his criteria time, spatial, parameter, data type update period for delivery (daily, monthly, yearly, on the spot) The relevant data are extracted from ENVRIPLUS cloud Data may be converted/transformed on ENVRIPLUS grid WPS ? The user’s cloud account is updated regularly with the new data provided above An accounting of data delivery is performed MDC ? A citation scheme is attached to the delivered data DOI bibliographic surveys can track the use of these data in publications reproducibility is possible A users identification scheme is implemented Federation of identities : Marine-ID, Shibboleth, OpenID
EGI workshop for E-infrastructure, April 2016 Amsterdam ENVRIPLUS cloud service Euro-Argo, SeaDataNet, Copernicus datasets 5 billion ocean observations 300 parameters observing platforms from 1900 to today. This “cloud” of observations is pushed and continuously updated on ENVRIPLUS cloud (EGI, EUDAT, …) Copies are replicated in different places, close to users location (EU, US, Australia, Japan)
EGI workshop for E-infrastructure, April 2016 Amsterdam Argo floats observations in 2015 (& others) Vertical profiles, 2015 observations, vertical profiles Argo floats - sea-mammals - XBT and CTDs from vessels - gliders
EGI workshop for E-infrastructure, April 2016 Amsterdam Ocean observation cloud data model Observation data model : a flat table of 5 billion records n ID platformCode dataType x y z t parameter value value_qc n Observation metadata n JSON collections of metadata (platform codes, parameter codes, data types) n Observations are hosted in a workplace such as n Hadoop, NoSQL files or PostgreSQL n The use of in-memory features would provide the best reactivity (instant answers) n Metadata are indexed with ElasticSearch n The workplace is activated in a virtual server, replicated on the cloud
EGI workshop for E-infrastructure, April 2016 Amsterdam Federated cloud data distribution data VM my cloud VM Research Infrastructure Research Infrastructure Research Infrastructure
EGI workshop for E-infrastructure, April 2016 Amsterdam ENVRIPLUS e-infrastructure architecture Data a metadata workplace Drill, HBASE Géographique index and facets ElasticSearch, Drill Data discovery and visualization OpenLayer3, AngularJS, BootStrap, html5/css3, Material design Data extraction Drill, SQL, PIG Data subscription service OwnCloud Logs management accounting Kibana CSV & JSON data upload Drill
EGI workshop for E-infrastructure, April 2016 Amsterdam Ocean observation e-services Data services to be developed around Euro-Argo cloud Ocean observations API : profiles, time-series, trajectories Metadata visualization : map wms services Data visualization : graphics Data products such as mixed-layer depths maps Agile and incremental implementation Step 1 : Euro-Argo data file on the cloud (1to file daily updated) Step 2 : VM for indexation of data file (ElasticSearch) Step 3 : data file generation service (CSV then NetCDF) Step 4 : data subscription/distribution service to OwnCloud accounts Step 5 : replicate data and VM in mirror sites (EU, US, AU, JP) Next steps : promote the development of various services around an ENVRIPLUS cloud
EGI workshop for E-infrastructure, April 2016 Amsterdam Current status Which components/services already exist in your architecture? The data and metadata files are available The ElasticSearch index is available Which components/services are under development (and by who)? The web interface for discovery and data access (Ifremer) The data processing services (Ifremer) Which components/services do you expect to get from EGI? Data storage and distribution Data processing
EGI workshop for E-infrastructure, April 2016 Amsterdam Plans for today What questions would you like to get answered today? Can we replicate data, metadata and index on EGI infrastructure Do you have equivalent services based on these technologies What issues you would like to solve today? How do we push our data and metadata files Do you have a federation of identity Other outcomes that you would like to get out from the workshop? Meet groups interested in interdisciplinary studies