EUDAT Collaborative Data Infrastructure

Slides:



Advertisements
Similar presentations
SDMX in the Vietnam Ministry of Planning and Investment - A Data Model to Manage Metadata and Data ETV2 Component 5 – Facilitating better decision-making.
Advertisements

Data Archiving and Networked Services DANS is an institute of KNAW en NWO and the Peter Doorn Data Archiving and Networked Services EUDAT Conference Trust.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The pan-European.
Sync and Exchange Research Data b2drop.eudat.eu This work is licensed under the Creative Commons CC-BY 4.0 licence B2DROP EUDAT’s Personal.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT The European.
EUDAT: Data sharing and management in a collaborative data infrastructure Rob Baxter, EPCC, University of Edinburgh.
Find Research Data b2find.eudat.eu B2FIND User Training How to find data objects and collections using EUDAT’s B2FIND This work is licensed.
11 Researcher practice in data management Margaret Henty.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No B 2 DROP User.
Replicate Research Data Safely eudat.eu/b2safe B2SAFE How to replicate your data using EUDAT’s B2SAFE Version 3 November 2015 This work is.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No B2SHARE How to.
Store and Share Research Data b2share.eudat.eu B2SHARE How to share and store research data using EUDAT’s B2SHARE This work is licensed under.
b2access.eudat.eu B2ACCESS The simple and secure authorisation and authentication platform of EUDAT This work is licensed under the Creative.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT EGI interoperability.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The Data Type.
CLARIN EUDAT2020 uptake plan Dieter Van Uytvanck CLARIN ERIC EUDAT User Forum, Rome.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EPOS and EUDAT.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Aalto Data.
Get Data to Computation eudat.eu/b2stage B2STAGE How to shift large amounts of data Version 4 February 2016 This work is licensed under the.
B2access.eudat.eu B2ACCESS User Training How to register with B2ACCESS Version 1 February 2016 This work is licensed under the Creative Commons.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The use of the.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No West-Life.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Public access.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Support to scientific.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Services.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No TURBASE-DNS: A.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Herbadrop.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Enriching Europeana.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Aalto Data Repository.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EGI - EUDAT interoperability.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No LTER- Europe &
Enhancements to Galaxy for delivering on NIH Commons
Accessing the VI-SEEM infrastructure
PIDs in EUDAT Webinar, 15 Februari 2013
Towards a pan-European Collaborative Data Infrastructure
This work is licensed under the Creative Commons CC-BY 4.0 licence.
The EUDAT Services Suite
Tokamak data mirror for JET and MAST Moving towards an open data repository for European nuclear fusion research.
EUDAT: collaborative pan-European infrastructure providing research data services, training and consultancy This work is licensed.
EUDAT’s engagement with the Earth Sciences
GISELA & CHAIN Workshop Digital Cultural Heritage Network
AAI for a Collaborative Data Infrastructure
The EUDAT Services Suite and how it could support FAIR data
An Overview of Data-PASS Shared Catalog
Donatella Castelli CNR-ISTI
Data Ingestion in ENES and collaboration with RDA
VI-SEEM Data Repository
EGI-Engage Engaging the EGI Community towards an Open Science Commons
Mark van de Sanden Giovanni Morelli
Data Access and Re-use Carl Johan Håkansson EUDAT Service Area Manager
Workshop Data curation and the EUDAT Collaborative Data Infrastructure
DATA SPHINX & EUDAT Collaboration
EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
NFFA Europe.
An EUDAT-based FAIR Data Approach for Data Interoperability
Semantic Annotation service
Common Solutions to Common Problems
European Research Data Services, Expertise & Technology Solutions
Malte Dreyer – Matthias Razum
Brian Matthews STFC EOSCpilot Brian Matthews STFC
GISELA & CHAIN Workshop Digital Cultural Heritage Network
EUDAT Site and Service Registry
DATATURB Direct simulation data of turbulent flows
Bird of Feather Session
Joining the EOSC Ecosystem
EOSC-hub Contribution to the EOSC WGs
Palestinian Central Bureau of Statistics
Presentation transcript:

EUDAT Collaborative Data Infrastructure www.eudat.eu/services This presentation introduces the B2 Service Suite and is aimed at a general public. It also provides information on the organization that develops, maintains and deploys the services: the EUDAT initiative. The B2 Service Suite is a number of data services supporting researchers, research communities and research infrastructures. Detailed information on the services can be found at eudat.eu/services. This presentation consists of 22 slides and will take 10 minutes. This work is licensed under the Creative Commons CC-BY 4.0 licence. Attribution: EUDAT – www.eudat.eu Version 2016-1

Outline Mission of the infrastructure Structure and governance Service portfolio Long-term vision and sustainability Relevant sessions during DI4R How to engage with the initiative after DI4R

Mission of the e-Infrastructure To build a Collaborative Data Infrastructure (CDI) as a pan-European solution to the challenge of data proliferation in Europe’s scientific and research communities. Thanks to the CDI, to allow researchers to share data within and between communities and enable them to carry out their research effectively. To provide a solution that will be affordable, trustworthy, robust, persistent, open and easy to use. To support multiple research communities by working closely with them to deliver effective technical solutions The B2 Service Suite is developed, maintained and deployed by the EUDAT initiative. This is a pan-European initiative building a sustainable cross-disciplinary and cross-national data infrastructure providing a set of shared services for accessing and preserving research data. EUDAT supports multiple research communities by working closely with them to deliver these technical services as part of the EUDAT Collaborative Data Infrastructure, the CDI.

Collaborative Data Infrastructure (CDI) A collaboration between Service Providers and Research Communities EUDAT generic data service provider storage, workflows, processing, archive A Partnership Agreement specifying the mutual obligations between the EUDAT centres Community Repositories (thematic data centres) A collaboration between Service Providers and Research Communities A Partnership Agreement specifying the mutual obligations between the EUDAT centres a portfolio of data management services A data and service model that ensures the CDI’s interoperability, extensibility and stability Looking to the future we have the CDI partnership agreement and a secretariat as a basis for future coordination. a portfolio of data management services A data and service model that ensures the CDI’s interoperability, extensibility and stability

A truly pan-European Infrastructure Geographically distributed, resilient network of 35 European organisations Data are safely stored alongside some of Europe’s most powerful supercomputers. The EUDAT consortium consists of 35 partners which include high performance computing centers, data centers, libraries, scientific communities and data scientists. The EUDAT vision is to enable European researchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment.

Community-Driven Solutions EUDAT services are designed, built and implemented based on user community requirements. PHYSICAL SCIENCES & ENGINEERING SOCIAL SCIENCES & HUMANITIES MATERIALS & ANALYTICAL FACILITIES ENVIRONMENTAL SCIENCES MAPPER BIOMEDICAL & MEDICAL SCIENCES EUDAT cooperates with a wide variety of research communities, such as the medical and biomedical sciences, environmental sciences, materials and analytical facilities, social sciences and humanities and physical sciences and engineering. EUDAT has concrete agreements with 7 core communities, an integral part of the initiative, namely: CLARIN: Common Language Resources and Technology Infrastructure ELIXIR: A distributed infrastructure for life-science information ENES: European Network for Earth System Modelling EPOS: European Plate Observing System ICOS: Integrated Carbon Observation System LTER Europe: European Long Term Ecological Research Network VPH: Virtual Physiological Human EUDAT works on a Collaborative Data Infrastructure conceived as a network of collaborating, cooperating centers, combining the richness of numerous community specific data repositories with the permanence and persistence of some of Europe’s largest scientific data centers.

CDI Data Domain EUDAT Data Domain modeled on the ANDS1 Data Curation Continiuum 1. Australian National Data Service organization – www.ands.org.au

B2 Service Suite B2ACCESS B2HANDLE Integration / Usability /

Relevant sessions during DI4R EUDAT presentation EUDAT B2FIND: A Cross-Discipline Metadata Service and Discovery Portal, 28 September 2016 @11:30 am CEST (1D) EUDAT presentation Coupling Data and HPC resources together: the EUDAT - PRACE Collaboration activity, 28 September 2016 @11:30 am CEST (1B) EUDAT presentation B2SHARE - Record lifecycle and HTTP API, 30 September 2016 @9:00 am CEST (7D) EUDAT training session Explore and design your own workflow for safe data replication, 30 September 2016 @9:00 am CEST (7D) EUDAT will be also present with an exhibition booth for the full duration of the conference

For more info: b2drop.eudat.eu eudat.eu/services/userdoc/b2drop b2share.eudat.eu eudat.eu/services/userdoc/b2share eudat.eu/services/userdoc/b2safe eudat.eu/services/userdoc/b2stage b2find.eudat.eu eudat.eu/services/userdoc/b2find Most of the EUDAT services have their dedicated service domain and documentation. For all services documentation, tutorials and training are in the making or already exist. The service suite will be enhanced and expanded. So keep in touch. If you have any questions or remarks concerning the services or the EUDAT initiative, please use the contact form available at the EUDAT website: eudat.eu. b2access.eudat.eu eudat.eu/services/userdoc/b2access-usage

Don’t forget to visit our booth Venue 2 - Exhibition and Catering Area For more information visit: www.eudat.eu Find out more about EUDAT User Documentation: www.eudat.eu/services/userdoc EUDAT User Training: https://eudat.eu/training

Thank you! Hilary Hanahoe, Trust-IT Kostas Kavoussanakis, EPCC René van Horik, DANS Hans van Piggelen, SURFsara Mark van de Sanden, SURFsara Giuseppe Fiameni, CINECA

Supporting Metadata Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information. Provides additional quality to the data Makes data findable and interpretable Metadata is essential to communities

EUDAT’s Personal Cloud Storage Service B2DROP EUDAT’s Personal Cloud Storage Service B2DROP is a secure and trusted data exchange service for researchers and scientists to keep their research data synchronized and up-to-date and to exchange with others. The B2 Service Suite consists of five services. Each service is presented in the following slides, starting with B2DROP. The B2DROP service can be characterized as a personal cloud storage service. It is a secure and trusted data exchange service.

An ideal solution for researchers and scientists to: Store and exchange data with colleagues and team members, including research data not finalized for publishing share data with fine-grained access controls synchronize multiple versions of data across different devices The B2DROP service is a cloud solution to store and share data in the early state of the research data life cycle. It is aimed at individual researchers and enables the storage and exchange of data with colleagues and team members. Data can be shared with fine-grained access controls. B2DROP synchronizes multiple versions of data across different devices and platforms. B2DROP users are offered up to 20 GB of storage space for their data. The B2DROP service can be found at b2drop.eudat.eu. Features: 20GB storage per user Living objects, so no PIDs Versioning and offline use Desktop synchronisation

B2SHARE B2SHARE is a user-friendly, reliable and trustworthy way for researchers, scientific communities and scientists to store and share small-scale research data from diverse contexts. The next service of the B2 Service Suite is the B2SHARE service to store and share small-scale research data form diverse contexts.

A winning solution for researchers, scientists and communities to: store data safely at a trusted and certified data centre preserve data to guarantee long-term persistence control access and share data with colleagues and the world The B2SHARE service is aimed at individual researchers. It has been integrated in a number of research infrastructures and EUDAT defines custom made community based metadata schema templates to facilitate users. B2SHARE facilitates data storage in a trusted and certified repository that guarantees long-term persistence of the data. Data objects get a persistent identifier. Depositors can document their data objects and give the data a usage license, preferably an open access license. The B2SHARE service can be found at b2share.eudat.eu. Features: metadata management permanent PIDs Open Access support

B2SAFE B2SAFE is a robust, safe and highly available service which allows community and departmental repositories to implement data management policies on research data across multiple administrative domains in a trustworthy manner. The third service of the B2 Service Suite is the B2SAFE service. This service allows community and department repositories to implement data management policies on research data across multiple administrative domains.

The ideal solution for communities with no facility for archival to: replicate research data into secure data stores archive and preserve research data in the long-term bring data close to powerful compute resources co-locate data with different communities benefit from economies of scale The B2SAFE service is aimed at research communities that have no facilities for archival data storage. The service supports a number of procedures such as data replication and the co-location of data with different communities. B2SAFE facilitates high-scale petabytes storage. More information on the B2SAFE service can be found at eudat.eu/b2safe. Features: large-scale storage robust and highly available permanent PIDs

B2STAGE B2STAGE is a reliable, efficient, light-weight and easy-to-use service to transfer research data sets between EUDAT storage resources and high-performance computing (HPC) workspaces The B2STAGE service enables the movement of large amounts of data between data stores and high-performance computing resources.

Facilitating communities to: move large amounts of data between data stores and high-performance compute resources re-ingest computational results back into EUDAT deposit large data sets onto EUDAT resources for long-term preservation The B2STAGE service is aimed at research communities and infrastructures to move large amounts of data between data stores and high-performance computing resources, to re-ingest computational results back into EUDAT and to deposit large data sets onto EUDAT resources for long-term preservation. More information on the B2STAGE service can be found at: eudat.eu/b2stage. Features: high-speed transfer reliable and light-weight manages permanent PIDs

B2FIND B2FIND is a simple, user-friendly metadata catalogue of research data collections stored in EUDAT data centres and other repositories. The last service of the B2 Service Suite is the B2FIND service. It can be characterized as a simple, user-friendly metadata catalogue of research data collections stored in EUDAT data centers and other repositories.

A metadata catalogue service to: seek data objects and collections using powerful metadata searches catalogue community data by means of selected metadata browse through multi-disciplinary data collections filtered by content, provenance and temporal keywords The B2FIND service enables the searching and browsing for data objects and collections and supports a number of metadata formats. B2FIND facilitates browsing through multi-disciplinary data collections. More information on the B2FIND service can be found at: https://b2find.eudat.eu. Features: simple to use standards-based comprehensive catalogue

B2HANDLE B2HANDLE provides an abstraction layer between a globally unique persistent identifier and a physical location of a data object allowing researchers to reliably cite and refer in the long term. B2HANDLE provides an abstraction layer between a globally unique persistent identifier and a physical location of a data object allowing researchers to reliably cite and refer in the long term.

Reliability through mutual PID mirroring Provides abstraction layer between a globally unique persistent identifier and physical location of data objects Follows policies to register data and make it long term refer- and citable B2HANDLE provides an abstraction layer between a globally unique persistent identifier and a physical location of data objects. It follows policies to register data and make it long term referable and citable. The service provides high reliability and availability and can be easily integrated using a HTTP RESTful API in any other service or application. The service is therefore technology-agnostic. For more information see: eudat.eu/b2handle Features: Reliability through mutual PID mirroring Machine readable via HTTP RESTful API Simple integration with any service Technology agnostic

B2ACCESS B2ACCESS is an easy-to-use and secure authentication and authorization platform which can be integrated with any service and supports different methods of authentication. B2ACCESS is an easy-to-use and secure authentication and authorization platform which can be integrated with any service and supports different methods of authentication.

The user may log in by using different methods of authentication: An easy-to-use and secure authentication and authorization platform integrated with any services The user may log in by using different methods of authentication: Home organisation identity provider Social ID EUDAT ID Allows group-, community- and service managers to specify authorisation decisions B2ACCESS provides an easy-to-use and secure authentication and authorization platform integrated in all other services. It provides different methods of authentication through the home organisation identity provider, but also allows social IDs like Google and Facebook as well as the EUDAT ID. Managers can specify authorisation decisions in the dedicated interface. For more information see: b2access.eudat.eu Features: easy integration in any service reliable and light-weight powerful management interface

Minimize data transfers Move tools, not data Enable data processing close to the data User provide community specific execution engines (e.g. docker containers) Stimulate reproducible results, reuse of execution templates Integration with CDI data services, containers to access B2SHARE, B2DROP, B2SAFE Integration with B2ACCESS Provide prototype

Make implicit assumptions explicit and persistent The Problem The Solution Understanding scientific data and metadata is hard Researcher 1: “Could you tell me what column 12 means in the CSV file you referenced in paper A from 5 years ago?” Researcher 2: “Uh, I believe it’s a number” R1: “I can see that. Could it be a temperature?” R2: “Probably” R1: “Fahrenheit? Celsius?” R2: “Maybe Kelvin or Rankine?” R1: “Kelvin?” R2: “On second thought, maybe it’s not really a temperature” R1: “…” Make implicit assumptions explicit and persistent Make data and metadata interpretable and interoperable, for humans and machines Make data types shareable and reusable

Based on CNRI’s Digital Object Repository and Registry software CORDRA + EPIC handles Definition of primitive and derived Data Types (via composition of primitive types) Data Types are assigned unique and resolvable EPIC handles for persistent identification and retrieval Data Types are validated against pre- configured JSON schemas Data Types are indexed to allow content-based queries Data Type Versioning WUI and REST API access Pilot service available

an annotation is “a note added to a text, book, drawing, etc an annotation is “a note added to a text, book, drawing, etc., as a comment or an explanation” (from Merriam Webster) Provide a service to add an assertion to digital asset Manual annotations via WUI, automatic via a REST API Integrate with B2FIND and B2SHARE Prototype available at http://b2note.bsc.es

Requirements Communities and users (e.g. PRACE) want deposit area for digital entities from computing simulations Connect to existing, community specific access services Support multiple protocols: GridFTP, Webdav, POSIX (full, light, like) Integrated within the CDI domain and services