Towards a pan-European Collaborative Data Infrastructure

Slides:



Advertisements
Similar presentations
EUDAT Towards a pan-European Collaborative Data Infrastructure Ari Lukkarinen CSC-IT Center for Science, Finland Digital Research Conference Oxford, 12.
Advertisements

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
EUDAT Towards a European Collaborative Data Infrastructure Damien Lecarpentier – CSC, IT Center for Science, Finland ISC’11, Hamburg, 20 June 2011.
EUDAT Towards a pan-European Collaborative Data Infrastructure Ari Lukkarinen CSC-IT Center for Science, Finland APA Conference, November 6th, 2012.
EUDAT Data Services for Research “The Story” Per Öster Director, Research Infrastructures CSC – IT Center for Science Ltd.
DASISH Common Solutions to Common Problems. DASISH – Data Service Infrastructure for the Social Sciences and Humanities DASISH brings together 5 ESFRI.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
Project number: Data and Data Requirements Wouter Los University of Amsterdam.
DASISH Final Conference Common Solutions to Common Problems.
LifeWatch E-Science and Observatory Infrastructure for Biodiversity & Ecosystem Science Olaf Bánki.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT The European.
The DEER The Distributed European Electronic Resource.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
EUDAT: Data sharing and management in a collaborative data infrastructure Rob Baxter, EPCC, University of Edinburgh.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
The Global Scene Wouter Los University of Amsterdam The Netherlands.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EPOS and EUDAT.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Aalto Data.
GEO Data Management Principles Implementation : World Data System–Data Seal of Approval (WDS-DSA) Core Certification of Digital Repositories Dr Mustapha.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Get Data to Computation eudat.eu/b2stage B2STAGE How to shift large amounts of data Version 4 February 2016 This work is licensed under the.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The use of the.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Public access.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Collaboration.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Support to scientific.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Services.
EGI-InSPIRE EGI-InSPIRE RI EGI strategy towards the Open Science Commons Tiziana Ferrari EGI-InSPIRE Director at EGI.eu.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No TURBASE-DNS: A.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Herbadrop.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Enriching Europeana.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Aalto Data Repository.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No LTER- Europe &
Accessing the VI-SEEM infrastructure
PIDs in EUDAT Webinar, 15 Februari 2013
EUDAT Towards a European Collaborative Data Infrastructure
The EUDAT Services Suite
Tokamak data mirror for JET and MAST Moving towards an open data repository for European nuclear fusion research.
EUDAT: collaborative pan-European infrastructure providing research data services, training and consultancy This work is licensed.
EUDAT’s engagement with the Earth Sciences
GISELA & CHAIN Workshop Digital Cultural Heritage Network
AAI for a Collaborative Data Infrastructure
Defining EOSC Rules of Engagement Damien Lecarpentier (CSC)
Certification of Trusted Repositories
INTAROS WP5 Data integration and management
Trustworthiness of Preservation Systems
Virtual Research Communities Digital Cultural Heritage
Summit 2017 Breakout Group 2: Data Management (DM)
Carlos Morais Pires European Commission Information Society and Media
Short to Medium Term Priority issues for EGI, EMI, anD others
EGI-Engage Engaging the EGI Community towards an Open Science Commons
Antonella Fresa Technical Coordinator
Access  Discovery  Compliance  Identification  Preservation
DATA SPHINX & EUDAT Collaboration
EOSC Governance Development Forum
Virtual Research Communities Digital Cultural Heritage
NFFA Europe.
An EUDAT-based FAIR Data Approach for Data Interoperability
Common Solutions to Common Problems
European Research Data Services, Expertise & Technology Solutions
Pre-OMB meeting Preparation for the Workshop “EGI towards H2020”
Brian Matthews STFC EOSCpilot Brian Matthews STFC
GISELA & CHAIN Workshop Digital Cultural Heritage Network
DATATURB Direct simulation data of turbulent flows
Virtual Competency Centre 1: e-Infrastructure General VCC meeting, 2/3 April 2012, Utrecht, The Netherlands Karlheinz Moerth (Co-head of VCC 1, Austria)
EOSC-hub Contribution to the EOSC WGs
Presentation transcript:

Towards a pan-European Collaborative Data Infrastructure EUDAT Towards a pan-European Collaborative Data Infrastructure Martin Hellmich Slides adapted from Damien Lecarpentier DCH-RP workshop, Manchester, 10 April 2013

Research Infrastructures Research Infrastructure trends: Internationalisation Diversification Increasingly relying on ICT Data deluge is a common challenge European Ris: Around 500 € 100 billion investment middle age 19th century 20th century 21st century

Increasing complexity and variety Data trends Zettabytes Exabytes Exponential growth Petabytes Where to store it? How to find it? How to make the most of it? Terabytes Gigabytes Increasing complexity and variety How to ensure interoperability?

Collaborative Data Infrastructure -A framework for the future? - Data Curation Trust User functionalities, data capture & transfer, virtual research environments Data Generators Users Data discovery & navigation, workflow generation, annotation, interpretability Community Support Services CDI foster synergies Economies of scale Foster collaboration (and interoperability) between the different infrastructures Persistent storage, identification, authenticity, workflow execution, mining Common Data Services

Data Centers and Communities

Five research communities on Board EPOS: European Plate Observatory System CLARIN: Common Language Resources and Technology Infrastructure ENES: Service for Climate Modelling in Europe LifeWatch: Biodoversity Data and Observatories VPH: The Virtual Physiological Human All share common challenges: Reference models and architectures Persistent data identifiers Metadata management Distributed data sources Data interoperability Project partners represent the data scientists in these consortia. EPOS – data and observatories for earthquakes, volcanoes, tectonics – based on sensor data. CLARIN – making language resources and technology usable ENES – simulations of the climate system using HPC Lifewatch – biodiversity research VPH – biomedical modelling and simulation of the human body

Communities ↔ Data Centers Requirements, service cases Technology appraisal & matching Service provision

Building Blocks of the CDI EUDAT Portal Integrated APIs and harmonized access to EUDAT facilities Metadata Catalogue AAI Aggregated EUDAT metadata domain. Data inventory Network of trust among authentication and authorization actors Data Staging Safe Replication Simple Store Dynamic replication to HPC workspace for processing Data curation and access optimization Researcher data store (simple upload, share and access)

Infrastructure – first pilots ENES VPH CLARIN EUDAT service provider Lifewatch Community service provider Safe Replication Data staging EPOS

Principles – where we want to be (1) 1: Data deposited with the EUDAT CDI will be preserved in perpetuity Costs will (probably) have to fall on data producers (or their funders) at time of data deposit 2: Access to data in the EUDAT CDI is free at the point of use Charging users for access or use creates a barrier to use that runs counter to the principles of open access Subject to legal constraints etc.

Principles – where we want to be (2) 3: Data are best curated in their own communities Data producers are central to discussions about its long term preservation 4: EUDAT will operate as a federation of community-facing repositories and “back-office” hosting providers EUDAT must operate as a partnership model with a distributed infrastructure hosting common services

Principles –where we want to be (3) 5: EUDAT services and infrastructure must be a suitable target for “Trustworthy Digital Repository” TDR outsourcing (cf. datasealofapproval.org) opens the door for existing TDRs to join the EUDAT federation 6: EUDAT will not assert ownership of any data that it holds A TDR is an organisation set up to preserve digital data forever that satisfies a stringent list of requirements developed by the main governmental and research libraries in the world to ensure the continuing availability, security and usability of the data. Follows standards. Data seal of approval

Work plan Moving the services to a production environment Integrating new partners to EUDAT (in particular research communities) Working groups, pilots, observers and associate partners Collaborating with other initiatives European e-Infrastructures: EGI, PRACE, DANTE, HELIX NEBULA, SCIDIPS-ES, TERENA, etc. Global initiatives: RDA, CODATA, etc Defining EUDAT’s path to sustainability Cost and funding models Governance

Your Questions Contact: Damien Lecarpentier damien.lecarpentier@csc.fi How to submit feedback? EUDAT User Forums, service contact lists http://eudat.eu/2nd-eudat-user-forum -> Fact sheets We welcome Use Cases from DCH-RP! We want to know how you want to be involved D7.2.1: Managing data curation and long-term preservation in a federated environment (not avail. yet) Contact: Damien Lecarpentier damien.lecarpentier@csc.fi

eudat-info@postit.csc.fi

Hierarchy of data needs Combining data in imaginative new ways to solve problems Sharing data across communities Metadata catalogue, persistent identifiers Data security, data standards, data curation Data archive, data storage facilities Combining data Sharing data Improving data usability and reusability Keeping data safe and accessible Storing and archiving data Value creation, though openness and sharing… … relies on more basic data needs being met first Maslow’s hierarchy of needs –psychology model of a hierarchy of human needs starting with the most fundamental. Man can’t achieve full potential unless fundamental needs are met. Main motivation for a collaborative data infrastructure comes from the belief that sharing and combining data will allow European researchers to work together to tackle societal challenges