Data Repositories and Science Gateways for Open Science Presenter: Roberto Barbera – UNICT and INFN EGI Community Forum Bari – 11 November 2015.

Slides:



Advertisements
Similar presentations
Data Publishing Workflows: Strategies and Standards
Advertisements

Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Research Infrastructures – Proposal n A Standard-based.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Research Infrastructures – Proposal n The CHAIN-REDS.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Grid Engine Riccardo Rotondo
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Co-funded.
The INFN Open Access Repository R. Barbera – University of Catania and INFN e-AGE Muscat (Oman) – December 2014.
Software Sustainability Institute Dealing with software: the research data issues 26 August.
European Grid Initiative Federated Cloud update Peter solagna Pre-GDB Workshop 10/11/
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n° WP3 - Strengthen.
E-Science for the SKA WF4Ever: Supporting Reuse and Reproducibility in Experimental Science Lourdes Verdes-Montenegro* AMIGA and Wf4Ever teams Instituto.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n° A Federated.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
SEAD Virtual Archive :: A Thin Layer for Scientific Discovery and Long-Term Preservation Inna Kouper April #dlbbspring2013.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
VIVO and Scholarly Repositories: Synergistic Opportunities.
DataONE: Preserving Data and Enabling Data-Intensive Biological and Environmental Research Bob Cook Environmental Sciences Division Oak Ridge National.
The Astronomy challenge: How can workflow preservation help? Susana Sánchez, Jose Enrique Ruíz, Lourdes Verdes-Montenegro, Julian Garrido, Juan de Dios.
Theme 2: Data & Models One of the central processes of science is the interplay between models and data Data informs model generation and selection Models.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n° Data Repositories.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n° Energising Scientific.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
- A. Celesti et al University of Messina, Italy Enhanced Cloud Architectures to Enable Cross-Federation Presented by Sanketh Beerabbi University of Central.
Data Citation Implementation Pilot Workshop
Storing digital assets on Grid/EGI FedCloud with gLibrary Giuseppe La Rocca, INFN DARIAH ERIC.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Research Infrastructures – Grant Agreement n
ETICS An Environment for Distributed Software Development in Aerospace Applications SpaceTransfer09 Hannover Messe, April 2009.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Rome - 24 January Earth Server EU FP7-INFRA project Scalability for Big Data Roberto Barbera - University of Catania and INFN - Italy
An Open Data Platform in the framework of the EGI-LifeWatch Competence Centre Fernando Aguilar Jesús Marco
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
Exploitation of ISS Scientific data EGI-Aparsen Workshop March Science Park– Amsterdam – The Netherlands Cooperative ISS Research data Conservation.
Utilizzo di portali per interfacciamento tra Grid e Cloud Workshop della Commissione Calcolo e Reti dell’INFN, May Laboratori Nazionali del.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n° The Sci-GaIA.
REST API to develop application for mobile devices Mario Torrisi Dipartimento di Fisica e Astronomia – Università degli Studi.
The Open Access Repository of INFN Roberto Barbera and Rita Ricceri – INFN
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI /04/14 1 EGI Community Forum 2014 Federated Cloud image management Marios.
EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number Marios Chatziangelou, et al.
The eCSG Mobile App Mario Torrisi INFN – Division of Catania 24 June 2013 Webinar on the eCSG 1.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
Report sulle attività svolte a Catania per ALICE C. Carrubba and G. Inserra Workshop Finale del PRIN STOA-LHC – Bari,
Programmatic Interaction with Open Access Repositories
Accessing the VI-SEEM infrastructure
Olawale Olayide, Abdulazeez Adelopo & Rising Osazuwa
EOSC Services for Scientists
EthERNet Repository - Final report
EOSC MODEL Pasquale Pagano CNR - ISTI
Joslynn Lee – Data Science Educator
B. Piringer R. Barbera, A. Calanducci, C. Carrubba, D. Davidovic, G
Giuseppina Inserra INFN Catania
The CHAIN-REDS Project: an overview
Open Science Approaches to Modelling & Simulation
Technical Meeting with CNR and INAF 7 October 2014
Donatella Castelli CNR-ISTI
CHAIN-REDS computing solutions for Virtual Research Communities CHAIN-REDS Workshop – 11 December 2013 Roberto Barbera – University of Catania and.
ACS 2016 Moving research forward with persistent identifiers
The Sci-GaIA project and introduction to the Hackfest
Antonella Fresa Technical Coordinator
Open Access Repository INFN Roberto Barbera (roberto
DATA SPHINX & EUDAT Collaboration
The SADE mini-project of the EGI DARIAH Competence Centre
EGI support services Science gateway developers
DARIAH – Competence Centre in a nutshell
OpenAIRE Open Science Publishing for Research Infrastructures: the EPOS use-case Paolo Manghi, Michele Manunta, Miriam Baglioni, Alessia Bardi, Francesco.
Check-in Identity and Access Management solution that makes it easy to secure access to services and resources.
Presentation transcript:

Data Repositories and Science Gateways for Open Science Presenter: Roberto Barbera – UNICT and INFN EGI Community Forum Bari – 11 November 2015

Outline  Introductory concepts, definitions and driving considerations  A viable approach to Open Science  Summary and conclusions 2

The Scientific Method Examples of IR: Classical Mechanics Newton’s Gravitation Theory Examples of DR: General Relativity Standard Model of Particle Physics 3 G. Galilei

The Pillars of the Scientific Method Repeatability The closeness of agreement between independent results obtained with the same method on identical test material, under the same conditions (same operator, same apparatus, same laboratory and after short intervals of time) Affected by random errors Reproducibility The closeness of agreement between independent results obtained with the same method on identical test material but under different conditions (different operators, different apparatus, different laboratories and/or after different intervals of time) Affected by systematic errors Is science really reproducible ? 4

Challenges in irreproducible research ( 5

The “reproducibility crisis” 18 Out of 18 microarray papers, results from 10 could not be reproduced Out of 18 microarray papers, results from 10 could not be reproduced 6 1.Ioannidis et al., Repeatability of published microarray gene expression analyses. Nature Genetics 41: 14 2.Science publishing: The trouble with retractions 3.Bjorn Brembs: Open Access and the looming crisis in science

Repeatability and Reproducibility are not all 7

How e-Infrastructures support the (e-)Scientific Method Data Infrastructures Open Access Doc. Repos. Data Repos. Semantic-web enrichment of linked data Data preservation HTC/HPC Clusters Grids, Clouds both ways Challenge: «walk» across the knowledge path both ways 8

Open Science

An INFN approach to Open Science: the “grand” view Digital Repository of Research Products (pilot: arXiv CNR S&T DL CINECA VQR INFN Multi media SINGLE – MANDATORY - DEPOSIT SCIENCE PRODUCTS REPRODUCIBILITY ORCID INFN Gray Lit. SCOAP 3 10

The INFN Open Access Repository ( papers data Automatic ingestion in place from: federated authentication

Alternative reputation systems: possibility to add researcher ID’s 12

Examples of document and data resources 13 Data stored on:

Example of software resources: the ALICE Virtual Research Environment 14

Example of research “package” 15

The OAR Knowledge Workflow 16

The OAR Knowledge Workflow: ALEPH data search & discovery 17

1. From OAR it is possible to select an “analysis” as simply as any other resources in the archive The OAR Knowledge Workflow: ALEPH “packages” inspection Clicking on RUN PAGE, the researcher can either reproduce or extend that particular analysis using a Catania Science Gateway

The OAR Knowledge Workflow: ALEPH data analysis (1/2) The Science Gateway collects from the OAR, and allows user browse, the metadata associated to the dataset(s) needed to run that particular analysis 19

The OAR Knowledge Workflow: ALEPH data analysis (2/2) Data are retrieved from Using the JSAGA adaptor for all OCCI-compliant cloud-middleware, the Science Gateway starts a dedicated VM already configured with the all the experiment software Both the CHAIN-REDS Cloud Testbed and the EGI Federated Cloud can be used as e-Infrastruc- tures Jobs run both on and 20

Remember: repeatability and reproducibility are not all Reusability and «extensibility» matter! 21

1.From within the CHAIN-REDS Science Gateway entitled researchers can start VMs already configured to re-use/extend ALICE data analyses 2.The VMs are deployed both on the CHAIN-REDS Cloud Testbed and on the EGI Federated Cloud using the features of the EGI AppDB Reusability of ALICE data with the CHAIN-REDS Science Gateway (1/3) 22

Reusability of ALICE data with the CHAIN-REDS Science Gateway (2/3) 1.The VM is available tor a customizable amount of time during which the user has full access to the dataset(s) and analysis algorithm(s) and source code(s) of the experiment 2.The user can access the VM using different protocols (e.g., SSH, VNC); clicking on the SSH or VNC icons the user can directly access the VM instantiated on the cloud from within the Science Gateway 23

Reusability of ALICE data with the CHAIN-REDS Science Gateway (3/3) 24 New stable analyses (and their results), generated running the VM, may be registered in the OAR (with DOIs) to further extend the analysis catalogue shared within the Virtual Research Community

“Who’s this science of ?” How to provide authorship to research products? 25

ORCID ( – becoming a “de facto” standard) 26 More than 1.74 million ORCID IDs so far

ORCID: search & link your works in/from DataCite 27

ORCID: add your research products to your profile 28 v <a

Summary and conclusions 29  Open Science vision can be implemented only if the “openness” paradigm becomes pervasive in research  Science outputs’ reproducibility, but also re-usability and extensibility, are key to walk through the “knowledge path” in both directions  The INFN Open Access Repository is a pilot knowledge preservation repository meant to serve both researchers and citizen scientists  What makes the INFN OAR different from other repositories is:  Its capability to connect to Science Gateways and exploit cloud resources worldwide to easily reproduce/extend scientific analyses  Its capability to provide full authorship (and hence credit, reputation and visibility) for all products of a scientist  this is key for a correct evaluation of research (…and of researchers)

Authors 30  R. Barbera (University of Catania and INFN, Italy)  S. Bianco (INFN LNF, Italy)  T. Boccali (INFN Pisa, Italy)  C. Carrubba (University of Catania, Italy)  G. Inserra (University of Catania, Italy)  M. Maggi (INFN Bari, Italy)  D. Menasce (INFN Milano Bicocca, Italy)  R. Ricceri (University of Catania, Italy)

Thank you ! 31