PaNdata Photon and Neutron Data Infrastructure I2S2Meeting 1 April 2011 Juan Bicarregui.

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
INFRASTRUCTURE CALLS IN Support to existing research infrastructures Integrating Activities Topics opened in Call FP7-INFRASTRUCTURES
Slide: 1 Welcome to the workshop ESRFUP-WP7 User Single Entry Point.
Introduction on WP7/WP9 Dominique PORTE 29/05/2008 Menu What is WP7? What is WP9? Goal of the brainstorming Introduction on WP7/WP9.
PaN-data WP7 - Integration Brian Matthews STFC-e-Science.
Building an Open Data Infrastructure for Science: Policy and Practice Juan Bicarregui STFC e-Science WIFI: BMA Guest “apa2 conference” “visitor” APA conference,
Federated data catalogues supporting cross-facility, cross- discipline interaction at the scale of atoms and molecules Neutron diffraction X-ray diffraction.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
Data Catalogue Service Work Package 4. Main Objective: Deployment, Operation and Evaluation of a cataloguing service for scientific data. Why: Potential.
1 23/05/2015 Networking Research Infrastructure NCPs National Hellenic Research Foundation (EKT/NHRF) Project Monitoring & Coordination Unit EuroRIs-NET.
Federated Identity Management for Research Communities (FIM4R) David Kelsey (STFC-RAL) EGI TF, AAI workshop 19 Sep 2012.
©STFC/Keith G Jeffery Metadata in the European e-Infrastructure Metadata in the European e-Infrastructure Keith G Jeffery Science and Technology.
Preservation and Long-term access through Networked Services Adam Farquhar, The British Library iPres2006 Cornell University, October 2006.
Research Infrastructures The Research Infrastructures in FP7.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
THE JOINED UP WORLD OF E-RESEARCH Professor Neil McLean National Technical Standards Adviser to the Department of Education Science and Training (DEST)
PaNdata Europe Midpoint workshop 8-10 February 2011 Soleil, Paris PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories.
The Preparatory Phase Proposal a first draft to be discussed.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
Data Infrastructures Opportunities for the European Scientific Information Space Carlos Morais Pires European Commission Paris, 5 March 2012 "The views.
Research Infrastructures in FP7 On behalf of : Anna Maria Johansson, Hervé Péro European Commission, DG RTD/B3 MarineERA workshop Feb
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
1 INFRA : INFRA : Scientific Information Repository supporting FP7 “The views expressed in this presentation are those of the author.
Caring and Sharing Collaboration in Digital Curation outside North America Ross Harvey Simmons College, Boston Curation Matters: 17 June 2010.
Towards a European network for digital preservation Ideas for a proposal Mariella Guercio, University of Urbino.
1 Web: Steve Brewer: Web: EGI Science Gateways Initiative.
ICSTI Annual Members’ Meeting & Workshop Dr. Stefan Winkler-Nees; Paris, 5. March 2012 The Alliance of German Science Organisations - Recommendations on.
Context and Linking in the Research Lifecycle CERIF and other standards Catherine Jones Scientific Information Group Scientific Computing Department STFC.
ESIP Federation Air Quality Cluster Partner Agencies.
ATTRACT – From Open Science to Open Innovation Information Sharing Meeting Brussels, June 19, 2014 Markus Nordberg (CERN) Development and Innovation Unit.
The Faster Research Cycle Interoperability for better science Brian Matthews, Leader, Information Management Group, E-Science Centre, STFC Rutherford Appleton.
Jamie Hall (ILL). SciencePAD Persistent Identifiers Workshop PANData Software Catalogue January 30th 2013 Jamie Hall Developer IT Services, Institut Laue-Langevin.
WP5 – Virtual Laboratories. WP5 Deliverables  D5.1: Specific requirements for the virtual laboratories M6  D5.2: Deployment of Specification of the.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
The DEER The Distributed European Electronic Resource.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
PaNdata ODI Open Data Infrastructure INFRA : Data infrastructures for e-Science PaNdata-ODI will develop, deploy and operate an Open Data Infrastructure.
SEE-GRID-2 The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
Data Preservation at Rutherford Lab David Corney 9 th July 2010 KEK.
19-20 October 2010 IT Directors’ Group meeting 1 Item 6 of the agenda ISA programme Pascal JACQUES Unit B2 - Methodology/Research Local Informatics Security.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number EGI vision for the EOSC Tiziana.
E-Infrastructure the FP7 prospects Mário Campolargo European Commission - DG INFSO Head of Unit Research Infrastructures TERENA Networking Conference 2006.
Thomas Gutberlet HZB User Coordination NMI3-II Neutron scattering and Muon spectroscopy Integrated Initiative WP5 Integrated User Access.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Usecases: 1.ISIS Neutron Source 2.DP for HEP Matthew Viljoen STFC, UK APARSEN-EGI workshop: preserving big data for research Amsterdam Science Park 4-6.
Our digital memory accessible tomorrow Sharon McMeekin APARSEN Alliance for Permanent Access to the Records of Science.
European Perspective on Distributed Computing Luis C. Busquets Pérez European Commission - DG CONNECT eInfrastructures 17 September 2013.
E-infrastructure requirements from the ESFRI Physics, Astronomy and Analytical Facilities cluster Provisional material based on outcome of workshop held.
Building an Open Data Infrastructure for Science: Policy and Practice Juan Bicarregui STFC e-Science Terena 2012, Reykjavik, May 2012.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Budget JRA2 Beneficiaries Description TOT Costs incl travel
Introduction the IT and DM Topic
Summit 2017 Breakout Group 2: Data Management (DM)
Research Data Context Preservation in SCAPE
National e-Infrastructure Vision
Pandata Service Verification
EGI-Engage Engaging the EGI Community towards an Open Science Commons
PaNdata Photon and Neutron Data Infrastructure Juan Bicarregui
EGI Webinar - Introduction -
Common Solutions to Common Problems
Integrating social science data in Europe
Brian Matthews STFC EOSCpilot Brian Matthews STFC
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Presentation transcript:

PaNdata Photon and Neutron Data Infrastructure I2S2Meeting 1 April 2011 Juan Bicarregui

Overview The PaNdata Collaboration The Vision The PaNdata Europe Project The PaNdata Open Data Infrastructure Proposal Looking Forwards Project

The PaNdata Collaboration Established 2007 with ESRF, ILL, ISIS and Diamond Expanded since to 11 organisations (see next slide) Aims: – “...to construct and operate a shared data infrastructure for Neutron and Photon laboratories...”

PaN-data bring together 11 major European Research Infrastructures PaN-data is coordinated by the e-Science Department at the Rutherford Appleton Laboratory, UK ISIS is the world’s leading pulsed spallation neutron source ILL operates the most intense slow neutron source in the world PSI operates the Swiss Light Source, SLS, and Neutron Spallation Source, SINQ, and is developing the SwissFEL Free Electron Laser HZB operates the BER II research reactor the BESSY II synchrotron CEA/LLB operates neutron scattering spectrometers from the Orphée fission reactor ESRF is a third generation synchrotron light source jointly funded by 19 European countries Diamond is new 3rd generation synchrotron funded by the UK and the Wellcome Trust DESY operates two synchrotrons, Doris III and Petra III, and the FLASH free electron laser Soleil is a 2.75 GeV synchrotron radiation facility in operation since 2007 ELETTRA operates a GeV synchrotron and is building the FERMI Free Electron Laser ALBA is a new 3 GeV synchrotron facility due to become operational in 2010 PaN-data Partners

PaN-data Applications The partners operate hundreds of instruments used by over 30,000 scientists each year These instruments support scientific fields as varied as: Physics, Chemistry, Biology, Material sciences, Energy technology, Environmental science, Medical technology and Cultural heritage Applications include: crystallography that reveals the structures of viruses and proteins important for the development of new drugs neutron scattering that identifies stresses within engineering components such as turbine blades tomography that can image microscopic details of the 3D-structure of the brain Industrial applications include pharmaceuticals, petrochemicals and microelectronics PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories

Overview The PaNdata Collaboration The Vision The PaNdata Europe Project The PaNdata Open Data Infrastructure Proposal Looking Forwards Project

What is e-Infrastructure? Data Creation Archival Access Storage Compute Network Services Curation the researcher acts through ingest and access Virtual Research Environment the researcher shouldn’t have to worry about the information infrastructure Information Infrastructure

EDNS - European Data Infrastructure for Neutron and Synchrotron Sources PaNdata Vision Single Infrastructure  Single User Experience Capacity Storage Publications Repositories Data Repositories Software Repositories Raw Data Data Analysis Analysed Data Publication Data Publication s Facility 1 Raw Data Data Analysis Analysed Data Publication Data Publication s Facility 2 Raw Data Data Analysis Analysed Data Publication Data Publication s Facility 3 Different Infrastructures  Different User Experiences Raw Data Catalogue Data Analysis Analysed Data Catalogue Publication Data Catalogue Publications Catalogue

In words: PANdata will provide our user communities with data repositories and data management tools to: deal with large sets and large data rates from the experiments, enable easy and standardised annotation of data, allow transparent and secure remote access to data, establish sustainable and compatible data catalogues, allow long-term preservation of data, and provide compatible open source data analysis software. This will have a major impact on our scientific user community because it will offer: cross facility and cross discipline data analysis, secure access to large data sets over the network instead of using portable media, maintaining the records of science by having properly annotated data, linking publications to data, allowing efficient software developments, and efficient scientific collaborations across Europe by providing compatible data formats and analysis software.

Metadata and Digital Curation Proposal Approval Scheduling Experiment Data cleansing Record Publication Scientist submits application for beamtime Facility committee approves application Facility registers, trains, and schedules scientist’s visit Scientists visits, facility run’s experiment Subsequent publication registered with facility Raw data filtered and cleansed Data analysis Tools for processing made available

Overview The PaNdata Collaboration The Vision The PaNdata Europe Project The PaNdata Open Data Infrastructure Proposal Looking Forwards Project

PaN-data Standardisation PaN-data Europe is undertaking 5 standardisation activities: 1.Development of a common data policy framework 2.Agreement on protocols for shared user information exchange 3.Definition of standards for common scientific data formats 4.Strategy for the interoperation of data analysis software enabling the most appropriate software to be used independently of where the data is collected 5.Integration and cross-linking of research outputs completing the lifecycle of research, linking all information underpinning publications, and supporting the long-term preservation of the research outputs PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories

PaN-data Europe Timeline PaN-data Europe runs from June 2010 until December 2011 with workshops in Spring and Autumn PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories

Dependencies

Overview The PaNdata Collaboration The Vision The PaNdata Europe Project The PaNdata Open Data Infrastructure Proposal Looking Forwards Project

ERA Open Access Sharing Initiatives (examples, etc) ERA Infrastructure Platform Initiatives (EGI, etc) PaNdata Support Action (Ends 30 Nov 11) Policies and Standards PaNdata ODI (begins end2011) JRAs Users Data Software Integration Provenance Preservation Scalability PaNdata ODI (begins end2011) Services Users Data PaNdata ODI Virtual Labs Policies Powder Diff SAXS & SANS Tomography

Objectives Objective 2 – Users To deploy, operate and evaluate a system for pan-European user identification across the participating facilities and implement common processes for the joint maintenance of that system. Objective 3 – Data To deploy, operate and evaluate a generic catalogue of scientific data across the participating facilities and promote its integration with other catalogues beyond the project. Objective 4 – Provenance To research and develop a conceptual framework, defined as a metadata model, which can record the analysis process, and to provide a software infrastructure which implements that model to record analysis steps hence enabling the tracing of the derivation of analysed data outputs. Objective 5 – Preservation To add to the PaNdata infrastructure extra capabilities oriented towards long-term preservation and to integrate these within selected virtual laboratories of the project to demonstrate benefits. These capabilities should, as for the developments in the provenance JRA, be integrated into the normal scientific lifecycle as far as possible. The conceptual foundations will be the OAIS standard and the NeXus file format. Objective 6 – Scalability To develop a scalable data processing framework, combining parallel filesystems with a parallelized standard data formats (pNexus pHDF5) to permit applications to make most efficient use of dedicated multi-core environments and to permit simultaneous ingest of data from various sources, while maintaining the possibility for real-time data processing. Objective 7 – Demonstration To deploy and operate the services and technology developed in the project in virtual laboratories for three specific techniques providing a set of integrated end-to-end data services.

PaNdata ODI Joint Research Activities PaNdata ODI Service Activities PaNdata ODI Service Releases Standards from PaNdata Support Action uCat dCat vLabs Prov Pres Scale Rel 1Rel 2Rel 3Rel 4 users data s/w Integ Jun 2014 Jun 2013 Dec 2013Dec 2012

Overview The PaNdata Collaboration The Vision The PaNdata Europe Project The PaNdata Open Data Infrastructure Proposal Looking Forwards Project

Data The Research Lifecycle the researcher acts through ingest and access Research Environment Creation Archival Access Storage Compute Network Data Services the researcher shouldn’t have to worry about the information infrastructure Information Infrastructure ICAT TopCAT EGI GEANT Local resources User Info feed DAQ feed Data Analysis feed Provenanced Data

Science driver – Data Integration Neutron diffraction X-ray diffraction } NMR High-quality structure refinement }

OECD Principles and Guidelines for Access to Research Data from Public Funding 13 principles A – Openness Openness means access on equal terms for the international research community at the lowest possible cost,.... B – Flexibility, C – Transparency, D – Legal conformity, E – Protection of intellectual property, F – Formal responsibility, G – Professionalism H – Interoperability Technological and semantic interoperability is a key consideration in enabling and promoting international and interdisciplinary access to and use of research data.... I – Quality, J – Security, K – Efficiency, L – Accountability M – Sustainability... taking administrative responsibility for the measures to guarantee permanent access to data that have been determined to require long-term retention. [

The 7 C’s Creation Collection Capacity Computation Curation Collaboration Communication PaNdata Europe SA PaNdata ODI PaNdata VRE Data Creation Archival Access Storage Compute Network Services Curation

Overview The PaNdata Collaboration The Vision The PaNdata Europe Project The PaNdata Open Data Infrastructure Proposal Looking Forwards Project

Thank You

An invitation Very large overlap between ERF and PaNdata Any ERF members are welcome to join in PaNdata activities

Federated data catalogues supporting cross-facility, cross- discipline interaction at the scale of atoms and molecules Neutron diffraction X-ray diffraction High-quality structure refinement Unification of data management policies Shared protocols for exchange of user information Common scientific data formats Interoperation of data analysis software Linking Data and Publications and supporting the long-term preservation of the research outputs PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories

Related projects CRISP ESFRI Cluster on Physical Science and Astronomy (EU proposal) (See for example (See for example Projects involved – ESRFup: Upgrade of the European Synchrotron Radiation Facility – ILL20/20: Upgrade of the European Neutron Spectroscopy Facility technology/the-esfri-projects/ – SLHC: “super” Large Hadron Collider – EuroFEL: Complementary Free Electron Lasers in the Infrared to soft X-ray range – European XFEL: Hard X-ray Free Electron Laser in Hamburg – ESS: European Spallation Source for neutron spectroscopy – FAIR: Facility for Antiproton and Ion Research Data Workpackage (among others) – Authorisation and authentication system permitting access to data and IT resources – Solutions for high-speed recording of data to permanent storage and its long-term archival – A metadata management and data mining service across participating ESFRI RIs – High-speed and reliable data access mechanisms to facilitate processing of data and its access to large- scale user communities distributed across Europe and beyond – Establish an environment permitting a data continuum from raw data through to publications

Related projects IRUVX-PP Project – preparatory project to support the foundation of the EuroFEL Consortium preparatory project to support the foundation of the EuroFEL Consortium ESRFUp – ESRF Upgrade program ESRF Upgrade program ILL 20/20 – Upgrade program for ILL Upgrade program for ILL Open Source Development of the ICAT metadata catalogue NMI3 – Integrated Infrastructure Initiative for Neutron Scattering and Muon Spectroscopy Integrated Infrastructure Initiative for Neutron Scattering and Muon Spectroscopy IA-SFS – Integrating Activity on Synchrotron and Free Electron Laser Science Integrating Activity on Synchrotron and Free Electron Laser Science e-IRG - e-Infrastructure Reflection Group – e-IRG Report on Data Management e-IRG Report on Data Management – e-IRG Roadmap e-IRG Roadmap ESFRI - European Strategy Forum on Research Infrastructures

Digital Curation Projects SCAPE : Scalable Preservation Environments – automated, quality-assured workflows – policy-based planning and watch system – Case studies from: Library, Web Archiving, Scientific Research Data Sets – Partners include: Microsoft Research, The British Library ODE: Opportunities for Data Exchange – interoperability of data layers and data sharing – re-use and preservation – Partners include: Alliance for Permanent Access, CERN, Helmholtz Association APARSEN – Alliance Permanent Access to the Records of Science Network – To create a shared vision and framework for a sustainable digital information infrastructure providing permanent access to digitally encoded information – 30 Partners including: European Space Agency, Philips, Airbus, Microsoft Research – STFC leading CASPAR and ENSURE - Enabling knowledge Sustainability Usability and Recovery – scalable preservation solutions primarily for the commercial sector – Based on Open Archival Information System (OAIS) reference model (ISO:14721:2003), – 11 partners, IBM leading