West-Life: A VRE for Structural Biology Alexandre Bonvin, Utrecht University Chris Morris STFC EGI Community Forum Bari, November 9-13 2015.

Slides:



Advertisements
Similar presentations
I2S2 - Infrastructure for Integration in Structural Sciences Information Model Development Workshop RAL 11 th February 2010
Advertisements

EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI AAI in EGI Status and Evolution Peter Solagna Senior Operations Manager
ICAT + Information Model Brian Matthews Scientific Information Group E-Science Centre STFC Rutherford Appleton Laboratory
Federated Identity Management for Researchers – A quick overview from GÉANT BoF TNC May 2014 Dublin.
Slide: 1 Welcome to the workshop ESRFUP-WP7 User Single Entry Point.
PaN-data WP7 - Integration Brian Matthews STFC-e-Science.
EUDAT FIM4R at TNC 2014 Jens Jensen, STFC, on behalf of EUDAT AAI task force.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
AAF Middleware update February Presented by Terry Smith Technical Manager and Heath Marks Manager.
Integrated e-Infrastructure for Scientific Facilities Kerstin Kleese van Dam STFC- e-Science Centre Daresbury Laboratory
Authors Project Database Handler The project database handler dbCCP4i is a small server program that handles interactions between the job database and.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
RI EGI-InSPIRE RI EGI Future activities Peter Solagna – EGI.eu.
 Our mission Deploying and unifying the NMR e-Infrastructure in System Biology is to make bio-NMR available to the scientific community in.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
Electronic labnotes Mari Wigham COMMIT/. Information WUR  Organising, sharing, finding and reusing data  Expertise in: ● Modelling data.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
A worldwide e-Infrastructure for NMR and structural biology A worldwide e-Infrastructure for NMR and structural biology Introduction Structural biology.
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
Award # funded by the National Science Foundation Award #ACI Jetstream: A Distributed Cloud Infrastructure for.
Authentication and Authorisation for Research and Collaboration Peter Solagna Milano, AARC General meeting Current status and plans.
Project Database Handler The Project Database Handler is a brokering application, which will mediate interactions between the project database and other.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Simplified Experiment Submit Proposal Results Excited Users Do Expt Data Analysis Feedback.
Towards a Structural Biology Work Bench Chris Morris, STFC.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Evolution of AAI for e- infrastructures Peter Solagna Senior Operations Manager.
Networks ∙ Services ∙ People Thomas Bärecke Journée Fédération, Paris Collaboration européenne GÉANT SA5 03/07/2015 SA5 T5 team
High throughput biology data management and data intensive computing drivers George Michaels.
ICAT Status Alistair Mills Project Manager Scientific Computing Department.
Thomas Gutberlet HZB User Coordination NMI3-II Neutron scattering and Muon spectroscopy Integrated Initiative WP5 Integrated User Access.
European Life Sciences Infrastructure for Biological Information ELIXIR Cloud Roadmap Chairs: Steven Newhouse, EMBL-EBI & Mirek Ruda,
Connect communicate collaborate Case Studies in Federated Identity Management for Research Communities Ann Harding, SWITCH/GN3plus Peter Gietz, DAASI International.
An Open Education Cloud Alexandre Bonvin, Utrecht University WeNMR VRC / WestLife VRE /MoBrain CC But today mainly wearing a lecturer’s hat Open Science.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No West-Life.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
A worldwide e-Infrastructure and Virtual Research Community for NMR and structural biology Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number Marios Chatziangelou, et al.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Collaboration.
Requirements from the WeNMR VRC User Forum 2011, Vilnius Workshop.
DARIAH EU AAI consideration K. Skala, D. Davidović, Z. Šojat Lisbon, 22 May 2015.
The Umbrella Project Authentication The minimum user information possible is stored centrally to avoid Data Protection issues. The Authentication is done.
Project Database Handler The Project Database Handler is a brokering application which will mediate interactions between the project database and other.
An short overview of INSTRUCT and its computational requirements Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for Biomolecular Research Faculty.
A Competence Center to Serve Translational Research from Molecule to Brain Alexandre M.J.J. Bonvin MoBrain CC coordinator Bijvoet Center for Biomolecular.
Virtual Laboratory Amsterdam L.O. (Bob) Hertzberger Computer Architecture and Parallel Systems Group Department of Computer Science Universiteit van Amsterdam.
Accessing the VI-SEEM infrastructure
Status Umbrella ID Mirjam van Daalen.
Project Facts Partners: DANTE (UK), GARR (IT), RedCLARA (UY), RedIRIS (ES), RENATA (CO), RNP (BR), TERENA (NL) Coordinator: RedCLARA Project Duration:
West-Life VRE for Structural Biology Overview and Impact
AAI for a Collaborative Data Infrastructure
Budget JRA2 Beneficiaries Description TOT Costs incl travel
Tools and Services Workshop
The importance of being Connected
The Life Cycle of Structural Biology Data: A discipline with a culture of sharing Chris Morris, STFC DI4R Sept 2016.
Alexandre M.J.J. Bonvin MoBrain CC coordinator
Federated Identity Management for Researchers (FIM4R)
ASTERICS to support enabling the scientific synergies
ELIXIR Safeguarding the results of life science research in Europe
Data processing Creative Biostructure now can provide a collection of tools and analysis package for biomolecular structure determination, refinement and.
Case Study: Algae Bloom in a Water Reservoir
Break out group coordinator:
West-Life Chris Morris, STFC, UK
Brian Matthews STFC EOSCpilot Brian Matthews STFC
West-Life: the last six months May-Oct 2018
Umbrella ID Federated Identity for PaN facilities
Presentation transcript:

West-Life: A VRE for Structural Biology Alexandre Bonvin, Utrecht University Chris Morris STFC EGI Community Forum Bari, November

Background West-Life: Life Sciences in the Cloud

Structural Biologists are mature computer users First use of digital computers in 1940s Protein Data Bank Log new entries by year

New scientific goals Larger macromolecular machines Membrane association 4D (structure + dynamics) Transient interactions

INSTRUCT user survey 73% working on eukaryotic rather than prokaryotic systems 84% working on complexes rather than single gene products Each research team routinely uses three-four different techniques 83% would use combined SB techniques more often if it was easier to get access to experimental facilities 73% of the cases found it hard to combine software tools for different techniques in integrated workflows

New experimental methods Combined techniques Users are not always experts Small samples Data noisy and incomplete Deliver results to other life scientists  Calls for integrative, user-friendly solutions

Crowdsourcing from the middle tier Community includes: – Life scientists who use computers – End user programmers – Algorithm developers We aim at easing the process of creating web- based services

The Project West-Life: Life Sciences in the Cloud

10 Partners: o STFC (UK) (lead partner, Martyn Winn Coordinator) o Dutch Cancer Institute (NKI) (NL) o EMBL (DE) o Masaryk University (MU) (CZ) o Consejo Superior De Investigaciones Cientificas (CSIC) (ES) o Consorzio Interuniversitario Risonanze Magnetiche Di Metallo Proteine (CIRMPP) (IT) o INSTRUCT (UK) o Utrecht University (NL) o Luna (FR) – (SME) o INFN (IT) The project Budget: € Duration: 36 months Started: 1 Nov 2015 Proposal ID

Support for combined techniques: – Multiple facilities visited for one project – Data management challenges inc. provenance – New algorithms needed for integrative approaches – Extends weNMR, uses iCAT, EGI and EUDAT resources – Will integrate and connect the already available services Main Concepts

Main objectives 1.Provide analysis solutions for the different Structural Biology approaches 2.Provide automated pipelines to handle multi-technique datasets in an integrative manner 3.Provide integrated data management for single and multi-technique projects, based on existing e- infrastructure 4.Foster best practices, collaboration and training of end users

Main Concepts

Ideas Challenges Requirements West-Life: Life Sciences in the Cloud

Structural Biology Work Bench Seamless data transfer between stages Accumulate metadata without user intervention No installation effort Extensible Data management should be combined with data processing

Reinvent nothing Existing best practise includes: – weNMR – PaNData – Diamond: pipelines and archives – Scipion – Data Life Cycle Lab Integration, not competition

Processing requirements Datasets may be scattered NMR, MX: Parameter sweeps and embarrassingly parallel models EM class assignment: IO intensive One can estimate the total demand, but it is hard to predict peak demand E-Infra requirements submitted to EGI, together with the MoBrain CC, WeNMR and N4U

New data challenges Data volume: – Combined output of European SB facilities > LHC – XFEL will double it Improve archiving of data and metadata Support for data moving / replication Improve automated pipelines for MX … create pipelines for other techniques

New data challenges Reproducibility – Keywords, version numbers – Archive software to ensure reproducibility, e.g. in Cloud VMs? Combined algorithms Quality indications

Data requirements Raw experimental data -> reduced data -> structure Large experimental facilities have own resource … small ones need help Automatically record provenance metadata when data used

AAI requirements Saved sessions, data access – “I am the person you gave these credentials to…” Collaborations – “I am the person you think I am” Remote experiments – “I am definitely the person you think I am” Personal certificates – Implausible that our community would use them at broad (but examples within WeNMR)

? Cryo-EM ISGC 2016 in Taipei

Supplementary material

Current AAI status WeNMR uses SSO – – Accepts eduGAIN and social media id Experimental facilities issue userids – Moving to Moonshot / Umbrella integration with eduGAIN – Check passports at gate – … but moving to remote access Instruct issues userids – Moving to Moonshot / Umbrella integration with eduGAIN – Verifies identity by phone call to PI (small community) West-Life not started yet

AAI solutions Solution for homeless users – create a local id without associating it with a homeid – We have colleagues not in eduGAIN Solutions to handle user attributes – Stored locally – updates are checked by administrator Preferred technology – Shibboleth has become standard – SAML probably sufficient for authorization Web access, with delegation

References Biasini et al. (2013). Acta Cryst. D69, Gutmanas et al. (2013). Acta Cryst. D69, Karaca, E. & Bonvin, A. M. J. J. (2013). Acta Cryst. D69, Marabini, et al. (2013). Acta Cryst. D69, Morris, C. & Segal, J. (2012). IEEE Software, 29, Perrakis et al. J. Struct. Biol. 175, DiMaio et al., Nature Methods, Improved protein crystal structures at low resolution by integrated refinement with Phenix and Rosetta, in press