Integrated e-Infrastructure for Distributed, Data-driven, Data- intensive High Performance Computing: Biomedical Requirements Peter V Coveney Centre for.

Slides:



Advertisements
Similar presentations
The e-Framework Bill Olivier Director Development, Systems and Technology JISC.
Advertisements

Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
Introduction to the unit and mixed methods approaches to research Kerry Hood.
Life Science Services and Solutions
High-Performance Computing
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
Developing a Reference Model of ePortfolio Peter Rees Jones CETIS (The UK Centre for Educational Technology Interoperability Standards) Drawing on the.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Thee-Framework for Education & Research The e-Framework for Education & Research an Overview TEN Competence, Jan 2007 Bill Olivier,
UK e-Science and the White Rose Grid Paul Townend Distributed Systems and Services Group Informatics Research Institute University of Leeds.
1 Ideas About the Future of HPC in Europe “The views expressed in this presentation are those of the author and do not necessarily reflect the views of.
1 The UK Opportunity: what is experimental medicine? UNLOCK YOUR GLOBAL BUSINESS POTENTIAL Pre- clinical develop- ment Phase I Phase II Phase III Product.
Institute of Cancer Research - Institut du cancer ICR’s Activities in Cancer Imaging.
Engineering & Physical Sciences Research Council.
VPH Overview and Data Requirements Peter Coveney University College London 1.
 Using High Performance Computing to enhance the clinical decision making process in low back pain Prof. Damien Lacroix INSIGNEO Institute for in silico.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
DCI for Clinical Translational Research
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)
New Communities: The Virtual Physiological Human Use Case Stefan Zasada University College London
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
Integrated Biomedical Information for Better Health Workprogramme Call 4 IST Conference- Networking Session.
From GEANT to Grid empowered Research Infrastructures ANTONELLA KARLSON DG INFSO Research Infrastructures Grids Information Day 25 March 2003 From GEANT.
Results of the HPC in Europe Taskforce (HET) e-IRG Workshop Kimmo Koski CSC – The Finnish IT Center for Science April 19 th, 2007.
National Digital Infrastructure The DfES vision for the next five years in ICT in Schools.
RI User Support in DEISA/PRACE EEF meeting 2 November 2010, Geneva Jules Wolfrat/Axel Berg SARA.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
Towards a Virtual Institute for Research into eGovernment Prof. Zahir Irani & Dr Tony Elliman Information Systems Evaluation and Integration Group School.
This document produced by Members of the Helix Nebula Partners and Consortium is licensed under a Creative Commons Attribution 3.0 Unported License. Permissions.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
The Swiss Grid Initiative Context and Initiation Work by CSCS Peter Kunszt, CSCS.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
MEDIN Work Plan for By March 2011 MEDIN will be 3 years into the original 5 year development plan started in Would normally ask for continued.
E-Science Research Councils awarded e-Science funds ” science increasingly done through distributed global collaborations enabled by the Internet, using.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Combining the strengths of UMIST and The Victoria University of Manchester “Use cases” Stephen Pickles e-Frameworks meets e-Science workshop Edinburgh,
Grid Enabled Neurosurgical Imaging Using Simulation
Marv Adams Chief Information Officer November 29, 2001.
Utility Computing: Security & Trust Issues Dr Steven Newhouse Technical Director London e-Science Centre Department of Computing, Imperial College London.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
Nigel Strang ICT for Health DG Information Society & Media European Commission ICT WP Challenge 5 - Objective 5.3: “
HeC PMT Meeting Tallinn 10 July 2008 Brainstorming session: what interest can there be in thinking about a possible new HeC 2 project to follow-up the.
1 The Virtual Physiological Human ToolKit Jonathan Cooper University of Oxford On behalf of the ToolKit team  VPH NoE, 2009 All Hands Meeting 2009 Oxford.
Technology-enhanced Learning: EU research and its role in current and future ICT based learning environments Pat Manson Head of Unit Technology Enhanced.
The National Grid Service Mike Mineter.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Cloud-based e-science drivers for ESAs Sentinel Collaborative Ground Segment Kostas Koumandaros Greek Research & Technology Network Open Science retreat.
1 Kostas Glinos European Commission - DG INFSO Head of Unit, Géant and e-Infrastructures "The views expressed in this presentation are those of the author.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
NIHR Southampton Biomedical Research Centre The Southampton Biomedical Research Centre is funded by the National Institute for Health Research (NIHR) and.
A worldwide e-Infrastructure and Virtual Research Community for NMR and structural biology Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for.
EGI-InSPIRE EGI-InSPIRE RI The European Grid Infrastructure Steven Newhouse Director, EGI.eu Project Director, EGI-InSPIRE 29/06/2016CoreGrid.
The Mapper project receives funding from the EC's Seventh Framework Programme (FP7/ ) under grant agreement n° RI EGI and PRACE ecosystem.
12 th Meeting of the GBIF Participant Nodes Committee 6-7 October 2013, Berlin, Germany Towards a generic work programme for a Node Olaf Bánki Senior Programme.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
EGI towards H2020 Feedback (from survey)
Bob Jones EGEE Technical Director
H2020, COEs and PRACE.
Clouds , Grids and Clusters
Centre for Computational Science, University College London
Similarities between Grid-enabled Medical and Engineering Applications
Overview of working draft v. 29 January 2018
MAZARS’ CONSULTING PRACTICE
eContentplus 2007 Work Programme
Presentation transcript:

Integrated e-Infrastructure for Distributed, Data-driven, Data- intensive High Performance Computing: Biomedical Requirements Peter V Coveney Centre for Computational Science University College London Integrating the Strengths of the e-Research Community, NeSC, Thursday, 10th March 2011

Computational Biomedicine –HIV/AIDS –Cardiovascular medicine –Cancer ICT, e-Health and the Virtual Physiological Human Infrastructure support Shortcomings in UK infrastructure Major policy hurdles UCL CLMS initiative Conclusions Contents

UCL Projects VPH Network of Excellence – EU (€8M); no HPC ContraCancrum – EU (€3.4M); no HPC VPH-Share – EU (€10.7M); no HPC P-Medicine – EU (€13.7M); no HPC INBIOMEDVision – EU (€2M) MAPPER – EU (€2M); no HPC A new approach to Science at the Life Sciences Interface – EPSRC (£4M) + HECToR Large Scale Lattice-Boltzmann Simulation of Liquid Crystals – EPSRC (£800K) + HECToR

Patient-specific medicine ‘Personalised medicine’ - use the patient’s specific profile to better manage disease or a predisposition towards a disease Tailoring of medical treatments based on the characteristics of an individual patient Patient-specific medical-simulation Use of genotypic and or phenotypic simulation to customise treatments for each particular patient, where computational simulation can be used to predict the outcome of courses of treatment and/or surgery Why use patient-specific approaches? Treatments can be assessed for their effectiveness with respect to the patient before being administered, saving the potential expense of ineffective treatments See: P. V. Coveney et al (eds), Interface Focus, Theme Issue on VPH Vol. 1, No. 3 Online 25 th April 2011

HIV-1 Protease is a common target for HIV drug therapy Enzyme of HIV responsible for protein maturation Target for Anti-retroviral Inhibitors Example of Structure Assisted Drug Design 9 FDA inhibitors of HIV-1 protease So what’s the problem? Emergence of drug resistant mutations in protease Render drug ineffective Drug resistant mutants have emerged for all FDA inhibitors Monomer B Monomer A Flaps Leucine - 90, 190 Glycine - 48, 148 Catalytic Aspartic Acids - 25, 125 Saquinavir P2 Subsite N-terminalC-terminal Medical/clinical domain I: HIV/AIDS  Integrate simulation with conventional clinical decision support systems to refine results

The goal: to simulatelarge scale patient specific cerebral blood flow in clinically relevant time frames Objectives: To study cerebral blood flow using patient-specific image-based models. To provide insights into the cerebral blood flow & anomalies. To develop tools and policies by means of which users can better exploit the ability to reserve and co-reserve HPC resources. To develop interfaces which permit users to easily deploy and monitor simulations across multiple computational resources. To visualize and steer the results of distributed simulations in real time Yield patient-specific information which helps plan embolisation of arterio- venous malformations, aneurysms, etc. Medical/clinical domain II: Grid enabled neurosurgical imaging using simulation M. D. Mazzeo and P. V. Coveney, Computer Physics Communications, 178, (12), , (2008). DOI: /j.cpc DOI /j.cpc

Medical/clinical domain III: ContraCancrum Multi-level dataMulti-level Modelling Two dedicated clinical studies in ContraCancrum, one in glioma and one in lung cancer (200 cases/year) Clinically Oriented Translational Cancer Multilevel Modelling

Virtual Physiological Human Funded under EU FP 7; ~ €250M 20 projects: 1 NoE, 5 IPs, 11 STREPs, 3 CAs. “a methodological and technological framework that, once established, will enable collaborative investigation of the human body as a single complex system...” Networking NoE

VPH-Share Overview VPH-Share will provide the organisational fabric (the infostructure), realised as a series of services, offered in an integrated framework, to expose and to manage data, information and tools, to enable the composition and operation of new VPH workflows and to facilitate collaborations between the members of the VPH community. HIV Heart Aneurisms Musculoskeletal €11M, , EU FP7 – Promotes cloud technologies

p-medicine Predictive disease modeling in p-medicine will contribute to the optimization of cancer treatment by fully exploiting the individual data of the patient. p-medicine is focusing on Wilms tumor, breast cancer and acute lymphoblastic leukemia The p-medicine infrastructure supports both a generic seamless, multi-level data integration purpose and a VPH-specific, multi-level, cancer data repository to facilitate model validation and clinical translation through trials. The infrastructure is scalable for any disease as long as predictive modeling is clinically significant in one or more levels (from molecular to tissue level) and the development of such models is feasible (i.e. there is enough understanding of the biological mechanisms involved to develop them). Led by a clinical oncologist - Prof Norbert Graf! Disease Modelling at the molecular Level Disease Modelling at the cellular Level N SG 1 G 2 M G 0 A Disease Modelling at the tissue/organ Level Multi-scale therapy predictions/disease evolution results Multi-level disease modeling €13M, , EU FP7

Large scale data & computing 06/10/2015 Seamless access and integration of distributed, heterogeneous data in a data warehouse repeatedly over time (≈ 200 GB / patient and time point) Seamless access and integration of distributed, heterogeneous data in a data warehouse repeatedly over time (≈ 200 GB / patient and time point) Models are built for use in clinical decision support results are needed in a timely fashion results are needed in a timely fashion It is necessary to have the possibility of seamlessly “plugging in” resources for parallel and large scale computing “here and now” petascale computing is needed to perform e.g.: activities like drug binding affinity determination activities like drug binding affinity determination Blood flow through tumours Gratis via VPH-NoE supervised VPH Virtual Community allocations of time on DEISA and, in future PRACE via MAPPER, …?

MAPPER: Objectives and Challenges MAPPER will develop computational strategies, software and services for distributed multiscale simulations across disciplines, exploiting existing and evolving European e-Infrastructure. Driven by seven exemplar multiscale applications, MAPPER will deploy a computational science infrastructure for distributed multiscale computing on and across European e-Infrastructures. By taking advantage of existing software and services, MAPPER will deliver high quality components aiming at large-scale, heterogeneous, high performance multidisciplinary multiscale computing, while maintaining ease of use and transparency for end users. MAPPER will advance state-of-the-art in high performance computing on e-Infrastructures by enabling distributed execution, across all European e-Infrastructures, of multiscale models.

VPH ToolKit

VPH Virtual Community on DEISA + euHeart in second wave, and other non-VPH EU projects VPH was awarded 2 million standard DEISA core hours for 2009, renewed for 2010 and 2011 HECToR (Cray, UK) SARA (IBM Power 6, Netherlands) DEISA-TeraGrid interoperability project has additional access to LRZ

Computational experiments integrated seamlessly into current clinical practice Clinical decisions influenced by patient specific computations: turnaround time for data acquisition, simulation, post-processing, visualisation, final results and reporting. Fitting the computational time scale to the clinical time scale : –Capture the clinical workflow –Get results which will influence clinical decisions: 1 day? 1 week? –This project - 15 to 30 minutes Development of procedures and software in consultation with clinicians Security/Access is major concern Need to integrate Data, Compute via Workflows On-demand availability of storage, networking and computational resources VPH requires HPC and Data Integration

Many of the projects we are involved in have non-standard requirements with respect to HPC service providers Ability to co-reserve resources  HARC Launch emergency simulations  SPRUCE Consistent interfaces for federated access  AHE Access to back end nodes: steering, visualisation Lightpath network connections Data integration from multiple sources  IMENSE Support for software (ReG steering toolkit etc)

Individualized MEdiciNe Simulation Environment  IMENSE Data repository – this is the key store for project data containing all patient data, and simulation data derived from the patient data. Integrated web portal – this provides the central interface from which users upload and access data sets, and analysis services. The interface provides users with the facility to search for patient data based on a number of criteria. Web Services – the web services platform implements required data processing functions. Workflow environment – the workflow environment provides a virtual experiment system, from which users can launch pre- defined workflows to automate moving data between the data environment and multiple data processing services. Coveney et al, “An e-Infrastructure Environment for Patient Specific Multiscale Modelling and Treatment”, preprint, 2011

IMENSE Interface IMENSE Environment

Workflows GSEngine is a workflow orchestration engine developed by the ViroLab project Can be used to orchestrate applications launched by AHE It allows services to be orchestrated using both point and click and scripting interfaces Workflows stored in a repository and shared between users Many of the aims of ViroLab similar to VPH-I projects, so GSEngine will be useful here Malawski et al, Future Generation Computer Systems, 26, (1), 138—146, 2010

Inside IMENSE: Integrating the components Coveney et al, “An e-Infrastructure Environment for Patient Specific Multiscale Modelling and Treatment”, preprint, 2011

UK Infrastructural Failures UK computing e-Infrastructure is crumbling. Not a holding partner in PRACE.  No Tier-0 site in the works. Only one Tier-1 machine (with issues).  HECToR has had several major failures, researchers seem to have trouble using/trusting it, given its usage.  What’s happening next? Tier-2 facilities are also being dismantled. NGS core nodes being shut down!! We cannot maintain a good level of e-Science research without the infrastructure to support it Relative to other countries we’re in full scale retreat!

Infrastructure in the UK is fragmented 22 Data HPC NGS Networks ?

TeraGrid  eXtreme Digital (XD) Two sets of services: –XES will provide a set of well-known (and standard) protocol specifications and profiles –CPS will support both the diversity of different services and capabilities required by the community From the desktop to the largest machines! XD design is firmly tied to the user requirements of the science and engineering research community. Presents the individual user with a common user environment Caters to both researchers whose computations require very little data movement and those performing very data-intensive computations. Will offer a highly capable service interface to “community user accounts” such as science gateways

We face major policy hurdles For our projects to be successful, we need integrated compute, storage, networks and services. –HPC’s antediluvian policies prevent this from happening  They still have a batch job mentality! No coordinated allocations policies in the EU –Need to apply for a project, then if successful apply for compute access  Can’t do project if compute application rejected!

Importance of connectivity With limited national facilities, connectivity to other countries becomes crucial. 1-10Gbit wide area networks are needed for large simulations and data movements. However, network provisioning is currently extremely difficult and time-consuming. Researchers end up having to request the links, rather than resource providers.

Policy issues E-science research has always required changes in resource provider policies to thrive. Support for advance machine and network reservations. Including urgent computing. Improvements in accessibility and usability. Support for Audited Credential Delegation. Interoperability between machines & infrastructures. DEISA’s Failure to address this augurs poorly for the future

Political issues Streamlined procedures for UK or EU scientific projects. All-in proposals which, when accepted, grant everything needed for a research project.  This includes funding for research as well as HPC resource allocations. More sensible service level agreements. If a simulation uses multiple machines and one fails, a full allocation refund should be given. MAPPER Policy Document – copies available

Supported by the Provost's Strategic Fund Computational Life and Medical Science The CLMS Network is 3 year initiative from September 2010 Management: Dean’s Committee Steering Committee Director: P.V. Coveneyhttp://

1.Expand UCL’s world-leading position in life and biomedical sciences 2.Steering the collaboration with academic institutions: within UCL, with UCLP and the NHS, UK-CMRI, Yale, and others 3.Exploit initiatives in integrative biomedical systems science from the UK Research Council, EU and others around the world 4.Grow collaborations with industry, create business and commercial opportunities, promote UCL IP licensing 5.Plan for the next stages of activity in computational life and medical sciences at UCL CLMS Goals CLMS brings together UCL researchers with clinicians from UCL partners to develop shared data + compute + data transfer + application support services  Integrated e-Infrastructure and Services

Conclusions Biomedical projects all put pressure on resource providers to offer new services and new ways of working For interactive and urgent work the batch processing model does not work The very conservative model adopted by HPC providers proscribes their resources from being used in innovative ways to do new science and engage new and different kinds of users If HPC is to be exploited in computational biomedicine it needs to be used in a way that fits in with the medical & clinical workflow VPH and similar initiatives: Will only increase pressure for non-standard services from resource providers