WISDOM/HealthGrid: overview of grid applications in life sciences

Slides:



Advertisements
Similar presentations
11 Application of CSF4 in Avian Flu Grid: Meta-scheduler CSF4. Lab of Grid Computing and Network Security Jilin University, Changchun, China Hongliang.
Advertisements

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Why Grids Matter to Europe Bob Jones EGEE.
Particle physics – the computing challenge CERN Large Hadron Collider –2007 –the worlds most powerful particle accelerator –10 petabytes (10 million billion.
Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
Pure Silver Reusing and Repurposing Bibliographic Data in a Current Research Information System and Institutional Repository 15 September.
ANSC644 Bioinformatics-Database Mining 1 ANSC644 Bioinformatics §Carl J. Schmidt §051 Townsend Hall §
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
1 The EU e-Infrastructure Programme EUROCRIS Conference, Brussels, 17 th July 2007 Mário Campolargo European Commission - DG INFSO Head of Unit “GÉANT.
Plateforme de Calcul pour les Sciences du Vivant Embrace WP3 meeting Vincent Breton Chargé de Recherches au CNRS.
Addressing emerging diseases on the grid
Premières rencontres scientifiques France Grilles Quand ? le 19 Septembre 2011 Où ? à la Cité des Congrès de Lyon Remise du 1er prix France Grilles pour.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
An overview of the EGEE project Bob Jones EGEE Technical Director DTI International Technology Service-GlobalWatch Mission CERN – June 2004.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks World-wide in silico drug discovery against.
INFSO-RI Enabling Grids for E-sciencE WISDOM mini-workshop Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand) ISGC 2007 March 28th,
V. Breton, Lyon BBE, September 2002 WP10 Status. V. Breton, Lyon BBE, September 2002 WP10 goals between now and the end of the year Deploy the applications.
Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Luciano Milanesi CNR-ITB.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Santiago de Chile, 1st EELA Conference, 4-5/9/06 1 Status.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Cancer surveillance network on a grid infrastructure.
KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division.
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Application Case Study: Distributed.
Integrated Biomedical Information for Better Health Workprogramme Call 4 IST Conference- Networking Session.
A DΙgital Library Infrastructure on Grid EΝabled Technology ETICS Usage in DILIGENT Pedro Andrade
A short introduction to the Worldwide LHC Computing Grid Maarten Litmaath (CERN)
BIOINFOGRID: Bioinformatics Grid Application for Life Science Giorgio Maggi INFN and Politecnico di Bari
INFSO-RI Enabling Grids for E-sciencE V. Breton, 30/08/05, seminar at SERONO Grid added value to fight malaria Vincent Breton EGEE.
Grid Enabled High Throughput Virtual Screening Against Four Different Targets Implicated in Malaria Presented by Vinod.
Page 1 SCAI Dr. Marc Zimmermann Department of Bioinformatics Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) Grid-enabled drug discovery.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE – paving the way for a sustainable infrastructure.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ignacio Blanquer Enabling User Communities:
EMBRACE An example of Grid Integration (I): The EMBRACE project Jean SALZEMANN CNRS/IN2P3.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM, a grid enabled virtual screening initiative Yannick Legré LPC Clermont-Ferrand,
INFSO-RI Enabling Grids for E-sciencE Status of the Biomedical Applications in EELA Project (E-Infrastructures Shared Between Europe.
INFSO-RI Enabling Grids for E-sciencE Biomedical applications V. Breton, CNRS-IN2P3.
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
1 e-Infrastructures e-Infrastructures Taking stock and looking ahead an European perspective Bernhard Fabianek European Commission - DG INFSO GÉANT & e-Infrastructure.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.
INFSO-RI Enabling Grids for E-sciencE Towards grid-enabled telemedicine in Africa Yannick Legré on behalf of Vincent Breton CNRS-IN2P3,
Avian Flu Data Challenge Hsin-Yen Chen ASGC 29 Aug APAN24.
Future of grids V. Breton CNRS. EGEE training, CERN, May 19th Table of contents Introduction Future of infrastructures : from networks to e-
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE08 conference, Istambul Biomed community meeting V. Breton, CNRS.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Activités biomédicales dans EGEE-II Nicolas.
B i o i n f o r m a t i c s / B i o m e d i c a l A p p l i c a t i o n s i n E E L A Mexico, D.F., october 22 – 26, e – s c i e n c e M e x i c.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Genome Wide Haplotype analyses of human.
BIOINFOGRID: Bioinformatics Grid Application for life science MILANESI, Luciano National Research Council Institute of.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
INFSO-RI Enabling Grids for E-sciencE EGEE-2 NA4 Biomed Bioinformatics in CNRS Christophe Blanchet Institute of Biology and Chemistry.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
Portals and my Grid Stefan Rennick Egglestone Mixed Reality Laboratory University of Nottingham.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE08 conference, Istambul Life sciences cluster perspective on EGI V. Breton, CNRS On.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bioinformatics activity Christophe BLANCHET.
Solving inter grid interoperability issues: example of WISDOM and molecular dynamics Jean Salzemann, LPC Clermont-Ferrand, CNRS/IN2P3 Morris Riedel, Jülich.
Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute.
EGEE is a project funded by the European Union under contract IST Aims and organization of the Biomedical VO Yannick Legré CNRS/IN2P3 NA4/SA1.
Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Andreas Gisel & Luciano Milanesi.
Genomic Medicine Grid Juan Pedro Sánchez Merino Instituto de Salud Carlos III
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to Grids and the EGEE project.
The digital divide: how to improve the situation ? Vincent Breton Institut des Grilles du CNRS LPC Clermont-Ferrand 1 Credit: L. Maigne, P. de Vlieger,
The digital divide: how to improve the situation ?
Tools and Services Workshop
Joslynn Lee – Data Science Educator
V. Breton LPC Clermont-Ferrand
EGEE support for HEP and other applications
WISDOM-II, status of preparation
Experience with the deployment of biomedical applications on the grid Vincent Breton LPC, CNRS-IN2P3 Credit for the slides: M. Hofmann, N. Jacq, V. Kasam,
In silico docking on grid infrastructures
Presentation transcript:

WISDOM/HealthGrid: overview of grid applications in life sciences V. Breton CNRS Grid Institute, IN2P3 EUMEDGRID-Support Launch Event, 26/01/2010 Credit: A. Da Costa, P. De Vlieger, J. Salzemann

Introduction Grid technology provides services to do science differently, opens new avenues for Large scale on demand computing Secure data sharing dynamic data analysis Goals of my talk Share some of our ideas for using grid services in life sciences and healthcare Share my enthusiasm for what is ahead of us All grid applications described in this talk use gLite as grid middleware

Table of content Introduction Grid added value for Large scale computing Distributed data management Dynamic data analysis WISDOM, grid-enabled in silico drug discovery Cancer surveillance network Emerging disease surveillance network Conclusion

Grid services have made huge progresses Distributed computing has been available for 5 years for scientific production  : access to very large number of CPUs (>20.000 for biomed Virtual Organization)  : web service APIs lead to improved interoperability (EGEE, OSG, Digital Ribbon, …) : job efficiency and resource stability are still a problem : MPI is still available on a limited number of clusters (<10% of CPUs on EGEE biomed VO) Distributed data management has recently become available (AMGA) : secured access : easy installation : good performances : critical mass of developers for software maintenance and evolution What can I do with these services I could not do before ?

Think bigger ! Possibility to scale up by one or two orders of magnitude the volume of computations On demand access to > 20.000 CPU cores instead of cluster Freedom to think big Use cases Protein structure computations (e-NMR) From docking 1000 drug-like molecules to testing all the compounds currently available on market From updating monthly to updating daily a molecular biology database From studying the impact of single DNA mutations (SNPs) to multiple correlated mutations (Haplotypes) on diseases

Goal: study the impact of DNA mutations on human coronary diseases Genome Wide Haplotype analyses of human complex diseases with the EGEE grid Goal: study the impact of DNA mutations on human coronary diseases Very CPU demanding analysis to study the impact of correlated (double, triple) DNA mutations Deployment on EGEE Grid 1926 CAD (Coronary Artery Diseases) patients & 2938 healthy controls 378,000 SNPs (Single Nucleon Polymorphisms = local DNA mutations) 8.1 millions of combinations tested in less than 45 days (instead of more than 10 years on a single Pentium 4) Results published in Nature Genetics March 2009 (D. Tregouet et al) Major role of mutations on chromosome 6 was confirmed

Application: recalculating protein 3D structures in PDB The PDB data base gathers publicly available 3D protein structures Full of bugs Goal: redo the structures by recalculating the diffraction patterns PDB-files 42.752 X-ray structures 36.124 Successfully recalculated ~36.000 Improved R-free 12.500/17000 CPU time estimate 21.7 CPU years Real time estimate 1 month on Embrace VO on EGEE R.P Joosten et al, Journal of Applied Cristallography, (2009) 42, 1-9

Share my data while keeping them ! Share securely data without having to put them in a central repository Data are left where they are produced Authorized users have a customized view of a subset of the data Data owners keep a full control of their data Use cases Federation of mammography databases (MammoGrid) to improve cancer detection Federation of brain medical image databases (BIRN, NeuroLog, NeuGrid) for neurosciences

Monitoring and alert Coupling of grid data management and computing services allows continuous Data collection Data analysis Updated modeling Towards decision making Use cases Tsunami alert system Flood alert system Epidemiology

Table of content Introduction Grid added value for Large scale computing Distributed data management Dynamic data analysis WISDOM, grid-enabled in silico drug discovery Cancer surveillance network Emerging disease surveillance network Conclusion

WISDOM WISDOM (World-wide In Silico Docking On Malaria) is an initiative aiming to demonstrate the relevance and the impact of the grid approach to address drug discovery for neglected and emerging diseases. 2005 2006 2007 2008 Wisdom-I Malaria Plasmepsin DataChallenge Avian Flu Neuraminidase Wisdom-II 4 targets Diabetes Alpha-amylase, maltase EGEE, Auvergrid, TwGrid, EELA, EuChina, EuMedGrid Embrace EGEE BioInfoGrid GRIDS EUROPEAN PROJECTS INSTITUTES SCAI, CNU Academica Sinica of Taiwan ITB, Unimo Univ,, LPC, CMBA CERN-Arda, Healthgrid, KISTI 2009 Malaria Journal 2009, 8:88 (1 May 2009)

WISDOM partners LPC Clermont-Ferrand: Biomedical grid SCAI Fraunhofer: Knowledge extraction, Chemoinformatics KISTI Grid technology CEA, Acamba project: Biological targets, Chemogenomics Univ. Modena: Biological targets, Molecular Dynamics Chonnam Nat. Univ. In vitro tests HealthGrid: Biomedical grid, Dissemination ITB CNR: Bioinformatics, Molecular modelling Academica Sinica: Grid user interface Univ. Los Andes: Biological targets, Malaria biology Univ. Pretoria: Bioinformatics, Malaria biology 12

Virtual screening pipeline FLEXX/ AUTODOCK Molecular docking Molecular dynamics AMBER Complex visualization CHIMERA in vitro WET LABORATORY in vivo 13 13

WISDOM production environment FTP HTTP File IS Bio IS WISDOM Information System AMGA Local Data Repository Data Manager Local Data Repository Data Manager Database Service Data Management APIs Client Services Transfer Manager NGS/EGI EGEE DIGITAL RIBBON SE CE SE SE CE CE SE Task Manager OSG Tasks Management APIs Job Manager Job Submitter Data Metadata Jobs

Table of content Introduction Grid added value for Large scale computing Distributed data management Dynamic data analysis WISDOM, grid-enabled in silico drug discovery Cancer surveillance network Emerging disease surveillance network Conclusion

Breast cancer screening Screening Structure Invitation: All women > 50 Mammography Problem : Data transfer from labs to screening structure Biopsy Cancer treatment loop Anatomical pathology report Cancer surveillance network on a grid infrastructure 16

Sentinel network Anatomic pathology lab Screening Structure GRID Anatomic pathology lab Screening Structure Anatomic pathology lab National / Regional Epidemiology Crisap database Cancer surveillance network on a grid infrastructure 17

Architecture Cancer surveillance network on a grid infrastructure 18

Technical architecture Certification Authority + GIPCPS Grid-enabled sentinel network for cancer surveillance, Proceedings of Healthgrid conference 2009 Studies in Health Technology and Informatics Collaboration: CNRS – MAAT - RSCA Cancer surveillance network on a grid infrastructure 19

Table of content Introduction Grid added value for Large scale computing Distributed data management Dynamic data analysis WISDOM, grid-enabled in silico drug discovery Cancer surveillance network Emerging disease surveillance network Conclusion

Grid-enabled influenza surveillance network > Is there a way to improve the response to emerging diseases using grids ?

> Elements for Epidemiologists Where does the virus come from? How does the virus spread? How does the virus evolve? Epidemiologist « The only way to track down a virus history Is through its imprint on the viral genome » 22

> Influenza A Virus Genome sequenced >ABV25634 MKAILLVLLCAFAATNADTLCIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCRLGGIAPLQLG KCNIAGWLLGNPECDLLLTVSSWSYIVETSNSDNGTCYPGDFIDYEELREQLSSVSSFEKFEIFPKTSSW PNHETTRGVTAACPYAGASSFYRNLLWLVKKENSYPKLSKSYVNNKGKEVLVLWGVHHPPTSTDQQSLYQ NADAYVSVGSSKYDRRFTPEIAARPKVRGQAGRMNYYWTLLEPGDTITFEATGNLVAPRYAFALNRGSES GIITSDAPVHDCDTKCQTPHGAINSSLPFQNIHPVTIGECPKYVKSTKLRMVTGLRNIPSIQSRGLFGAI AGFIEGGWTGLIDGWYGYHHQNGQGSGYAADQKSTQNAIDGITNKVNSVIEKMNTQFTVVGKEFNNLERR IKNLNKKVDDGFLDVWTYNAELLVLLENERTLDFHDSNVKNLYEKARSQLRNNAKEIGNGCFEFYHKCDD ACMESVRNGTYDYPKYSEESKLNREEIDGVKLESMMVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSL QCRICI Grid DB > Sequence added to Public or Private DB FTP NCBI Metadata Daily updates Sequences Protein, Nucleotide, Coding region IDs > Automatic link to Grid DataBase 23

Grid enabled influenza surveillance network Grid added values: Dynamics Security Daily download NCBI Grid DB Local Database Grid DB Grid DB Local Database Daily updated data on virus evolution Grid DBs parsing Phylogenetic analysis Virtual screening New candidate drugs

Molecular epidemiology pipeline to monitor influenza A virus AMGA Metadata Sequences Protein, Nucleotide, Coding region IDs > Run the phylogenetic workflow on grid Prepare Data in correct format Alignment + Curation (Muscle + Gblocks) Next: Avian flu Monitoring (Vietnam) Daily updated data on virus evolution Phylogenetic Analysis (PhyML) Grid portal http://g-info.healthgrid.org/ 25

Table of content Introduction Grid added value for Large scale computing Distributed data management Dynamic data analysis WISDOM, grid-enabled in silico drug discovery Cancer surveillance network Emerging disease surveillance network Conclusion and perspectives

Conclusion Grid services are better than they have ever been Opportunities to do science differently or at a larger scale Many opportunities for collaboration with mediterranean countries in the field of life sciences Virtual Screening Surveillance networks (influenza) Interest from Institut des Grilles du CNRS for other fields Earth sciences (Climate change) Particle physics Any other field HealthGrid conference 2010 Paris, June 28-30 2010 Call for papers: http://paris2010.healthgrid.org

Complexity Time Infrastructure Interoperability Quality of Service Visionaries Early Adopters Majority Late 2007 2009 2011 2014 2016 Infrastructure Interoperability Quality of Service On Demand Access User Friendliness Improved Distributed Data Management Data Models Data Integration Tools and Standards Knowledge Management Tools and standards Domain Specific Knowledge Management and Ontologies Time Complexity T. Solomonides et al , Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine, and Healthcare, IGI Global, May 2009