INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org WHO, 2/11/04 Grid enabled in silico drug discovery Vincent Breton CNRS/IN2P3 Credit for the.

Slides:



Advertisements
Similar presentations
1 Real World Chemistry Virtual discovery for the real world Joe Mernagh 19 May 2005.
Advertisements

Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks World-wide in silico drug discovery against.
INFSO-RI Enabling Grids for E-sciencE WISDOM mini-workshop Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand) ISGC 2007 March 28th,
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
Asia’s Largest Global Software & Services Company Genomes to Drugs: A Bioinformatics Perspective Sharmila Mande Bioinformatics Division Advanced Technology.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
Using the WS-PGRADE Portal in the ProSim Project Protein Molecule Simulation on the Grid Tamas Kiss, Gabor Testyanszky, Noam.
KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division.
FKPPL workshop May 2012 BUI The Quang Prof. Vincent Breton Prof. Doman Kim Prof. NGUYEN Hong Quang Prof. PHAM Quoc Long Grid enabled in silico drug discovery.
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)
Application of e-infrastructure to real research.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Building Grid-enabled Virtual Screening Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Application Case Study: Distributed.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
BIOINFOGRID: Bioinformatics Grid Application for Life Science Giorgio Maggi INFN and Politecnico di Bari
INFSO-RI Enabling Grids for E-sciencE EGEE - a worldwide Grid infrastructure opportunities for the biomedical community Bob Jones.
Protein Molecule Simulation on the Grid G-USE in ProSim Project Tamas Kiss Joint EGGE and EDGeS Summer School.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.
INFSO-RI Enabling Grids for E-sciencE V. Breton, 30/08/05, seminar at SERONO Grid added value to fight malaria Vincent Breton EGEE.
Grid Enabled High Throughput Virtual Screening Against Four Different Targets Implicated in Malaria Presented by Vinod.
Page 1 SCAI Dr. Marc Zimmermann Department of Bioinformatics Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) Grid-enabled drug discovery.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM, a grid enabled virtual screening initiative Yannick Legré LPC Clermont-Ferrand,
INFSO-RI Enabling Grids for E-sciencE Clinical Decision Support Systems Pilot Demo 2nd EGEE Conference Den Haag, November the 24,
INFSO-RI Enabling Grids for E-sciencE Status of the Biomedical Applications in EELA Project (E-Infrastructures Shared Between Europe.
Samudrala group - overall research areas CASP6 prediction for T Å C α RMSD for all 70 residues CASP6 prediction for T Å C α RMSD for all.
INFSO-RI Enabling Grids for E-sciencE Biomedical applications V. Breton, CNRS-IN2P3.
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.
INFSO-RI Enabling Grids for E-sciencE Grid-enabled drug discovery to address neglected diseases N. Jacq – CNRS-IN2P3 EGAAP meeting.
INFSO-RI Enabling Grids for E-sciencE Towards grid-enabled telemedicine in Africa Yannick Legré on behalf of Vincent Breton CNRS-IN2P3,
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Avian Flu Data Challenge Hsin-Yen Chen ASGC 29 Aug APAN24.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Future of grids V. Breton CNRS. EGEE training, CERN, May 19th Table of contents Introduction Future of infrastructures : from networks to e-
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE08 conference, Istambul Biomed community meeting V. Breton, CNRS.
INFSO-RI Enabling Grids for E-sciencE User Survey Objectives and Results F.Jacq CNRS-IN2P3 EGEE Conference - Athens 21 th April.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks INFSO-RI Enabling Grids for E-sciencE.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
Enabling e-Research in Combustion Research Community T.V Pham 1, P.M. Dew 1, L.M.S. Lau 1 and M.J. Pilling 2 1 School of Computing 2 School of Chemistry.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Activités biomédicales dans EGEE-II Nicolas.
B i o i n f o r m a t i c s / B i o m e d i c a l A p p l i c a t i o n s i n E E L A Mexico, D.F., october 22 – 26, e – s c i e n c e M e x i c.
BIOINFOGRID: Bioinformatics Grid Application for life science MILANESI, Luciano National Research Council Institute of.
INFSO-RI Enabling Grids for E-sciencE Use Case of gLite Services Utilization. Multiple Ligand Trajectory Docking Study Jan Kmuníček.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE08 conference, Istambul Life sciences cluster perspective on EGI V. Breton, CNRS On.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks NA3 Activity – Training and Induction Robin.
Analysis of job submissions through the EGEE Grid Overview The Grid as an environment for large scale job execution is now moving beyond the prototyping.
Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute.
EGEE is a project funded by the European Union under contract IST Aims and organization of the Biomedical VO Yannick Legré CNRS/IN2P3 NA4/SA1.
Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Andreas Gisel & Luciano Milanesi.
N. Jacq Laboratoire de Physique Corpusculaire – CNRS HealthGrid session of LSG-RG - GGF12 - Brussels September 22nd 2004 Grid-enabled drug discovery to.
Molecular Modeling in Drug Discovery: an Overview
2 nd EGEE/OSG Workshop Data Management in Production Grids 2 nd of series of EGEE/OSG workshops – 1 st on security at HPDC 2006 (Paris) Goal: open discussion.
Designing Drugs Virtually P14D461P - Arni B. Hj. Morshidi P14D389P - Anisah Bt Ismail P14D397P - Syarifah Rohaya Bt Wan Idris P14D394P - Dayang Adelina.
Page 1 Computer-aided Drug Design —Profacgen. Page 2 The most fundamental goal in the drug design process is to determine whether a given compound will.
Accessing the VI-SEEM infrastructure
SCAI Activities in the - GRID-field - Molecular Modelling field
Joslynn Lee – Data Science Educator
Similarities between Grid-enabled Medical and Engineering Applications
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
WISDOM-II, status of preparation
In silico docking on grid infrastructures
Consortium: National networks in 16 European countries.
The ViroLab Virtual Laboratory for Viral Diseases
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE WHO, 2/11/04 Grid enabled in silico drug discovery Vincent Breton CNRS/IN2P3 Credit for the slides: N. Jacq

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Target Identification Target Validation Lead Identification Lead Optimization Target discoveryLead discovery Phases of a pharmaceutical development Clinical Phases (I-III) Duration: 12 – 15 years, Costs: million US $ vHTS Similarity analysis Similarity analysis Database filtering Database filtering Computer Aided Drug Design (CADD) de novo design diversity selection diversity selection Biophores Alignment Combinatorial libraries ADMET QSAR

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Selection of the potential drugs 28 million compounds currently known Drug company biologists screen up to 1 million compounds against target using ultra-high throughput technology Chemists select compounds for follow-up Chemists work on these compounds, developing new, more potent compounds Pharmacologists test compounds for pharmacokinetic and toxicological profiles 1-2 compounds are selected as potential drugs

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Dataflow and workflow in a virtual screening hit crystal structure ligand data base junk Docking Structure optimization Reranking MD-simulation

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Enable scientists to quickly and easily find ligands binding to a particular target protein –growth of targets number –growth of 3D structures determination (PDB database) –growth of computing power –growth of prediction quality of protein-compound interactions Experimental screening very expensive : difficult for academic or small companies Enrichment = Actives molecules Tested molecules Computational aspects of Drug Discovery virtual screening

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Grid added value for the first steps of in silico drug discovery Target identification and validation –Volume of molecular biology data is exponentially increasing –Grid added value: interoperability, sharing of data content and tools Large scale virtual screening to select the most promising compounds –Distributed computing –output data management Molecular dynamics to further assess selected compounds –Parallel computing

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Grid infrastructures vs pervasive grids A grid infrastructure uses an identified set of resources properly administered behind firewalls Grid infrastructures vs pervasive grids –Large scale docking on pervasive grid already achieved (Grid.org, Decrypthon, World Community Grid)  Centralized job submission and data management  Limited security model  No output data distribution (web portal)  Limited quality of service (no user support) Grid infrastructures vs clusters –Sharing of computing resources –Data management: distribution/replication of data –Sharing of services (participating groups bring their expertise)

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Potential grid services Chimioinformatics teams Biology teams Virtual Docking services MD service Annotation services Bioinformatics teams target Chemist/biologist teams hitsSelected hits Grid service customers Grid service providers Grid infrastructure

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 WISDOM : Wide In Silico Docking On Malaria Scientific objectives –start enabling in silico drug discovery in a grid environment to address the deadliest infectious disease on earth: malaria –Demonstrate to the research communities active in the area of drug discovery the relevance of grid infrastructures Goals of the first “data challenge” (July - September 2005) –Biological goal : Proposition of new inhibitors for a family of proteins produced by plasmodium falciparum – Biomedical informatics goal : Deployment of in silico virtual screening on the grid – Grid goal : Deployment of a CPU consuming application generating large data flows to test the grid infrastructure and services. Partners –Fraunhofer SCAI –CNRS/IN2P3 –CMBA (Center for Bio-Active Molecules screening) representing different projects: –EGEE (EU FP6) –Simdat (EU FP6) –Instruire and Campus Grid (French and German Regional Grids) –Accamba project (french ACI project)

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 WISDOM workflow Deployment of a virtual screening workflow on grid infrastructures hit crystal structure Ligand db DockingReranking MD-simulation junk Grids Workflow manager

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 WISDOM elements Biological information –Plasmepsin is a promising aspartic protease target involved in the hemoglobin degradation of P. falciparum. 5 different structures are prepared (PDB source) –ZINC is an open source library of 3,3 millions selected compounds. They are made available by chemistry companies and are ready to be used Biomedical informatics tools –Autodock is free for academic, with grid based empirical potential and flexible docking via MC search and incremental construction –FlexX is licensed required, available for this data challenge during 1 week, with Boehm potential and fragment assembly energy function Grid tools –wisdom_env is an environment for an automatic, optimized and fault tolerance workflow using the grid resources and services –The biomedical VO will be the infrastructure with dedicated/no-dedicated resources

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 WISDOM : Deployment on a grid environment Docking is easily distributed once the compound database is available on the grid nodes. Each computing element computes docking probability for a different sample of ligands In a first step, docking scores are returned to the user and compared on its local machine. Later on, data management services can handle the storage and the post- processing of the output files StorageElement ComputingElement Site1 Site2 StorageElement User interface ComputingElement Compounds database Software Parameter settings Target structures

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Results of the preliminary tests Docking application deployed since the summer ,000 jobs since January 2005 Tests performed with the software Autodock on the biomedical VO 100,000 compounds 500 jobs Total CPU time for jobs6 months CPU User script time40 h Gain of time for the user150 CPU time for 1 job9h Input and output transfer time between SE and CE for 1 job 2.5 mn Waiting time for 1 job due to the grid 30 mn Resubmitted Jobs Aborted jobs % 16 3%

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Data challenge scenario Scenario 1 Duration3 weeks CPU time80 years CPU Grid performance70% Number of CPU2,000 Number of grid jobs (20h)30,000 Storage2*6 TB Docking workflow description Number of compounds Number of parameters settings 500,000 4 ObjectiveSelection of the best hits with short analysis FlexX running time : 1 mn F. output size : 1MB F. job output size : 1.2GB F. job compressed output size : 250MB Autodock running time : 2.5 mn A. output size : 1MB A. job output size : 0,5GB A. job compressed output size : 100MB

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Output analysis (Fraunhofer) Post filtering Clustering of similar conformations Checking pharmacophoric points of each conformation Doing statistics on the score distribution Re-ranking for interesting compounds Sorting and assembly of data Ligand plot of 1LEE (Plasmepsin II) with inhibitor R Ligand plot of 1LF3 (plasmepsin II) with inhibitor EH5 332

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Follow-up of the DC The best hits found by post-treatment will be published and available on a permanent grid storage via a portal –Experimental screening of the most promising hits A knowledge space will be progressively build around these results –to extract and process the most interesting information –to enrich the data with the results found later by other in silico drug discovery processes The in silico drug discovery will be further extend –to include more precise molecular dynamics computations using quantum chemistry software like NAMD

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 From drug discovery to drug delivery Drug discovery is about finding new drugs However, the best drugs are useful provided they are made available to the sick Drug delivery is a huge challenge for developing countries –Lack of healthcare infrastructures –Lack of resources to buy drugs –Lack of education to deliver them –Lack of information on drug efficiency For drug delivery, grids have a real added value –To collect data in endemic areas –To provide data and tools to endemic areas (local reseach, training)

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 In silico drug discovery process (EGEE, SwissBioGRID, …) Clermont-Ferrand The grid impact : Computing and storage resources for genomics research and in silico drug discovery cross-organizational collaboration space to progress research work Federation of patient databases for clinical trials and epidemiology in developing countries Grids for neglected diseases of the developing world Support to local centres in plagued areas (data collection, genomics research, clinical trials and vector control) SCAI Fraunhofer Swiss Biogrid consortium Local research centres In plagued areas

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Grid federation of databases for epidemiology Analysis center Country A Hospital Country B Added value: - no central repository - queries on federation of databases - privacy protected - telemedecine Hospital Country E Hospital Country C Hospital Country D Epidemiology

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Grid federation of databases for clinical trials Pharmaceutical laboratory / International organization Country A Hospital Country B Hospital Country E Hospital Country C Hospital Country D Drug / Vaccine assessment Added value: - no central repository - queries on federation of databases - privacy protected-

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Projects starting on EGEE in relation to drug delivery and telemedecine Grid enabled telemedecine for medical development –Development of neurosurgery in poverty regions of western China –Ophthalmology in Burkina-Faso  Collaboration with Schiphra dispensary (Ouagadougou, Burkina Faso)

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Grid-enabled telemedecine for medical development Collaboration: NPO Chain of Hope, n°9 Hospital Shanghaï (neurosurgery unit), Chuxiong Hospital (Yunnan), CNRS-IN2P3, Clermont-Ferrand hospitals Goal: improve patient follow-up by french clinicians Method: grid-enabled telemedecine web application

Enabling Grids for E-sciencE INFSO-RI GGF, 28/06/05 Conclusion Grid technologies promise to change the way organizations tackle complex problems by offering unprecedented opportunities for resource sharing and collaboration Grids should provide the services needed for in silico drug discovery Applied to world health development, grids should also –Help monitor epidemics –Strenghthen R&D on neglected diseases –Grant easier access to eHealth – We are looking for joint pilot projects with a pharmaceutical lab –Develop a grid-enabled drug discovery pipeline for malaria –Build a federation of databases to address 1 infectious disease (epidemiology, clinical trials, vector control) –Study grid added value for drug delivery