In silico docking on grid infrastructures

Slides:



Advertisements
Similar presentations
Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
Advertisements

Addressing emerging diseases on the grid
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
Jürgen Sühnel Institute of Molecular Biotechnology, Jena Centre for Bioinformatics Jena / Germany Supplementary Material:
Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks World-wide in silico drug discovery against.
INFSO-RI Enabling Grids for E-sciencE WISDOM mini-workshop Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand) ISGC 2007 March 28th,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
Bioinformatics Ayesha M. Khan Spring Phylogenetic software PHYLIP l 2.
Asia’s Largest Global Software & Services Company Genomes to Drugs: A Bioinformatics Perspective Sharmila Mande Bioinformatics Division Advanced Technology.
Using the WS-PGRADE Portal in the ProSim Project Protein Molecule Simulation on the Grid Tamas Kiss, Gabor Testyanszky, Noam.
KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division.
FKPPL workshop May 2012 BUI The Quang Prof. Vincent Breton Prof. Doman Kim Prof. NGUYEN Hong Quang Prof. PHAM Quoc Long Grid enabled in silico drug discovery.
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Building Grid-enabled Virtual Screening Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Application Case Study: Distributed.
G. Terstyanszky, T. Kukla, T. Kiss, S. Winter, J.: Centre for Parallel Computing School of Electronics and Computer Science, University of.
BIOINFOGRID: Bioinformatics Grid Application for Life Science Giorgio Maggi INFN and Politecnico di Bari
INFSO-RI Enabling Grids for E-sciencE WHO, 2/11/04 Grid enabled in silico drug discovery Vincent Breton CNRS/IN2P3 Credit for the.
Protein Molecule Simulation on the Grid G-USE in ProSim Project Tamas Kiss Joint EGGE and EDGeS Summer School.
Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.
INFSO-RI Enabling Grids for E-sciencE V. Breton, 30/08/05, seminar at SERONO Grid added value to fight malaria Vincent Breton EGEE.
Grid Enabled High Throughput Virtual Screening Against Four Different Targets Implicated in Malaria Presented by Vinod.
Page 1 SCAI Dr. Marc Zimmermann Department of Bioinformatics Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) Grid-enabled drug discovery.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
EMBRACE An example of Grid Integration (I): The EMBRACE project Jean SALZEMANN CNRS/IN2P3.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM, a grid enabled virtual screening initiative Yannick Legré LPC Clermont-Ferrand,
INFSO-RI Enabling Grids for E-sciencE Biomedical applications V. Breton, CNRS-IN2P3.
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.
INFSO-RI Enabling Grids for E-sciencE Grid-enabled drug discovery to address neglected diseases N. Jacq – CNRS-IN2P3 EGAAP meeting.
INFSO-RI Enabling Grids for E-sciencE Towards grid-enabled telemedicine in Africa Yannick Legré on behalf of Vincent Breton CNRS-IN2P3,
Avian Flu Data Challenge Hsin-Yen Chen ASGC 29 Aug APAN24.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks INFSO-RI Enabling Grids for E-sciencE.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Activités biomédicales dans EGEE-II Nicolas.
B i o i n f o r m a t i c s / B i o m e d i c a l A p p l i c a t i o n s i n E E L A Mexico, D.F., october 22 – 26, e – s c i e n c e M e x i c.
BIOINFOGRID: Bioinformatics Grid Application for life science MILANESI, Luciano National Research Council Institute of.
INFSO-RI Enabling Grids for E-sciencE Use Case of gLite Services Utilization. Multiple Ligand Trajectory Docking Study Jan Kmuníček.
HA Neuramindase (NA) and replication of virions A enzyme, cleaves host receptors help release of new virions NA Modeling HTS against Inf-A NA on Grid Ying-Ta.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute.
Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Andreas Gisel & Luciano Milanesi.
Molecular Modeling in Drug Discovery: an Overview
FESR Consorzio COMETA - Progetto PI2S2 Molecular Modelling Applications Laura Giurato Gruppo di Modellistica Molecolare (Prof.
2 nd EGEE/OSG Workshop Data Management in Production Grids 2 nd of series of EGEE/OSG workshops – 1 st on security at HPDC 2006 (Paris) Goal: open discussion.
Designing Drugs Virtually P14D461P - Arni B. Hj. Morshidi P14D389P - Anisah Bt Ismail P14D397P - Syarifah Rohaya Bt Wan Idris P14D394P - Dayang Adelina.
Page 1 Molecular Modeling Service in Profacgen. Page 2 The three-dimensional structure of a protein provides essential information about its biological.
Page 1 Computer-aided Drug Design —Profacgen. Page 2 The most fundamental goal in the drug design process is to determine whether a given compound will.
Bob Jones EGEE Technical Director
Workload Management Workpackage
Grid Computing: Running your Jobs around the World
SCAI Activities in the - GRID-field - Molecular Modelling field
Regional Operations Centres Core infrastructure Centres
U.S. ATLAS Grid Production Experience
Joseph JaJa, Mike Smorul, and Sangchul Song
Long-term Grid Sustainability
The LHC Computing Grid Visit of Her Royal Highness
Nicolas Jacq LPC, IN2P3/CNRS, France
OpenGATE meeting/Grid tutorial, mars 9nd 2005
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
LCG middleware and LHC experiments ARDA project
WISDOM-II, status of preparation
LHC Data Analysis using a worldwide computing grid
Experience with the deployment of biomedical applications on the grid Vincent Breton LPC, CNRS-IN2P3 Credit for the slides: M. Hofmann, N. Jacq, V. Kasam,
Consortium: National networks in 16 European countries.
Consortium: National networks in 16 European countries.
Grid computing Assaf Gottlieb Tel-Aviv University
Presentation transcript:

In silico docking on grid infrastructures Jean Salzemann LPC of Clermont-Ferrand, France (CNRS/IN2P3) Embrace Workshop, Helsinki, 2006/06/17 Credit : Nicolas Jacq, Vincent Breton

Content WISDOM initiative Challenges of the high throughput virtual docking Development of a grid environments for a large-scale deployment Achieved deployment on EGEE infrastructure Wide In Silico Docking On Malaria Accelerate drug design against H5N1 neuraminidase Perspectives mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

WISDOM initiative WISDOM initiative aims to demonstrate the relevance and the impact of the grid approach to address drug discovery for neglected and emerging diseases. First achieved experiences: Summer 2005: Wide In Silico Docking On Malaria (WISDOM) Spring 2006: Accelerate drug design against H5N1 neuraminidase Partners: Grid infrastructures: EGEE, Auvergrid, TWGrid European projects: Embrace, BioinfoGrid, Share, Simdat Institutes and association: Fraunhofer SCAI, Academia Sinica of Taiwan, ITB, Unimo University, LPC, CMBA, CERN-ARDA, HealthGrid mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

There is a need… to develop new drugs for the diseases of the developing world HIV/AIDS, malaria and Tuberculosis account for 5,6 million deaths Permanent necessity to develop new drugs to fight emerging resistance to drugs (malaria) Unchanged pharmacopeia for decades against trypanosomiasis, leishmaniasis, Chagas disease, ... to be able to develop quickly new drugs against emerging diseases H5N1, SRAS, dengue… are recent examples of emerging diseases Many factors like world-wide exchanges can help propagation of such diseases at a large scale Necessity to quickly adapt to emerging resistances mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Combinatorial libraries Phases of a pharmaceutical development Molecular Docking: Predict how small molecules, such as substrates or drug candidates, bind to a receptor of known 3D structure Target discovery Lead discovery Target Identification Target Validation Lead Identification Lead Optimization Clinical Phases (I-III) The development of a new drug in pharmaceutical research is extremely time consuming and expensive. On the average the complete research process from the early screening experiments to the clinical testing takes 10 to 15 years and costs about 150-300 million dollars. Recent efforts to minimize development time and cost led to automatisation and miniaturization of modern drug testing systems. Through the rapid progress in the fields of combinatorial chemistry and high-throughput screening (HTS) there is an increasing need for interpretation of large amounts of "noisy" data.   Today it is possible to test the binding affinity of several ten thousands potential drug candidates per day. The results of these experiments are analyzed with the help of special computer software (CADD). vHTS Similarity analysis Database filtering Computer Aided Drug Design (CADD) de novo design diversity selection Biophores Alignment Combinatorial libraries ADMET QSAR Duration: 12 – 15 years, Costs: 500 - 800 million US $ mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Grid-enabled virtual screening workflow Grid service customers Chemist/biologist teams Biology teams Data access for expert teams in the world Grid infrastructure Check point Check point Check point Selected hits Hits Target MD services Docking services Annotation services Grid service providers Chimioinformatics teams Bioinformatics teams mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Challenges for high throughput virtual docking Example: data challenge against H5N1 NA Millions of chemical compounds available in laboratories In vitro high Throughput Screening 1$/compound, nearly impossible 300,000 Chemical compounds: ZINC & Chemical combinatorial library Molecular docking (Autodock) ~100 CPU years, 600 GB data Data challenge on EGEE, Auvergrid, TWGrid ~6 weeks on ~2000 computers In vitro screening of 100 hits Hits sorting and refining Target (PDB) : Neuraminidase (8 structures) mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Issues for the grid-enabled high throughput virtual docking Computer-based in-silico screening can help to identify the most promising leads for biological tests systematic and productive reduces the cost of trail-and-error approach In silico docking is well-fitted for grid deployment CPU intensive application Huge amount of output No communication between tasks Issues of a large scale grid deployment The rate of submitted jobs must be carefully monitored The amount of transferred data impacts on grid performance Grid process introduces significant delays Licensed software requires licenses distribution strategy on grid The rate of submitted jobs must be carefully monitored Robust and fault-tolerant environment Testing services and resources for the application The amount of transferred data impacts on grid performance Install application on grid Subsets of the database instead of large unique compound files Grid process introduces significant delays Submission is limited by the number of Resources Brokers All resources must be filled but saturation must be avoided Licensed software requires licenses distribution strategy on grid mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Grid tools of the data challenges WISDOM a workflow of grid job handling: automated job submission, status check and report, error recovery push model job scheduling batch mode job handling http://wisdom.eu-egee.fr DIANE a framework for applications with master-worker model pull mode job scheduling interactive mode job handling with flexible failure recovery feature http://cern.ch/diane mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

WISDOM components Installer Tester User wisdom_install wisdom_test Set of jobs wisdom_execution Workload definition Job submission Job monitoring Job bookkeeping Fault tracking Fault fixing Job resubmission GRID Grid services (RB, RLS…) Grid resources (CE, SE) Application components (Software, database) Superviser License server Accounting data wisdom_collect wisdom_db wisdom_site mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Simplified grid workflow for WISDOM Results Storage Element Subsets Computing Element WISDOM production system Site1 Statistics Jobs Parameter settings Target structures Resource Broker Computing Element User interface Site2 Subsets Compounds database Storage Element Software Results FlexX license server : 3000 floating licenses given by BioSolveIT to SCAI Maximum number of used licenses was 1008 mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Grid resources of the data challenges EGEE-II AuverGrid TWGrid a world-wide infrastructure providing freely over than 5,000 CPUs and 21 TB for biomedical applications mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

First biomedical data challenge: World-wide In Silico Docking On Malaria (WISDOM) Significant biological parameters 2 different docking applications (Autodock and FlexX) About 1 million virtual compounds selected Target proteins from the parasite responsible for malaria Significant numbers Total of about 46 million ligands docked in 6 weeks 1TB of data produced Up 1700 computers in 15 countries used simultaneously About 80 CPU years Average crunching factor ~600 Number of docked compounds vs time Number of running and waiting jobs vs time mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Second biomedical data challenge: Accelerate drug design against H5N1 neuraminidase Significant biological parameters 1 docking application (Autodock) About 300,000 virtual compounds selected Target proteins with predicted mutations involved in the virus multiplication Significant numbers Total of about 2,5 million ligands docked in 6 weeks 600 GB of data produced Up 2000 computers in 17 countries used simultaneously corresponding to about 105 CPU years Average crunching factor ~900 France,25% UKI,23% SouthWestern Europe,16% SouthEastern Europe,14% Italy,6% Northern Europe,4% Asia Pacific,5% Germany-Switzerland,1% Central Europe,2% Russia,4% Rate of jobs by EGEE federation mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Selecting the promising compounds The in-silico screening provides not only the docking poses of a compound against the target but also the docking energy By ranking the information, chemist can select the promising compounds to go on the structure-based drug design for potential drugs mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01

Perspectives Second large scale docking on EGEE in fall 2006 Several new foreseen targets on malaria, dengue and other neglected diseases. Resources needed: ~80 CPU years per target Supported by EGEE-II and EELA european projects, Swiss BioGrid initiative Collaboration is open for new targets,software & infrastructures Reranking of WISDOM hits by Molecular Dynamics simulations Supported by BioinfoGrid & EGEE-II european projects Interest for ressources on supercomputers (contact with DEISA) Best hits further processed through in vitro testing and structure activity relationships mardi 2 avril 2019 In silico docking on grid infrastructures, NGN Helsinki, 2006/06/01