INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Using Grid Computing to Accelerate Structure-Based Design Against Influenza A Neuraminidases.

Slides:



Advertisements
Similar presentations
CPSCG: Constructive Platform for Specialized Computing Grid Institute of High Performance Computing Department of Computer Science Tsinghua University.
Advertisements

LIBRA: Lightweight Data Skew Mitigation in MapReduce
28 April, 2005ISGC 2005, Taiwan The Efficient Handling of BLAST Applications on the GRID Hurng-Chun Lee 1 and Jakub Moscicki 2 1 Academia Sinica Computing.
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
GRID INTEROPERABILITY USING GANGA Soonwook Hwang (KISTI) YoonKee Lee and EunSung Kim (Seoul National Uniersity) KISTI-CCIN2P3 FKPPL Workshop December 1,
Lecturer: Sebastian Coope Ashton Building, Room G.18 COMP 201 web-page: Lecture.
Establishing the overall structure of a software system
INFSO-RI Enabling Grids for E-sciencE Application Demonstrations C. Loomis, J. Moscicki, J. Montagnat EGEE European Review (CERN)
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 11 Slide 1 Architectural Design.
DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
A Workflow-Aware Storage System Emalayan Vairavanathan 1 Samer Al-Kiswany, Lauro Beltrão Costa, Zhao Zhang, Daniel S. Katz, Michael Wilde, Matei Ripeanu.
DockoMatic: Automated Tool for Homology Modeling and Docking Studies DOCKOMATIC-Student Procedure-Homology Modeling & Molecular Docking Tutorial.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Chapter 10 Architectural Design.
Chapter 6 : Software Metrics
Architectural Design portions ©Ian Sommerville 1995 Establishing the overall structure of a software system.
High-Throughput Virtual Molecular Docking: Hadoop Implementation of AutoDock4 on a Private Cloud Sally R. Ellingson Graduate Research Assistant Center.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Building Grid-enabled Virtual Screening Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Application Case Study: Distributed.
Architectural Design To explain the advantages and disadvantages of different distributed systems architectures To discuss client-server and distributed.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Protein Molecule Simulation on the Grid G-USE in ProSim Project Tamas Kiss Joint EGGE and EDGeS Summer School.
INFSO-RI Enabling Grids for E-sciencE V. Breton, 30/08/05, seminar at SERONO Grid added value to fight malaria Vincent Breton EGEE.
Workflow Project Status Update Luciano Piccoli - Fermilab, IIT Nov
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Architectural Design Yonsei University 2 nd Semester, 2014 Sanghyun Park.
In silico discovery of inhibitors using structure-based approaches Jasmita Gill Structural and Computational Biology Group, ICGEB, New Delhi Nov 2005.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
S. Guatelli, A. Mantero, J. Moscicki, M. G. Pia Geant4 medical simulations in a distributed computing environment 4th Workshop on Geant4 Bio-medical Developments.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.
INFSO-RI Enabling Grids for E-sciencE SALUTE – Grid application for problems in quantum transport E. Atanassov, T. Gurov, A. Karaivanova,
Avian Flu Data Challenge Hsin-Yen Chen ASGC 29 Aug APAN24.
CSC480 Software Engineering Lecture 10 September 25, 2002.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.
INFSO-RI Enabling Grids for E-sciencE Running ECCE on EGEE clusters Olav Vahtras KTH.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Abel Carrión Ignacio Blanquer Vicente Hernández.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Tool Integration with Data and Computation Grid “Grid Wizard 2”
INFSO-RI Enabling Grids for E-sciencE EGEE-2 NA4 Biomed Bioinformatics in CNRS Christophe Blanchet Institute of Biology and Chemistry.
HA Neuramindase (NA) and replication of virions A enzyme, cleaves host receptors help release of new virions NA Modeling HTS against Inf-A NA on Grid Ying-Ta.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Bayesian Evolutionary Analysis by Sampling Trees (BEAST) LEE KIM-SUNG Environmental Health Institute National Environment Agency.
Susanna Guatelli Geant4 in a Distributed Computing Environment S. Guatelli 1, P. Mendez Lorenzo 2, J. Moscicki 2, M.G. Pia 1 1. INFN Genova, Italy, 2.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
Slide 1 Chapter 8 Architectural Design. Slide 2 Topics covered l System structuring l Control models l Modular decomposition l Domain-specific architectures.
Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Report from.
IS301 – Software Engineering Dept of Computer Information Systems
Simulation in a Distributed Computing Environment
Virtual Screening.
Ligand Docking to MHC Class I Molecules
An Integrated Approach to Protein-Protein Docking
The Design of a Grid Computing System for Drug Discovery and Design
Simulation in a Distributed Computing Environment
Simulation in a Distributed Computing Environment
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE Using Grid Computing to Accelerate Structure-Based Design Against Influenza A Neuraminidases Hurng-Chun Lee, Li-Yung Ho, and Ying-Ta Wu* *Genomics Research Center Academia Sinica, Taiwan EGEE User Forum CERN,

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Outline Influenza A Pandemic H5N1 H1N1 H2N2H3N2H1N1 H9N2H7N7H5N1 NAHA deaths /170 cases Feb 26, 2006

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Neuraminidases cleave host receptors help release of new virions

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Neuraminidase and Inhibitors Zanamivir R=guanidine Oseltamivir R=H R’=amine R’ Structure-Based Drug Design binding pocket

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, MutationN1N2 R292K oseltamivir Zanamivir H274Y(F)oseltamivir N294Soseltamivir?oseltamivir E119Voseltamivir?oseltamivir E119(G;A;D)oseltamivir?Zanamivir : Predicted mutation site by structure overlay and sequence alignment : Reported mutation site Drug-resistant variants and Point Mutation

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Prepare the Target Protein -- add polar hydrogen atoms -- assign charges to atoms -- decide range of binding site 2. Run AutoGrid 3. Prepare the Ligand -- assign charges to atoms -- decide flexible bonds (run AutoTors) 4. Run AutoDock 5. Evaluate Results and Rank Score AutoGrid AutoTors Garrett M. Morris David S. Goodsell Ruth Huey William E. Hart Scott Halliday Rik Belew Arthur J. Olson AutoDock Morris et al. (1998), J. Computational Chemistry, 19 : Docking Engine : AutoDock 3.0.5

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Application Characteristic Virtual screening based on molecular docking is the most time consuming part in structure-based drug design workflow Number of docking tasks = N x M –N: number of ligands –M: number of target structures CPU-bound application, huge amount of output, no communication between tasks Task complexity is unpredictable –difficult to apply trivial domain decomposition method in splitting the tasks The pitiful …

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Issues of the Grid applications Due to the loose coupling nature, distributing application jobs on the Grid is not trivial –extra works are needed concerning the efficient job handling and result gathering –need also efforts to handle transient network or site problems –complexities should be hidden and the interface to end user should be application oriented The significant Grid system overhead makes the Grid only benefit to the jobs with long computing time –not suitable for the pilot jobs for decision making

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, What is DIANE? A lightweight framework for parallel scientific applications in master-worker model –ideal for applications without communications between parallel tasks (e.g. for most of the Bioinformatics applications in analyzing huge amount of independent dataset) The framework takes care of all synchronization, communication and workflow management details on behalf of application DIANE = Distributed Analysis Environment

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Distributing AutoDock tasks on the Grid using DIANE

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, DIANE/AutoDock A generic framework to which application can easily plug-in # -*- python -*- Application = 'Autodock' JobInitData = {'macro_repos' :'/home/hclee/diane_demo/autodock/macro', 'ligand_repos':'/home/hclee/diane_demo/autodock/ligand', 'ftprotocol':'gass', 'output_prefix':'autodock_test' } ## The input files will be staged in to workers InputFiles = [] ## The definition of failure recovery def failRecovery(self): print '*'*30 for t in self.master.tasks.failed(): print "ignoring failed task:",t t.ignore() print '*'*30 return 1 autodock.job Application specific job attributes Job level failure recovery definition % diane.startjob –-job autodock.job –ganga –w Intuitive job execution command Possible to mix heterogeneous computing backends

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, DIANE/AutoDock – integrated user interface

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Performance Evaluation Test case –5 target protein: 1 protein, 5 conformations –ligand: 100 small compounds (with 7 positives )  500 docking tasks in total Test environment –DIANE backend handler: SSH –Hardware spec:  Traditional PC cluster with NFS (2 x Intel Xeon 2.8 GHz + 2 GB memory per node) –Grid: LCG

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Test Results DIANE/AutoDock framework on Cluster Duration time : total elapsed time of a DIANE job Each DIANE job contain 500 tasks (5 protein conformations x 100 compounds)

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Handling docking jobs on traditional PC cluster good load balance a DIANE/Autodock Task Test Results DIANE/AutoDock framework on Cluster

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, DIANE/AutoDock framework on LCG-GRID terminated

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Without redundant scheduling

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, With redundant scheduling job was reassigned to other nodes

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Compound library enrichment AutoDock parameters: translation / step=2.0 Å quaternion / step =20 degree torsion / step= 20 degree number of energy evaluation =1.5 X 10 6 max. number of generation =2.7 X 10 4 Run number =10 red = positives All positives were docked within RMSD<1.5Å

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Probe effects due to minor changes in target’s binding sites

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Summary Modeling compound-protein complex can be speeded up by distributing molecular docking processes on the Grid. With the DIANE framework, distributing molecular docking tasks on the Grid can be easily implemented with intuitive interface for end user. The DIANE framework also provides the functionalities by which the system can be easily tuned to tackle the issues in distributing molecular docking tasks on the loosely-coupled Grid. This simple test case demonstrated that huge compound databases can be effectively enriched by executing docking tasks on Grid. However, more resources are required in order to build up a real HTP docking service for life science community.

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Acknowledgements Li-Yung Ho Hurng-Chun Lee Hsing-Yen Chen Dr. Simon Lin Jakub Moscicki Dr. Massimo Lamanna Supports from Genomics Research Center, Academia Sinica National Science Council, Taiwan are highly appreciated LCG-ARDA, CERN

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Interacting Complexes A key step to structure-based inhibitor design PDB1F8B