2003/10/3 UK Jpana N+N Meeting 1 “Grid Platform for Drug Discovery” Project Mitsuhisa Sato Center for Computational Physics, University of Tsukuba, Japan.

Slides:



Advertisements
Similar presentations
Computational Grids and Computational Economy: Nimrod/G Approach David Abramson Rajkumar Buyya Jonathan Giddy.
Advertisements

National Institute of Advanced Industrial Science and Technology Ninf-G - Core GridRPC Infrastructure Software OGF19 Yoshio Tanaka (AIST) On behalf.
Severs AIST Cluster (50 CPU) Titech Cluster (200 CPU) KISTI Cluster (25 CPU) Climate Simulation on ApGrid/TeraGrid at SC2003 Client (AIST) Ninf-G Severs.
Three types of remote process invocation
The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Distributed components
A Grid Parallel Application Framework Jeremy Villalobos PhD student Department of Computer Science University of North Carolina Charlotte.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Computational Steering on the GRID Using a 3D model to Interact with a Large Scale Distributed Simulation in Real-Time Michael.
Parallelization and Grid Computing Thilo Kielmann Bioinformatics Data Analysis and Tools June 8th, 2006.
Application-specific Tools Netsolve, Ninf, and NEOS CSE 225 Chas Wurster.
Rutgers PANIC Laboratory The State University of New Jersey Self-Managing Federated Services Francisco Matias Cuenca-Acuna and Thu D. Nguyen Department.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Workload Management Massimo Sgaravatto INFN Padova.
Company LOGO Development of Resource/Commander Agents For AgentTeamwork Grid Computing Middleware Funded By Prepared By Enoch Mak Spring 2005.
Diffusion scheduling in multiagent computing system MotivationArchitectureAlgorithmsExamplesDynamics Robert Schaefer, AGH University of Science and Technology,
FLANN Fast Library for Approximate Nearest Neighbors
Ajou University, South Korea ICSOC 2003 “Disconnected Operation Service in Mobile Grid Computing” Disconnected Operation Service in Mobile Grid Computing.
ADLB Update Recent and Current Adventures with the Asynchronous Dynamic Load Balancing Library Rusty Lusk Mathematics and Computer Science Division Argonne.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
KARMA with ProActive Parallel Suite 12/01/2009 Air France, Sophia Antipolis Solutions and Services for Accelerating your Applications.
Grid ASP Portals and the Grid PSE Builder Satoshi Itoh GTRC, AIST 3rd Oct UK & Japan N+N Meeting Takeshi Nishikawa Naotaka Yamamoto Hiroshi Takemiya.
Research Achievements Kenji Kaneda. Agenda Research background and goal Research background and goal Overview of my research achievements Overview of.
The Japanese Virtual Observatory (JVO) Yuji Shirasaki National Astronomical Observatory of Japan.
DISTRIBUTED COMPUTING
March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Supercomputing Center CFD Grid Research in N*Grid Project KISTI Supercomputing Center Chun-ho Sung.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
One-sided Communication Implementation in FMO Method J. Maki, Y. Inadomi, T. Takami, R. Susukita †, H. Honda, J. Ooba, T. Kobayashi, R. Nogita, K. Inoue.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Styx Grid Services: Lightweight, easy-to-use middleware for e-Science Jon Blower Keith Haines Reading e-Science Centre, ESSC, University of Reading, RG6.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Supporting Molecular Simulation-based Bio/Nano Research on Computational GRIDs Karpjoo Jeong Konkuk Suntae.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
Integrating Computing Resources on Multiple Grid-enabled Job Scheduling Systems Through a Grid RPC System Yoshihiro Nakajima, Mitsuhisa Sato, Yoshiaki.
Virtual Private Grid (VPG) : A Command Shell for Utilizing Remote Machines Efficiently Kenji Kaneda, Kenjiro Taura, Akinori Yonezawa Department of Computer.
AMH001 (acmse03.ppt - 03/7/03) REMOTE++: A Script for Automatic Remote Distribution of Programs on Windows Computers Ashley Hopkins Department of Computer.
Interactive Data Analysis on the “Grid” Tech-X/SLAC/PPDG:CS-11 Balamurali Ananthan David Alexander
National Institute of Advanced Industrial Science and Technology Developing Scientific Applications Using Standard Grid Middleware Hiroshi Takemiya Grid.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
1 An unattended, fault-tolerant approach for the execution of distributed applications Manuel Rodríguez-Pascual, Rafael Mayo-García CIEMAT Madrid, Spain.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
Enabling Grids for E-sciencE LRMN ThIS on the Grid Sorina CAMARASU.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
GRID & Parallel Processing Koichi Murakami11 th Geant4 Collaboration Workshop / LIP - Lisboa (10-14/Oct./2006) 1 GRID-related activity in Japan Go Iwai,
Parallel Computing Globus Toolkit – Grid Ayaka Ohira.
Dynamic Deployment of VO Specific Condor Scheduler using GT4
NGS computation services: APIs and Parallel Jobs
Chapter 17: Database System Architectures
Cluster Computing and the Grid, Proceedings
The Design of a Grid Computing System for Drug Discovery and Design
Overview of Workflows: Why Use Them?
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

2003/10/3 UK Jpana N+N Meeting 1 “Grid Platform for Drug Discovery” Project Mitsuhisa Sato Center for Computational Physics, University of Tsukuba, Japan

2003/10/3 UK Jpana N+N Meeting 2 Our Grid Project JST-ACT program: “Grid platform for drug discovery”, funded by JST(Japan Science and Technology Corporation), 1.3 M$/ 3 years started from 2001 –Tokushima University, Toyohashi Inst. Of Tech., University of Tsukuba, Fuji Res. Inst. Corp. ILDG: International Lattice QCD Data Grid –CCP, U. of Tsukuba, EPCC UK, SciDAC US. –Design of QCDML –QCD Meta database by web services, QCD data sharing by SRM and Globus replica …

2003/10/3 UK Jpana N+N Meeting 3 High Throughput Computing for drug discovery Exhaustive parallel conformation search and docking over Grid Accumulation computing results into large scale database and reuse High performance ab initio MO calculation for large molecules on clusters “Combinatorial Computing” Using Grid

2003/10/3 UK Jpana N+N Meeting 4 Grid applications of our drug-discovery Conformation search : find possible confirmations Docking search: compute energy of combination of molecules Quantitative Structure-Activity Relationships (SQAR) analysis: finding rules of drug design Conformation Search Drug libraries Docking Search (using ab initio MO calculation) Conform- ations Docking Computation results QSAR analysis target CONFLEX-G Grid enabled Conformation Search application MO in clusters Job submission for MO Coarse-grain MO for Grid (REMD, FMO) Design of XML for results Web service interface

2003/10/3 UK Jpana N+N Meeting 5 Stepwise Rotation Corner Flap Edge Flip CONFLEX Algorithm: tree searchAlgorithm: tree search –Local conformation changes –Initial conformation selection We are implementing with OmniRPCWe are implementing with OmniRPC – Tree search action is dynamic!!! Conformation search tree Anti  E=0.0 kcal/mol Gauche +  E=0.9 kcal/mol Gauche -  E=0.9 kcal/mol

2003/10/3 UK Jpana N+N Meeting 6 Gird Platform for drug discovery Univ. of Tsukuba AIST Toyohashi Inst. Of Tech. Tokushima Univ. Control & monitoring Scheduling and monitoring of computations distributed data base management Design of G rid middleware Development of large-scale ab-initio MO program Cluster for CONFLEX development of conformation search program (CONFLEX) Database for CONFLEX results Database of MO calculation results 3D structure database for drug design wide-area network wide-area network request

2003/10/3 UK Jpana N+N Meeting 7 What can Grid do? Parallel Applications, programming, and our view for grid “Typical” Grid Applications –Parametric execution: Execute the same program with different parameters using an large amount of computing resources –master-workers type of parallel program “Typical” Grid Resources –A Cluster of Clusters: some PC Clusters are available –Dynamic resources: load and status are changed time-to-time. PC PC Cluster PC PC Cluster Our View PC PC Cluster GridEnvironmentGridEnvironment

2003/10/3 UK Jpana N+N Meeting 8 Parallel programming in Grid –Using Globus shell (GSH) Submit batch job scripts to remote nodes staging and workflow –Grid MPI (MPICH-G, PACX MPI, …) General-purpose, but difficult and error-prone No support for dynamic resource and fault-tolerance No support for Firewall, clusters with private network. –Grid RPC a good and intuitive programming interface Ninf, NetSolve, … OmniRPC

2003/10/3 UK Jpana N+N Meeting 9 Overview of OmniRPC A Grid RPC system for parallel computing Provide seamless parallel programming environment from clusters to grid. –It use “rsh” for a cluster, “GRAM” for a grid managed by Globus, “ssh” for a conventional remote nodes. –Program development and testing in PC clusters –Product run in Grid to exploit huge computing resources –User can switch configuration with “host file” without any modification Make use of remote clusters of PC/SMP as Grid computing resource –Support for clusters in firewall and private address PC PC Cluster PC PC PC PC PC PC PC PC PC PC PC PC PC PC PC Client Grid Environment <Agent invoker=“globus” mxio=“on”/> <JobScheduler type=“rr” maxjob=“20”/> Host file

2003/10/3 UK Jpana N+N Meeting 10 Overview of OmniRPC (cont.) Easy-to-use parallel programming interface –A gridRPC based on Ninf Grid RPC –Parallel programming using asynchronous call API –The thread-safe RPC design allows to use OpenMP in client programs Support Master-workers parallel programs for parametric search grid applications –Persistent data support in remote workers for applications which requires large data Monitor and performance tools int main(int argc, char **argv) { int i, A[100][100],B[100][100][100],C[100][100][100]; OmniRpcRequest reqs[100]; OmniRpcInit(&argc, &argv); for(i = 0; i< 100; i++) reqs[i] = OmniRpcCallAsync(“mul”,100, B[i], A, C[i]); OmniRpcWaitAll(100,reqs);. OmniRpcFinalize(); return 0; }

2003/10/3 UK Jpana N+N Meeting 11 OmniRPC features need Globus? –No, you can use “ssh” as well as “globus” –It is very useful for an application people. –“ssh” can solve “firewall” problem. Data persistence model? –Parameter search type application need to share the initial data. –OmniRPC support it. Can use many (remote) clusters? –Yes, OmniRPC supports “cluster of clusters”. How to use in different machine and environment ? –You can switch the configuration by “config file” without modification on source program. Why not “Grid PRC” standard? –OmniRPC provides high level interface, to avoid “scheduling” and “fault- tolerance” from users.

2003/10/3 UK Jpana N+N Meeting 12 OmniRPC Home Page

2003/10/3 UK Jpana N+N Meeting 13 Conflex from Cluster to Grid For large bimolecules, the number of combinational trial structure will be huge! Geometry optimization of large molecular structures requires more time to compute! Geometry optimization phase takes more than 90% in total execution time So far, executed on PC Cluster by using MPI Grid allows to use huge computing resources to overcome these problem!

2003/10/3 UK Jpana N+N Meeting 14 Our Grid Platform Univ. of Tsukuba Dennis Cluster Dual P4 Xeon 2.4GHz 10 nodes Alice Cluster Dual Athlon nodes AIST UME Cluster Dual P3 1.4GHz 32 nodes Tokushima Univ. Toku Cluster P3 1.0GHz 8 nodes Toyohashi Univ. of Tech. Toyo Cluster Dual Athlon nodes Tsukuba WAN SINET

2003/10/3 UK Jpana N+N Meeting 15 Summary of Our Grid Environment Dual P3 1.4GHz P3 1GHz Dual Athlon Dual P4 Xeon 2.4GHz Machine overview # of Nodes --Dennis Toku UME Toyo Alice Throughput (MB/s)# RTT * ( ms)# Cluster * Round-Trip Time # All measurement Dennis Cluster and Each Cluster

2003/10/3 UK Jpana N+N Meeting 16 CONFLEX-G:Grid enabled CONFLEX Parallelize molecular geometry optimization phase using Master/Worker model. OmniRPC persistent data model (automatic initializable remote module facility) allows to reuse workers for each call. –Eliminate initializing worker program at every PRC. Selection of Initial Structure Local Perturbation Geometry Optimization Comparison & Store Conformation Database PC PC Cluster A PC PC Cluster B PC PC Cluster C PC

2003/10/3 UK Jpana N+N Meeting 17 Experiment Setting CONFLEX ’ s version: 402q Test data: Two Molecular samples –C17 (51 atoms) –AlaX16a (181 atoms). Authentication method :SSH CONFLEX-G client program was executed on the server node of Dennis cluster We used all nodes in clusters of our grid

2003/10/3 UK Jpana N+N Meeting 18 Sample Molecules # of opt. trial structures = 26.7(h) AlaX16a (181 atoms) C17 (51 atoms) Estimated total exec.time for all Trial structures in Dennis’s Single CPU (s) Average exec. time to opt. trial structure (s) # of trial structure at one opt. phase (degree of parallelism) data

2003/10/3 UK Jpana N+N Meeting 19 Comparison between OmniRPC and MPI in Dennis Cluster C17 ( 51 atoms, degree of parallelism 48 ) Overhead of On-Demand Initialization of worker program in OmniRPC 10 times Speedup using OmniRPC

2003/10/3 UK Jpana N+N Meeting 20 Execution time of AlaX16a (181 atoms, degree of parallelism 160) 64 times Speedup

2003/10/3 UK Jpana N+N Meeting 21 Discussion Performance of CONFLEX-G was observed to be almost equals to that of CONFLEX with MPI –Overheads to initialize workers was found. It will be required to imporve. We could achieve performance improvement using multiple clusters, –A speedup of 64 on 112 workers in AlaX16a(181 atoms) –However …, In our experiment: Each workers takes only one or two trial structures, too few! Load in-balance occurs because exec. time of each opt. varies. We expect more speed up for larger molecule.

2003/10/3 UK Jpana N+N Meeting 22 Discussion (cont ’ d) Possible improvement: –Exploit more parallelism Parallelize the outer loop to increase the number of structure optimization at a time –Efficient Job Scheduling Heavy jobs -> fast machines light jobs -> slow machines –Can we estimate execution time ? –Parallelize worker program by SMP(OpenMP) Increase the performance of worker Reduce the number of workers

2003/10/3 UK Jpana N+N Meeting 23 Summary and Future work Conflex-G: Grid-enabled molecular confirmation search. –We used OmniRPC to make it grid-enabled. –We are actually doing product-run.. For MO simulation (Docking), we are working on coarse- grain MO, as well as job submission –REMD (replica exchange program using NAMD) –FMO (Fragment MO) For QSAR –Design of ML to describe computation results –Web service interface to access the database