Biosciences Working Group Update Wilfred W. Li, Ph.D., UCSD, USA Habibah Wahab, Ph.D., USM, Malaysia You-Qiang Song, Ph.D, HKU, PRC Hosted by HKU Hong Kong, PRC, March 2-4, 2010
PRAGMA: model for international collaboration in Technology and Science
Broadening Impact of Technology Engaging Future Generations PRIME Student 2009: Jessica Hsieh, USM
Scientific Drivers and Use Cases: Influenza A Virus Harris et al, PNAS, 2006
2009 H1N1 Pandemic Influenza Source: Cumulative cases represented in Google Map as of 21 Apr, 2010 WHO: deaths to date US: 4642 deaths; total cases Malaysia: 74 deaths, 6463 total cases China: 650 deaths; total cases Postpandemic period as of Aug ~1% death rate, similar to seasonal flu Targets younger and healthy individuals, different from seasonal flu (90% > 65 years older)
WHO Status Update Week 17 (Apr 19, 2009) to Week 13 (Apr 3, 2010)
Transparent access of applications on Avian Flu Grid through middleware CNIC Duckling Portal Konkuk/Kukmin Glyco-M*Grid NBCR CADD
Relaxed Complex Scheme and Ensemble based Virtual Screening Contributed to HIV Integrase Inhibitor Development “ Exploration of the structural basis for this unexpected result … suggests an approach to the development of integrase inhibitors with unique resistance profiles.” D. Hazuda et al., Proc. Natl. Acad. Sci. USA (Aug. 2004), refers to Schames, et al. (2004). Discovery of unexpected binding site in HIV-1 Integrase using MD and AutoDock: Schames, … & McCammon, J. Med. Chem. (released on web, early 2004) February, 2006 – Phase III Clinical Trials February, 2007 – Name announced: Isentress (raltegravir) October, 2007 – FDA “fast track” approval New Class of HIV Drugs: Merck & Co. MK-0518 Source: A. McCammon
Ensemble-based Virtual Screening with Relaxed Complex Scheme NAMD2 Amber NCI Diversity Set: 3.3 MB, 2000 compounds; Required at each site ZINC subset: 200,000. A few hundred MB Multiple targets: HA, NA subtypes Each target: 30~50 MD snapshots, 1~2 MB each AutoDock4 Simulation Data: hundreds of GB Docking Data: hundreds of MB Total data to date: ~5 TB in long term storage. Each experiment is about 1 Petaflops accumulative in computation cost. Source: Amaro
Advances in Computing Infrastructure Enables Complex Simulations of Biomolecular Systems Amaro & Li, CTMC, 2010
Opal 2 for SaaS
Condor poolTeraGrid/PRAGMA Grid PBS/SGE Clusters Globus Opal Application Services Opal AppMGLTools Kepler Opal WS: Transparent Access Layer for Applications Grid/Cloud Resources CADDVistrailsTaverna Condor CSF4
Opal Plugins for Popular Workflow Software
14 CADD: Opal Web Services for Biomedical Applications Ren et al, NAR 2010, Web Server Issue Modules supporting MD simulation and analysis, Virtual Screening, Docking, Visualization Project management under development
Opal MetaService: Transparent Access to Workflows and Applications
Social Networks and Collaborative Environment Social Network SiteNumber of UsersFeaturesAPI Examples Google170 million (Gmail)Google Integrated Suite of Tools Google Apps Engine LinkedIn65 millionProfessionalHuddle/Zoho Office Online Twitter100 millionShort MMS/SMSTwitPic Google Wave100,000 X 7?Upload any fileGoogle Wave Robot Facebook500 million+Social networkFacebook Apps Are these too big to fail? Utility Computing finally?
OPAL as resource manager of CSF4 CSF4 allocate service instances of OPAL for jobs 17 New OPAL-CSF4 Cloud model PRAGMA 19 workshop, Changchun, Jilin, China, Sep.13-15, 2010.
2 – 4 March 2010 PRAGMA 18, San Diego18 Nornisah Mohamed, USM
Integrating Visualization Workflows using Real-time bioMEdical data Streaming and visualization (RIMES) Kevin Dong, CNIC
VM Replication Experiment SDSC VM hosting server AIST VM hosting server AFG VM (original) AFG VM (copy) VM hosting server: Rocks 5.3 Xen roll Avian Flu Grid VM Rocks VM Globus/SGE Autodock Replication updates hostname and IP Compute nodes Network configurations Globus configuration SGE configuration NBCR VM hosting server AFG VM (copy) VM replication
Lau, Haga and Date ViewDock TDW
Other Examples of Continued Software Development at Member Institutions – Drugscreener-G – KISTI, Korea – Grid Enabled Virtural Screening Service – ASGC, Taiwan – CADD Pipeline – NBCR, USA – WISDOM project – CNRS, EU – Glyco-M*Grid – Kookmin & Konkuk U, Korea
Meeting the New Challenges Virtualization – What does it mean to us? – Virtual machines, CSF server, Gfarm server and virtual clusters Production environment – Where is it? What form should it take? -- EC2, VC replication Collaboration – How to stay in touch better, PRIME, MURPA, research in general?
Look Around Session Heru Suhartanto – Indonesia Faculty of Computer Science, University of Indonesia – Molecular Dynamics Simulation of disordered regions the RGK-family of small GTPase revealed no GTPase activity Suntae Hwang – South Korea Kookmin University – MGrid Service on Nationwide Consortium of Supercomputing Infrastructure (PLSI) in Korea
PRAGMA 19:Look ahead sessions Progress Reported at PRAGMA 20! Day 2- – Duckling portal as a new generation user portal Current focus: better user management, online editing, status notification Possible features: – Support for Opal service? Compute cloud access? – Support for larger data size? Or Data cloud access? – Support for Open ID? Social network access? – Continued support for RIMES?
Looking ahead – M*Grid portal Current status: pending deployment in PLSI e-science project, with Gfarm filesystem browser Possible features: – Duckling portal as the new portlet framework? – Possible metascheduler in resource selection? – Possible Opal service support? M*Grid job execution environment is quite feature rich, and specific for simulation jobs. Can Opal service support provide more benefits?
Looking ahead CSF4 – Current focus: CSF4 support for Opal services (maybe globus no longer needed for job execution), cloud service metascheduler, bug fixes and release of Possible features: more efficient/advanced resource selection policies Gfarm – Current focus: Gfarm 2.4 deployment and integration with Opal 2.3
Looking ahead NBCR CADD – Current focus: Release of 0.1 beta, documentation, and RCS rescoring workflow – Possible features: Data cloud service Metadata and job history