BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 1 NAMD Algorithms and HPC Functionality David Hardy.

Slides:



Advertisements
Similar presentations
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Advertisements

The Charm++ Programming Model and NAMD Abhinav S Bhatele Department of Computer Science University of Illinois at Urbana-Champaign
1 NAMD - Scalable Molecular Dynamics Gengbin Zheng 9/1/01.
May 8th, 2012 Higher Ed & Research. Molecular Dynamics Applications Overview AMBER NAMD GROMACS LAMMPS Sections Included * * In fullscreen mode, click.
Life and Health Sciences Summary Report. “Bench to Bedside” coverage Participants with very broad spectrum of expertise bridging all scales –From molecule.
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.
Cyberinfrastructure for Scalable and High Performance Geospatial Computation Xuan Shi Graduate assistants supported by the CyberGIS grant Fei Ye (2011)
Dr. Gengbin Zheng and Ehsan Totoni Parallel Programming Laboratory University of Illinois at Urbana-Champaign April 18, 2011.
Presented by Next Generation Simulations in Biology: Investigating biomolecular structure, dynamics and function through multiscale modeling Pratul K.
Evaluation of Fast Electrostatics Algorithms Alice N. Ko and Jesús A. Izaguirre with Thierry Matthey Department of Computer Science and Engineering University.
A CASE STUDY OF COMMUNICATION OPTIMIZATIONS ON 3D MESH INTERCONNECTS University of Illinois at Urbana-Champaign Abhinav Bhatele, Eric Bohm, Laxmikant V.
Adaptive MPI Chao Huang, Orion Lawlor, L. V. Kalé Parallel Programming Lab Department of Computer Science University of Illinois at Urbana-Champaign.
Computer Science Prof. Bill Pugh Dept. of Computer Science.
Topology Aware Mapping for Performance Optimization of Science Applications Abhinav S Bhatele Parallel Programming Lab, UIUC.
A Framework for Collective Personalized Communication Laxmikant V. Kale, Sameer Kumar, Krishnan Varadarajan.
Charm++ Load Balancing Framework Gengbin Zheng Parallel Programming Laboratory Department of Computer Science University of Illinois at.
BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 1 Future Direction with NAMD David Hardy
“High-performance computational GPU-stand for teaching undergraduate and graduate students the basics of quantum-mechanical calculations“ “Komsomolsk-on-Amur.
Algorithms and Software for Large-Scale Simulation of Reactive Systems _______________________________ Ananth Grama Coordinated Systems Lab Purdue University.
ExTASY 0.1 Beta Testing 1 st April 2015
Scaling to New Heights Retrospective IEEE/ACM SC2002 Conference Baltimore, MD.
CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign S5246—Innovations in.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
2005 Materials Computation Center External Board Meeting The Materials Computation Center Duane D. Johnson and Richard M. Martin (PIs) Funded by NSF DMR.
BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 1 Demonstration: Using NAMD David Hardy
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Faster, Cheaper, and Better Science: Molecular.
Chem. 860 Molecular Simulations with Biophysical Applications Qiang Cui Department of Chemistry and Theoretical Chemistry Institute University of Wisconsin,
Molecular dynamics of DNA fragments on the Grid Kirill Zinovjev Latvian Institute of Organic Synthesis Riga Technical University g.
Folding of Distinct Sequences into Similar Functional Topologies Hossein Mohammadiarani and Harish Vashisth (Advisor) Department of Chemical Engineering,
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
An FPGA Implementation of the Ewald Direct Space and Lennard-Jones Compute Engines By: David Chui Supervisor: Professor P. Chow.
Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign Visualization of Petascale.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Charm++ Hits and Misses A NAMD Perspective.
CCA Common Component Architecture CCA Forum Tutorial Working Group CCA Status and Plans.
SAN DIEGO SUPERCOMPUTER CENTER Advanced User Support Project Overview Adrian E. Roitberg University of Florida July 2nd 2009 By Ross C. Walker.
1 ©2004 Board of Trustees of the University of Illinois Computer Science Overview Laxmikant (Sanjay) Kale ©
Anton, a Special-Purpose Machine for Molecular Dynamics Simulation By David E. Shaw et al Presented by Bob Koutsoyannis.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC In-Situ Visualization and Analysis of Petascale.
National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Visualization Support for XSEDE and Blue Waters DOE Graphics.
A uGNI-Based Asynchronous Message- driven Runtime System for Cray Supercomputers with Gemini Interconnect Yanhua Sun, Gengbin Zheng, Laximant(Sanjay) Kale.
HisH - HisF Imidazole Glycerol Phosphate Synthase: regulates 5 th step histidine biosynthesis HisH class I glutamine amidotransferase HisF alpha-beta barrel.
SAN DIEGO SUPERCOMPUTER CENTER Advanced User Support Project Overview Thomas E. Cheatham III University of Utah Jan 14th 2010 By Ross C. Walker.
1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Scaling NAMD to 100 Million Atoms on Petascale.
Dynameomics: Protein Mechanics, Folding and Unfolding through Large Scale All-Atom Molecular Dynamics Simulations INCITE 6 David A. C. Beck Valerie Daggett.
HPC University Requirements Analysis Team Training Analysis Summary Meeting at PSC September Mary Ann Leung, Ph.D.
Petascale Computing Resource Allocations PRAC – NSF Ed Walker, NSF CISE/ACI March 3,
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Teragrid 2009 Scalable Interaction with Parallel Applications Filippo Gioachin Chee Wai Lee Laxmikant V. Kalé Department of Computer Science University.
Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign.
VisIt Project Overview
Electron Ion Collider New aspects of EIC experiment instrumentation and computing, as well as their possible impact on and context in society (B) COMPUTING.
Condensed Matter Physics and Materials Science: David M
Performance Evaluation of Adaptive MPI
Theoretical and Computational Biophysics Group
Development of the Nanoconfinement Science Gateway
Computer simulation studies of forced rupture kinetics of
Parallel computing in Computational chemistry
Accelerating New Science
Presentation transcript:

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 1 NAMD Algorithms and HPC Functionality David Hardy NAIS: State-of-the-Art Algorithms for Molecular Dynamics

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 2 Beckman Institute University of Illinois at Urbana-Champaign Theoretical and Computational Biophysics Group

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 3 Acknowledgments Jim Phillips Lead NAMD developer John Stone Lead VMD developer David Tanner Implemented GBIS Klaus Schulten Director of TCB group

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 4 NAMD and VMD: The Computational Microscope Ribosome: synthesizes proteins from genetic information, target for antibiotics Silicon nanopore: bionanodevice for sequencing DNA efficiently Study the molecular machines in living cells

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 5 Electrons in Vibrating Buckyball Cellular Tomography, Cryo-electron Microscopy Poliovirus Ribosome Sequences VMD – “Visual Molecular Dynamics” Whole Cell Simulations Visualization and analysis of: –molecular dynamics simulations –quantum chemistry calculations –particle systems and whole cells –sequence data –volumetric data User extensible w/ scripting and plugins

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 6 VMD Interoperability – Linked to Today’s Key Research Areas Unique in its interoperability with a broad range of modeling tools: AMBER, CHARMM, CPMD, DL_POLY, GAMESS, GROMACS, HOOMD, LAMMPS, NAMD, and many more … Supports key data types, file formats, and databases, e.g. electron microscopy, quantum chemistry, MD trajectories, sequence alignments, super resolution light microscopy Incorporates tools for simulation preparation, visualization, and analysis

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 7 NAMD: Scalable Molecular Dynamics 51,000 Users, 2900 Citations Blue Waters Target Application 2002 Gordon Bell Award Illinois Petascale Computing Facility PSC Lemieux ATP synthase Computational Biophysics Summer School GPU Acceleration NCSA Lincoln NVIDIA Tesla

8 Larger machines enable larger simulations

9 NAMD features are chosen for scalability CHARMM, AMBER, OPLS force fields Multiple time stepping Hydrogen bond constraints Efficient PME full electrostatics Conjugate-gradient minimization Temperature and pressure controls Steered molecular dynamics (many methods) Interactive molecular dynamics (with VMD) Locally enhanced sampling Alchemical free energy perturbation Adaptive biasing force potential of mean force User-extendable in Tcl for forces and algorithms All features run in parallel and scale to millions of atoms!

10 NAMD 2.9 Release Public beta released March 19, final version in May Capabilities: –New scalable replica-exchange implementation –QM/MM interface to OpenAtom plane-wave QM code –Knowledge-based Go potentials to drive folding and assembly –Multilevel Summation Method electrostatics (serial prototype) Performance: –Cray XE6/XK6 native multi-threaded network layer –Communication optimizations for wider multicore nodes –GPU acceleration of energy minimization –GPU-oriented shared-memory optimizations –GPU Generalized Born (OBC) implicit solvent –Faster grid force calculation for MDFF maps Enables Desktop MDFF

11 NAMD 2.9 Desktop MDFF Implicit Solvent 1 GPU (20X) Implicit Solvent 8 cores (6X) Explicit Solvent 8 cores (1X) Fast: 2 ns/day Faster: 12 ns/day Fastest: 40 ns/day Initial Structure PDB 2EZM Fitted Structure Simulated EM Map from PDB 3EZM Cyanovirin-N 1,500 atoms 1 Å final RMSD with GPU-Accelerated Implicit Solvent and CPU-Optimized Cryo-EM Forces

12 NAMD 2.9 Scalable Replica Exchange Easier to use and more efficient: –Eliminates complex, machine-specific launch scripts –Scalable pair-wise communication between replicas –Fast communication via high-speed network Basis for many enhanced sampling methods: –Parallel tempering (temperature exchange) –Umbrella sampling for free-energy calculations –Hamiltonian exchange (alchemical or conformational) –Finite Temperature String method –Nudged elastic band Great power and flexibility: –Enables petascale simulations of modestly sized systems –Leverages features of Collective Variables module –Tcl scripts can be highly customized and extended Released in NAMD 2.9 Enabled for Roux group

13 NAMD 2.9 QM/MM Calculations OpenAtom (100 atoms, 70Ry, on 1K cores): 120 ms / step NAMD: (50,000 atoms on 512 cores): 2.5 ms / step Harrison & Schulten, Quantum and classical dynamics of ATP hydrolysis in solvent. Submitted Car-Parrinello MD (OpenAtom) and NAMD in one software NAMD Charm++ OPENATOM Parallel Electrostatic Embedding Permits atom QM regions Synchronous load-balancing of QM and MD maximizes processor utilization Method combining OpenAtom and NAMD Parallel Car-Parrinello MD DOE and NSF Funded 10 yrs Martyna/Kale Collaboration OpenAtom Faster Timestep (sec/step) # Cores BG/L BG/P Cray XT3

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 14 NAMD impact is broad and deep Comprehensive, industrial-quality software –Integrated with VMD for simulation setup and analysis –Portable extensibility through Tcl scripts (also used in VMD) –Consistent user experience from laptop to supercomputer Large user base – 51,000 users –9,100 (18%) are NIH-funded; many in other countries –14,100 have downloaded more than one version Leading-edge simulations –“most-used software” on NICS Cray XT5 (largest NSF machine) –“by far the most used MD package” at TACC (2 nd and 3 rd largest) –NCSA Blue Waters early science projects and acceptance test –Argonne Blue Gene/Q early science project

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 15 Outside researchers choose NAMD and succeed Corringer, et al., Nature, 2011 M. Koeksal, et al., Taxadiene synthase structure and evolution of modular architecture in terpene biosynthesis. (2011) C.-C. Su, et al., Crystal structure of the CusBA heavy-metal efflux complex of Escherichia coli. (2011) D. Slade, et al., The structure and catalytic mechanism of a poly(ADP-ribose) glycohydrolase. (2011) F. Rose, et al., Mechanism of copper(II)-induced misfolding of Parkinson ’ s disease protein. (2011) L. G. Cuello, et al., Structural basis for the coupling between activation and inactivation gates in K(+) channels. (2010) S. Dang, et al.,, Structure of a fucose transporter in an outward-open conformation. (2010) F. Long, et al., Crystal structures of the CusA efflux pump suggest methionine-mediated metal transport. (2010) R. H. P. Law, et al., The structural basis for membrane binding and pore formation by lymphocyte perforin. (2010) P. Dalhaimer and T. D. Pollard, Molecular Dynamics Simulations of Arp2/3 Complex Activation. (2010) J. A. Tainer, et al., Recognition of the Ring-Opened State of Proliferating Cell Nuclear Antigen by Replication Factor C Promotes Eukaryotic Clamp-Loading. (2010) D. Krepkiy, et al.,, Structure and hydration of membranes embedded with voltage-sensing domains. (2009) N. Yeung, et al.,, Rational design of a structural and functional nitric oxide reductase. (2009) Z. Xia, et al., Recognition Mechanism of siRNA by Viral p19 Suppressor of RNA Silencing: A Molecular Dynamics Study. (2009) Recent NAMD Simulations in Nature Bare actinCofilactin Voth, et al., PNAS, K-atom 30 ns study of anesthetic binding to bacterial ligand-gated ion channel provided “ complementary interpretations…that could not have been deduced from the static structure alone. ” 500K-atom 500 ns investigation of effect of actin depolymerization factor/cofilin on mechanical properties and conformational dynamics of actin filament. Bound Propofol Anesthetic 2100 external citations since 2007

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 16 Challenges of New Hardware BUT the frequency has stopped increasing (since 2003 or so) Transistors (1000s) Clock Speed (MHz) Power (W) Year The number of transistors on a chip keeps increasing (and will, for 10 years) Due to power limits count

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 17 Harnessing Future Hardware Challenge: a panoply of complex and powerful hardware –Complex multicore chips, accelerators Solution: BTRC computer science expertise –Parallel Programming Lab: leading research group in scalable parallel computing Intel MIC (TACC Stampede) Kepler GPU (Blue Waters) AMD Interlagos (Blue Waters)

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 18 Parallel Programming Lab University of Illinois at Urbana-Champaign Siebel Center for Computer Science

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 19 Parallel Objects, Adaptive Runtime System Libraries and Tools The enabling CS technology of parallel objects and intelligent Runtime systems has led to several collaborative applications in CSE Crack Propagation Space-time meshes Computational Cosmology Rocket Simulation Protein Folding Dendritic Growth Quantum Chemistry (QM/MM) Develop abstractions in context of full-scale applications NAMD: Molecular Dynamics STM virus simulation

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 20 Computing research drives NAMD Parallel Programming Lab – directed by Prof Laxmikant Kale –Charm++ is an Adaptive Parallel Runtime System Gordon Bell Prize 2002 Three publications at Supercomputing 2011 Four panels discussing the future necessity of our ideas 20 years of co-design for NAMD performance, portability, and productivity, adaptivity Recent example: Implicit Solvent deployed in NAMD by 1 RA in 6 months. 4x more scalable than similar codes Yesterday’s supercomputer is tomorrow’s desktop

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 21 NAMD 2.8 Highly Scalable Implicit Solvent Model NAMD Implicit Solvent is 4x more scalable than Traditional Implicit Solvent for all system sizes, implemented by one GRA in 6 months. traditional NAMD 138,000 Atoms 65M Interactions Tanner et al., J. Chem. Theory and Comp., 7: , 2011 Processors Speed [pairs/sec] 27,600 Atoms 29,500 Atoms 2,016 Atoms 2,412 Atoms 149,000 Atoms

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 22 Cray Gemini Optimization The new Cray machine has a better network (called Gemini) MPI-based NAMD scaled poorly BTRC implemented direct port of Charm++ to Cray uGNI is the lowest level interface for the Cray Gemini network –Removes MPI from NAMD call stack Gemini provides at least 2x increase in usable nodes for strong scaling ApoA1 (92,000 atoms) on Cray Blue Waters prototype ns/day nodes

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC M-atom Benchmark Performance ns/day 100M-atom performance on Jaguar and Blue Waters Number of “Cores” Tsubame Tokyo Institute of Tech GPUs Number of Nodes ns/day Cray Jaguar XT5 (ORNL) 224,000 cores BlueWaters ESS XE6 (UIUC) Final: 380,000 cores, 3000 GPUs

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC M Atoms on Titan vs Jaguar 5x5x4 STMV grid PME every 4 steps New Optimizations Charm++ uGNI port Node-aware optimizations Priority messages in critical path Persistent FFT messages for PME Shared-memory parallelism for PME Paper to be submitted to SC12

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 25 1M Atom Virus on TitanDev GPU Single STMV PME every 4 steps

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC M Atoms on TitanDev

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 27 TitanDev Cray XK6 Results 100 stmv s/step 32 nodes 100 stmv s/step 768 nodes XK6, Fermi s s XK6, without Fermi s s XE s s XK6 Fermi vs XK6 without Fermi 3.8 x 2.6 x XK6 Fermi vs XE61.8 x 1.4 x

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 28 time step non PME stepPME step sum PME computationkernel GPU, 100 stmv, 32 nodes1.249 s1.009 s1.833 s0.796 s0.964 s GPU, 4 stmv, 30 nodes0.057 s0.049 s0.065 s0.028 s0.042 s GPU, 100 stmv, 768 nodes0.085 s0.046 s0.176 s0.034 s0.042 s strong scaling weak scaling TitanDev Cray XK6 Results

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 29 Tsubame (Tokyo) Application of GPU Accelerated NAMD 8400 cores AFM image of flat chromatophore membrane (Scheuring 2009) 20 million atom proteins + membrane

BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 30 GPU Computing in NAMD and VMD NAMD algorithms to be discussed: –Short-range non-bonded interactions –Generalized Born Implicit Solvent –Multilevel Summation Method VMD algorithms to be discussed: –Electrostatic potential maps –Visualizing molecular orbitals –Radial distribution functions –“QuickSurf” representation