NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC In-Situ Visualization and Analysis of Petascale.

Slides:



Advertisements
Similar presentations
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Advertisements

GPU Virtualization Support in Cloud System Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science and Information.
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
1 NAMD - Scalable Molecular Dynamics Gengbin Zheng 9/1/01.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.
Team Members: Tyler Drake Robert Wrisley Kyle Von Koepping Justin Walsh Faculty Advisors: Computer Science – Prof. Sanjay Rajopadhye Electrical & Computer.
BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 1 Analysis and Visualization Algorithms in VMD David.
AGENT SIMULATIONS ON GRAPHICS HARDWARE Timothy Johnson - Supervisor: Dr. John Rankin 1.
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 1 Future Direction with NAMD David Hardy
“High-performance computational GPU-stand for teaching undergraduate and graduate students the basics of quantum-mechanical calculations“ “Komsomolsk-on-Amur.
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Direct Self-Consistent Field Computations on GPU Clusters Guochun.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign Visualization and Analysis.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign Petascale Molecular.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC High Performance Molecular Simulation, Visualization,
High Performance I/O and Data Management System Group Seminar Xiaosong Ma Department of Computer Science North Carolina State University September 12,
By Arun Bhandari Course: HPC Date: 01/28/12. GPU (Graphics Processing Unit) High performance many core processors Only used to accelerate certain parts.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign S5246—Innovations in.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC S04: High Performance Computing with CUDA Case.
© David Kirk/NVIDIA and Wen-mei W. Hwu, 1 Programming Massively Parallel Processors Lecture Slides for Chapter 1: Introduction.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign Interactive Molecular.
2005 Materials Computation Center External Board Meeting The Materials Computation Center Duane D. Johnson and Richard M. Martin (PIs) Funded by NSF DMR.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign GPU-Accelerated Analysis.
BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC 1 Demonstration: Using NAMD David Hardy
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign Programming in CUDA:
Porting Irregular Reductions on Heterogeneous CPU-GPU Configurations Xin Huo, Vignesh T. Ravi, Gagan Agrawal Department of Computer Science and Engineering.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Faster, Cheaper, and Better Science: Molecular.
© Copyright John E. Stone, Page 1 OpenCL for Molecular Modeling Applications: Early Experiences John Stone University of Illinois at Urbana-Champaign.
Programming for Hybrid Architectures
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Accelerating Molecular Modeling Applications.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign GPU-Accelerated Molecular.
GPU Architecture and Programming
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC S03: High Performance Computing with CUDA Heterogeneous.
© John E. Stone, University of Illinois, Urbana-Champaign 1 ECE 498AL Lecture 21: Application Performance Case Studies: Molecular Visualization.
Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Using GPUs to compute the multilevel summation.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign Visualization of Petascale.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC High Performance Molecular Simulation, Visualization,
1 ©2004 Board of Trustees of the University of Illinois Computer Science Overview Laxmikant (Sanjay) Kale ©
GPU Programming Shirley Moore CPS 5401 Fall 2013
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign Early Experiences Scaling.
Parallelizing Spacetime Discontinuous Galerkin Methods Jonathan Booth University of Illinois at Urbana/Champaign In conjunction with: L. Kale, R. Haber,
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Multilevel Summation of Electrostatic Potentials.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign CUDA Applications I.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign Broadening the Use of.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC High Performance Molecular Visualization and.
National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Visualization Support for XSEDE and Blue Waters DOE Graphics.
SAN DIEGO SUPERCOMPUTER CENTER Advanced User Support Project Overview Thomas E. Cheatham III University of Utah Jan 14th 2010 By Ross C. Walker.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483, University of Illinois, Urbana-Champaign 1 Graphic Processing Processors (GPUs) Parallel.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign VMD: GPU-Accelerated.
NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Scaling NAMD to 100 Million Atoms on Petascale.
3/12/2013Computer Engg, IIT(BHU)1 CUDA-3. GPGPU ● General Purpose computation using GPU in applications other than 3D graphics – GPU accelerates critical.
Petascale Computing Resource Allocations PRAC – NSF Ed Walker, NSF CISE/ACI March 3,
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
NIH BTRC for Macromolecular Modeling and Bioinformatics Beckman Institute, U. Illinois at Urbana-Champaign S6261—VMD+OptiX: Streaming.
Multicore Applications in Physics and Biochemical Research Hristo Iliev Faculty of Physics Sofia University “St. Kliment Ohridski” 3 rd Balkan Conference.
© Wen-mei W. Hwu and John Stone, Urbana July 22, Illinois UPCRC Summer School 2010 The OpenCL Programming Model Part 2: Case Studies Wen-mei Hwu.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign.
High Performance Molecular Visualization and Analysis on GPUs
ECE408 Fall 2015 Applied Parallel Programming Lecture 21: Application Case Study – Molecular Dynamics.
John Stone Theoretical and Computational Biophysics Group
Theoretical and Computational Biophysics Group
John E. Stone Theoretical and Computational Biophysics Group
GPU-Accelerated Molecular Visualization and Analysis with VMD
Johann Biedermann, Alexander Ullrich, Johannes Schöneberg, Frank Noé 
Presentation transcript:

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC In-Situ Visualization and Analysis of Petascale Molecular Dynamics Simulations with VMD John Stone Theoretical and Computational Biophysics Group Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign Accelerated HPC Symposium, San Jose Convention Center, San Jose, CA, May 17, 2012

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Electrons in Vibrating Buckyball Cellular Tomography, Cryo-electron Microscopy Poliovirus Ribosome Sequences VMD – “Visual Molecular Dynamics” Whole Cell Simulations Visualization and analysis of: –molecular dynamics simulations –quantum chemistry calculations –particle systems and whole cells –sequence data User extensible w/ scripting and plugins

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Molecular Visualization and Analysis Challenges for Petascale Simulations Very large structures (10M to over 100M atoms) –12-bytes per atom per trajectory frame –100M atom trajectory frame: 1200MB! Long-timescale simulations produce huge trajectories –MD integration timesteps are on the femtosecond timescale ( sec) but many important biological processes occur on microsecond to millisecond timescales –Even storing trajectory frames infrequently, resulting trajectories frequently contain millions of frames Petabytes of data to analyze, far too large to move Viz and analysis must be done primarily on the supercomputer where the data already resides

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Approaches for Visualization and Analysis of Petascale Molecular Simulations with VMD Abandon conventional approaches, e.g. bulk download of trajectory data to remote viz/analysis machines –In-place processing of trajectories on the machine running the simulations –Use remote visualization techniques: Split-mode VMD with remote front- end instance, and back-end viz/analysis engine running in parallel on supercomputer Large-scale parallel analysis and visualization via distributed memory MPI version of VMD Exploit GPUs and other accelerators to increase per-node analytical capabilities, e.g. NCSA Blue Waters Cray XK6 In-situ on-the-fly viz/analysis and event detection through direct communication with running MD simulation

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Improved Support for Large Datasets in VMD New structure building tools, file formats, and data structures enable VMD to operate efficiently up to 150M atoms –Up to 30% more memory efficient –Analysis routines optimized for large structures, up to 20x faster for calculations on 100M atom complexes where molecular structure traversal can represent a significant amount of runtime –New and revised graphical representations support smooth trajectory animation for multi-million atom complexes; VMD remains interactive even when displaying surface reps for 20M atom membrane patch Uses multi-core CPUs and GPUs for the most demanding computations 20M atoms: membrane patch and solvent

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC New Interactive Display & Analysis of Terabytes of Data: Out-of-Core Trajectory I/O w/ Solid State Disks Timesteps loaded on-the-fly (out-of-core) –Eliminates memory capacity limitations, even for multi-terabyte trajectory files –High performance achieved by new trajectory file formats, optimized data structures, and efficient I/O Analyze long trajectories significantly faster New SSD Trajectory File Format 2x Faster vs. Existing Formats Immersive out-of-core visualization of large-size and long-timescale molecular dynamics trajectories. J. Stone, K. Vandivort, and K. Schulten. Lecture Notes in Computer Science, 6939:1-12, Commodity SSD, SSD RAID A DVD movie per second! 450MB/sec to 4GB/sec

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Ribosome w/ solvent 3M atoms 3 frames/sec w/ HD 60 frames/sec w/ SSDs Membrane patch w/ solvent 20M atoms 0.4 frames/sec w/ HD 8 frames/sec w/ SSDs VMD Out-of-Core Trajectory I/O Performance: SSD-Optimized Trajectory Format, 8-SSD RAID New SSD Trajectory File Format 2x Faster vs. Existing Formats VMD I/O rate ~2.1 GB/sec w/ 8 SSDs

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Parallel VMD Analysis w/ MPI Analyze trajectory frames, structures, or sequences in parallel supercomputers: –Parallelize user-written analysis scripts with minimum difficulty –Parallel analysis of independent trajectory frames –Parallel structural analysis using custom parallel reductions –Parallel rendering, movie making Dynamic load balancing: –Recently tested with up to 15,360 CPU cores Supports GPU-accelerated clusters and supercomputers VMD Sequence/Structure Data, Trajectory Frames, etc… Gathered Results Data-Parallel Analysis in VMD

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC DisplayDevice OpenGLRenderer CAVE FreeVR Windowed OpenGL Display Subsystem Scene Graph Molecular Structure Data and Global VMD State User Interface Subsystem Tcl/Python Scripting Mouse + Windows VR “Tools” Graphical Representations Non-Molecular Geometry DrawMolecule Interactive MD TachyonDisplayDevice OptiXDisplayDevice

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC GPU Accelerated Trajectory Analysis and Visualization in VMD GPU-Accelerated FeatureSpeedup vs. single CPU core Molecular orbital display120x Radial distribution function92x Electrostatic field calculation44x Molecular surface display40x Ion placement26x MDFF density map synthesis26x Implicit ligand sampling25x Root mean squared fluctuation25x Radius of gyration21x Close contact determination20x Dipole moment calculation15x

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC NCSA Blue Waters Early Science System Cray XK6 nodes w/ NVIDIA Tesla X2090

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Time-Averaged Electrostatics Analysis on NCSA Blue Waters Early Science System NCSA Blue Waters Node TypeSeconds per trajectory frame for one compute node Cray XE6 Compute Node: 32 CPU cores (2xAMD 6200 CPUs) 9.33 Cray XK6 GPU-accelerated Compute Node: 16 CPU cores + NVIDIA X2090 (Fermi) GPU 2.25 Speedup for GPU XK6 nodes vs. CPU XE6 nodesGPU nodes are 4.15x faster overall Preliminary performance for VMD time-averaged electrostatics w/ Multilevel Summation Method on the NCSA Blue Waters Early Science System

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC In-Situ Visualization and Analysis with VMD Early prototype and testing phase VMD supports live socket connection to running MD code Custom user-written analysis scripts are triggered by callbacks as incoming frames arrive Separate threads handle async. network I/O between MD code and “master” VMD instance, MPI broadcast or decomposition among peer VMD nodes Perform real-time analysis processing of incoming frames to find rare events –Store the most “interesting” timesteps –Build summary analyses useful for accelerating interactive “Timeline” displays, and subsequent detailed batch mode analysis runs VMD Live, running MD simulation, e.g. NAMD running on thousands of compute nodes Store only “interesting” trajectory frames Data-Parallel Analysis, Visualization, in VMD

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC Acknowledgements Theoretical and Computational Biophysics Group, University of Illinois at Urbana- Champaign NCSA Blue Waters Team NCSA Innovative Systems Lab NVIDIA CUDA Center of Excellence, University of Illinois at Urbana-Champaign The CUDA team at NVIDIA NIH support: P41-RR005969

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC GPU Computing Publications Fast Visualization of Gaussian Density Surfaces for Molecular Dynamics and Particle System Trajectories. M. Krone, J. Stone, T. Ertl, and K. Schulten. In proceedings EuroVis 2012, (In-press) Immersive Out-of-Core Visualization of Large-Size and Long- Timescale Molecular Dynamics Trajectories. J. Stone, K. Vandivort, and K. Schulten. G. Bebis et al. (Eds.): 7th International Symposium on Visual Computing (ISVC 2011), LNCS 6939, pp. 1-12, Fast Analysis of Molecular Dynamics Trajectories with Graphics Processing Units – Radial Distribution Functions. B. Levine, J. Stone, and A. Kohlmeyer. J. Comp. Physics, 230(9): , 2011.

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC GPU Computing Publications Quantifying the Impact of GPUs on Performance and Energy Efficiency in HPC Clusters. J. Enos, C. Steffen, J. Fullop, M. Showerman, G. Shi, K. Esler, V. Kindratenko, J. Stone, J Phillips. International Conference on Green Computing, pp , GPU-accelerated molecular modeling coming of age. J. Stone, D. Hardy, I. Ufimtsev, K. Schulten. J. Molecular Graphics and Modeling, 29: , OpenCL: A Parallel Programming Standard for Heterogeneous Computing. J. Stone, D. Gohara, G. Shi. Computing in Science and Engineering, 12(3):66- 73, An Asymmetric Distributed Shared Memory Model for Heterogeneous Computing Systems. I. Gelado, J. Stone, J. Cabezas, S. Patel, N. Navarro, W. Hwu. ASPLOS ’10: Proceedings of the 15 th International Conference on Architectural Support for Programming Languages and Operating Systems, pp , 2010.

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC GPU Computing Publications GPU Clusters for High Performance Computing. V. Kindratenko, J. Enos, G. Shi, M. Showerman, G. Arnold, J. Stone, J. Phillips, W. Hwu. Workshop on Parallel Programming on Accelerator Clusters (PPAC), In Proceedings IEEE Cluster 2009, pp. 1-8, Aug Long time-scale simulations of in vivo diffusion using GPU hardware. E. Roberts, J. Stone, L. Sepulveda, W. Hwu, Z. Luthey-Schulten. In IPDPS’09: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Computing, pp. 1-8, High Performance Computation and Interactive Display of Molecular Orbitals on GPUs and Multi-core CPUs. J. Stone, J. Saam, D. Hardy, K. Vandivort, W. Hwu, K. Schulten, 2nd Workshop on General-Purpose Computation on Graphics Pricessing Units (GPGPU-2), ACM International Conference Proceeding Series, volume 383, pp. 9-18, Probing Biomolecular Machines with Graphics Processors. J. Phillips, J. Stone. Communications of the ACM, 52(10):34-41, Multilevel summation of electrostatic potentials using graphics processing units. D. Hardy, J. Stone, K. Schulten. J. Parallel Computing, 35: , 2009.

NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute, UIUC GPU Computing Publications Adapting a message-driven parallel application to GPU-accelerated clusters. J. Phillips, J. Stone, K. Schulten. Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, IEEE Press, GPU acceleration of cutoff pair potentials for molecular modeling applications. C. Rodrigues, D. Hardy, J. Stone, K. Schulten, and W. Hwu. Proceedings of the 2008 Conference On Computing Frontiers, pp , GPU computing. J. Owens, M. Houston, D. Luebke, S. Green, J. Stone, J. Phillips. Proceedings of the IEEE, 96: , Accelerating molecular modeling applications with graphics processors. J. Stone, J. Phillips, P. Freddolino, D. Hardy, L. Trabuco, K. Schulten. J. Comp. Chem., 28: , Continuous fluorescence microphotolysis and correlation spectroscopy. A. Arkhipov, J. Hüve, M. Kahms, R. Peters, K. Schulten. Biophysical Journal, 93: , 2007.