MDSimAid: Automatic optimization of fast electrostatics in molecular simulations Jesús A. Izaguirre, Michael Crocker, Alice Ko, Thierry Matthey and Yao.

Slides:



Advertisements
Similar presentations
Simulazione di Biomolecole: metodi e applicazioni giorgio colombo
Advertisements

Design Rule Generation for Interconnect Matching Andrew B. Kahng and Rasit Onur Topaloglu {abk | rtopalog University of California, San Diego.
Molecular Dynamics: Review. Molecular Simulations NMR or X-ray structure refinements Protein structure prediction Protein folding kinetics and mechanics.
LINCS: A Linear Constraint Solver for Molecular Simulations Berk Hess, Hemk Bekker, Herman J.C.Berendsen, Johannes G.E.M.Fraaije Journal of Computational.
Introduction to Molecular Orbitals
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500 Cluster.
An image-based reaction field method for electrostatic interactions in molecular dynamics simulations Presented By: Yuchun Lin Department of Mathematics.
Geometric Algorithms for Conformational Analysis of Long Protein Loops J. Cortess, T. Simeon, M. Remaud- Simeon, V. Tran.
Prof. Jesús A. Izaguirre Department of Computer Science and Engineering Computational Biology and Bioinformatics at Notre Dame.
An improved hybrid Monte Carlo method for conformational sampling of large biomolecules Department of Computer Science and Engineering University of Notre.
Evaluation of Fast Electrostatics Algorithms Alice N. Ko and Jesús A. Izaguirre with Thierry Matthey Department of Computer Science and Engineering University.
1 Parallel multi-grid summation for the N-body problem Jesús A. Izaguirre with Thierry Matthey Department of Computer Science and Engineering University.
1 Improved Hybrid Monte Carlo method for conformational sampling Jesús A. Izaguirre with Scott Hampton Department of Computer Science and Engineering University.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Solving the Protein Threading Problem in Parallel Nocola Yanev, Rumen Andonov Indrajit Bhattacharya CMSC 838T Presentation.
1 Nonlinear Instability in Multiple Time Stepping Molecular Dynamics Jesús Izaguirre, Qun Ma, Department of Computer Science and Engineering University.
1 An improved hybrid Monte Carlo method for conformational sampling of proteins Jesús A. Izaguirre and Scott Hampton Department of Computer Science and.
Joo Chul Yoon with Prof. Scott T. Dunham Electrical Engineering University of Washington Molecular Dynamics Simulations.
A Fault-tolerant Architecture for Quantum Hamiltonian Simulation Guoming Wang Oleg Khainovski.
RAPID: Randomized Pharmacophore Identification for Drug Design PW Finn, LE Kavraki, JC Latombe, R Motwani, C Shelton, S Venkatasubramanian, A Yao Presented.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Molecular Modeling Part I Molecular Mechanics and Conformational Analysis ORG I Lab William Kelly.
Parallel Performance of Hierarchical Multipole Algorithms for Inductance Extraction Ananth Grama, Purdue University Vivek Sarin, Texas A&M University Hemant.
CompuCell Software Current capabilities and Research Plan Rajiv Chaturvedi Jesús A. Izaguirre With Patrick M. Virtue.
Algorithms and Software for Large-Scale Simulation of Reactive Systems _______________________________ Ananth Grama Coordinated Systems Lab Purdue University.
Materials Process Design and Control Laboratory ON THE DEVELOPMENT OF WEIGHTED MANY- BODY EXPANSIONS USING AB-INITIO CALCULATIONS FOR PREDICTING STABLE.
Process Flowsheet Generation & Design Through a Group Contribution Approach Lo ï c d ’ Anterroches CAPEC Friday Morning Seminar, Spring 2005.
ChE 452 Lecture 24 Reactions As Collisions 1. According To Collision Theory 2 (Equation 7.10)
CZ5225 Methods in Computational Biology Lecture 4-5: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:
Scheduling Many-Body Short Range MD Simulations on a Cluster of Workstations and Custom VLSI Hardware Sumanth J.V, David R. Swanson and Hong Jiang University.
Combined Central and Subspace Clustering for Computer Vision Applications Le Lu 1 René Vidal 2 1 Computer Science Department, Johns Hopkins University,
E-science grid facility for Europe and Latin America E2GRIS1 André A. S. T. Ribeiro – UFRJ (Brazil) Itacuruça (Brazil), 2-15 November 2008.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
An FPGA Implementation of the Ewald Direct Space and Lennard-Jones Compute Engines By: David Chui Supervisor: Professor P. Chow.
Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.
A Technical Introduction to the MD-OPEP Simulation Tools
1 Complex Images k’k’ k”k” k0k0 -k0-k0 branch cut   k 0 pole C1C1 C0C0 from the Sommerfeld identity, the complex exponentials must be a function.
Molecular simulation methods Ab-initio methods (Few approximations but slow) DFT CPMD Electron and nuclei treated explicitly. Classical atomistic methods.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Molecular Dynamics Simulations of Compressional Metalloprotein Deformation Andrew Hung 1, Jianwei Zhao 2, Jason J. Davis 2, Mark S. P. Sansom 1 1 Department.
Molecular Simulation of Reactive Systems. _______________________________ Sagar Pandit, Hasan Aktulga, Ananth Grama Coordinated Systems Lab Purdue University.
MODELING MATTER AT NANOSCALES 3. Empirical classical PES and typical procedures of optimization Classical potentials.
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
LSM3241: Bioinformatics and Biocomputing Lecture 6: Fundamentals of Molecular Modeling Prof. Chen Yu Zong Tel:
Using Cache Models and Empirical Search in Automatic Tuning of Applications Apan Qasem Ken Kennedy John Mellor-Crummey Rice University Houston, TX Apan.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 20 Some materials adapted from Prof. Keith E. Gubbins:
Molecular dynamics (4) Treatment of long-range interactions Computing properties from simulation results.
Materials Process Design and Control Laboratory ON THE DEVELOPMENT OF WEIGHTED MANY- BODY EXPANSIONS USING AB-INITIO CALCULATIONS FOR PREDICTING STABLE.
Developing a Force Field Molecular Mechanics. Experimental One Dimensional PES Quantum mechanics tells us that vibrational energy levels are quantized,
A Pattern Language for Parallel Programming Beverly Sanders University of Florida.
Dynameomics: Protein Mechanics, Folding and Unfolding through Large Scale All-Atom Molecular Dynamics Simulations INCITE 6 David A. C. Beck Valerie Daggett.
Multipole-Based Preconditioners for Sparse Linear Systems. Ananth Grama Purdue University. Supported by the National Science Foundation.
Molecular dynamics (MD) simulations  A deterministic method based on the solution of Newton’s equation of motion F i = m i a i for the ith particle; the.
Image Charge Optimization for the Reaction Field by Matching to an Electrostatic Force Tensor Wei Song Donald Jacobs University of North Carolina at Charlotte.
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Dr. Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.
Implementation of the TIP5P Potential
for more information ... Performance Tuning
Extra Tree Classifier-WS3 Bagging Classifier-WS3
Algorithms and Software for Large-Scale Simulation of Reactive Systems
Performance Evaluation of the Parallel Fast Multipole Algorithm Using the Optimal Effectiveness Metric Ioana Banicescu and Mark Bilderback Department of.
Course Outline Introduction in algorithms and applications
Supported by the National Science Foundation.
Molecular simulation methods
Algorithms and Software for Large-Scale Simulation of Reactive Systems
Computer simulation studies of forced rupture kinetics of
Presentation transcript:

MDSimAid: Automatic optimization of fast electrostatics in molecular simulations Jesús A. Izaguirre, Michael Crocker, Alice Ko, Thierry Matthey and Yao Wang Corresponding author: Department of Computer Science and Engineering University of Notre Dame, USA

Talk Roadmap 1. Motivation: automatic tuning of molecular simulations 2. Key to performance: fast electrostatics 3. Evaluation of MDSimAid recommender system 4. Discussion and extensions

Motivation u Create a recommender system/ self- adaptive software for running molecular dynamics (MD) simulations u Importance of MD n Protein dynamics, e.g., protein folding n Sampling and thermodynamics, e.g., drug design u Complicated to use effectively n Too many algorithms for performance critical sections, with complex parameter relationships

Classical molecular dynamics u Newton’s equations of motion: u Molecules u CHARMM force field (Chemistry at Harvard Molecular Mechanics) Bonds, angles and torsions

Energy Functions Ubond = oscillations about the equilibrium bond length Uangle = oscillations of 3 atoms about an equilibrium angle Udihedral = torsional rotation of 4 atoms about a central bond Unonbond = non-bonded energy terms (electrostatics and Lennard-Jones)

Talk Roadmap 1. Motivation: automatic tuning of molecular simulations 2. Key to performance: fast electrostatics 3. Evaluation of MDSimAid recommender system 4. Discussion and extensions

Performance critical section: Electrostatics computation (1) This is a conditionally convergent sum. Ewald (1921) figured how to split this sum into two rapidly convergent sums:

Derivation of fast electrostatics methods I Ewald used the following function: The short range part is usually solved directly. The smooth part may be solved by a Fourier series, giving rise to Ewald methods:

Derivation of fast electrostatics methods II Ewald chooses splitting parameter so that work is evenly spaced as O(N 3/2 ) Particle Mesh Ewald (PME) chooses splitting parameter such that short- range part is O(N), and interpolates Fourier series to a mesh, thus allowing the use of O(N log N) FFT

Fast electrostatics algorithms There are many methods. Which one to use for a given system and accuracy? Ewald summation O(N 3/2 )Ewald, 1921 Fast Multipole Method O(N)Greengard, 1987 Particle Mesh Ewald O(N log N)Darden, 1993 Multi-grid summation (dense mat-vec as a sum of sparse mat-vec) O(N)Brandt et al., 1990 Skeel et al., 2002 Izaguirre et al., 2003

Particle Mesh Ewald u Following Ewald, separates the electrostatic interactions into two parts: n Direct-space short range evaluation n Fourier-space evaluation u The Fourier term is approximated by using fast Fourier transforms on a grid u Method parameters are grid size and cutoff of direct-space

Multigrid I

Multigrid II

Multigrid III u Complex relationship among method parameters: n Cutoff and softening distances for potential evaluation at the particle and grid levels n Grid size and interpolation order n Number of levels u Rules extracted from extensive evaluation encapsulated in MDSimAid n Fine tuned at run-time by running selected tests n Makes these methods easier to use

Talk Roadmap 1. Motivation: automatic tuning of molecular simulations 2. Key to performance: fast electrostatics 3. Evaluation of MDSimAid recommender system 4. Discussion and extensions

Related Work I: Performance Models n Darden et al., J. Chem. Phys l effect of varying parameters of Particle Mesh Ewald n Petersen et al., J. Chem. Phys l accuracy and efficiency of Particle Mesh Ewald n Krasny et al., J. Chem. Phys l used FMM to compute direct part of Ewald sum n Skeel et al., J. Comp. Chem l study of parameters for multigrid (MG) method. Compared MG to Fast Multipole Method (FMM). MG faster than FMM for low accuracy

Related Work II: Limitations u Most published results n fail to suggest how to determine the specific values n provide general trends only n contain unknown constants in equations that model performance n Do not account for modern computer architectures, particularly memory subsystem u For example, using parameters for MG suggested by MDSimAid gave an order of magnitude better accuracy and between 2 to 4 times faster execution than suggested by analytical results (Ko, 2002)

Summary u General contributions of this study n Practical guidelines for choosing parameters for each fast electrostatics algorithms, and to choose among different algorithms l Implemented important algorithms with reasonable efficiency in ProtoMol l Tested algorithms for various system sizes and accuracy l Tested quality of these methods for MD of solvated proteins n Encapsulated results on a tool called MDSimAid n Hybrid rule-based and run-time optimization for choosing algorithm and parameter for MD

Experimental protocol u These methods were tested and implemented in a common generic and OO framework, ProtoMol: 1. Smooth Particle Mesh Ewald 2. Multigrid summation 3. Ewald summation u Testing protocol: n Methods (1) and (2) above were compared against (3) to determine accuracy and relative speedup n Tested on water boxes and protein systems ranging from 1,000 to 100,000 atoms, and low and high accuracies n For selected protein systems, structural and transport properties were computed (e.g., Melittin, pdb id 2mlt, in water, atoms)

Software adaptation I (1) Domain specific language to define MD simulations in ProtoMol (Matthey and Izaguirre, ) (2) “JIT” generation of prototypes: factories of template instantiations

Software adaptation II (3) Timing and comparison facilities: force compare time force Coulomb -algorithm PMEwald -interpolation BSpline -cutoff 6.5 -gridsize force compare time force Coulomb -algorithm PMEwald -interpolation BSpline -cutoff 6.5 -gridsize (4) Selection at runtime – extensible module using Python trial_param1[i]=cutoffvalue+1 test_PME(cutoffvalue+1,gridsize,..., accuracy)

Optimization strategies I (1) Rules generated from experimental data & analytical models (thousands of data points) (2A) At run-time, M tests are tried by exploring the parameter that changes accuracy or time most rapidly (biased search), or (2B) At run-time, M tests are tried by randomly exploring all valid method parameter combinations (3) Choose the fastest algorithm/parameter combination within accuracy constraints

Example of parameter space exploration for PME

Examples of rules in MDSimAid u Rules can be generated automatically using inductive decision trees or regression rules (or any machine learning technique). Example: n Regression for PME: cutoff = * N * rPE n Regression tree for MG-Ewald: rPE 1E-5 : | rPE 1E-4 : LM3 (41/54.3%) n Models at the leaves: LM1: level = 2.96 LM2: level = 1.83 LM3: level = 1.27 u Rules can be evaluated by correlation coefficients and other metrics: n Correlation coefficient for PME n Correlation coefficient for MG-Ewald

Fastest algorithms found I Platform: Solaris; Biased search with 3 trials and unbiased with 5 trials

Fastest algorithms found II Platform: Linux; Biased search with 3 trials and unbiased with 5 trials

Platform: Solaris; Biased search with 3 trials and unbiased with 5 trials Fastest algorithms found III

Fastest algorithms found IV Platform: Linux; Biased search with 3 trials and unbiased with 5 trials

Optimization strategies II u Hybrid optimization strategies work well for algorithmic tuning: n Possible to obtain rules for general trends n Runtime optimization provides fine-tuning and may signal invalid rules u Rules provide initial guess; unbiased local search in parameter space provides improvements n In our tests, local search found three times faster fast electrostatic algorithms, at a cost of 3 times more computational search

Hybrid optimization speedup Platform: Solaris

Hybrid optimization speedup Platform: Linux

Results (10 -4 rPE)

Results (10 -5 rPE)

Talk Roadmap 1. Motivation: automatic tuning of molecular simulations 2. Key to performance: fast electrostatics 3. Evaluation of MDSimAid recommender system 4. Discussion and extensions

Related Work III u Algorithm or software selection problem (Rice, 1976) u Optimization at higher level than PHIPAC, ATLAS, etc. u Optimization approach similar to SPIRAL (Moura et al., 2000) u Some rules are generated by inductive methods, similar to PYTHIA II (Houstis et al., 2000) u Some rules come from performance models, similar to SALSA (Dongarra & Eijkhout, 2002) and BeBop (Vuduc, Yelick, Demmel, )

Discussion u Extensions n Fit in framework for SANS l Timing facilities (Autopilot at UIUC, etc.) l Algorithmic description metadata on XML l History database, agents for update of rules n Extend to more parts of MD simulation protocol: l Multiple time stepping integrators l Parallelism, clusters, grids l Lower level parameters (matrix-vector block size, etc.)? Or use ATLAS! u For further reference: n n n

Acknowledgements u NSF Biocomplexity grant IBN u NSF ACI CAREER Award ACI u NSF REU grant u Prof. Ricardo Vilalta and students, Univ. of Houston, for work on inductive learning