Probabilistic Roadmaps: A Tool for Computing Ensemble Properties of Molecular Motions Serkan Apaydin, Doug Brutlag 1 Carlos Guestrin, David Hsu 2 Jean-Claude.

Slides:



Advertisements
Similar presentations
Probabilistic Roadmaps. The complexity of the robot’s free space is overwhelming.
Advertisements

By Lydia E. Kavraki, Petr Svestka, Jean-Claude Latombe, Mark H. Overmars Emre Dirican
Computational methods in molecular biophysics (examples of solving real biological problems) EXAMPLE I: THE PROTEIN FOLDING PROBLEM Alexey Onufriev, Virginia.
Probabilistic Roadmap
By Guang Song and Nancy M. Amato Journal of Computational Biology, April 1, 2002 Presentation by Athina Ropodi.
Iterative Relaxation of Constraints (IRC) Can’t solve originalCan solve relaxed PRMs sample randomly but… start goal C-obst difficult to sample points.
Geometric Algorithms for Conformational Analysis of Long Protein Loops J. Cortess, T. Simeon, M. Remaud- Simeon, V. Tran.
Sampling and Connection Strategies for PRM Planners Jean-Claude Latombe Computer Science Department Stanford University.
1 Last lecture  Configuration Space Free-Space and C-Space Obstacles Minkowski Sums.
The Calculation of Enthalpy and Entropy Differences??? (Housekeeping Details for the Calculation of Free Energy Differences) first edition: p
Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
CS 326 A: Motion Planning Probabilistic Roadmaps Basic Techniques.
Determination of alpha-helix propensities within the context of a folded protein Blaber et al. J. Mol. Biol 1994.
1 Single Robot Motion Planning - II Liang-Jun Zhang COMP Sep 24, 2008.
The Probabilistic Roadmap Approach to Study Molecular Motion Jean-Claude Latombe Kwan Im Thong Hood Cho Temple Visiting Professor, NUS Kumagai Professor,
Application of Probabilistic Roadmaps to the Study of Protein Motion.
Protein folding kinetics and more Chi-Lun Lee ( 李紀倫 ) Department of Physics National Central University.
Using Motion Planning to Study Ligand Binding and Protein Folding Nancy Amato,Guang Song and Burchan Bayazit Department of Computer Science Texas A&M University.
Stochastic Roadmap Simulation: An efficient representation and algorithm for analyzing molecular motion Mehmet Serkan Apaydιn May 27 th, 2004.
1 On the Probabilistic Foundations of Probabilistic Roadmaps D. Hsu, J.C. Latombe, H. Kurniawati. On the Probabilistic Foundations of Probabilistic Roadmap.
CS 326A: Motion Planning ai.stanford.edu/~latombe/cs326/2007/index.htm Probabilistic Roadmaps: Basic Techniques.
Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005.
Using Motion Planning to Map Protein Folding Landscapes
Thomas Blicher Center for Biological Sequence Analysis
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
Randomized Planning for Short Inspection Paths Tim Danner Lydia E. Kavraki Department of Computer Science Rice University.
Determination of alpha-helix propensities within the context of a folded protein Blaber et al. J. Mol. Biol 1994.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.
Motion Algorithms: Planning, Simulating, Analyzing Motion of Physical Objects Jean-Claude Latombe Computer Science Department Stanford University.
Algorithm for Fast MC Simulation of Proteins Itay Lotan Fabian Schwarzer Dan Halperin Jean-Claude Latombe.
Stochastic roadmap simulation for the study of ligand-protein interactions Mehmet Serkan Apaydin, Carlos E. Guestrin, Chris Varma, Douglas L. Brutlag and.
CS273 Algorithms for Structure and Motion in Biology Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.
Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion Mehmet Serkan Apaydin, Douglas L. Brutlag, Carlos.
Protein Side Chain Packing Problem: A Maximum Edge-Weight Clique Algorithmic Approach Dukka Bahadur K.C, Tatsuya Akutsu and Tomokazu Seki Proceedings of.
CS 326A: Motion Planning Probabilistic Roadmaps: Sampling and Connection Strategies.
CS 326 A: Motion Planning Probabilistic Roadmaps Basic Techniques.
The Geometry of Biomolecular Solvation 1. Hydrophobicity Patrice Koehl Computer Science and Genome Center
Efficient Maintenance and Self-Collision Testing for Kinematic Chains Itay Lotan Fabian Schwarzer Dan Halperin Jean-Claude Latombe.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps 1)A.P. Singh, J.C. Latombe, and D.L. Brutlag. A Motion Planning.
Modeling molecular dynamics from simulations Nina Singhal Hinrichs Departments of Computer Science and Statistics University of Chicago January 28, 2009.
Monte Carlo Methods: Basics
Generating Better Conformations for Roadmaps in Protein Folding PARASOL Lab, Department of Computer Science, Texas A&M University,
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Conformational Sampling
Using Motion Planning to Study Protein Folding Pathways Susan Lin, Guang Song and Nancy M. Amato Department of Computer Science Texas A&M University
Statistical Physics of the Transition State Ensemble in Protein Folding Alfonso Ramon Lam Ng, Jose M. Borreguero, Feng Ding, Sergey V. Buldyrev, Eugene.
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
Department of Mechanical Engineering
Approximation of Protein Structure for Fast Similarity Measures Fabian Schwarzer Itay Lotan Stanford University.
Conformational Entropy Entropy is an essential component in ΔG and must be considered in order to model many chemical processes, including protein folding,
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
Altman et al. JACS 2008, Presented By Swati Jain.
7. Lecture SS 2005Optimization, Energy Landscapes, Protein Folding1 V7: Diffusional association of proteins and Brownian dynamics simulations Brownian.
Deciding Under Probabilistic Uncertainty Russell and Norvig: Sect ,Chap. 17 CS121 – Winter 2003.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
Quantum Mechanics/ Molecular Mechanics (QM/MM) Todd J. Martinez.
Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1, Amarda Shehu 2, Lydia Kavraki 2,3  Provided an approach to generating.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Monte Carlo Simulation of Folding Processes for 2D Linkages Modeling Proteins with Off-Grid HP-Chains Ileana Streinu Smith College Leo Guibas Rachel Kolodny.
Protein structure prediction Computer-aided pharmaceutical design: Modeling receptor flexibility Applications to molecular simulation Work on this paper.
Last lecture Configuration Space Free-Space and C-Space Obstacles
Efficient Energy Computation for Monte Carlo Simulation of Proteins
Large Time Scale Molecular Paths Using Least Action.
Justin Spiriti Zuckerman Lab MMBioS meeting 5/22/2014
Presentation transcript:

Probabilistic Roadmaps: A Tool for Computing Ensemble Properties of Molecular Motions Serkan Apaydin, Doug Brutlag 1 Carlos Guestrin, David Hsu 2 Jean-Claude Latombe, Chris Varma Computer Science Department Stanford University 1 Department of Biochemistry, Stanford University 2 Computer Science Department, University of North Carolina

Goal of our Research Develop efficient computational representations and algorithms to study molecular pathways for protein folding and ligand-protein binding Protein folding  RECOMB ’02 Ligand-protein binding  ECCB ‘02

Acknowledgements People: Leo Guibas Michael Levitt, Structural Biology Itay Lotan Vijay Pande, Chemistry Fabian Schwarzer Amit Singh Rohit Singh Funding: NSF-ITR ACI Stanford’s Bio-X and Graduate Fellowship programs

Analogy with Robotics

Configuration Space Approximate the free space by random sampling  Probabilistic Roadmaps

Probabilistic Roadmap free space [Kavraki, Svetska, Latombe,Overmars, 95]

Probabilistic Completeness The probability that a roadmap fails to correctly capture the connectivity of the free space goes to 0 exponentially in the number of milestones (~ running time).  Random sampling is convenient incremental scheme for approximating the free space

Computed Examples

Biology  Robotics Energy field, instead of joint control Continuous energy field, instead of binary free and in-collision spaces Multiple pathways, instead of single collision-free path Potentially many more degrees of freedom Relation to real world is more complex

Initial Work [Singh, Latombe, Brutlag, 99] Study of ligand-protein binding Probabilistic roadmaps with edges weighted by energetic plausibility Search of most plausible paths

Initial Work [Singh, Latombe, Brutlag, 99] Study of ligand-protein binding Probabilistic roadmaps with edges weighted by energetic plausibility Search of most plausible paths Study of energy profiles along such paths Catalytic Site energy

Initial Work [Singh, Latombe, Brutlag, 99] Study of ligand-protein binding Probabilistic roadmaps with edges weighted by energetic plausibility Search of most plausible paths Study of energy profiles along such paths Extensions to protein folding [Song and Amato, 01] [Apaydin et al., 01]

New Idea: Capture the stochastic nature of molecular motion by assigning probabilities to edges vivi vjvj P ij

Why is this a good idea? 1)We can approximate Monte Carlo simulation as closely as we wish 2)Unlike with MC simulation, we avoid the local-minima problem 3)We can consider all pathways in the roadmap at once to compute ensemble properties

Edge probabilities Follow Metropolis criteria: Self-transition probability: vjvj vivi P ij P ii

Stochastic simulation on roadmap and Monte Carlo simulation converge to same Boltzmann distribution S Stochastic Roadmap Simulation P ij

Problems with Monte Carlo Simulation  Much time is wasted in local minima  Each run generates a single pathway

Solution P ij Treat roadmap as a Markov chain and use the First-Step Analysis tool

Example #1: Probability of Folding p fold Unfolded set Folded set p fold 1- p fold “We stress that we do not suggest using p fold as a transition coordinate for practical purposes as it is very computationally intensive.” Du, Pande, Grosberg, Tanaka, and Shakhnovich “On the Transition Coordinate for Protein Folding” Journal of Chemical Physics (1998). HIV integrase [Du et al. ‘98]

P ii F: Folded setU: Unfolded set First-Step Analysis P ij i k j l m P ik P il P im Let f i = p fold (i) After one step: f i = P ii f i + P ij f j + P ik f k + P il f l + P im f m =1  One linear equation per node  Solution gives p fold for all nodes  No explicit simulation run  All pathways are taken into account  Sparse linear system

In Contrast … Computing p fold with MC simulation requires:  Performing many MC simulation runs  Counting the number of times F is attained first for every conformation of interest:

Computational Tests 1ROP (repressor of primer) 2  helices 6 DOF 1HDD (Engrailed homeodomain) 3  helices 12 DOF H-P energy model with steric clash exclusion [Sun et al., 95]

1ROP Correlation with MC Approach

1HDD

Computation Times (1ROP) Monte Carlo: 49 conformations Over 11 days of computer time Over 10 6 energy computations Roadmap: 5000 conformations hours of computer time ~15,000 energy computations ~4 orders of magnitude speedup!

Example #2: Ligand-Protein Interaction Computation of escape time from funnels of attraction around potential binding sites (funnel = ball of 10A rmsd)

Computing Escape Time with Roadmap Funnel of Attraction i j k l m P ii P im P il P ik P ij  i = 1 + P ii  i + P ij  j + P ik  k + P il  l + P im  m (escape time is measured as number of steps of stochastic simulation) = 0

Similar Computation Through Simulation Similar Computation Through Simulation [Sept, Elcock and McCammon `99] 10K to 30K independent simulations

Applications 1)Distinguishing catalytic site: Given several potential binding sites, which one is the catalytic site?

Complexes Studied ligandprotein# random nodes # DOFs oxamate1ldm80007 Streptavidin1stp Hydroxylamine4ts COT1cjw THK1aid IPM1ao PTI3tpi800013

Distinction Based on Energy ProteinBound state Best potential binding site 1stp ts tpi ldm cjw aid ao (kcal/mol) Able to distinguish catalytic site Not able

Distinction Based on Escape Time ProteinBound state Best potential binding site 1stp3.4E+91.1E+7 4ts13.8E+101.8E+6 3tpi1.3E+115.9E+5 1ldm8.1E+53.4E+6 1cjw5.4E+84.2E+6 1aid9.7E+51.6E+8 1ao56.6E+75.7E+6 (# steps) Able to distinguish catalytic site Not able

Applications 1)Distinguishing catalytic site 2)Computational mutagenesis C C O O O GLN-101 ARG-106 ASP-195 HIS-193 ASP-166 ARG-169 NADH Loop Chemical environment of LDH-NADH-substrate complex (pyruvate) (catalyzes conversion of pyruvate to lactate in the presence of NADH CH 3 Some amino acids are deleted entirely, replaced by other amino acids, or sidechains altered

Binding of Pyruvate to LDH ASP-195 HIS-193 ASP-166 ARG THR-245 C C O O O CH 3 NADH GLN-101 ARG-106 Loop

Results C C O O O GLN-101 ARG-106 ASP-195 HIS-193 ASP-166 ARG-169 NADH Loop CH 3 THR-245 Mutant Escape TimeChange Wildtype3.216E6N/A

Results C C O O O GLN-101 ALA-106 ASP-195 ALA-193 ASP-166 ARG-169 NADH + Loop CH 3 MutantEscape TimeChange Wildtype3.216E6N/A His193  Ala Arg106  Ala 4.126E2 

Results MutantEscape TimeChange Wildtype3.216E6N/A His193  Ala Arg106  Ala 4.126E2  His193  Ala3.381E3  Arg106  Ala2.550E2  Asp195  Asn5.221E7  Gln101  Arg1.669E6No change Thr245  Gly4.607E5  C C O O O GLN-101 ARG-106 ASP-195 HIS-193 ASP-166 ARG-169 NADH Loop CH 3 GLY-245

Conclusion Probabilistic roadmaps are a promising computational tool for studying ensemble properties of molecular pathways Current and future work:  Better kinetic/energetic models  Experimentally verifiable tests  Non-uniform sampling strategies  Encoding MD simulation

Stochastic simulation on a roadmap and MC simulation converge to the same distribution  (Boltzman): For any set S,  >0,  >0,  >0, there exists N such that a roadmap with N milestones has error bounded by: with probability at least 1-  vsvs vgvg S Stochastic Roadmap Simulation

Energy Function [Sun et. al. ‘95] Based on pairwise sidechain centroid distances H-P model  Amino acids classified as either hydrophobic or polar  Hydrophobic residues contact rewarded Exclusion term to prevent steric clashes

Ligand-Protein Modeling DOF = 10 –3 coordinates to position root atom; –2 angles to specify first bond; –Angles for all remaining non-terminal atoms; –Bond angles are assumed constant; Protein assumed rigid [Singh, Latombe and Brutlag `99] x,y,z      

Energy of Interaction EvEv R ij EcEc E v = 0.2[(R 0 /R ij ) (R 0 /R ij ) 6 ] E c = 332 Q i Q j /(  R ij ) Energy = van der Waals interaction ( E v ) + electrostatic interaction ( E c )

Solvent Effects Is only valid for an infinite medium of uniform dielectric; Dielectric discontinuities result in induced surface charges; Solution: Poisson-Boltzman equation   E c = 332 Q i Q j /(  R ij )  Use Delphi [ Rocchia et al `01]  Finite Difference solution is based on discretizing the workspace into a uniform grid. [  (r).  (r)] -  (r)k(r) 2 sinh([  (r)] + 4  r f (r)/kT = 0