Stochastic Roadmap Simulation: An efficient representation and algorithm for analyzing molecular motion Mehmet Serkan Apaydιn May 27 th, 2004.

Slides:

Advertisements

Similar presentations

By Guang Song and Nancy M. Amato Journal of Computational Biology, April 1, 2002 Presentation by Athina Ropodi.

Advertisements

Iterative Relaxation of Constraints (IRC) Can’t solve originalCan solve relaxed PRMs sample randomly but… start goal C-obst difficult to sample points.

Geometric Algorithms for Conformational Analysis of Long Protein Loops J. Cortess, T. Simeon, M. Remaud- Simeon, V. Tran.

Understanding Strong Field Closed Loop Learning Control Experiments PRACQSYS August 2006.

Exploring Folding Landscapes with Motion Planning Techniques Bonnie Kirkpatrick 2, Xinyu Tang 1, Shawna Thomas 1, Dr. Nancy Amato 1 1 Texas A&M University.

Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.

1 Single Robot Motion Planning - II Liang-Jun Zhang COMP Sep 24, 2008.

The Probabilistic Roadmap Approach to Study Molecular Motion Jean-Claude Latombe Kwan Im Thong Hood Cho Temple Visiting Professor, NUS Kumagai Professor,

Application of Probabilistic Roadmaps to the Study of Protein Motion.

Protein folding kinetics and more Chi-Lun Lee ( 李紀倫 ) Department of Physics National Central University.

Using Motion Planning to Study Ligand Binding and Protein Folding Nancy Amato,Guang Song and Burchan Bayazit Department of Computer Science Texas A&M University.

CS 326A: Motion Planning ai.stanford.edu/~latombe/cs326/2007/index.htm Probabilistic Roadmaps: Basic Techniques.

Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005.

Using Motion Planning to Map Protein Folding Landscapes

Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &

Probabilistic Roadmaps: A Tool for Computing Ensemble Properties of Molecular Motions Serkan Apaydin, Doug Brutlag 1 Carlos Guestrin, David Hsu 2 Jean-Claude.

Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Motion Algorithms: Planning, Simulating, Analyzing Motion of Physical Objects Jean-Claude Latombe Computer Science Department Stanford University.

Algorithm for Fast MC Simulation of Proteins Itay Lotan Fabian Schwarzer Dan Halperin Jean-Claude Latombe.

Stochastic roadmap simulation for the study of ligand-protein interactions Mehmet Serkan Apaydin, Carlos E. Guestrin, Chris Varma, Douglas L. Brutlag and.

CS273 Algorithms for Structure and Motion in Biology Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe.

Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

RNA Folding Kinetics Bonnie Kirkpatrick Dr. Nancy Amato, Faculty Advisor Guang Song, Graduate Student Advisor.

Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion Mehmet Serkan Apaydin, Douglas L. Brutlag, Carlos.

CS 326A: Motion Planning Probabilistic Roadmaps: Sampling and Connection Strategies.

CS 326 A: Motion Planning Probabilistic Roadmaps Basic Techniques.

Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.

Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.

Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps 1)A.P. Singh, J.C. Latombe, and D.L. Brutlag. A Motion Planning.

1 Protein Folding Atlas F. Cook IV & Karen Tran. 2 Overview What is Protein Folding? Motivation Experimental Difficulties Simulation Models:  Configuration.

Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Modeling molecular dynamics from simulations Nina Singhal Hinrichs Departments of Computer Science and Statistics University of Chicago January 28, 2009.

NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice.

Monte Carlo Methods: Basics

Generating Better Conformations for Roadmaps in Protein Folding PARASOL Lab, Department of Computer Science, Texas A&M University,

What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.

Conformational Sampling

Using Motion Planning to Study Protein Folding Pathways Susan Lin, Guang Song and Nancy M. Amato Department of Computer Science Texas A&M University

Statistical Physics of the Transition State Ensemble in Protein Folding Alfonso Ramon Lam Ng, Jose M. Borreguero, Feng Ding, Sergey V. Buldyrev, Eugene.

Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.

Study of Loop Length & Residue Composition of β-Hairpin Motif

CZ5225 Methods in Computational Biology Lecture 4-5: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:

Department of Mechanical Engineering

A Technical Introduction to the MD-OPEP Simulation Tools

Understanding Molecular Simulations Introduction

Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.

Markov Cluster (MCL) algorithm Stijn van Dongen.

Deciding Under Probabilistic Uncertainty Russell and Norvig: Sect ,Chap. 17 CS121 – Winter 2003.

Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++

LSM3241: Bioinformatics and Biocomputing Lecture 6: Fundamentals of Molecular Modeling Prof. Chen Yu Zong Tel:

Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University

CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.

Monte Carlo Simulation of Folding Processes for 2D Linkages Modeling Proteins with Off-Grid HP-Chains Ileana Streinu Smith College Leo Guibas Rachel Kolodny.

Mean Field Theory and Mutually Orthogonal Latin Squares in Peptide Structure Prediction N. Gautham Department of Crystallography and Biophysics University.

Protein structure prediction Computer-aided pharmaceutical design: Modeling receptor flexibility Applications to molecular simulation Work on this paper.

How NMR is Used for the Study of Biomacromolecules Analytical biochemistry Comparative analysis Interactions between biomolecules Structure determination.

Molecular dynamics (MD) simulations  A deterministic method based on the solution of Newton’s equation of motion F i = m i a i for the ith particle; the.

Protein Structures from A Statistical Perspective Jinfeng Zhang Department of Statistics Florida State University.

A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Dr. Harish Vashisth Department of Chemical Engineering, University of New Hampshire,

A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Harish Vashisth Department of Chemical Engineering, University of New Hampshire,

Modeling molecular dynamics from simulations

Department of Chemistry

PRM based Protein Folding

Bayesian Refinement of Protein Functional Site Matching

Protein folding.

CZ5225 Methods in Computational Biology Lecture 7: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:

Protein structure prediction.

謝孫源 (Sun-Yuan Hsieh) 成功大學電機資訊學院資訊工程系

Experimental Overview

Deciding Under Probabilistic Uncertainty

Presentation transcript:

Stochastic Roadmap Simulation: An efficient representation and algorithm for analyzing molecular motion Mehmet Serkan Apaydιn May 27 th, 2004

Molecular motion is an essential process of life Stanford bio-x cluster An NMR spectrometer (CS273) Bovine Spongiform Encephalopathy (BSE) protein (mis)-folding Drug molecules act by binding to proteins Ligand-protein binding

Computing p fold, the best order parameter in protein folding is expensive using classical simulation techniques Unfolded set Folded set p fold 1- p fold “We stress that we do not suggest using p fold as a transition coordinate for practical purposes as it is very computationally intensive.” Du, Pande, Grosberg, Tanaka, and Shakhnovich “On the Transition Coordinate for Protein Folding” Journal of Chemical Physics (1998). HIV integrase [Du et al. ‘98]

Stochastic Roadmap Simulation (SRS) molecular motion Develop efficient computational representations and algorithms to study molecular motion pathways for protein folding and ligand-protein binding

Contributions New computational framework for studying molecular motion –Transition probabilities –Correspondence to Monte Carlo –First step analysis –Extension to non-uniform sampling Computation of ensemble properties: –protein folding: p fold parameter comparison with Monte Carlo Quantitative predictions of experimental values –ligand-protein binding: escape time Qualitative predictions about the role of amino acids in the active site of a protein Application to distinguish the catalytic site from a set of potential binding sites P ij

Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Extension of basic framework Quantitative prediction of experimental results on protein folding

Proteins and their structure Macromolecule Building block of life.

Ligand-Protein Binding

Simulating molecular motion Monte Carlo (MC) or Molecular Dynamics

Molecular Representations Atomistic model Linkage model –Internal parameter representation (bond angles, lengths, torsional angles) –Each secondary structure element as a vector [Lotan `04]

Analogy with Robotics X0X0 11 22 33 X1X1 Y0Y0 X2X2 X3X3

Molecular Energetics E = E S + E  + E S-B + E T or + E vdW + E dipole bonded terms non-bonded terms Force fields Gō models Hydrophobic-Polar models (cs273)

MC simulation

Problems with Monte Carlo Simulation  Each run generates a single pathway  Much time is wasted in local minima

A path planning technique: Probabilistic Roadmaps (PRM) [Kavraki et.al.`96] C-obstacle Preprocessing Configuration space node Query edge Qinit Qgoal

Application of PRM to molecular motion Study of ligand-protein binding Probabilistic roadmaps with edges weighted by energetic plausibility Search for the minimum weight paths [Singh, Latombe, Brutlag, `99]

Application of PRM to molecular motion Study of ligand-protein binding Probabilistic roadmaps with edges weighted by energetic plausibility Search for the minimum weight paths Extensions to protein folding [Singh, Latombe, Brutlag, `99] [Song and Amato, `01] [Apaydın et al., `01]

How many pathways are there in a roadmap? n/m , 2, 12, 184, 8512, , , , , (10x10) , (11x11) , (12x12) Number of Self-Avoiding Walks on a 2D Grid

Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Future work

New Idea: Stochastic Conformational Roadmaps Capture the stochastic nature of molecular motion by assigning probabilities to edges vivi vjvj P ij [Apaydın et. al., RECOMB `02, WAFR`02] Collaborators: C. Guestrin, D. Hsu

Edge probabilities Self transition probabilities: P ij vivi vjvj P ii Follow Metropolis criteria: Correspond to probabilities in Monte Carlo simulation.

S Relationship to MC simulation P ij Each path on graph = a path of MC simulation Roadmap represents many MC simulation paths simultaneously Stochastic Roadmap Simulation and Monte Carlo Simulation converge to the same distribution  (the Boltzmann distribution).

Using SRS to compute ensemble properties P ij Markov chain Treat roadmap as a Markov chain and use First-Step Analysis

Application of SRS to protein folding: Probability of Folding p fold Unfolded set Folded set p fold 1- p fold “We stress that we do not suggest using p fold as a transition coordinate for practical purposes as it is very computationally intensive.” Du, Pande, Grosberg, Tanaka, and Shakhnovich “On the Transition Coordinate for Protein Folding” Journal of Chemical Physics (1998). HIV integrase [Du et al. ‘98]

P ii F: Folded setU: Unfolded set First-Step Analysis P ij i k j l m P ik P il P im Let f i = p fold (i) After one step: f i = P ii f i + P ij f j + P ik f k + P il f l + P im f m =1  One linear equation per node  Solution gives p fold for all nodes  No explicit simulation run  All pathways are taken into account  Sparse linear system

In Contrast … Computing p fold with MC simulation requires:  Performing many MC simulation runs  Counting the number of times F is attained first for every conformation of interest:

Comparison: SRS vs. MC (on synthetic landscape) Number of nodes L1 Distance

Computational Tests on two real proteins 1ROP (repressor of primer) 2  helices 6 DOF 1HDD (Engrailed homeodomain) 3  helices 12 DOF H-P energy model with steric clash exclusion [Sun et al., `95]

Differences in p fold values obtained by SRS and MC for 1ROP and 1HDD Number of nodes L1 Distance

p fold on real protein: ß hairpin Immunoglobin binding protein (Protein G) Last 16 amino acids C-α based representation Gō model based energy 42 DOFs [Zhou and Karplus, `99]

Comparison between SRS and MC for ß hairpin Number of nodes L1 Distance

Computation Times (ß hairpin) Monte Carlo: (30 simulations) 1 conformation ~10 hours of computer time Over 10 7 energy computations Roadmap: 2000 conformations 23 seconds of computer time ~50,000 energy computations ~6 orders of magnitude speedup!

Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Extension of basic framework Quantitative prediction of experimental results on protein folding

Application of SRS to Ligand-Protein Interactions  Distinguishing catalytic site: Among several potential binding sites, which one is the catalytic site?  Studying effect of catalytic amino acids upon binding/unbinding [Apaydın et. al., ECCB ‘02] Collaborators: C. Guestrin, C. Varma

Funnels of attractions and escape time from a funnel Potential binding sites Funnel = Energy gradient around a site that guides the ligand to that site. Defined as all ligand conformations within 10A rmsd of the site. [Camacho and Vajda `01] Computation of escape time from funnels of attraction around potential binding sites

Computing Escape Time with Roadmap Funnel of Attraction i j k l m P ii P im P il P ik P ij  i = 1 + P ii  i + P ij  j + P ik  k + P il  l + P im  m (escape time is measured as number of steps of stochastic simulation) = 0

Results on lactate dehydrogenase C C O O O GLN-101 ARG-106 ASP-195 HIS-193 ASP-166 ARG-169 NADH Loop CH 3 THR E6 Escape Time N/A Change Wildtype Mutant

Results on lactate dehydrogenase C C O O O GLN-101 ALA-106 ASP-195 ALA-193 ASP-166 ARG-169 NADH + Loop CH E E6 Escape Time  N/A Change His193  Ala- Arg106  Ala Wildtype Mutant

Results on lactate dehydrogenase 4.607E E E E E E E6 Escape Time  No change     N/A Change Thr245  Gly Gln101  Arg Asp195  Asn Arg106  Ala His193  Ala His193  Ala- Arg106  Ala Wildtype Mutant C C O O O GLN-101 ARG-106 ASP-195 HIS-193 ASP-166 ARG-169 NADH Loop CH 3 GLY-245

Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Extension of basic framework Quantitative prediction of experimental results on protein folding

A non uniform sampling strategy: sampling local minima and saddles of the landscape [Henkelman, Jonsson’99]

Adding critical points to the roadmap obtains the same quality in p fold values with less number of nodes Number of nodes L1 Distance

Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Extension of basic framework Quantitative prediction of experimental results on protein folding

Using p fold to make quantitative predictions Connecting theory with experiment: –Rates –Φ values Transition State computation using: –Energy barriers considering monotonic pathways –P fold considering all pathways [Garbuzynskiy, Finkelstein, Galzitskaya `04] Collaborators: TH Chiang, D. Hsu (N.U. Singapore) [Fersht `99]

Φ Value Results using pfold are better for 3 (out of 5) proteins Protein Correlation to experiment in [Garbuzynskiy et. al., `04] Correlation to experiment with p fold B1 IgG-binding domain of protein G Src SH3 domain SH3 domain of  -spectrin Sso7d CI

Computing rates with p fold results in better correlation with experiment [Garbuzynskiy et. al., `04] using p fold Correlation: 0.67Correlation: experimental rate --computed rate Protein # log(k f )

Contributions New computational framework for studying molecular motion –Transition probabilities –Correspondence to Monte Carlo –First step analysis –Extension to non-uniform sampling Computation of ensemble properties: –protein folding: p fold parameter comparison with Monte Carlo Quantitative predictions of experimental values –ligand-protein binding: escape time Qualitative predictions about the role of amino acids in the active site of a protein Application to distinguish the catalytic site from a set of potential binding sites P ij

Future work Non-uniform sampling on high-dimensional examples Computing and reducing the error in the computed parameters Estimating the number of nodes needed Exploring larger systems and pushing the experiment q3q3 q1q1 q2q2 q4q4 q5q5

SRS code available! Visit:

Acknowledgements  My advisors: Prof. Latombe, Prof. Brutlag Prof. Van Roy Prof. McCluskey  My committee: Prof. Motwani, Prof. Vuckovic  Coauthors: D. Hsu, C. Guestrin, S. Kasif, A. Singh, C. Varma  Collaborators: TH Chiang, J. Greenberg, S. Ieong, F. Schwarzer, R. Singh, A. Tellez  Faculty: Prof. Altman, Prof. Baldwin, Prof. Guibas, Prof. Pande Prof. Kavraki (Rice) Prof. Zell (Tuebingen) Prof. Snoeyink (UNC)  Funding: David L. Cheriton Stanford Graduate Fellowship NSF Biogeometry grant Stanford’s Bio-X program  Resources: Bio-X SGI Supercomputer, Bio-X PC computer cluster  Colleagues: N. Batada, A. Ben-Hur, S. Bennett, E. Boas, T. Bretl, J. Brown, F. Buron, L. Chong, A. Collins, S. Elmer, P. Fong, A. Garg, S. Gokturk, H. Gonzales- Banos, K. Hauser, G. Henkelman, P. Isto, G. Jayachandran, J. Kuffner, S. Larson, M. Liang, B. Naughton, X. Liu, I. Lotan, H. Mandyam, N. Mitra, S. Mitra, A. Nguyen, YM Rhee, D. Russel, M. Saha, G. Sanchez-Ante, S. Saxonov, S. Schmidler, J. Shapiro, J. Shin, P. Shirvani, M. Shirts, C. Snow, C. Yu, B. Zagrovic, A. Zomorodian  Staff: I. Contreras, P. Cook, J. Engelson, K. Hedjasi, J. McCormick, H. Nguyen, N. Riewerts, D. Shankle  Friends and family

Thank you!