Download presentation
Presentation is loading. Please wait.
1
Stochastic Roadmap Simulation: An efficient representation and algorithm for analyzing molecular motion Mehmet Serkan Apaydιn May 27 th, 2004
2
Molecular motion is an essential process of life Stanford bio-x cluster An NMR spectrometer (CS273) Bovine Spongiform Encephalopathy (BSE) http://www.usd.edu/eric/ protein (mis)-folding Drug molecules act by binding to proteins http://www.the-scientist.com Ligand-protein binding
3
Computing p fold, the best order parameter in protein folding is expensive using classical simulation techniques Unfolded set Folded set p fold 1- p fold “We stress that we do not suggest using p fold as a transition coordinate for practical purposes as it is very computationally intensive.” Du, Pande, Grosberg, Tanaka, and Shakhnovich “On the Transition Coordinate for Protein Folding” Journal of Chemical Physics (1998). HIV integrase [Du et al. ‘98]
4
Stochastic Roadmap Simulation (SRS) molecular motion Develop efficient computational representations and algorithms to study molecular motion pathways for protein folding and ligand-protein binding
5
Contributions New computational framework for studying molecular motion –Transition probabilities –Correspondence to Monte Carlo –First step analysis –Extension to non-uniform sampling Computation of ensemble properties: –protein folding: p fold parameter comparison with Monte Carlo Quantitative predictions of experimental values –ligand-protein binding: escape time Qualitative predictions about the role of amino acids in the active site of a protein Application to distinguish the catalytic site from a set of potential binding sites P ij
6
Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Extension of basic framework Quantitative prediction of experimental results on protein folding
7
Proteins and their structure Macromolecule Building block of life.
8
Ligand-Protein Binding
9
Simulating molecular motion Monte Carlo (MC) or Molecular Dynamics http://folding.stanford.edu
10
Molecular Representations Atomistic model Linkage model –Internal parameter representation (bond angles, lengths, torsional angles) –Each secondary structure element as a vector [Lotan `04]
11
Analogy with Robotics X0X0 11 22 33 X1X1 Y0Y0 X2X2 X3X3
12
Molecular Energetics E = E S + E + E S-B + E T or + E vdW + E dipole bonded terms non-bonded terms Force fields Gō models Hydrophobic-Polar models (cs273)
13
MC simulation
15
Problems with Monte Carlo Simulation Each run generates a single pathway Much time is wasted in local minima
16
A path planning technique: Probabilistic Roadmaps (PRM) [Kavraki et.al.`96] C-obstacle Preprocessing Configuration space node Query edge Qinit Qgoal
17
Application of PRM to molecular motion Study of ligand-protein binding Probabilistic roadmaps with edges weighted by energetic plausibility Search for the minimum weight paths [Singh, Latombe, Brutlag, `99]
18
Application of PRM to molecular motion Study of ligand-protein binding Probabilistic roadmaps with edges weighted by energetic plausibility Search for the minimum weight paths Extensions to protein folding [Singh, Latombe, Brutlag, `99] [Song and Amato, `01] [Apaydın et al., `01]
19
How many pathways are there in a roadmap? n/m 23456 22 3412 4838184 5161259768512 6324145382793841262816 1, 2, 12, 184, 8512, 1262816, 575780564, 789360053252, 3266598486981642, (10x10) 41044208702632496804, (11x11) 1568758030464750013214100, (12x12) 182413291514248049241470885236 http://mathworld.wolfram.com/Self-AvoidingWalk.html Number of Self-Avoiding Walks on a 2D Grid
20
Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Future work
21
New Idea: Stochastic Conformational Roadmaps Capture the stochastic nature of molecular motion by assigning probabilities to edges vivi vjvj P ij [Apaydın et. al., RECOMB `02, WAFR`02] Collaborators: C. Guestrin, D. Hsu
22
Edge probabilities Self transition probabilities: P ij vivi vjvj P ii Follow Metropolis criteria: Correspond to probabilities in Monte Carlo simulation.
23
S Relationship to MC simulation P ij Each path on graph = a path of MC simulation Roadmap represents many MC simulation paths simultaneously Stochastic Roadmap Simulation and Monte Carlo Simulation converge to the same distribution (the Boltzmann distribution).
24
Using SRS to compute ensemble properties P ij Markov chain Treat roadmap as a Markov chain and use First-Step Analysis
25
Application of SRS to protein folding: Probability of Folding p fold Unfolded set Folded set p fold 1- p fold “We stress that we do not suggest using p fold as a transition coordinate for practical purposes as it is very computationally intensive.” Du, Pande, Grosberg, Tanaka, and Shakhnovich “On the Transition Coordinate for Protein Folding” Journal of Chemical Physics (1998). HIV integrase [Du et al. ‘98]
26
P ii F: Folded setU: Unfolded set First-Step Analysis P ij i k j l m P ik P il P im Let f i = p fold (i) After one step: f i = P ii f i + P ij f j + P ik f k + P il f l + P im f m =1 One linear equation per node Solution gives p fold for all nodes No explicit simulation run All pathways are taken into account Sparse linear system
27
In Contrast … Computing p fold with MC simulation requires: Performing many MC simulation runs Counting the number of times F is attained first for every conformation of interest:
28
Comparison: SRS vs. MC (on synthetic landscape) Number of nodes L1 Distance
29
Computational Tests on two real proteins 1ROP (repressor of primer) 2 helices 6 DOF 1HDD (Engrailed homeodomain) 3 helices 12 DOF H-P energy model with steric clash exclusion [Sun et al., `95]
30
Differences in p fold values obtained by SRS and MC for 1ROP and 1HDD Number of nodes L1 Distance
31
p fold on real protein: ß hairpin Immunoglobin binding protein (Protein G) Last 16 amino acids C-α based representation Gō model based energy 42 DOFs [Zhou and Karplus, `99]
32
Comparison between SRS and MC for ß hairpin Number of nodes L1 Distance
33
Computation Times (ß hairpin) Monte Carlo: (30 simulations) 1 conformation ~10 hours of computer time Over 10 7 energy computations Roadmap: 2000 conformations 23 seconds of computer time ~50,000 energy computations ~6 orders of magnitude speedup!
34
Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Extension of basic framework Quantitative prediction of experimental results on protein folding
35
Application of SRS to Ligand-Protein Interactions Distinguishing catalytic site: Among several potential binding sites, which one is the catalytic site? Studying effect of catalytic amino acids upon binding/unbinding [Apaydın et. al., ECCB ‘02] Collaborators: C. Guestrin, C. Varma
36
Funnels of attractions and escape time from a funnel Potential binding sites Funnel = Energy gradient around a site that guides the ligand to that site. Defined as all ligand conformations within 10A rmsd of the site. [Camacho and Vajda `01] Computation of escape time from funnels of attraction around potential binding sites
37
Computing Escape Time with Roadmap Funnel of Attraction i j k l m P ii P im P il P ik P ij i = 1 + P ii i + P ij j + P ik k + P il l + P im m (escape time is measured as number of steps of stochastic simulation) = 0
38
Results on lactate dehydrogenase C C O O O GLN-101 ARG-106 ASP-195 HIS-193 ASP-166 ARG-169 NADH + + + Loop CH 3 THR-245 3.216E6 Escape Time N/A Change Wildtype Mutant
39
Results on lactate dehydrogenase C C O O O GLN-101 ALA-106 ASP-195 ALA-193 ASP-166 ARG-169 NADH + Loop CH 3 4.126E2 3.216E6 Escape Time N/A Change His193 Ala- Arg106 Ala Wildtype Mutant
40
Results on lactate dehydrogenase 4.607E5 1.669E6 5.221E7 2.550E2 3.381E3 4.126E2 3.216E6 Escape Time No change N/A Change Thr245 Gly Gln101 Arg Asp195 Asn Arg106 Ala His193 Ala His193 Ala- Arg106 Ala Wildtype Mutant C C O O O GLN-101 ARG-106 ASP-195 HIS-193 ASP-166 ARG-169 NADH + + + Loop CH 3 GLY-245
41
Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Extension of basic framework Quantitative prediction of experimental results on protein folding
42
A non uniform sampling strategy: sampling local minima and saddles of the landscape [Henkelman, Jonsson’99]
43
Adding critical points to the roadmap obtains the same quality in p fold values with less number of nodes Number of nodes L1 Distance
44
Outline Background Stochastic Roadmap Simulation Applications –Protein folding –Ligand-protein binding Extension of basic framework Quantitative prediction of experimental results on protein folding
45
Using p fold to make quantitative predictions Connecting theory with experiment: –Rates –Φ values Transition State computation using: –Energy barriers considering monotonic pathways –P fold considering all pathways [Garbuzynskiy, Finkelstein, Galzitskaya `04] Collaborators: TH Chiang, D. Hsu (N.U. Singapore) [Fersht `99]
46
Φ Value Results using pfold are better for 3 (out of 5) proteins Protein Correlation to experiment in [Garbuzynskiy et. al., `04] Correlation to experiment with p fold B1 IgG-binding domain of protein G 0.740.78 Src SH3 domain 0.630.65 SH3 domain of -spectrin 0.810.78 Sso7d 0.580.28 CI2 0.350.51
47
Computing rates with p fold results in better correlation with experiment [Garbuzynskiy et. al., `04] using p fold Correlation: 0.67Correlation: 0.83 --experimental rate --computed rate Protein # log(k f )
48
Contributions New computational framework for studying molecular motion –Transition probabilities –Correspondence to Monte Carlo –First step analysis –Extension to non-uniform sampling Computation of ensemble properties: –protein folding: p fold parameter comparison with Monte Carlo Quantitative predictions of experimental values –ligand-protein binding: escape time Qualitative predictions about the role of amino acids in the active site of a protein Application to distinguish the catalytic site from a set of potential binding sites P ij
49
Future work Non-uniform sampling on high-dimensional examples Computing and reducing the error in the computed parameters Estimating the number of nodes needed Exploring larger systems and pushing the experiment q3q3 q1q1 q2q2 q4q4 q5q5
50
SRS code available! Visit: http://robotics.stanford.edu/~apaydin/software.html
51
Acknowledgements My advisors: Prof. Latombe, Prof. Brutlag Prof. Van Roy Prof. McCluskey My committee: Prof. Motwani, Prof. Vuckovic Coauthors: D. Hsu, C. Guestrin, S. Kasif, A. Singh, C. Varma Collaborators: TH Chiang, J. Greenberg, S. Ieong, F. Schwarzer, R. Singh, A. Tellez Faculty: Prof. Altman, Prof. Baldwin, Prof. Guibas, Prof. Pande Prof. Kavraki (Rice) Prof. Zell (Tuebingen) Prof. Snoeyink (UNC) Funding: David L. Cheriton Stanford Graduate Fellowship NSF Biogeometry grant Stanford’s Bio-X program Resources: Bio-X SGI Supercomputer, Bio-X PC computer cluster Colleagues: N. Batada, A. Ben-Hur, S. Bennett, E. Boas, T. Bretl, J. Brown, F. Buron, L. Chong, A. Collins, S. Elmer, P. Fong, A. Garg, S. Gokturk, H. Gonzales- Banos, K. Hauser, G. Henkelman, P. Isto, G. Jayachandran, J. Kuffner, S. Larson, M. Liang, B. Naughton, X. Liu, I. Lotan, H. Mandyam, N. Mitra, S. Mitra, A. Nguyen, YM Rhee, D. Russel, M. Saha, G. Sanchez-Ante, S. Saxonov, S. Schmidler, J. Shapiro, J. Shin, P. Shirvani, M. Shirts, C. Snow, C. Yu, B. Zagrovic, A. Zomorodian Staff: I. Contreras, P. Cook, J. Engelson, K. Hedjasi, J. McCormick, H. Nguyen, N. Riewerts, D. Shankle Friends and family
52
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.