Using Motion Planning to Map Protein Folding Landscapes

Slides:



Advertisements
Similar presentations
1 Miklós Vargyas, Judit Papp May, 2005 MarvinSpace – live demo.
Advertisements

PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Computational methods in molecular biophysics (examples of solving real biological problems) EXAMPLE I: THE PROTEIN FOLDING PROBLEM Alexey Onufriev, Virginia.
Probabilistic Roadmap
Probabilistic Roadmap Methods (PRMs)
By Guang Song and Nancy M. Amato Journal of Computational Biology, April 1, 2002 Presentation by Athina Ropodi.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Iterative Relaxation of Constraints (IRC) Can’t solve originalCan solve relaxed PRMs sample randomly but… start goal C-obst difficult to sample points.
Geometric Algorithms for Conformational Analysis of Long Protein Loops J. Cortess, T. Simeon, M. Remaud- Simeon, V. Tran.
Randomized Motion Planning for Car-like Robots with C-PRM Guang Song and Nancy M. Amato Department of Computer Science Texas A&M University College Station,
Exploring Folding Landscapes with Motion Planning Techniques Bonnie Kirkpatrick 2, Xinyu Tang 1, Shawna Thomas 1, Dr. Nancy Amato 1 1 Texas A&M University.
A COMPLEX NETWORK APPROACH TO FOLLOWING THE PATH OF ENERGY IN PROTEIN CONFORMATIONAL CHANGES Del Jackson CS 790G Complex Networks
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
RNA Folding Xinyu Tang Bonnie Kirkpatrick. Overview Introduction to RNA Previous Work Problem Hofacker ’ s Paper Chen and Dill ’ s Paper Modeling RNA.
Protein Structure, Databases and Structural Alignment
The Probabilistic Roadmap Approach to Study Molecular Motion Jean-Claude Latombe Kwan Im Thong Hood Cho Temple Visiting Professor, NUS Kumagai Professor,
Application of Probabilistic Roadmaps to the Study of Protein Motion.
Protein folding kinetics and more Chi-Lun Lee ( 李紀倫 ) Department of Physics National Central University.
Using Motion Planning to Study Ligand Binding and Protein Folding Nancy Amato,Guang Song and Burchan Bayazit Department of Computer Science Texas A&M University.
Providing Haptic ‘Hints’ to Automatic Motion Planners Providing Haptic ‘Hints’ to Automatic Motion Planners Burchan Bayazit Joint Work With Nancy Amato.
CS 326A: Motion Planning ai.stanford.edu/~latombe/cs326/2007/index.htm Probabilistic Roadmaps: Basic Techniques.
Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005.
Energetics and kinetics of protein folding. Comparison to other self-assembling systems?
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Randomized Motion Planning for Car-like Robots with C-PRM Guang Song, Nancy M. Amato Department of Computer Science Texas A&M University College Station,
Stochastic roadmap simulation for the study of ligand-protein interactions Mehmet Serkan Apaydin, Carlos E. Guestrin, Chris Varma, Douglas L. Brutlag and.
RNA Folding Kinetics Bonnie Kirkpatrick Dr. Nancy Amato, Faculty Advisor Guang Song, Graduate Student Advisor.
Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion Mehmet Serkan Apaydin, Douglas L. Brutlag, Carlos.
Motion Planning: From Intelligent CAD to Computer Animation to Protein Folding Nancy M. Amato Parasol Lab,Texas A&M University.
Providing Haptic ‘Hints’ to Automatic Motion Planners Providing Haptic ‘Hints’ to Automatic Motion Planners Burchan Bayazit Joint Work With Nancy Amato.
Providing Haptic ‘Hints’ to Automatic Motion Planners Providing Haptic ‘Hints’ to Automatic Motion Planners by Burchan Bayazit Department of Computer Science.
Exploring Folding Landscapes with Motion Planning Techniques Bonnie Kirkpatrick Montana State University Dr. Nancy Amato Guang Song Xinyu Tang Texas A&M.
A Randomized Approach to Robot Path Planning Based on Lazy Evaluation Robert Bohlin, Lydia E. Kavraki (2001) Presented by: Robbie Paolini.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps 1)A.P. Singh, J.C. Latombe, and D.L. Brutlag. A Motion Planning.
1 Protein Folding Atlas F. Cook IV & Karen Tran. 2 Overview What is Protein Folding? Motivation Experimental Difficulties Simulation Models:  Configuration.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Modeling molecular dynamics from simulations Nina Singhal Hinrichs Departments of Computer Science and Statistics University of Chicago January 28, 2009.
Generating Better Conformations for Roadmaps in Protein Folding PARASOL Lab, Department of Computer Science, Texas A&M University,
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Using Motion Planning to Study Protein Folding Pathways Susan Lin, Guang Song and Nancy M. Amato Department of Computer Science Texas A&M University
Randomized Motion Planning: From Intelligent CAD to Digital Actors to Protein Folding Nancy M. Amato Department of Computer Science Texas A&M University.
Representations of Molecular Structure: Bonds Only.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Statistical Physics of the Transition State Ensemble in Protein Folding Alfonso Ramon Lam Ng, Jose M. Borreguero, Feng Ding, Sergey V. Buldyrev, Eugene.
ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory.
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
Structure prediction: Ab-initio Lecture 9 Structural Bioinformatics Dr. Avraham Samson Let’s think!
PROTEIN FOLDING: H-P Lattice Model 1. Outline: Introduction: What is Protein? Protein Folding Native State Mechanism of Folding Energy Landscape Kinetic.
FlexWeb Nassim Sohaee. FlexWeb 2 Proteins The ability of proteins to change their conformation is important to their function as biological machines.
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Ab-initio protein structure prediction ? Chen Keasar BGU Any educational usage of these slides is welcomed. Please acknowledge.
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
Randomized KinoDynamic Planning Steven LaValle James Kuffner.
CS 326A: Motion Planning Probabilistic Roadmaps for Path Planning in High-Dimensional Configuration Spaces (1996) L. Kavraki, P. Švestka, J.-C. Latombe,
PRM based Protein Folding
Protein Structure Prediction and Protein Homology modeling
Protein Structure Prediction
Giovanni Settanni, Antonino Cattaneo, Paolo Carloni 
Protein structure prediction.
Understanding protein folding via free-energy surfaces from theory and experiment  Aaron R Dinner, Andrej Šali, Lorna J Smith, Christopher M Dobson, Martin.
Conformational Search
The Dynamic Basis for Signal Propagation in Human Pin1-WW
Protein structure prediction
Presentation transcript:

Using Motion Planning to Map Protein Folding Landscapes Nancy M. Amato Parasol Lab,Texas A&M University

Paper Folding via Motion Planning Polyhedron 25 dof (10 samples, 2 sec) Soccer Ball 31 dof (10 samples, 6 sec) Box 12 (5) dof (218 samples, 3 sec) Periscope 11 dof (450 samples, 6 sec)

Protein Folding via Motion Planning Folding Paths for Proteins G & L Protein G Protein L

TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN Protein Folding We are interested in the folding process how the protein folds to its native structure TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN Different from protein structure prediction Predict native structure given amino acid sequence Native 3D structure is important b/c influences function

Why Study Folding Pathways? Importance of Studying Pathways insight into protein interactions & function may lead to better structure prediction algorithms Diseases such as Alzheimer’s & Mad Cow related to misfolded proteins Computational Techniques Critical Hard to study experimentally (happens too fast) Can study folding for thousands of already solved structures Help guide/design future experiments normal - misfold prion protein

Folding Landscapes Each conformation has a potential energy Configuration space Potential Each conformation has a potential energy Native state is global minimum Set of all conformations forms landscape Shape of landscape reflects folding behavior Native state Different proteins  different landscapes  different folding behaviors

Using Motion Planning to Map Folding Landscapes [RECOMB 01,02, 04; PSB 03] Configuration space Potential Use Probabilistic Roadmap (PRM) method from motion planning to build roadmap Roadmap approximates the folding landscape Characterizes the main features of landscape Can extract multiple folding pathways from roadmap Compute population kinetics for roadmap A conformation Native state

Related Work Other PRM-Based approaches for studying molecular motions Folding landscape Trajectory (path #) Path quality Time dependent (running time) Folding kinetics Native state needed Molecular Dynamics No Yes (1) good Yes. (very long) Monte Carlo Yes Statistical Model N/A No (short) Yes (only average) Our PRM approach (RECOMB 01, 02,04, PSB 03) (many) approximate Yes, multiple kinetics Other PRM-Based approaches for studying molecular motions Other work on protein folding ([Apaydin et al, ICRA’01,RECOMB’02]) Ligand binding ([Singh, Latombe, Brutlag, ISMB’99], [Bayazit, Song, Amato, ICRA’01]) RNA Folding (Tang, Kirkpatrick, Thomas, Song, Amato [RECOMB 04]) Time dependent, ok?

TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN Modeling Proteins Primary Structure TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN One amino acid Secondary Structure a helix  sheet + variable loops = Tertiary Structure We model an amino acid with 2 torsional degrees of freedom: Standard practice by biochemists

Roadmap Construction: Node Generation Sample using known native state sample around it, gradually grow out generate conformations by randomly selecting phi/psi angles Criterion for accepting a node: Compute potential energy E of each node and retain it with probability: Native state N Denser distribution around native state

Ramachandran Plots for Different Sampling Techniques Uniform sampling Gaussian sampling Iterative Gaussian sampling

Distributions for different types: Potential Energy vs Distributions for different types: Potential Energy vs. RMSD for roadmap nodes all alpha alpha + beta all beta Say ‘landscape sampled by our methods’.

Roadmap Construction Node Connection 1. Find k closest nodes for each roadmap node (k=20) use Euclidean distance u v 2. Assign edge weight to reflect energetic feasibility: c1 c2 c3 cn … Edge weight w(u,v) = f(E(C1), E(C2),… E(Cn)) lower weight  more feasible 1 13 152 681 Native state

PRMs for Protein Folding: Key Issues Energy Functions The degree to which the roadmap accurately reflects folding landscape depends on the quality of energy calculation. We use our own coarse potential (fast) and well known all atom potential (slow) Validation In [ICRA’01, RECOMB ’01, JCB ’02], results validated with experimental results [Li & Woodward 1999].

One Folding Path of Protein A A nice movie…. But so what? Ribbon Model Space-fill Model B domain of staphylococcal protein A

Roadmap Analysis Secondary Structure Formation Order [RECOMB’01, JCB’02, RECOMB’02, JCB’03, PSB’03] Order in which secondary structure forms during folding hairpin 1,2 helix Q: Which forms first?

Formation Time Calculation Secondary structure has formed when x% of the native contacts are present native contact: less than 7 A between Ca atoms in native state 10 30 20 40 50 time step at which each contact forms If we pick x% as 60%, then at time step 30, three contacts present, structure considered formed native contact

Contact Map A contact map is a triangular matrix which identifies all the native contacts among residues

Contact Maps

Secondary Structure Formation Order: Timed Contact Map of a Path [JCB’02] residue # (IV:  1-4) 140 143 140 143 140 141 142 144 139 143 143 1-2   1-4 114 142 135 131 residue #  1.explain contact map. X-Y are residue number. 2. Regions of contacts. 3. How time is calculated. Formation order: ,  3-4,  1-2,  1-4  3-4 Average T = 142 protein G (domain B1)

Secondary Structure Formation Order: Timed Contact Map of a Path [JCB’02] residue # (IV:  1-4) 140 143 140 143 140 141 142 144 139 143 143 1-2   1-4 114 142 135 131 residue #  1.explain contact map. X-Y are residue number. 2. Regions of contacts. 3. How time is calculated. Formation order: ,  3-4,  1-2,  1-4  3-4 Average T = 142 protein G (domain B1)

Secondary Structure Formation Order: Validation Sample Summary PDB # of Residues #order % of paths Secondary structure formation order Exp. 1GB1 56 2 66 34 ,3-4,1-2,1-4 ,1-2,3-4,1-4 Agreed 1BDD 60 1 100 2,3,1,2-3, 1-3 1COA 64 90 10 , 3-4, 2-3, 1-4, -4 , 3-4, 2-3, -4, 1-4 2AIT 74 9.1 7.4 4-5, 1-2 … 1-2, 4-5 … 1UBQ 76 3 80 15 ,3-4,1-2, 3-5,1-5 3-4, , 1-2, 3-5,1-5 1BRN 110 4 75 8.3 1,2,3 … 1,3,2 … Not sure

Detailed Study of Proteins G & L [PSB’03] Protein G Protein L Protein G Protein G & Protein L Similar structure (1 helix, 2 beta strands), but 15% sequence identity Fold differently Protein G: helix, beta 3-4, beta1-2, beta 1-4 [Kuszewski et al 1994, Orban et al. 1995] Protein L: helix, beta 1-2, beta 3-4, beta 1-4 [Yi & Baker 1996, Yi et al 1997] Can our approach detect the difference? Yes! 75% Protein G paths & 80% Protein L paths have “right” order Increases to 90% & 100%, resp., when use all atom potential

Helix and Beta Strands Coarse Potential [PSB’03] Protein G: Protein L: (b3- b4 forms first) over 2k paths analyzed b2 b1 b4 b3 (b1- b2 forms first) over 2k paths b2 b1 b4 b3

Helix and Beta Strands All-atom Potential Protein G: Protein L: (b3- b4 forms first) Analyze First x% Contacts b2 Contacts SS Formation Order 20 40 60 80 100 b1 a , b 3- b4 , b1 - b2 , b1 - b4 79 79 74 82 90 all a , b1 - b2 , b3 - b4 , b1 - b4 21 21 26 18 10 a b4 , b 3- b4 , b1 - b2 , b1 - b4 77 74 71 77 81 hydrophobic a , b 1- b2 , b3 - b4 , b1 - b4 23 26 29 23 19 b3 (b1- b2 forms first) Contacts SS Formation Order 20 40 60 80 100 a , b1 - b2 b3 b4 99 1 all hydrophobic Analyze First x% Contacts b2 b1 b4 b3

Summary: PRM-Based Protein Folding PRM roadmaps approximate energy landscapes Efficiently produce multiple folding pathways Secondary structure formation order (e.g. G and L) better than trajectory-based simulation methods, such as Monte Carlo, molecular dynamics Provide a good way to study folding kinetics multiple folding kinetics in same landscape (roadmap) natural way to study the statistical behavior of folding more realistic than statistical models (e.g. Lattice models, Baker’s model PNAS’99, Munoz’s model, PNAS’99)

RNA Folding Results X. Tang, B. Kirkpatrick, S. Thomas, G. Song RNA Folding Results X. Tang, B. Kirkpatrick, S. Thomas, G. Song [RECOMB’04 ] RNA energy landscape can be completely described by huge roadmaps. Heuristics are used to approximate energy landscape using small roadmaps. Our roadmaps contain many folding pathways. Energy profile Folding Steps Population kinetics analysis on the roadmaps shows that heuristic 1 can efficiently describe the energy landscape using a small subset of nodes Map1 (Complete): 142 Nodes Map2 (Heuristic 1): 15 Nodes Map3 (Heuristic 2): 33 Nodes Population Population Population Folding Steps Folding Steps Folding Steps

Ligand Binding [IEEE ICRA`01] Docking: Find a configuration of the ligand near the protein that satisfies geometric, electro-static and chemical constraints PRM Approach (Singh, Latombe, Brutlag, 1999) rapidly explores high dimensional space We use OBPRM: better suited for generating conformations in binding site (near protein surface) Haptic User interaction haptics (sense of touch) helps user understand molecular interaction User assists planner by suggesting promising regions, and planner will post-process and ‘improve’

Contact Information For more information, check out our website: http://parasol.tamu.edu/~amato/ Credits: My students: Guang Song (now a Postdoc at Iowa State), Shawna Thomas, Xinyu Tang & Ken Dill (UCSF) and Marty Scholtz (Texas A&M)