Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Motion Planning to Map Protein Folding Landscapes

Similar presentations


Presentation on theme: "Using Motion Planning to Map Protein Folding Landscapes"— Presentation transcript:

1 Using Motion Planning to Map Protein Folding Landscapes
Nancy M. Amato Parasol Lab,Texas A&M University

2 Paper Folding via Motion Planning
Polyhedron 25 dof (10 samples, 2 sec) Soccer Ball 31 dof (10 samples, 6 sec) Box 12 (5) dof (218 samples, 3 sec) Periscope 11 dof (450 samples, 6 sec)

3 Protein Folding via Motion Planning Folding Paths for Proteins G & L
Protein G Protein L

4 TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN
Protein Folding We are interested in the folding process how the protein folds to its native structure TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN Different from protein structure prediction Predict native structure given amino acid sequence Native 3D structure is important b/c influences function

5 Why Study Folding Pathways?
Importance of Studying Pathways insight into protein interactions & function may lead to better structure prediction algorithms Diseases such as Alzheimer’s & Mad Cow related to misfolded proteins Computational Techniques Critical Hard to study experimentally (happens too fast) Can study folding for thousands of already solved structures Help guide/design future experiments normal - misfold prion protein

6 Folding Landscapes Each conformation has a potential energy
Configuration space Potential Each conformation has a potential energy Native state is global minimum Set of all conformations forms landscape Shape of landscape reflects folding behavior Native state Different proteins  different landscapes  different folding behaviors

7 Using Motion Planning to Map Folding Landscapes [RECOMB 01,02, 04; PSB 03]
Configuration space Potential Use Probabilistic Roadmap (PRM) method from motion planning to build roadmap Roadmap approximates the folding landscape Characterizes the main features of landscape Can extract multiple folding pathways from roadmap Compute population kinetics for roadmap A conformation Native state

8 Related Work Other PRM-Based approaches for studying molecular motions
Folding landscape Trajectory (path #) Path quality Time dependent (running time) Folding kinetics Native state needed Molecular Dynamics No Yes (1) good Yes. (very long) Monte Carlo Yes Statistical Model N/A No (short) Yes (only average) Our PRM approach (RECOMB 01, 02,04, PSB 03) (many) approximate Yes, multiple kinetics Other PRM-Based approaches for studying molecular motions Other work on protein folding ([Apaydin et al, ICRA’01,RECOMB’02]) Ligand binding ([Singh, Latombe, Brutlag, ISMB’99], [Bayazit, Song, Amato, ICRA’01]) RNA Folding (Tang, Kirkpatrick, Thomas, Song, Amato [RECOMB 04]) Time dependent, ok?

9 TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN
Modeling Proteins Primary Structure TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN One amino acid Secondary Structure a helix  sheet + variable loops = Tertiary Structure We model an amino acid with 2 torsional degrees of freedom: Standard practice by biochemists

10 Roadmap Construction: Node Generation
Sample using known native state sample around it, gradually grow out generate conformations by randomly selecting phi/psi angles Criterion for accepting a node: Compute potential energy E of each node and retain it with probability: Native state N Denser distribution around native state

11 Ramachandran Plots for Different Sampling Techniques
Uniform sampling Gaussian sampling Iterative Gaussian sampling

12 Distributions for different types: Potential Energy vs
Distributions for different types: Potential Energy vs. RMSD for roadmap nodes all alpha alpha + beta all beta Say ‘landscape sampled by our methods’.

13 Roadmap Construction Node Connection
1. Find k closest nodes for each roadmap node (k=20) use Euclidean distance u v 2. Assign edge weight to reflect energetic feasibility: c1 c2 c3 cn Edge weight w(u,v) = f(E(C1), E(C2),… E(Cn)) lower weight  more feasible 1 13 152 681 Native state

14 PRMs for Protein Folding: Key Issues
Energy Functions The degree to which the roadmap accurately reflects folding landscape depends on the quality of energy calculation. We use our own coarse potential (fast) and well known all atom potential (slow) Validation In [ICRA’01, RECOMB ’01, JCB ’02], results validated with experimental results [Li & Woodward 1999].

15 One Folding Path of Protein A A nice movie…. But so what?
Ribbon Model Space-fill Model B domain of staphylococcal protein A

16 Roadmap Analysis Secondary Structure Formation Order
[RECOMB’01, JCB’02, RECOMB’02, JCB’03, PSB’03] Order in which secondary structure forms during folding hairpin 1,2 helix Q: Which forms first?

17 Formation Time Calculation
Secondary structure has formed when x% of the native contacts are present native contact: less than 7 A between Ca atoms in native state 10 30 20 40 50 time step at which each contact forms If we pick x% as 60%, then at time step 30, three contacts present, structure considered formed native contact

18 Contact Map A contact map is a triangular
matrix which identifies all the native contacts among residues

19 Contact Maps

20 Secondary Structure Formation Order: Timed Contact Map of a Path [JCB’02]
residue # (IV:  1-4) 1-2 1-4 114 142 135 131 residue # 1.explain contact map. X-Y are residue number. 2. Regions of contacts. 3. How time is calculated. Formation order: ,  3-4,  1-2,  1-4 3-4 Average T = 142 protein G (domain B1)

21 Secondary Structure Formation Order: Timed Contact Map of a Path [JCB’02]
residue # (IV:  1-4) 1-2 1-4 114 142 135 131 residue # 1.explain contact map. X-Y are residue number. 2. Regions of contacts. 3. How time is calculated. Formation order: ,  3-4,  1-2,  1-4 3-4 Average T = 142 protein G (domain B1)

22 Secondary Structure Formation Order: Validation Sample Summary
PDB # of Residues #order % of paths Secondary structure formation order Exp. 1GB1 56 2 66 34 ,3-4,1-2,1-4 ,1-2,3-4,1-4 Agreed 1BDD 60 1 100 2,3,1,2-3, 1-3 1COA 64 90 10 , 3-4, 2-3, 1-4, -4 , 3-4, 2-3, -4, 1-4 2AIT 74 9.1 7.4 4-5, 1-2 … 1-2, 4-5 … 1UBQ 76 3 80 15 ,3-4,1-2, 3-5,1-5 3-4, , 1-2, 3-5,1-5 1BRN 110 4 75 8.3 1,2,3 … 1,3,2 … Not sure

23 Detailed Study of Proteins G & L [PSB’03]
Protein G Protein L Protein G Protein G & Protein L Similar structure (1 helix, 2 beta strands), but 15% sequence identity Fold differently Protein G: helix, beta 3-4, beta1-2, beta 1-4 [Kuszewski et al 1994, Orban et al. 1995] Protein L: helix, beta 1-2, beta 3-4, beta 1-4 [Yi & Baker 1996, Yi et al 1997] Can our approach detect the difference? Yes! 75% Protein G paths & 80% Protein L paths have “right” order Increases to 90% & 100%, resp., when use all atom potential

24 Helix and Beta Strands Coarse Potential [PSB’03]
Protein G: Protein L: (b3- b4 forms first) over 2k paths analyzed b2 b1 b4 b3 (b1- b2 forms first) over 2k paths b2 b1 b4 b3

25 Helix and Beta Strands All-atom Potential
Protein G: Protein L: (b3- b4 forms first) Analyze First x% Contacts b2 Contacts SS Formation Order 20 40 60 80 100 b1 a , b 3- b4 , b1 - b2 , b1 - b4 79 79 74 82 90 all a , b1 - b2 , b3 - b4 , b1 - b4 21 21 26 18 10 a b4 , b 3- b4 , b1 - b2 , b1 - b4 77 74 71 77 81 hydrophobic a , b 1- b2 , b3 - b4 , b1 - b4 23 26 29 23 19 b3 (b1- b2 forms first) Contacts SS Formation Order 20 40 60 80 100 a , b1 - b2 b3 b4 99 1 all hydrophobic Analyze First x% Contacts b2 b1 b4 b3

26 Summary: PRM-Based Protein Folding
PRM roadmaps approximate energy landscapes Efficiently produce multiple folding pathways Secondary structure formation order (e.g. G and L) better than trajectory-based simulation methods, such as Monte Carlo, molecular dynamics Provide a good way to study folding kinetics multiple folding kinetics in same landscape (roadmap) natural way to study the statistical behavior of folding more realistic than statistical models (e.g. Lattice models, Baker’s model PNAS’99, Munoz’s model, PNAS’99)

27 RNA Folding Results X. Tang, B. Kirkpatrick, S. Thomas, G. Song
RNA Folding Results X. Tang, B. Kirkpatrick, S. Thomas, G. Song [RECOMB’04 ] RNA energy landscape can be completely described by huge roadmaps. Heuristics are used to approximate energy landscape using small roadmaps. Our roadmaps contain many folding pathways. Energy profile Folding Steps Population kinetics analysis on the roadmaps shows that heuristic 1 can efficiently describe the energy landscape using a small subset of nodes Map1 (Complete): 142 Nodes Map2 (Heuristic 1): 15 Nodes Map3 (Heuristic 2): 33 Nodes Population Population Population Folding Steps Folding Steps Folding Steps

28 Ligand Binding [IEEE ICRA`01]
Docking: Find a configuration of the ligand near the protein that satisfies geometric, electro-static and chemical constraints PRM Approach (Singh, Latombe, Brutlag, 1999) rapidly explores high dimensional space We use OBPRM: better suited for generating conformations in binding site (near protein surface) Haptic User interaction haptics (sense of touch) helps user understand molecular interaction User assists planner by suggesting promising regions, and planner will post-process and ‘improve’

29 Contact Information For more information, check out our website: Credits: My students: Guang Song (now a Postdoc at Iowa State), Shawna Thomas, Xinyu Tang & Ken Dill (UCSF) and Marty Scholtz (Texas A&M)


Download ppt "Using Motion Planning to Map Protein Folding Landscapes"

Similar presentations


Ads by Google