Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein Structure Prediction Samantha Chui Oct. 26, 2004.

Similar presentations


Presentation on theme: "Protein Structure Prediction Samantha Chui Oct. 26, 2004."— Presentation transcript:

1 Protein Structure Prediction Samantha Chui Oct. 26, 2004

2 Central Dogma of Biology Question: Given a protein sequence, to what conformation will it fold? DNA sequenceProtein sequenceProtein structure transcription & translation folding

3 How does nature do it? Hydrophobicity vs. hydrophilicity Van der Waals interaction Electrostatic interaction Hydrogen bonds Disulfide bonds

4 Current Approaches Experimental Methods X-ray crystallography NMR spectroscopy Computational Methods Homology modeling Similar sequences fold into similar structures Threading Dissimilar sequences may fold into similar structures Ab initio No similarity assumptions Conformational search

5 Assembly of sub-structural units known structures … fragment library protein sequence predicted structure

6 “Small Libraries of Protein Fragments Model Native Protein Structures Accurately” Rachel Kolodny, Patrice Koehl, Leonidas Guibas, and Michael Levitt, 2002 Goal: Find finite set of protein fragments that can be used to construct accurate discrete conformations for any protein 1. Generate fragments from known proteins 2. Cluster fragments to identify common structural motifs 3. Test library accuracy on proteins not in the initial set

7 Datasets of protein fragments 200 unique protein domains from Protein Data Bank (PDB) 36,397 residues Four sets of backbone fragments 4, 5, 6, and 7-residue long fragments Divide each protein domain into consecutive fragments beginning at random initial position f

8 Fragment structural similarity Coordinate root-mean-square (cRMS) deviation of C α atoms cRMS(A,B) = sqrt( Σ d i 2 /N) one to one mapping between atoms in structure A and structure B Translate and rotate to find best alignment 0 if superimpose perfectly

9 Pruning and clustering Outliers have large cRMS deviation from all other fragments Discard according to some fragment-length specific threshold k-means simulated annealing clustering Repeatedly run k-means clustering, merge nearby clusters and split disperse clusters Scoring function: total variance = Σ (x – μ) 2 Less sensitive to initial choice of cluster centers than k-means

10 Compiling the libraries Select cluster centroids as library entries Minimum sum of cRMS deviations from all the other cluster fragments Form representative set of protein fragments Library contents highly dependent upon clustering procedure For each set of fragments, start with 50 random seeds and choose library with minimal total variance score

11 Evaluating quality of a library Local-fit How well library fits local conformation of all proteins in test set. Global-fit How well library fits global three- dimensional conformation of all proteins in test set

12 Local-fit method Protein structures broken into set of all overlapping fragments of length f Find for each protein fragment the most similar fragment in the library (cRMS) Score = Average cRMS value over all fragments in all proteins in the test set

13 Local-fit results

14 Global-fit method Concatenate best local-fit library fragments just found Determine fragment’s orientation by superimposing its first three C α atoms onto last three C α atoms of preceding fragment

15 Global-fit method Number of possible sequences of fragments exponential in protein’s length Greedy algorithm finds good rather than best global-fit approximation Start at N terminus, approximate increasingly larger segments of the protein Concatenate library fragment which will yield structure of minimal cRMS deviation from corresponding segment Deterministic, linear time

16 Global-fit results 100 fragments 5 residues 10 states/residue 20 fragments 5 residues 4.47 states/residue 0.91 Å1.85 Å 50 fragments 7 residues 2.66 states/residue 2.78 Å

17 Assembly of sub-structural units known structures … fragment library protein sequence predicted structure

18 “Protein structure prediction via combinatorial assembly of sub-structural units” Yuval Inbar, Hadar Benyamini, Ruth Nussinov, and Haim J. Wolfson, 2003

19 CombDock Input: structural units (SUs) with known 3D conformations SUs considered rigid bodies rotated and translated with respect to each other Goal: predict overall structure Constraints Penetration: avoid steric clashes Backbone: restriction on maximum distance between consecutive SUs

20 All pairs docking N(N-1)/2 pairs of SUs Calculate candidate transformations according to matching complementary local features on surface of SUs Apply transformation on 2 nd SU of pair Keep K best for each Clustering to ensure all K transformations yield significantly different complexes

21 Combinatorial assembly Multigraph representation Vertices = SUs Edges = transformations between two SUs K parallel edges between any two vertices Final protein conformation = spanning tree N SUs, one connectivity component, no cycles 12K … i k j Transformation between i and k induced by transformations (ij, jk)

22 Combinatorial Assembly N N-2 K N-1 different spanning trees Not all spanning trees are valid complexes Use heuristical algorithm Two subtrees adjacent iff there exists an index i so that vertex i is in one subtree and i+1 is in the other Sequential tree: recursive definition One vertex Tree with edge that connects two adjacent sequential trees

23 Combinatorial Assembly Hierarchical algorithm of N stages i th stage: generate sequential trees with i vertices Construct trees by connecting adjacent sequential trees of smaller sizes generated earlier Keep D best sequential trees at each step Discard trees which do not meet backbone and penetration constraints Score = sum of scores of transformations

24 Combinatorial Assembly

25 CombDock Results

26 Conclusion Experimental Methods X-ray crystallography NMR spectroscopy Computational Methods Homology modeling Similar sequences fold into similar structures Threading Dissimilar sequences may fold into similar structures Ab initio No similarity assumptions Conformational search known structures … fragment library protein sequence predicted structure

27 References Kolodny et al., “Small libraries of protein fragments model protein structures accurately” Inbar et al., “Protein structure prediction via combinatorial assembly of sub-structural units”


Download ppt "Protein Structure Prediction Samantha Chui Oct. 26, 2004."

Similar presentations


Ads by Google