Download presentation
Presentation is loading. Please wait.
Published byThomas Sherman Modified over 8 years ago
1
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, 2013-2014 Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents: http://www.lgm.upmc.fr/laine/STRUCThttp://www.lgm.upmc.fr/laine/STRUCT e-mail: elodie.laine@upmc.frelodie.laine@upmc.fr
2
Elodie Laine – 18.12.2013
3
General principles Elodie Laine – 18.12.2013 Comparing structures structural similarity structural alignment segment shapes secondary structure elements Classifying structures evolutionary relationships functional motifs
4
Measuring structure similarity Root Mean Square Deviation (RMSD) after superimposition It expresses the minimal global mean distance between the n corresponding atoms of the superimposed structures a and b, where (x,y,z) are the atomic cartesian coordinates. The RMSD can be computed on a selection of atoms (backbone, heavy atoms…). The RMSD computation requires that exactly n atoms from structure a to correspond to n atoms from structure b. Elodie Laine – 18.12.2013
5
Measuring structure similarity Root Mean Square Deviation (RMSD) after superimposition It expresses the minimal global mean distance between the n corresponding atoms of the superimposed structures a and b, where (x,y,z) are the atomic cartesian coordinates. The RMSD can be computed on a selection of atoms (backbone, heavy atoms…). The RMSD computation requires that exactly n atoms from structure a to correspond to n atoms from structure b. Elodie Laine – 18.12.2013 How can we establish the correspondance between the two structures ?
6
Structural alignment: an example Elodie Laine – 18.12.2013 1/ Identification of segments with similar shapes 2/ Combination of segment pairs 3/ Extension of the alignment in 3D
7
Structural alignment: an example Elodie Laine – 18.12.2013 1/ Identification of segments with similar shapes Each protein is subdivided in: segments of n residus that can overlap secondary structure elements parts of secondary structure elements Each protein is subdivided in: segments of n residus that can overlap secondary structure elements parts of secondary structure elements Similarity between segments is estimated by using: a filter on the end-to-end distance a filter on the distance from the N-terminus Similarity between segments is estimated by using: a filter on the end-to-end distance a filter on the distance from the N-terminus
8
Structural alignment: an example Elodie Laine – 18.12.2013 2/ Combination of segment pairs Final optimal alignment
9
Structural alignment: an example Elodie Laine – 18.12.2013 3/ Extension of the alignment in 3D Extend the current alignment by one residue in the forward or backward direction Compute RMSDs after superposition The extension with the smallest RMSD is retained
10
Distance-matrix ALIgnment (DALI) Elodie Laine – 18.12.2013 DALI is a stuctural alignment method based on intra-molecular distances. Background idea: Represent each protein as a 2D matrix storing all C α -C α distances. Slide one matrix onto the other to find the common sub-matrix with the best match Implementation (Greedy): Break each matrix into elementary contact patterns Pair-up similar contact patterns (one from each protein) Assemble pairs in the correct order to yield the overall alignment Protein A Protein B 0d 12 d 13 d 14 d 12 0d 23 d 24 d 13 d 23 0d 34 d 14 d 24 d 34 0 1234 1 2 3 4 Protein ADistance matrix for Protein ADistance matrix pair
11
Distance-matrix ALIgnment (DALI) Elodie Laine – 18.12.2013 Similarity measure: Assembly of alignment: Non-trivial combinatory problem The new alignment has one overlapping segment with the previous one: (S1S2) – (S1’S2’), (S2S3) – (S2’S3’),… Available Alignment Methods: Monte Carlo optimization Branch-and-bound Neighbor walk : similarity threshold : deviation from arithmetric mean envelope function Myoglobin distance matrix
12
Distance-matrix ALIgnment (DALI) Elodie Laine – 18.12.2013 3D (Spatial)2D (Distance Matrix) 1D (Sequence) Holm & Sander (1993) J. Mol. Biol.
13
Structural alignment tools Elodie Laine – 18.12.2013
14
Protein structural classification What is the motivation for classifying protein structures ? to better understand protein biological functions to determine the evolutionary relationship between proteins Structures tend to diverge less than sequences. Proteins displaying a certain degree of sequence similarity adopt similar shapes. Generally above 40% sequence identity, the structures are very much alike. Decarboxylases with 21% sequence identity: convergent or divergent evolution? Elodie Laine – 18.12.2013
15
Protein structural classification secondary structure domain class Fold/topology superfamily α-helix, β-sheet, loop… protein structural unit secondary structure content global shape homology & similar function Increasing similarity Elodie Laine – 18.12.2013 Local/global
16
Secondary structure local assignment Elodie Laine – 18.12.2013 Backbone torsion angles φ = angle between C-Cα-N & Cα-N-C planes ψ = angle between N-C-Cα & C-Cα-N planes ω = angle between Cα-N-C & N-C-Cα planes The ω angle is normally 0° (cis) or 180° (trans). Backbone torsion angles φ = angle between C-Cα-N & Cα-N-C planes ψ = angle between N-C-Cα & C-Cα-N planes ω = angle between Cα-N-C & N-C-Cα planes The ω angle is normally 0° (cis) or 180° (trans).
17
Secondary structure local assignment Elodie Laine – 18.12.2013 Ramachandran Diagram 3 10 = 3 10 -helix α = α -helix β = extended β -stand L α = left-handed α -helix polyP = extended polyproline Ramachandran Diagram 3 10 = 3 10 -helix α = α -helix β = extended β -stand L α = left-handed α -helix polyP = extended polyproline Ramachandran et al. (1963) J. Mol. Biol. Ramachandran et al. (1968) Adv. Protein. Chem.
18
Domain assignment to structural classes Elodie Laine – 18.12.2013 Protein domains are stable units of protein structure that can fold autonomously. Small proteins and most medium sized ones have just one domain. In the past, protein domains have been described in terms of structure compactness, function and evolution, or folding. Pyruvate kinase Classes: All alpha All beta Alpha and beta – mixed (a/b) Alpha and beta proteins – segregated (a+b) Small – metal ligand, heme and/or disulfide bridges … (1) (2) (3) (4) (5) (2) (3)
19
Folds and superfamilies Elodie Laine – 18.12.2013 Domains belonging to the same fold have the same major secondary structures in the same arrangement with the same topological connections. Ex: Globin-like, Long alpha-hairpin, Type I dockerin domain… The domains within a fold are further classified into superfamilies. Domains belonging to the same superfamily have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology. Ex: Globin-like and Alpha-helical ferredoxin are the two superfamilies of the Globin-like fold.
20
Folds and superfamilies Elodie Laine – 18.12.2013 PA superfamily
21
Evolutionary processes Elodie Laine – 18.12.2013 Above a certain level of structural similarity Conservation of rare structural characteristics, e.g. βαβ left Low sequence identity, yet significant Key residues in the active site Transitivity: if A & B are homologous, B & C also, then A & C are homologous Do all proteins displaying identical folds share a common ancestor ? Divergent evolution Homology Convergent evolution Analogy
22
Superfolds/supersitesSuperfolds/supersites Elodie Laine – 18.12.2013 Proteins displaying the same superfold tend to bind their ligand in the same location, called supersite. Superfolds are folds shared between different superfamilies. Their existence is indicative of convergent evolution. βαβ barrell β propeller Rossman-like
23
Protein structure classification resources Elodie Laine – 18.12.2013 CATH/Gene3D 16 millions protein domains classified into 2,626 superfamilies http://www.cathdb.info/ SCOP/SCOPe 59,514 PDB entries representing 167,547 domains. http://scop.berkeley.edu/ Superfamily level annotations on a collection of hidden Markov models for 2,478 completely sequence genomes Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis. Csaba G, Birzele F, Zimmer R. (2009) BMC Struct Biol. 9:23
24
CATH example Elodie Laine – 18.12.2013
25
ConclusionConclusion Root Mean Square Deviation (RMSD) is a very popular measure of the global drift between two protein conformations Structural alignment methods can rely on secondary structure, C α -C α distances, fragments… to find the best match between two protein structures Structure comparison and structural alignment are very complex problems yet unresolved. Concepts regarding methods and measures are disputed Structure classification can help to understand the function of proteins, and detect divergent or convergent evolutionary processes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.