Presentation is loading. Please wait.

Presentation is loading. Please wait.

Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, 2013-2014 Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:

Similar presentations


Presentation on theme: "Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, 2013-2014 Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:"— Presentation transcript:

1 Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, 2013-2014 Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents: http://www.lgm.upmc.fr/laine/STRUCThttp://www.lgm.upmc.fr/laine/STRUCT e-mail: elodie.laine@upmc.frelodie.laine@upmc.fr

2 Elodie Laine – 18.12.2013

3 General principles Elodie Laine – 18.12.2013 Comparing structures structural similarity structural alignment segment shapes secondary structure elements Classifying structures evolutionary relationships functional motifs

4 Measuring structure similarity  Root Mean Square Deviation (RMSD) after superimposition It expresses the minimal global mean distance between the n corresponding atoms of the superimposed structures a and b, where (x,y,z) are the atomic cartesian coordinates. The RMSD can be computed on a selection of atoms (backbone, heavy atoms…). The RMSD computation requires that exactly n atoms from structure a to correspond to n atoms from structure b. Elodie Laine – 18.12.2013

5 Measuring structure similarity  Root Mean Square Deviation (RMSD) after superimposition It expresses the minimal global mean distance between the n corresponding atoms of the superimposed structures a and b, where (x,y,z) are the atomic cartesian coordinates. The RMSD can be computed on a selection of atoms (backbone, heavy atoms…). The RMSD computation requires that exactly n atoms from structure a to correspond to n atoms from structure b. Elodie Laine – 18.12.2013 How can we establish the correspondance between the two structures ?

6 Structural alignment: an example Elodie Laine – 18.12.2013 1/ Identification of segments with similar shapes 2/ Combination of segment pairs 3/ Extension of the alignment in 3D

7 Structural alignment: an example Elodie Laine – 18.12.2013 1/ Identification of segments with similar shapes Each protein is subdivided in: segments of n residus that can overlap secondary structure elements parts of secondary structure elements Each protein is subdivided in: segments of n residus that can overlap secondary structure elements parts of secondary structure elements Similarity between segments is estimated by using: a filter on the end-to-end distance a filter on the distance from the N-terminus Similarity between segments is estimated by using: a filter on the end-to-end distance a filter on the distance from the N-terminus

8 Structural alignment: an example Elodie Laine – 18.12.2013 2/ Combination of segment pairs Final optimal alignment

9 Structural alignment: an example Elodie Laine – 18.12.2013 3/ Extension of the alignment in 3D Extend the current alignment by one residue in the forward or backward direction Compute RMSDs after superposition The extension with the smallest RMSD is retained

10 Distance-matrix ALIgnment (DALI) Elodie Laine – 18.12.2013 DALI is a stuctural alignment method based on intra-molecular distances. Background idea: Represent each protein as a 2D matrix storing all C α -C α distances. Slide one matrix onto the other to find the common sub-matrix with the best match Implementation (Greedy): Break each matrix into elementary contact patterns Pair-up similar contact patterns (one from each protein) Assemble pairs in the correct order to yield the overall alignment Protein A Protein B 0d 12 d 13 d 14 d 12 0d 23 d 24 d 13 d 23 0d 34 d 14 d 24 d 34 0 1234 1 2 3 4 Protein ADistance matrix for Protein ADistance matrix pair

11 Distance-matrix ALIgnment (DALI) Elodie Laine – 18.12.2013 Similarity measure: Assembly of alignment: Non-trivial combinatory problem The new alignment has one overlapping segment with the previous one: (S1S2) – (S1’S2’), (S2S3) – (S2’S3’),… Available Alignment Methods: Monte Carlo optimization Branch-and-bound Neighbor walk  : similarity threshold  : deviation from arithmetric mean envelope function Myoglobin distance matrix

12 Distance-matrix ALIgnment (DALI) Elodie Laine – 18.12.2013 3D (Spatial)2D (Distance Matrix) 1D (Sequence) Holm & Sander (1993) J. Mol. Biol.

13 Structural alignment tools Elodie Laine – 18.12.2013

14 Protein structural classification What is the motivation for classifying protein structures ?  to better understand protein biological functions  to determine the evolutionary relationship between proteins Structures tend to diverge less than sequences. Proteins displaying a certain degree of sequence similarity adopt similar shapes. Generally above 40% sequence identity, the structures are very much alike. Decarboxylases with 21% sequence identity: convergent or divergent evolution? Elodie Laine – 18.12.2013

15 Protein structural classification secondary structure domain class Fold/topology superfamily α-helix, β-sheet, loop… protein structural unit secondary structure content global shape homology & similar function Increasing similarity Elodie Laine – 18.12.2013 Local/global

16 Secondary structure local assignment Elodie Laine – 18.12.2013 Backbone torsion angles φ = angle between C-Cα-N & Cα-N-C planes ψ = angle between N-C-Cα & C-Cα-N planes ω = angle between Cα-N-C & N-C-Cα planes The ω angle is normally 0° (cis) or 180° (trans). Backbone torsion angles φ = angle between C-Cα-N & Cα-N-C planes ψ = angle between N-C-Cα & C-Cα-N planes ω = angle between Cα-N-C & N-C-Cα planes The ω angle is normally 0° (cis) or 180° (trans).

17 Secondary structure local assignment Elodie Laine – 18.12.2013 Ramachandran Diagram 3 10 = 3 10 -helix α = α -helix β = extended β -stand L α = left-handed α -helix polyP = extended polyproline Ramachandran Diagram 3 10 = 3 10 -helix α = α -helix β = extended β -stand L α = left-handed α -helix polyP = extended polyproline Ramachandran et al. (1963) J. Mol. Biol. Ramachandran et al. (1968) Adv. Protein. Chem.

18 Domain assignment to structural classes Elodie Laine – 18.12.2013 Protein domains are stable units of protein structure that can fold autonomously. Small proteins and most medium sized ones have just one domain. In the past, protein domains have been described in terms of structure compactness, function and evolution, or folding. Pyruvate kinase Classes: All alpha All beta Alpha and beta – mixed (a/b) Alpha and beta proteins – segregated (a+b) Small – metal ligand, heme and/or disulfide bridges … (1) (2) (3) (4) (5) (2) (3)

19 Folds and superfamilies Elodie Laine – 18.12.2013 Domains belonging to the same fold have the same major secondary structures in the same arrangement with the same topological connections. Ex: Globin-like, Long alpha-hairpin, Type I dockerin domain… The domains within a fold are further classified into superfamilies. Domains belonging to the same superfamily have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology. Ex: Globin-like and Alpha-helical ferredoxin are the two superfamilies of the Globin-like fold.

20 Folds and superfamilies Elodie Laine – 18.12.2013 PA superfamily

21 Evolutionary processes Elodie Laine – 18.12.2013 Above a certain level of structural similarity Conservation of rare structural characteristics, e.g. βαβ left Low sequence identity, yet significant Key residues in the active site Transitivity: if A & B are homologous, B & C also, then A & C are homologous Do all proteins displaying identical folds share a common ancestor ? Divergent evolution Homology Convergent evolution Analogy

22 Superfolds/supersitesSuperfolds/supersites Elodie Laine – 18.12.2013 Proteins displaying the same superfold tend to bind their ligand in the same location, called supersite. Superfolds are folds shared between different superfamilies. Their existence is indicative of convergent evolution. βαβ barrell β propeller Rossman-like

23 Protein structure classification resources Elodie Laine – 18.12.2013 CATH/Gene3D 16 millions protein domains classified into 2,626 superfamilies http://www.cathdb.info/ SCOP/SCOPe 59,514 PDB entries representing 167,547 domains. http://scop.berkeley.edu/ Superfamily level annotations on a collection of hidden Markov models for 2,478 completely sequence genomes Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis. Csaba G, Birzele F, Zimmer R. (2009) BMC Struct Biol. 9:23

24 CATH example Elodie Laine – 18.12.2013

25 ConclusionConclusion Root Mean Square Deviation (RMSD) is a very popular measure of the global drift between two protein conformations Structural alignment methods can rely on secondary structure, C α -C α distances, fragments… to find the best match between two protein structures Structure comparison and structural alignment are very complex problems yet unresolved. Concepts regarding methods and measures are disputed Structure classification can help to understand the function of proteins, and detect divergent or convergent evolutionary processes


Download ppt "Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, 2013-2014 Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:"

Similar presentations


Ads by Google