Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison.

Similar presentations


Presentation on theme: "Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison."— Presentation transcript:

1 Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison

2 Lecture 11 CS5662 Motivation Given structures A and B, are they similar? –Implications: A and B might share the same set of functions Given structure A, is a similar structure already known? –Implications: Each new experimentally solved structure can be placed in context of existing body of structural knowledge

3 Lecture 11 CS5663 Concepts For n sequences and s corresponding structural equivalence classes, n >> s. Possible reasons: –Structural divergence is slower than sequence divergence in evolution (à la RNA sequence alignment) –Convergent evolution: Some structures are preferred for a functional reason –Coincidence: Only so many structures are possible, for a given threshold of similarity Terms used to describe structure –Architecture, Class, Fold, Super-family, Family……

4 Lecture 11 CS5664 Superposition versus Alignment Structural superposition versus structural alignment –Superposition Residue correspondence already known, based on a statistically significant sequence alignment Problem is that of finding optimal correspondence between two sets of points, given subset of equivalent points between the two sets Optimality measured by lowest value of Root Mean Square Deviation (RMSD) –Alignment Residue correspondence unknown Need structure-based scoring function and evaluate this for all possible superposition of structures Optimal solution frequently impractical because of high complexity (NP-hard, Why?)

5 Lecture 11 CS5665 Heuristic Structure Alignment General strategy –Summarized/reduced representation of each structure Consider only subsets of atoms (Just C  or C  ) Use summarized vectors to represent organized sub- structures –Approaches Dynamic programming with empirical scoring functions Distance-matrix correspondence in internal co- ordinate space

6 Lecture 11 CS5666 Heuristic Structure Alignment Vector based strategies –VAST (Vector Alignment Search Tool): Compare summarized vector representations –SSAP (Secondary Structure Alignment Program): Compare nearest-neighbor vectors by double dynamic programming Distance matrix comparisons (à la Dot-matrix) –DALI (Distance Alignment Tool): Subset of internal coordinate space

7 Lecture 11 CS5667 VAST alignment (Fig. 10.13) Use only subset of atom coordinates Replace atom coordinates with vector coordinates corresponding to secondary structural elements (“structural words”) Compare sets of vectors to assess similarity

8 Lecture 11 CS5668 Double dynamic programming (SSAP/CATH Fig. 10.14) “First level:” –Represent each residue by neighborhood vector for C  –Compare n versus m neighborhood vectors –Generate optimal alignment based on vector differences and dynamic programming “Second Level:” –Add matrix scores if paths cross in a cumulative matrix –Generate optimal alignment based on the cumulative matrix

9 Lecture 11 CS5669 Distance matrix based alignment (DALI Fig. 10.15) Generate dot-matrix of inter- C  distances, using threshold Pick out secondary structure elements based on matrix patterns Compare two matrices to generate structural alignment

10 Lecture 11 CS56610 Structure Comparison Databases Several databases (CATH, MMDB, FSSP) maintain a hierarchical classification of known structures, based on pair-wise structural alignment scores High complexity of the algorithms requires incremental additions Actual classification is algorithm-dependent, with some consensus, but significant differences exist

11 Lecture 11 CS56611 Summary Sequence similarity (> 50% identity) implies structural similarity. Converse not necessarily true (evolutionary convergence/information convergence) Structural similarity algorithms are heuristic ways to assess structural similarity – independent of sequence similarity Structural variation is smaller than that suggested by the number of possible sequences


Download ppt "Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison."

Similar presentations


Ads by Google