Download presentation
Presentation is loading. Please wait.
Published byJocelyn Hood Modified over 8 years ago
1
Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures Rachel Kolodny Patrice Koehl Michael Levitt Stanford University
2
Myoglobin Perutz 1960 compare Hemoglobin Database (PDB) Human growth hormone Human Prolactin … Oxy-Myoglobin Today: 25115 structures Compare(, ) Generate List want ordered list … Similar structures
3
The Structural Alignment Problem comparison of structures Two chains in R 3 A=(a 1,a 2,…,a n ), B=(b 1,b 2,…,b m ) Find sub-chains s a and s b s.t. - (a s a (1),a s a (2),…, a s a (k) ),(b s b (1),b s b (2),…, b s b (k) ) are similar –k is maximal tradeoff
4
Similarity in Structure Two common similarity measures cRMS dRMS Captures how close the corresponding pairs are in space Euclidean similarity
5
Alignment Scores single number that allows comparison of two alignments Given two alignments, judging which is better We have: length k, cRMS Use SI, MI [Kleywegt & Jones 1994 ]
6
Easy To See Best Alignment … ABCD (1,2) (1,3) (N,N-1) method pair Distribution of stars shows which is best
7
‘Best-of-All’ Can Join Efforts … ABCDBest- of-All (1,2) (1,3) (N,N-1) method pair
8
Large Scale Comparison 2930 CATH domains –769 fold classes –Sequence diverse All against all 8,581,970 alignments Over 800 CPU days on 2.8GHz processors class architecture topology www.biochem.ucl.ac.uk/bsm/cath
9
Methods Compared SSAPTaylor & Orengo, 1989 STRUCTALSubbiah, Laurents & Levitt, 1993 Gerstein & Levitt 1998 DALIHolm & Sander, 1993 Holm & Park, 2000 DEJAVU /LSQMANKleywegt, 1996 CEShindyalov & Bourne, 1998 SSMKrissinel & Henrick, 2003 Best-of-AllBest of above methods
10
Previous Comparisons Sierk & Pearson [2004] –ROC curves using CATH Novotny et al. [2004] –Checked a few dozen cases –Use CATH as gold standard Leplae & Hubbard [2002] –ROC curves using SCOP
11
Comparison Using ROC Curves Gold Standard 1 Positives Negatives 1 1 … … 0 0 Sort by similarity 2 Score/SAS 0.2 1.2 5.3 …… 9.4 1 0 1 0 3 Draw ROC curves True Positives % (sensitivity) False Positives % (100 – specificity) random Perfect measure
12
ROC Curve Issues Uses only internal ordering –Estimation of similarity can be very wrong Converts a classification gold standard into binary truth Native scores or SAS 0.2 1.2 5.3 …… 9.4 200 1200 5300 …… 9400
13
SAS & Native ROC Curves
14
Comparing SAS Values Directly Best-of-All
15
GSASSAS CAT CA Cross Fold Similarities
16
GSAS & SAS Distributions Best-of-All Same CAT Pairs All Pairs percent
17
More Tests Hard cases for all but one –STRUCTAL and LSQMAN do well –Relies on comparing alignments directly Time –SSM does best, then LSQMAN –To be fast: give up quickly in hard cases
18
Summary A new methodology for comparing structural alignment methods Allows defining ‘Best-of-All’ method Now can use Best-of-All data –Maybe to improve database-wide comparisons to new structures ?
19
Thank You
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.