Download presentation
Presentation is loading. Please wait.
Published byJessie French Modified over 8 years ago
1
k -Nearest-Neighbors Problem
2
cRMSD cRMSD(c,c ’ ) is the minimized RMSD between the two sets of atom centers: min T [(1/n) i=1, …,n ||a i (c) – T(a i (c’))|| 2 ] 1/2 where the minimization is over all possible rigid-body transform T
3
k -Nearest-Neighbors Complexity O(N 2 (log k + L)) –N number of protein conformations to be compared –K number of nearest neighbors –L time to compare two conformations (cRMSD takes linear time). Solution reduce L by reducing the number of centers to compare -> m- averaging
4
m-Averaged Approximation Cut the backbone into fragments of m C atoms Replace each fragment by the centroid of the C atoms
5
Evaluation: Test Sets [Lotan and Schwarzer, 2003] FOLDTRAJ random partially unfolded structures -> good correlation with small m (few long segments) Park-Levitt set [Park et al, 1997] compact native- like structures -> good correlation with large m (many short segments) Use smaller m on unfolded proteins for greater time savings
6
Flexible m-averaging ProteinA 47 residues 14 < r gyr < 24 6 < m < 12 r gyr
7
Results rgyrmk=100, %correctk=50, %correctk=10, %correct 1412718090 16.5>=119490100 18.5>=1064 100 20.5>=9615490 24>=66878100 Overhead for calculating and m-averaged structures and r gyration too high Without averaging 28 sec and for all constant m’s 1 min With flexible average 2 mins 20 sec Easily fixed by precalculating r gyr and structures
8
Uses U F
9
Conclusions Flexible m-averaging can save time (without sacrificing accuracy?) Useful for quickly finding k nearest neighbors and building roadmaps Precalculate m-averaged structures and r gyration for greater speed up
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.