K -Nearest-Neighbors Problem. cRMSD  cRMSD(c,c ’ ) is the minimized RMSD between the two sets of atom centers: min T [(1/n)  i=1, …,n ||a i (c) – T(a.

Slides:



Advertisements
Similar presentations
Time averages and ensemble averages
Advertisements

Functional Site Prediction Selects Correct Protein Models Vijayalakshmi Chelliah Division of Mathematical Biology National Institute.
Fast Algorithms For Hierarchical Range Histogram Constructions
IBM Labs in Haifa © 2005 IBM Corporation Adaptive Application of SAT Solving Techniques Ohad Shacham and Karen Yorav Presented by Sharon Barner.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Crystallography -- lecture 21 Sidechain chi angles Rotamers Dead End Elimination Theorem Sidechain chi angles Rotamers Dead End Elimination Theorem.
Computational methods in molecular biophysics (examples of solving real biological problems) EXAMPLE I: THE PROTEIN FOLDING PROBLEM Alexey Onufriev, Virginia.
1 Micha Feigin, Danny Feldman, Nir Sochen
Iterative Relaxation of Constraints (IRC) Can’t solve originalCan solve relaxed PRMs sample randomly but… start goal C-obst difficult to sample points.
Protein Planes Bob Fraser CSCBC Overview Motivation Points to examine Results Further work.
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.
Lecture 21: Spectral Clustering
Routing Strategies Fixed Routing
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
CISC667, F05, Lec21, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction 3-Dimensional Structure.
The Fourth WIM Meeting 1 Active Nearest Neighbor Queries for Moving Objects Jan Kolar, Igor Timko.
Docking of Protein Molecules
A faster reliable algorithm to estimate the p-value of the multinomial llr statistic Uri Keich and Niranjan Nagarajan (Department of Computer Science,
Finding Compact Structural Motifs Presented By: Xin Gao Authors: Jianbo Qian, Shuai Cheng Li, Dongbo Bu, Ming Li, and Jinbo Xu University of Waterloo,
A Hierarchical Energy-Efficient Framework for Data Aggregation in Wireless Sensor Networks IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 55, NO. 3, MAY.
Algorithm for Fast MC Simulation of Proteins Itay Lotan Fabian Schwarzer Dan Halperin Jean-Claude Latombe.
Efficient Nearest-Neighbor Search in Large Sets of Protein Conformations Fabian Schwarzer Itay Lotan.
Stochastic roadmap simulation for the study of ligand-protein interactions Mehmet Serkan Apaydin, Carlos E. Guestrin, Chris Varma, Douglas L. Brutlag and.
A Multiresolution Symbolic Representation of Time Series
Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion Mehmet Serkan Apaydin, Douglas L. Brutlag, Carlos.
Protein Structure Prediction Samantha Chui Oct. 26, 2004.
Model Database. Scene Recognition Lamdan, Schwartz, Wolfson, “Geometric Hashing”,1988.
Classification and Prediction: Regression Analysis
( ) EXAMPLE 1 Evaluating Square Roots a. 36 = 6 6 because 2 = 36 . b.
Efficient Maintenance and Self-Collision Testing for Kinematic Chains Itay Lotan Fabian Schwarzer Dan Halperin Jean-Claude Latombe.
Introduction to Error Analysis
COMPARATIVE or HOMOLOGY MODELING
Efficient Maintenance and Self- Collision Testing for Kinematic Chains Itay Lotan Fabian Schwarzer Dan Halperin Jean-Claude Latombe.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Statistical Physics of the Transition State Ensemble in Protein Folding Alfonso Ramon Lam Ng, Jose M. Borreguero, Feng Ding, Sergey V. Buldyrev, Eugene.
Energy-Aware Scheduling with Quality of Surveillance Guarantee in Wireless Sensor Networks Jaehoon Jeong, Sarah Sharafkandi and David H.C. Du Dept. of.
Computer Simulation of Biomolecules and the Interpretation of NMR Measurements generates ensemble of molecular configurations all atomic quantities Problems.
Approximation of Protein Structure for Fast Similarity Measures Fabian Schwarzer Itay Lotan Stanford University.
Conformational Space.  Conformation of a molecule: specification of the relative positions of all atoms in 3D-space,  Typical parameterizations:  List.
UNSUPERVISED LEARNING David Kauchak CS 451 – Fall 2013.
1/20 Study of Highly Accurate and Fast Protein-Ligand Docking Method Based on Molecular Dynamics Reporter: Yu Lun Kuo
Stabbing balls and simplifying proteins Ovidiu Daescu and Jun Luo Department of Computer Science University of Texas at Dallas Richardson, TX
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Course 8 Contours. Def: edge list ---- ordered set of edge point or fragments. Def: contour ---- an edge list or expression that is used to represent.
2010 RCAS Annual Report Jung-Hsin Lin Division of Mechanics, Research Center for Applied Sciences Academia Sinica Dynamics of the molecular motor F 0 under.
Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive.
Residuals. Why Do You Need to Look at the Residual Plot? Because a linear regression model is not always appropriate for the data Can I just look at the.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
The Unscented Kalman Filter for Nonlinear Estimation Young Ki Baik.
Repairing Sensor Network Using Mobile Robots Y. Mei, C. Xian, S. Das, Y. C. Hu and Y. H. Lu Purdue University, West Lafayette ICDCS 2006 Speaker : Shih-Yun.
The influence of forgetting rate on complex span and academic performance Debbora Hall, Chris Jarrold, John Towse and Amy Zarandi.
Computational Intelligence: Methods and Applications Lecture 14 Bias-variance tradeoff – model selection. Włodzisław Duch Dept. of Informatics, UMK Google:
Regression and Correlation of Data Summary
Data Driven Resource Allocation for Distributed Learning
What is the differentiation.
Reflectance Function Approximation
Machine Learning – Regression David Fenyő
Class Notes 18: Numerical Methods (1/2)
Protein Planes Bob Fraser CSCBC 2007.
DSMC Collision Frequency Traditional & Sophisticated
Volume 108, Issue 3, Pages (February 2015)
3-Dimensional Structure
Solution structure of the donor site of a trans-splicing RNA
Reseeding-based Test Set Embedding with Reduced Test Sequences
Union of Geometric Constraint-Based Simulations with Molecular Dynamics for Protein Structure Prediction  Tyler J. Glembo, S. Banu Ozkan  Biophysical.
Ronen Basri Tal Hassner Lihi Zelnik-Manor Weizmann Institute Caltech
Gydo C.P. van Zundert, Adrien S.J. Melquiond, Alexandre M.J.J. Bonvin 
Yang Zhang, Jeffrey Skolnick  Biophysical Journal 
Volume 86, Issue 6, Pages (June 2004)
Presentation transcript:

k -Nearest-Neighbors Problem

cRMSD  cRMSD(c,c ’ ) is the minimized RMSD between the two sets of atom centers: min T [(1/n)  i=1, …,n ||a i (c) – T(a i (c’))|| 2 ] 1/2 where the minimization is over all possible rigid-body transform T

k -Nearest-Neighbors Complexity  O(N 2 (log k + L)) –N number of protein conformations to be compared –K number of nearest neighbors –L time to compare two conformations (cRMSD takes linear time).  Solution reduce L by reducing the number of centers to compare -> m- averaging

m-Averaged Approximation  Cut the backbone into fragments of m C  atoms  Replace each fragment by the centroid of the C  atoms

Evaluation: Test Sets [Lotan and Schwarzer, 2003]  FOLDTRAJ random partially unfolded structures -> good correlation with small m (few long segments)  Park-Levitt set [Park et al, 1997] compact native- like structures -> good correlation with large m (many short segments)  Use smaller m on unfolded proteins for greater time savings

Flexible m-averaging  ProteinA 47 residues  14 < r gyr < 24  6 < m < 12 r gyr

Results rgyrmk=100, %correctk=50, %correctk=10, %correct >= >= >= >=  Overhead for calculating and m-averaged structures and r gyration too high  Without averaging 28 sec and for all constant m’s 1 min  With flexible average 2 mins 20 sec  Easily fixed by precalculating r gyr and structures

Uses U F

Conclusions  Flexible m-averaging can save time (without sacrificing accuracy?)  Useful for quickly finding k nearest neighbors and building roadmaps  Precalculate m-averaged structures and r gyration for greater speed up