Alignment of Flexible Molecular Structures
Motivation Proteins are flexible. One would like to align proteins modulo the flexibility. Hinge and shear protein domain motions (Gerstein, Lesk, Chotia). Conformational flexibility in drugs.
Problem definition
Flexible Geometric Hashing Exploit the fact that neighboring parts share the joint - accumulate mutual information at the joint. Achieve complexity of the same order of magnitude as in rigid alignment.
Flexible protein alignment without prior hinge knowledge FlexProt - algorithm detects automatically flexibility regions, exploits amino acid sequence order.
Motivation
Geometric Representation 3-D Curve {v i }, i=1…n
Experimental Results
FlexProt Algorithm two protein molecules A and B, each being represented by the sequence of the 3-D coordinates of its C atoms. Input: two protein molecules A and B, each being represented by the sequence of the 3-D coordinates of its C atoms. largest flexible alignment by decomposing the two molecules into a minimal number of rigid fragment pairs having similar 3-D structure. Task: largest flexible alignment by decomposing the two molecules into a minimal number of rigid fragment pairs having similar 3-D structure.
Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels) FlexProt Main Steps
Structural Similarity Matrix Congruent Rigid Fragment Pair
j i+1 j+1 i j-1 i-1 v i-1 v i v i+1 w j-1 w j w j+1 Frag kt (l) = v k … v i... v k+l-1 w t … w j … w t+l-1 RMSD (Frag kt (l) ) < Detection of Congruent Rigid Fragment Pairs k t k+l-1 t+l-1
RMSD Computation V i …... V i+l W j...… W j+l W j...… W j+l V k …... V k+m W t...… W t+m W t...… W t+m P=P=P=P= Q= P U Q RMSD( P U Q ) in O(1) time NOT O( |P|+|Q| ) RMSD( P ) RMSD( Q )
Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels) FlexProt Main Steps
How to Join Rigid Fragment Pairs ?
Graph Representation Graph Node Graph Edge
Graph Representation The fragments are in ascending order.The fragments are in ascending order. The gaps (ins/dels) are limited.The gaps (ins/dels) are limited. Allow some overlapping.Allow some overlapping. W + Size of the rigid fragment pair (node b) - Gaps (ins/dels) - Overlapping Penalties a b
Graph Representation W _i W _k W _t W _m W _n DAG (directed acyclic graph) DAG (directed acyclic graph)
Optimal Solution ? “All Shortest Paths”“All Shortest Paths” O(|E| * |V|+|V| 2 ) (for DAG) O(|E| * |V|+|V| 2 ) (for DAG) W _i W _k W _t W _m W _n “Single-source shortest paths”“Single-source shortest paths” O(|E|+|V|) O(|E|+|V|)
Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels) FlexProt Main Steps
Clustering (removing ins/dels) T1T1 T2T2 If joining two fragment pairs gives small RMSD (T 1 ~ T 2 ) then put them into one cluster.
Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels) FlexProt Main Steps
Correspondence Problem
Molecular Surface Representation Applications to docking
Motivation Prediction of biomolecular recognition. Detection of drug binding ‘cavities’. Molecular Graphics.
1. Solvent Accessible Surface – SAS 2. Connolly Surface
Connolly’s MS algorithm A ‘water’ probe ball ( A diameter) is rolled over the van der Waals surface. Smoothes the surface and bridges narrow ‘inaccessible’ crevices.
Connolly’s MS algorithm - cont. Convex, concave and saddle patches according to the no. of contact points between the surface atoms and the probe ball. Outputs points+normals according to the required sampling density (e.g. 10 pts/A 2 ).
Example - the surface of crambin
Critical points based on Connolly rep. (Lin, Wolfson, Nussinov) Define a single point+normal for each patch. Convex-caps, concave-pits, saddle - belt.
Critical point definition
Connolly => Shou Lin
Solid Angle local extrema knob hole
Chymotrypsin surface colored by solid angle (yellow-convex, blue-concave)