Identifying similar surface patches on proteins using a spin-image surface representation M. E. Bock Purdue University, USA G. M. Cortelazzo, C. Ferrari, C. Guerra University of Padova, Italy
Protein Surface Matching Problem: detect similar surface patches on two proteins Motivation: identify similar binding sites in proteins useful when the proteins are unrelated by sequence or overall fold.
Our approach is purely geometric is based on a protein surface representation as a set of two-dimensional (2D) images, called spin images finds a collection of pairs of points on the two proteins such that the corresponding members of the pairs for one of the proteins form a surface patch for which the corresponding spin images are a "match".
Spin Images (Jonhson, Hebert, 1997) A surface representation introduced in the area of computer vision that uses 2D images to describe 3-D oriented points It allows to apply powerful techniques from 2-D template matching and pattern classification to the problem of 3-D surface recognition.
An oriented point basis
Spin Image ii jj The spin image of O is an array that accumulates the pairs ( ) relative to the surface points
Examples of spin images
Why spin images? Spin-images are useful representations the following reasons. invariant to rigid transformation object-centered simple to compute scalable from local to global representation
Matching Algorithm Step 1. Obtain Connolly’s surface representation of two given proteins P and Q Step 2. Label Connolly’s points as blocked, shadowed, or clear. Step 2. Find individual point correspondences on the two proteins based on correlation value of their spin images. Restrict correspondences to points with the same label Step3. Group point correspondences into patches using geometric consistency criteria
Labeling surface points A surface point P is unblocked if no other surface point with positive value lies on the oriented line l through P parallel to the normal n and in the same orientation as n A point that is not unblocked is blocked The unblocked points that belong to the convex hull of the protein surface are labeled clear points all others are shadowed. This labeling is easily obtained from the spin images
Examples of blocked and shadowed points
Grouping Point Correspondences Geometric Consistency : 1. the distances between m 1 and m 2 and between s 1 and s 2 are within a given tolerance 2. the angles between the normals at m 1 and m 2 and between the normals at s 1 and s 2 are within a given tolerance.
Grouping strategy A greedy approach that grows a patch of geometrically consistent points around a point that is a member of a pair of points on the two proteins whose corresponding spin images are highly correlated The obtained patches are ranked according to the number of points they contain
Results The active site is found generally among the top solutions As an example, for proteins 1BCK and 1CYN (cyclophilin B and A, respectively), that bind to the same ligand (cyclosporin), the first solution corresponds to the active site
Active sites on 1BCK and 1CYN (from PDB)
Top solution of our approach
Future work Apply a spin image representation for protein-protein interfaces and docking