Predicting ligand binding sites on protein surface Zengming Zhang 2010-5-12.

Slides:



Advertisements
Similar presentations
CSE554Cell ComplexesSlide 1 CSE 554 Lecture 3: Shape Analysis (Part II) Fall 2014.
Advertisements

Continuous Charge Distributions
Table of Contents 9.5 Some Basic Morphological Algorithm
Course Syllabus 1.Color 2.Camera models, camera calibration 3.Advanced image pre-processing Line detection Corner detection Maximally stable extremal regions.
By Groysman Maxim. Let S be a set of sites in the plane. Each point in the plane is influenced by each point of S. We would like to decompose the plane.
Bioinformatics Vol. 21 no (Pages ) Reporter: Yu Lun Kuo (D )
Chapter 9: Morphological Image Processing
Course Syllabus 1.Color 2.Camera models, camera calibration 3.Advanced image pre-processing Line detection Corner detection Maximally stable extremal regions.
DIGITAL IMAGE PROCESSING
Morphological Image Processing Md. Rokanujjaman Assistant Professor Dept of Computer Science and Engineering Rajshahi University.
Chapter 23 Mirrors and Lenses.
Light: Geometric Optics
A New Analytical Method for Computing Solvent-Accessible Surface Area of Macromolecules.
Digital Image Processing, 2nd ed. © 2002 R. C. Gonzalez & R. E. Woods Chapter 9 Morphological Image Processing Chapter 9 Morphological.
Images Formed by Refraction
Chapter 9 Morphological Image Processing. Preview Morphology: denotes a branch of biology that deals with the form and structure of animals and planets.
Copyright © 2009 Pearson Education, Inc. Chapter 32 Light: Reflection and Refraction.
Numerical Meshes from Seismic Images Karl Apaza Agüero Paulo Roma Cavalcanti Antonio Oliveira Claudio Esperança COPPE – Sistemas - UFRJ.
Morphological Image Processing
1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.
2007Theo Schouten1 Morphology Set theory is the mathematical basis for morphology. Sets in Euclidic space E 2 (or rather Z 2 : the set of pairs of integers)
Geometric molecular surface modeling using mathematical morphology operators. Journal of Molecular Graphics Volume 13, Issue 6, Pages (December.
From Chapter 23 – Coulomb’s Law
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Lecture 5. Morphological Image Processing. 10/6/20152 Introduction ► ► Morphology: a branch of biology that deals with the form and structure of animals.
Chapter 9.  Mathematical morphology: ◦ A useful tool for extracting image components in the representation of region shape.  Boundaries, skeletons,
Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz.
 When light strikes the surface of an object  Some light is reflected  The rest is absorbed (and transferred into thermal energy)  Shiny objects,
A phospholipid segment Hydrophilic head, hydrophobic tail Watson, The Cell.
Chapter 22 Gauss’s Law Chapter 22 opener. Gauss’s law is an elegant relation between electric charge and electric field. It is more general than Coulomb’s.
Morphological Image Processing
Chapter 36 Image Formation.
Digital Image Processing CSC331 Morphological image processing 1.
LP Narrowing: A New Strategy for Finding All Solutions of Nonlinear Equations Kiyotaka Yamamura Naoya Tamura Koki Suda Chuo University, Tokyo, Japan.
Morphological Image Processing การทำงานกับรูปภาพด้วยวิธีมอร์โฟโลจิคัล
1 11. Polygons Polygons 2D polygons ( 다각형 ) –Polygon sides are all straight lines lying in the same plane 3D polyhedra ( 다면체 )  chap. 12 –Polyhedra.
CS654: Digital Image Analysis
Low level Computer Vision 1. Thresholding 2. Convolution 3. Morphological Operations 4. Connected Component Extraction 5. Feature Extraction 1.
References Books: Chapter 11, Image Processing, Analysis, and Machine Vision, Sonka et al Chapter 9, Digital Image Processing, Gonzalez & Woods.
EE 4780 Morphological Image Processing. Bahadir K. Gunturk2 Example Two semiconductor wafer images are given. You are supposed to determine the defects.
1 Mathematic Morphology used to extract image components that are useful in the representation and description of region shape, such as boundaries extraction.
1 Overview representing region in 2 ways in terms of its external characteristics (its boundary)  focus on shape characteristics in terms of its internal.
Morphological Image Processing Robotics. 2/22/2016Introduction to Machine Vision Remember from Lecture 12: GRAY LEVEL THRESHOLDING Objects Set threshold.
 Mathematical morphology is a tool for extracting image components that are useful in the representation and description of region shape, such as boundaries,
Detection of closed sharp edges in point clouds Speaker: Liuyu Time:
We created a set of volume limited samples taken from the 2dFGRS (Colless 2001) that contains about 250,000 galaxies with accurate redshifts, is relatively.
BYST Morp-1 DIP - WS2002: Morphology Digital Image Processing Morphological Image Processing Bundit Thipakorn, Ph.D. Computer Engineering Department.
Morphology Morphology deals with form and structure Mathematical morphology is a tool for extracting image components useful in: –representation and description.
Machine Vision ENT 273 Hema C.R. Binary Image Processing Lecture 3.
11/01/2010 Segmentation of SES for Protein Structure Analysis Virginio Cantoni, Riccardo Gatti, Luca Lombardi University of Pavia, dept. of Computer Engineering.
CDS 301 Fall, 2008 Domain-Modeling Techniques Chap. 8 November 04, 2008 Jie Zhang Copyright ©
Lecture(s) 3-4. Morphological Image Processing. 3/13/20162 Introduction ► ► Morphology: a branch of biology that deals with the form and structure of.
Chapter 6 Skeleton & Morphological Operation. Image Processing for Pattern Recognition Feature Extraction Acquisition Preprocessing Classification Post.
Molecular mechanics Classical physics, treats atoms as spheres Calculations are rapid, even for large molecules Useful for studying conformations Cannot.
Geometric Optics Figure Mirrors with convex and concave spherical surfaces. Note that θr = θi for each ray.
Mathematical Morphology
Digital Image Processing CP-7008 Lecture # 09 Morphological Image Processing Fall 2011.
Introduction to Morphological Operators
Predicting ligand binding sites on protein surface
CS Digital Image Processing Lecture 5
Morphological Image Processing
CSE 554 Lecture 3: Shape Analysis (Part II)
Are Proteins Well-Packed?
Complementarity of Structure Ensembles in Protein-Protein Binding
Ligand Binding to the Voltage-Gated Kv1
CS654: Digital Image Analysis
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

Predicting ligand binding sites on protein surface Zengming Zhang

What is the binding site? (Concave, cleft, hole) – shaped region on protein surface A key into a lock!  Key-ligand  Lock-protein  Lock hole-binding sites

Why do we need to find binding sites? First step in many structure analyses:  Functional/catalytic site prediction  Comparisons of protein atomic configurations  Docking calculations  Structure-based drug design  …

Algorithms for finding binding sites Grid-based  Cover the protein into a 3D grid,  Empty grid points are then defined a pockets if they satisfy a number of geometric or energetic conditions. Sphere-based  A set of probe spheres are placed on protein surface.  Pocket spheres are those generated probe spheres that satisfy a number of geometric conditions among the generated probe spheres. α-shape based  Is defined as a subset of Delaunay tessellations of protein atoms, omitting edges longer than the sum of the radii of two atoms.

Algorithms for finding binding sites Grid-based  POCKET, LIGSITE, LIGSITE CS,LIGSITE CSC,ConCavity, PocketPicker and GHECOM Sphere-based  SURFNET, PASS, Q-SiteFinder, PHECOM α-shape based  CAST, Fpocket

α-shape The shape surrounded by the black line The edge of Delaunay tessellations

No edge that its length is longer than the sum of the radii of two atoms

α-shape based: CAST Computes a triangulation of the protein’s surface atoms using α- shapes, then triangles are grouped by letting small triangles flow toward neighboring larger triangles, which act as sinks!

Grid-based The protein is projected onto a 3D grid. They focused on PSP (protein-solvent-protein) events of the grids. When a straight line drawn from a grid point is enclosed on both side by protein atoms, the arrangement of the line for that grid point is termed a PSP event. Grid points having more than a threshold number of PSP events are defined as pockets.

Sphere-based SURFNET: Places a sphere (called gap spheres) between two protein atoms. If the sphere contains any other atoms, reduce its radius until it just touches one protein atom. A set of these gap spheres are defined as pockets.

Grid-based: GHECOM By Takeshi Kawabata Kawabata T. (2010) Detection of multi-scale pockets on protein surfaces using mathematical morphology. Proteins,78, To define pocket region on protein surface

Primary points: 1. A new definition of pockets by using the basic operations of mathematical morphology 2. Proposed an algorithm for finding pockets 3. Construct a useful dataset for algorithm testing 4. Introduced a new method for evaluate binding site predictions 5. Some useful discoveries about ligands bind to binding sites

Some Background: Multiscale pockets:  Calculate deep and shallow pockets simultaneously  “Multiscale pockets” need “multiscale probes”, they use many probes of different sizes to define pockets. “Size” and “Depth” of pockets:  Two properties of pockets  A definition of pockets using small and large spherical probes of his previous work: PHECOM  A pocket region: a space into which a small spherical can enter but a large spherical probe cannot.

Pocket definition Mathematical Morphology  It is a theory used in the analysis of geometric features of digital images based on rigorous set theory.  Morphology can provide boundaries of objects, their skeletons, and their convex hulls. It is also useful for many pre- and post-processing techniques, especially in edge thinning and pruning.

mathematical morphology (con.) Four operations: dilation, erosion, opening, closing a: Molecular shape b: The shape of the probe c:X ⊕ P: Operation dilation of X by P d:XΘP: Operation erosion of X by P e:X ○ P: Operation opening of X by P f: X P: Operation closing of X by P The shape X is the vdW volume of a protein

mathematical morphology (con.) mathematical morphology language:  The translation of the shape X by the vector p (p-translated X) is denoted by (X) p and is defined by:

mathematical morphology (con.) where X c is the complement of shape X X c = E 3 –X In other words, the closing of X by P is defined as a space where the probe P cannot enter when any overlaps between X and P are prohibited. The closing of X by P is called as the “molecular volume” of molecule X defined by probe P.

Pocket definition (con.) Eq.(12) is introduced by Masuya and Doi using mathematical morphological operations:

Pocket definition (con.)

Algorithm: Multiscale closing or multiscale molecular volume: Using K types of large probe spheres P1,P2, … Pk, and one Small probe S, must satisfy: The opening condition means that a large probe Pj can be reconstru- cted by a set of translated smaller probes Pi.

Algorithm (con.) If the opening condition [Eq. 16] is satisfied for all the probes {Pi}, then the following relation will hold: But …

Algorithm (con.) Not satisfy Eq.(16)

Algorithm (con.) Is the assumption WRONG ?  NO!  The assumption of Eq. (16) is still safe, because they use digitized pseudo-spheres as approximations of real spheres in continuous space, and therefore, the digitized pseudo-spheres should have the properties of real spheres.

Algorithm (con.) Only one index for the 3D grid I(x) is necessary to store K types of dilations, molecular volumes and pockets: x is a 3D point, I D (x), I C (x) and I P (x) are integers determined by a 3D point x. Multiscale dilation Multiscale closing or Multiscale molecular volume Multiscale pocket

Algorithm (con.) R inaccess :  The minimum inaccessible radius, means the minimum radius of spheres that cannot touch the point x.  As a measure of shallowness for probes on protein surface. R pocket  The minimum pocket radius, means the minimum radius of spheres with which the point x is within the pocket.

Algorithm (con.) Eq.(17-19) suggest an efficient algorithm for calculating multiscale dilations, molecular volumes and pockets. To implement an efficient algorithm, a shell of pockets H k is defined as the difference of kth and (k-1)th probes as follows:

Algorithm (con.) A general strategy for an efficient algorithm is to process a shape X using a series of shells, progressing in size from smaller to large shell( H1, H2, …, Hk). The algorithm is shown in Figure 4. In this study, the grid width was set to 0.8 Å, the radius of the probe S was set to 1.87 Å, and 17 types of different large probes Pk were used, their radius were: 2.0, 2.5, 3.0, 3.5,…. And 10 Å.

Algorithm (con.) Calculation of R inaccess for ligand atoms A measure of pocket shallowness for probes or atoms of binding ligands is useful for characterizing binding pockets. |L| is the number of points in the sharp L of the ligand. A: 1/((1/3 + 1/4 + 1/4 )/3) = 3.6 Å B: 1/((1/6 + 1/5 + 1/5 )/3) = 5.3 Å

Algorithm (con.) Calculation of R inaccess and pocketness for protein atoms and residues A measure for characterizing the depth of a protein atom or residue is useful for analyzing the relationship between ligand types and surrounding protein atom types. For characterizing the depth of protein atoms, they introduced the concept of “accessible shell volume” around a part of protein Y: where shell Y is a part of a protein shape X (Y ⊂ X), and S is a spherical probe.

Algorithm (con.) The measure of pocketness for a protein atom or residue, indicating how much it contributes to binding ligands. Generally speaking, deep and large pockets tend to bind ligands. Here is a measure pocketness to indicate both size and depth of a pocket: A residue in a deeper and larger pocket has a larger value of pocketness.

Algorithm (con.) Clustering grids and filtering out small clusters  Most of ligands are bound in the largest pockets.  The procedure of clustering pockets and extracting only large pocket clusters have been widely used by researchers.  In this study, using multiscale boundaries of pockets need a threshold value of the R pocket measure for the boundary between the pocket and the open outer space. [will shown in “Results” section]

Dataset Prepared from SCOP database, V 1.73 Included protein chains with mutual sequence identities of 40% or less. Exclude:  Small proteins with less than 40 residues  Protein chains with domains of class f,h,i,j,k, total 7375 chains Extract the chains bound to “proper” small molecules, exclude:  Tiny molecules  Unnatural precipitants: BOG, DTT, EPE, GOL, MES, MPD, MRD, PG4 and TRS.  DNA, RNA ( >= 3 ntd) and proteins (>=10 aa)  Chains with more than 10,000 heavy atoms As a result:  1817 chains were included.  Each of which contacted at least one proper small molecule.  Only use bound chains.

Evaluation of binding site predictions using recall-precision plots For purpose of comparison, calculated pockets and binding ligands were represented by pockets or ligands with 0.8Å width; each point was checked to determine if it was inside of the pockets or binding ligands. N P is the number of grid points in pockets, N L is the number of grid point overlapping with ligands, and N PL is the number of grid points in pockets that overlapped with ligands.

Results 1dwd

Results

Useful discoveries The majority of molecules binding in deep pockets were coenzymes In contrast, adenine and guanine mononucleotides tend to bind in medium- to-shallow pockets Macromolecules tend to bind in shallow pockets or protruded regions

Useful discoveries In the typical binding pose of the dataset HEM molecule, the aromatic atoms CBB and CMC are facing proteins, whereas the carboxyl atoms O1A and O2A are facing water. In the ADP molecule, the atom N6 in the adenine ring and the atom O1B, O2B and O3B of phosphate group favored deep pockets, the atoms of sugar, such as O2’ and O3’, favored shallow pockets. N6 side of adenine atoms and the phosphate termini are facing proteins, while the sugar atoms are facing water.

Summary: 1. A new definition of pockets by using the basic operations of mathematical morphology 2. Proposed an efficient algorithm for finding pockets 3. Construct a useful dataset for algorithm testing 4. Introduced a new method for evaluate binding site predictions with precision and recall. 5. Some useful discoveries

Thanks! Any questions? Please feel free to ask me!