Taking Geometry to its Edge: Fast Rigid (and Hinge-Bent) Docking Algorithms. Haim Wolfson 1, Dina Duhovny 1, Yuval Inbar 1, Vladimir Polak 1, Ruth Nussinov.

Slides:



Advertisements
Similar presentations
A 3-D reference frame can be uniquely defined by the ordered vertices of a non- degenerate triangle p1p1 p2p2 p3p3.
Advertisements

Iterative Relaxation of Constraints (IRC) Can’t solve originalCan solve relaxed PRMs sample randomly but… start goal C-obst difficult to sample points.
Seminar in structural bioinformatics Multiple structural alignment of proteins By Elad Kaspani.
Protein Tertiary Structure Prediction
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
Docking Algorithm Scheme Part 1: Molecular shape representation Part 2: Matching of critical features Part 3: Filtering and scoring of candidate transformations.
Alignment of Flexible Molecular Structures. Motivation Proteins are flexible. One would like to align proteins modulo the flexibility. Hinge and shear.
Docking of Protein Molecules
FLEX* - REVIEW.
An Integrated Approach to Protein-Protein Docking
BL5203: Molecular Recognition & Interaction Lecture 5: Drug Design Methods Ligand-Protein Docking (Part I) Prof. Chen Yu Zong Tel:
QSD – Quadratic Shape Descriptors Surface Matching and Molecular Docking Using Quadratic Shape Descriptors Goldman BB, Wipke WT. Quadratic Shape Descriptors.
Object Recognition. Geometric Task : find those rotations and translations of one of the point sets which produce “large” superimpositions of corresponding.
CAPRI Critical Assessment of Prediction of Interactions.
Unbound Docking of Rigid Molecules. Problem Definition Given two molecules find their correct association: +=
Protein Structure Prediction Samantha Chui Oct. 26, 2004.
Protein-protein and Protein- ligand Docking The geometric filtering.
Model Database. Scene Recognition Lamdan, Schwartz, Wolfson, “Geometric Hashing”,1988.
Identifying similar surface patches on proteins using a spin-image surface representation M. E. Bock Purdue University, USA G. M. Cortelazzo, C. Ferrari,
ClusPro: an automated docking and discrimination method for the prediction of protein complexes Stephen R. Comeau, David W.Gatchell, Sandor Vajda, and.
Combinatorial docking approach for structure prediction of large proteins and multi-molecular assemblies Yuval Inbar 1, Hadar Benyamini 2, Ruth Nussinov.
Recurring conformation of the human immunodeficiency virus type 1 gp120 V3 loop Robyn L. Stanfield, Jayant B. Ghiara, Erica O. Saphire, Albert T. Profy,
Hyperthermophile subtilases
Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive.
PatchFinder. The ConSurf web-server calculates the evolutionary rate for each position in the protein. Surface clusters of spatially close & conserved.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Local Flexibility Aids Protein Multiple Structure Alignment Matt Menke Bonnie Berger Lenore Cowen.
A new protein-protein docking scoring function based on interface residue properties Reporter: Yu Lun Kuo (D )
Rong Chen Boston University
Structure of an LDLR-RAP Complex Reveals a General Mode for Ligand Recognition by Lipoprotein Receptors  Carl Fisher, Natalia Beglova, Stephen C. Blacklow 
An Integrated Approach to Protein-Protein Docking
Symmetry Recognizing Asymmetry
R. Elliot Murphy, Alexandra B. Samal, Jiri Vlach, Jamil S. Saad 
Giovanni Settanni, Antonino Cattaneo, Paolo Carloni 
AnchorDock: Blind and Flexible Anchor-Driven Peptide Docking
Chaperone-Assisted Crystallography with DARPins
Yeast RNA Polymerase II at 5 Å Resolution
Near-Atomic Resolution for One State of F-Actin
Volume 34, Issue 4, Pages (May 2009)
Solution structure of the donor site of a trans-splicing RNA
Structure of CheA, a Signal-Transducing Histidine Kinase
Volume 4, Issue 5, Pages (November 1999)
Structure of Bax  Motoshi Suzuki, Richard J. Youle, Nico Tjandra  Cell 
Volume 20, Issue 12, Pages (September 2017)
Moosa Mohammadi, Joseph Schlessinger, Stevan R Hubbard  Cell 
Volume 22, Issue 2, Pages (February 2005)
Volume 16, Issue 4, Pages (April 2008)
Structural View of the Ran–Importin β Interaction at 2.3 Å Resolution
David Jeruzalmi, Mike O'Donnell, John Kuriyan  Cell 
A Putative Mechanism for Downregulation of the Catalytic Activity of the EGF Receptor via Direct Contact between Its Kinase and C-Terminal Domains  Meytal.
What Does It Take to Bind CAR?
Protein Structure Alignment
David Jeruzalmi, Mike O'Donnell, John Kuriyan  Cell 
NSF N-Terminal Domain Crystal Structure
E.Radzio Andzelm, J Lew, S Taylor  Structure 
OmpT: Molecular Dynamics Simulations of an Outer Membrane Enzyme
Volume 11, Issue 1, Pages (January 2003)
Volume 12, Issue 11, Pages (November 2004)
Structure of the InlB Leucine-Rich Repeats, a Domain that Triggers Host Cell Invasion by the Bacterial Pathogen L. monocytogenes  Michael Marino, Laurence.
Gydo C.P. van Zundert, Adrien S.J. Melquiond, Alexandre M.J.J. Bonvin 
Volume 25, Issue 6, Pages e5 (June 2019)
Peter König, Rafael Giraldo, Lynda Chapman, Daniela Rhodes  Cell 
Structure of an IκBα/NF-κB Complex
Three protein kinase structures define a common motif
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Volume 20, Issue 3, Pages (March 2012)
Morgan Huse, Ye-Guang Chen, Joan Massagué, John Kuriyan  Cell 
Presentation transcript:

Taking Geometry to its Edge: Fast Rigid (and Hinge-Bent) Docking Algorithms. Haim Wolfson 1, Dina Duhovny 1, Yuval Inbar 1, Vladimir Polak 1, Ruth Nussinov 2,3 1 School of Computer Science, 2 School of Medicine Tel Aviv University, Israel 3 NCI-Frederick, USA

CAPRI: Critical Assessment of PRediction of Interactions First docking contest: 19 groups from all over the world. Round 1 – 3 targets. Round 2 – 4 targets. Only 5 predictions per target can be submitted.

Targets Round 1, Round 2Any good prediction? Our group Revision Hpr kinase / hpr Rotavirus VP6 / Fab (antibody) Hemagglutinin (virus capsid) / Fab HC63 (antibody) Amylase / camelid antibody VH1 Amylase / camel antibody VH2 Amylase / camel antibody VH3 T-Cell Receptor/exotoxin

Molecular Surface Representation Local Critical Feature Selection Geometric Matching of Critical Features Filtering and Scoring Active site knowledge Candidate Transformations PDB files Geometric Docking Algorithms

PPD – Norel et al Surface representation – Connolly MS. Critical features – local extrema of surface curvature ‘knobs’ / ‘holes’. Matching – pairs of critical points + associated normals are matched using Geometric Hashing. Scoring – shape complementarity (allowing moderate penetration), electrostatics, aromatic residues.

BUDDA – V. Polak (M.Sc. Thesis 2002) Surface representation – Connolly patch centers (caps/pits/belts), distance transform grid. Critical features – ‘knobs’ / ‘holes’+ caps/pits/belts. Option to focus on backbone residue related points. Matching – knob/hole +a pair of neighboring caps/pits are matched using Geometric Hashing. Scoring – shape complementarity, allowing moderate penetration.

PatchDock – Duhovny et al Surface representation – distance transform grid, multi-resolution surface. Critical features – three types of surface patches: convex, concave and flat. Focus on active site : hot spot rich patches. Matching – patch points are matched by Geometric Hashing. Scoring – shape complementarity, allowing moderate penetration.

Our Docking Algorithms PPDBUDDAPatchDock Surface representation Connolly’s MSCaps/pits/belts, distance transform grid distance transform grid, multi-resolution surface Critical features ‘knobs’/‘holes’ point+normal pairs backbone ‘knob/hole’+pair of ‘caps/pits’ surface patches: convex, concave and flat (point+normal pairs in a patch) Matching algorithm Geometric Hashing Filtering and scoring shape complementarity, electrostatics, aromatic residues shape complementarity Active Site Focusing in the matching step in matching and scoring steps

Automatic CDR detection The light and heavy chains of CDRs have conserved patterns that enable us to align a given sequence to a consensus sequence which was derived using statistical data. This alignment is used further to locate the CDRs area.

Round 1 Docking Algorithms: PPD (Norel) BUDDA (Polak)

HPr kinase / phosphatase is a key regulatory enzyme controlling carbon metabolism in bacteria. The protein is a hexamer. HprK/P contains the Walker motif - characteristic of nucleotide-binding proteins. Target 1 – HPR Kinase

It catalyses the ATP-dependent phosphorelation/dephosphorelation of Ser46 in HPr. Target 1 – HPR Kinase

What was done: –Distance constraint of 10.0 Å between the oxygen atom of Ser(Asp)-46 and the closest phosphate oxygen.

Target 1 – HPr Kinase/HPr What was done: –Distance constraint of 10.0 Å between the oxygen atom of Ser(Asp)-46 of the HPr and the closest phosphate oxygen. Results: –Best result within top 10 ranked 7; RMSD from native ~8.0 Å –Explanation: A considerable part of the interface surface area is between the HPR and the enzyme flexible helix.

Target 1 – Lessons Learned Flexible hinge-bent docking: –Two rigid parts of enzyme: (i) Helix of chain C (ii) the body of chain A without the helix. New results: –2 nd best scoring result: ~ 3.0 Å, run- time: 2 min. –In our solution the phosphocarrier protein is in red and the helix of the kinase is in orange. –These results were achieved without using the distance constraint.

The position of the helix in the uncomplexed structure (dark green color) The position of the helix in the solution obtained by flexible docking (orange) The position of the helix in the structure of the complex (purple)

VP6 protein of rotavirus that causes gastroenteritis in children. Target 2 – Biological Background

Trimmer (symmetry) The surface of the B (helices) domain is buried in rotavirus capsid. The H-domain interacts with the antibody. A ‘hint’ was given- to use the trimmer in the docking, meaning that active site is expanded to more than one chain. Target 2 – Biological Background

Target 2 – what we did The antibody potential binding site was restricted to CDRs. The antigen VP6 potential binding site was restricted to the β domain. We selected solutions with interfaces that include: 1. at least 4 CDRs of the antibody with high TYR,TRP concentration. 2. at least 2 chains of the antigen. clustering of the solutions obtained for the different chains of the trimer.

Target 2 – our best hit Our solution in blue vs. original complex (RMSD 15A, rank 7) (within 5 results that were not submitted due to technical problems)

Target 2 – Lessons Learned 1 Search only loop regions of the antigen Restrict even further the antigen to the exposed part of the virus capsid Loops of the “cap” region are in spacefill Side view Top view

Target 2 – Lessons Learned 2 Filter out results that cause steric clash of the 3 (symmetric) antibodies binding to the antigen trimer.

Target 2 : a-posteriori best hits Our best hit: RMSD 3.08 rank 76 First 10: RMSD 5.54 A, ranked 9 Run-time: 7 min VP6 molecule in spacefill, original complex Fab is in blue superimposed on our solution (rank 9) in yellow.

Target 2 – analysis of geometric shape complementarity The area of the interface of the original (blue) complex: ~400A 2 The area of the interface in our highest ranked solution (yellow) is ~600A 2. In this result the light chain of the antibody is shifted towards the center of virus capsid, enlarging shape complementarity. The heavy chain is very close to it’s original location. Heavy chains Light chains

Target 2 – shape complementarity Heavy chains only Light chains

Hexamer: 3 dimmers (symmetry) One chain of the dimmer(s) is buried in the capsid. Other antibody- antigen complexes of this antigen also imply that the epitope is on the ‘external’ chains (A,C and E). Target 3 – influenza hemagglutinin:

Target 3 – what we did The antibody potential binding site was restricted to CDRs. We selected solutions with interface that includes: 1. at least 4 CDRs of the antibody with high TYR,TRP concentration. 2. only 1 chain of the antigen  main reason to failure! Clustering of the solutions obtained for the different chains of the trimer.

Target 3 – Lessons Learned Restrict antigen potential binding site to the exposed domain of the virus capsid Filter out results that cause steric clash of the 3 antibodies (symmetry constraint). Filter out results that include only one chain of the virus capsid in the interface. Detect structurally conserved regions of Influenza virus Hemagglutinin to reduce the effective protein surface. MultiProt – a tool for multiple alignment and detection of structurally conserved patterns. Applied to 25 structures of Hemagglutinin from the PDB.

Target 3 – MultiProt Results 138 structurally conserved residues out of 320 residues in domain HA1. Some of those residues exhibit significant sequence variability. HA1 domain HA2 domain Structurally conserved residues

Three sites of antibody binding: Capri Target 3 1QFU 2VIR Hemagglutinin molecule Structurally conserved residues

Target 3 – MultiProt Results 1qfucapri3 2vir

Target 3 – final results RMSD 3.10 A, rank 6, run-time: 5 min The original complex antibody is in red and our solution is in green. Virus capsid protein hemagglutinin antibody from complex docking solution

Round 2 Docking Algorithms: BUDDA (Polak) PatchDock (Duhovny)

Targets 4,5,6 – alpha-amylase 3 Catalytic Residues in largest cavity: Ca (stabilizer ion) Cl (activator ion) Asp 197,300 Glu 233 Gly rich flexible loop ( )

Targets 4,5,6 – what we did Non conserved regions, based on multiple sequence alignment of mammalian amylase, were extracted. These regions were marked as the amylase potential binding site for the camel antibody. Favor results with wider interface area of CDR loop H3. Favor results with wider interface area of variable regions – reason to failure in all 3 targets.

Why non-conserved? amylase antibody ? The camel has it’s own amylase. He can only produce antibodies for the residues that differ between the two amylases. The interface must include some of those different residues. We don’t know the sequence of camel amylase, so we simply consider variable regions.

Targets 4,5,6 – non conserved regions in the interfaces Target 4: 15% of the interface Target 5: 13% of the interface Target 6: 20% of the interface ConSurf output:

Targets 4,5,6 – automatic CDR detection Target 4: 89% of the interface Target 5: 88% of the interface Target 6: 83% of the interface Target 4 Target 5 amylase antibody CDRs

Targets 4,5,6 – new results Only restriction – at least 70% of the interface in the candidate complexes belongs to CDRs. Average running time: 25 min TargetBest RMSD RankInterface Area of the Correct solution Interface Area of the Highest Ranked Solution ~ 405A 2 ~ 765A ~ 435A 2 ~ 700A ~ 570A 2 ~ 600A 2

Target 7: T-cell receptor with streptococcal pyrogenic exotoxin TCR-SAG complex in blue, Streptococcal toxin in yellow In the PDB search we found a complex of TCR with staphylococcal enterotoxin. The toxins have high structural similarity  alignment is our first solution

Target 7: Docking Active site focusing: TCR: only loops that are relevant for SAG binding were selected. Toxin: loops from the interface of the complex computed by alignment were selected.

Target 7: Docking Results Active site focusing for TCR and SAG: Best result : RMSD 3.37, Rank 3, Running Time: 1 min Active site focusing for TCR only: Best result : RMSD 3.37, Rank 36, Running Time: 7 min

Conclusions 1 We have presented results of fast rigid docking algorithms, which are based on geometric shape complementarity only. The algorithms can be easily extended to include main-chain flexibility (hinge bending). Successful approximate focusing on the binding sites of the proteins “almost” ensures ranking of a “correct” solution at the top.

Conclusions 2 Despite the heuristic nature of the algorithms, which are based on local shape complementarity and not on exhaustive search of the transformation space, “correct solutions” are not lost. A “correct solution” always appears among the first few hundred, yet the “best solution” might exhibit significantly higher shape complementarity than the “native” one.

Conclusions 3 Re-ranking by a GOOD energy function of the top few hundred geometric solutions, would result in obtaining a correct solution. Biological knowledge of “similar” interactions can assist in the focusing on the binding sites. A fully automatic prediction can help to evaluate the relative merit of the various algorithms employed.

Acknowledgements The CAPRI organizers and evaluators. Maxim Shatsky, Hadar Benyamini, Inbal Halperin, Adi Barzilay, Snait Tamir. Raquel Norel.