8. Protein Docking 1. Prediction of protein-protein interactions 1.How do proteins interact? 2.Can we predict and manipulate those interactions?  Prediction.

Slides:



Advertisements
Similar presentations
Antibody Structure Prediction and the Use of Mutagenesis in Docking Arvind Sivasubramanian, Aroop Sircar, Eric Kim & Jeff Gray Johns Hopkins University,
Advertisements

Protein Docking Rong Chen Boston University. BU Bioinformatics The Lowest Binding Free Energy  G water R L R L L R L R L R.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Protein Structure Prediction using ROSETTA
How to approximate complex physical and thermodynamic interactions? Employ rigid or flexible structures for ligand and receptor (Side-chains or Back-bone.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Molecular Docking G. Schaftenaar Docking Challenge Identification of the ligand’s correct binding geometry in the binding site ( Binding Mode ) Observation:
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
Docking Algorithm Scheme Part 1: Molecular shape representation Part 2: Matching of critical features Part 3: Filtering and scoring of candidate transformations.
Protein Docking and Interactions Modeling CS 374 Maria Teresa Gil Lucientes November 4, 2004.
“Inverse Kinematics” The Loop Closure Problem in Biology Barak Raveh Dan Halperin Course in Structural Bioinformatics Spring 2006.
Docking of Protein Molecules
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
An Integrated Approach to Protein-Protein Docking
BL5203: Molecular Recognition & Interaction Lecture 5: Drug Design Methods Ligand-Protein Docking (Part I) Prof. Chen Yu Zong Tel:
CAPRI Critical Assessment of Prediction of Interactions.
Protein Structure Prediction Samantha Chui Oct. 26, 2004.
Identifying similar surface patches on proteins using a spin-image surface representation M. E. Bock Purdue University, USA G. M. Cortelazzo, C. Ferrari,
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Protein modelling ● Protein structure is the key to understanding protein function ● Protein structure ● Topics in modelling and computational methods.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Molecular Docking S. Shahriar Arab. Overview of the lecture Introduction to molecular docking: Definition Types Some techniques Programs “Protein-Protein.
ClusPro: an automated docking and discrimination method for the prediction of protein complexes Stephen R. Comeau, David W.Gatchell, Sandor Vajda, and.
COMPARATIVE or HOMOLOGY MODELING
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Representations of Molecular Structure: Bonds Only.
Structural biology should be computable! Protein structures determined by amino acid sequences Protein structures and complexes correspond to global free.
Protein Structure Modelling Many sequences - few structures Homology Modelling - Based on Sequence Similarity with Sequences of Known Structures.
Ab Initio Methods for Protein Structure Prediction CS882 Presentation, by Shuai C., Li.
2. Introduction to Rosetta and structural modeling (From Ora Schueler-Furman) Approaches for structural modeling of proteins The Rosetta framework and.
Altman et al. JACS 2008, Presented By Swati Jain.
Hierarchical Database Screenings for HIV-1 Reverse Transcriptase Using a Pharmacophore Model, Rigid Docking, Solvation Docking, and MM-PB/SA Junmei Wang,
Modelling protein tertiary structure Ram Samudrala University of Washington.
Protein Design with Backbone Optimization Brian Kuhlman University of North Carolina at Chapel Hill.
Discrimination of near-native structures by clustering docked conformations and the selection of the optimal radius D. Kozakov 1, K. H. Clodfelter 2, C.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Coarse and Reliable Geometric Alignment for Protein Docking Yusu Wang Stanford University Joint Work with P. K. Agarwal, P. Brown, H. Edelsbrunner, J.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Mean Field Theory and Mutually Orthogonal Latin Squares in Peptide Structure Prediction N. Gautham Department of Crystallography and Biophysics University.
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
2. Introduction to Rosetta and structural modeling
Elon Yariv Graduate student in Prof. Nir Ben-Tal’s lab Department of Biochemistry and Molecular Biology, Tel Aviv University.
Protein-Protein Interactions. A Protein may interact with: –Other proteins –Nucleic Acids –Small molecules Protein Interactions.
A new protein-protein docking scoring function based on interface residue properties Reporter: Yu Lun Kuo (D )
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
Computational Structure Prediction
Rong Chen Boston University
Protein Structure Prediction and Protein Homology modeling
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
An Integrated Approach to Protein-Protein Docking
Yang Zhang, Andrzej Kolinski, Jeffrey Skolnick  Biophysical Journal 
Rosetta: De Novo determination of protein structure
AnchorDock: Blind and Flexible Anchor-Driven Peptide Docking
Protein structure prediction.
9. Protein Docking.
Complementarity of Structure Ensembles in Protein-Protein Binding
Volume 20, Issue 3, Pages (March 2012)
Ligand Binding to the Voltage-Gated Kv1
Protein-Protein Docking: From Interaction to Interactome
Gydo C.P. van Zundert, Adrien S.J. Melquiond, Alexandre M.J.J. Bonvin 
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

8. Protein Docking 1

Prediction of protein-protein interactions 1.How do proteins interact? 2.Can we predict and manipulate those interactions?  Prediction of Structure – Docking  Prediction of Binding  Design – creation of new interactions 2

Docking vs. ab initio modeling ADEFFGKLSTKK……. Sequence de novo Structure Prediction (ROSETTA) Structure Building Blocks: backbone & side chains Docking (ROSETTADOCK) + Complex Monomers Rigid body degrees of freedom 3 translation 3 rotation CASPCAPRI 3

Protein-protein docking  Aim: predict the structure of a protein complex from its partners + Complex Monomers Rigid body degrees of freedom 3 translation 3 rotation 4

↓ Slow Precise Solution 1 : Tolerate clashes Monomers change structure upon binding to partner +=+=+= Fast ↓ Weak discrimination of correct solution Solution 2 : Model changes 5

Protein-protein docking Sampling strategies  Initial approaches: Techniques for fast detection of shape complementarity 1.Fast Fourier Transform (FFT) 2.Geometric hashing  Advanced high-resolution approaches: model changes explicitly 3. Rosettadock  Data-driven docking 4. Haddock 6

Find shape complementarity: 1. Fast Fourier Transform (FFT) Ephraim Katzir + 7

Find shape complementarity - FFT Ephraim Katzir 8

Find shape complementarity: Fast Fourier Transform (FFT) Ephraim Katzir z Y Translation Correlation X Translation Test all possible positions of ligand and receptor: For each rotation of ligand ( R ) evaluate all translations ( T ) of ligand grid over receptor grid = correlation product: can be calculated by FFT 9

Find shape complementarity: Fast Fourier Transform (FFT) R L L R R L Rotate Fast Fourier Transform A=DFT(a) Discretize Fast Fourier Transform B=DFT(b) SurfaceInterior Correlation function C=A*B From 1 <0 for R >0 for L Ephraim Katzir S=iDFT(C) Computational cost: N 3 logN 3 (instead of N 6 ) 10

SurfaceInteriorBinding Site Y Translation Correlation X Translation IFFT Increase the speed by 10 7 From Find shape complementarity: Fast Fourier Transform (FFT) 11

Some FFT-based docking protocols Zdock (Weng) Cluspro (Vajda, Camacho) PIPER (Vajda, Kozakov) Molfit (Eisenstein) DOT (TenEyck) HEX (Ritchie) – FFT in rotation space 12

Shape complementarity: 2. Geometric hashing (patchdock, Wolfson & Nussinov)  Matching of puzzle pieces 1.Define geometric patches (concave, convex, flat) 2.Surface patch matching 3.Filtering and scoring From 13

Hashing: alpha shapes Formalizes the idea of “shape” In 2D an “edge” between two points is “alpha-exposed” if there exists a circle of radius alpha such that the two points lie on the surface of the circle and the circle contains no other points from the point set 14

Hashing – sparse surface representation Slide from Jens Meiler 15

Docking with geometric hashing PATCHDOCK Fast and versatile approach Speed allows easy extension to multiple protein docking, flexible hinge docking, etc A extension of this protocol, FIREDOCK, includes side chain optimization (RosettaDock- like) – very flexible, fast and accurate protocol 16

Low-Resolution Monte Carlo Search High-Resolution Refinement 10 5 Clustering Predictions High-resolution docking with Rosetta: Rosettadock Random Start Position Filters Random Start Position 17

Euler angles are independent and guarantee non-biased search Choosing starting orientations 1.Global search  Random Translation  Random Rotation (Euler Angles) 1.Tilt direction [ o ] 2.Tilt angle [0:90 o ] 3.Spin angle [ o ] 18

Choosing starting orientations 2.Local Refinement  Translation 3 Å normal, 8 Å parallel  Rotation Tilt direction [0±8 o ] 2.Tilt angle 3.Spin angle 19

Low-Resolution Monte Carlo Search High-Resolution Refinement 10 5 Clustering Predictions Overview of docking algorithm Random Start Position Filters 20

Low-resolution search  Mimics physical diffusion process 1.Perturbation 2.Monte Carlo search 3.Rigid body translations and rotations 4.Residue-scale interaction potentials Protein representation: backbone atoms + average centroids 21

Residue-scale scoring ScoreRepresentationPhysical Force Contactsr centroid-centroid < 6 Å Attractive van der Waals Bumps (r – R ij ) 2 Repulsive van der Waals Residue environment -ln(P env ) Solvation Residue pair -ln(P ij ) Hydrogen bonding electrostatics, solvation Alignment -1 for interface residues in Antibody CDR (bioinformatic) Constraintsvaries(biochemical) 22

Low-Resolution Monte Carlo Search High- Resolution Refinement 10 5 Clustering Predictions Overview of docking algorithm Random Start Position Filters 23

High resolution optimization: Monte Carlo with Minimization (MCM) Side chain optimization Rigid body minimization Random perturbation MC START Random perturbation Side chain optimization Rigid body minimization FINISH Energy Rigid body orientations Cycles of iterative optimization 24

Low-Resolution Monte Carlo Search High-Resolution Refinement 10 5 Clustering Predictions Overview of docking algorithm Random Start Position Filters 25

Filters  Low resolution Antibody profiles Antigen binding residues at interface Contact filters Biological information Interface residues Interacting residue pair  High resolution Energy filters speed up creation of low energy models Random perturbation Final scoring Monte-Carlo (MC) optimization Minimization of rigid body orientation 5 cycles of MC optimization 45 cycles of MC optimization Filter1 Filter2 Filter3 26

Low-Resolution Monte Carlo Search High-Resolution Refinement 10 5 Clustering Predictions Overview of docking algorithm Random Start Position Filters 27

Clustering Compare all top-scoring decoys pairwise Cluster decoys hierarchically Decoys within e.g. 2.5Å form a cluster Represents ENTROPY 28

Assessment 1: Benchmark studies Bound-Bound – Start with bound complex structure, but remove the side chain configurations so they must be predicted Unbound-Unbound – Start with the individually- crystallized component proteins in their unbound conformation Benchmark set contains 54 targets for which bound and unbound structures are known Bound-Unbound (Semibound) trypsin + inhibitorbarnase + barstar α-chymotrypsin + inhibitor subtilisin + inhibitor lysozyme + antibodies hemagglutinin + antibody actin + deoxyribonuclease I subtilisin + prosegment 29

Assessment of method on benchmark (54 proteins, Gray et al., 2003)  funnel - 3/5 top-scoring models within 5A rmsd …….. 30 Unbound Docking Global 3 28/32 Bound Docking Perturbation 1 42/54 1.More than three of top five decoys (by score) that have rmsd less than 5 Å 2.More than three of top five decoys (by score) that predict more than 25% native residue contacts 3.The rank of the first cluster with >25% native residue contacts Unbound Docking Perturbation 2 32/54  Overall performance

Score and performance are correlated with binding affinity  targets with funnels  targets without funnels Δ score for bound backbone docking Δ score (calculated) -log Ka (experimental) 31

Limitation of “rotamer-based” modeling Near-native model with clash Non-native model without clash Orange and red: native complex; Blue: docking model. Trp 172 Trp 215 PDB code: 1CHO 32

Rtmin: rotamer trial with minimization Randomly pick one residue. Screen a list of rotamers. Minimize each of these rotamers. Accept the one that yields the lowest energy. Additional rotamers Include free side chain conformation in rotamer library Rot I Rot II Minimization Native Improved side chain modeling at interface Wang, OSF & Baker,

RosettaDock simulation Rigid body orientations: RMSD to arbitrary starting structure ( Å ) Energy  1 model/simulation: energy vs RMSD (structural similarity to starting model)  Final model selected based on energy (and/or sample density) 34

RosettaDock simulation 1.Initial Search2.Refinement RMSD to arbitrary starting structure Energy RMSD to starting structure of refinement (Å) 35

36 red,orange– xray blue – model; green – unbound  0.27Å interface rmsd  87% native contacts  6% wrong contacts  Overall rank 1 CAPRI Target 12 Cohesin-Dockerin Side chain flexibility is important Dockerin Cohesin Carvalho et. al (2003) PNAS

Details of T12 interface D39 N37 S45 L83 E86 Y74 L22 R53 Dockerin Cohesin red,orange– xray blue - model 37

Similar landscapes for different Rosetta predictions Phil Bradley Docking energy landscape Folding energy landscape Energy function describes well principles underlying the correct structure of monomers and complexes Schueler-Furman et. al (2005) Science 38

A Challenging Target RF1-HEMK (T20) Challenge: Large complex RF1 to be modeled from RF2 Disordered Q-loop Hope: Q235 methylated A Gln analog in HemK crystal Strategy: Trimming – Docking – Loop Modeling - Refining Q252 RF1 loop1 loop2 Q-loop Q-235 HemK Q252 RF1 loop1 loop2 Q-loop Q-235 HemK Q252 Q-235 HemK RF1 Q252 Q-235 HemK RF1 loop1 loop2 Q-loop Keys to success: Location of interface with truncated protein Separate modeling of large conformational change in key loop 39

Prediction of large conformational change Gln235 I_rmsd 2.34 Ǻ F_nat 34.2% Red, orange – bound; Green,– unbound; Blue -- model GLN235 C  atom shift:14.13Ǻ to 3.91 Ǻ Q-loop global C  rmsd: 11.8 Ǻ to 4.8 Ǻ 40

Docking with backbone minimization N1C N1’C START random perturbation repack minimization FINISH Fold tree Docking Monte Carlo Minimization (MCM) Rigid-body Backbone Sidechain Interface RMSD Interface energy 2SNI # of “hits” in top 10 models Red: bound rigid Green: unbound rigid Blue: unbound flexible 41

Docking with loop minimization Flexible Docking N1 N1’C 22’xC Fold-tree Minimize rigid-body and loop simultaneously Red, orange – bound (1T6G, Sansen, S. et al, J.B.C.(2004)); Blue – model; Green – unbound (1UKR, Krengel U. et al, JMB (1996)) Correctly predicted loop conformation All-atom energy Interface RMSD 42

Docking with loop rebuilding unbound rigid Ligand RMSD All-atom energy Bound rigidunbound flexible loop 1BTH 43

Flexible backbone protein–protein docking using ensembles Incorporate backbone flexibility by using a set of different templates Generation of set of ensembles: with Rosetta relax protocol, from NMR ensembles, etc Chaudhury & Gray, (2008) 44

Sampling among conformers during docking Exchange between templates during protocol 45

Evaluation of 4 different protocols 1.key-lock (KL) model rigid-backbone docking 2. conformer selection (CS) model ensemble docking algorithm 3.induced fit (IF) model energy-gradient-based backbone minimization 4. combined conformer selection/induced fit (CS/IF) model Can teach us about the possible binding mechanism (e.g. induced fit vs key-lock) Brown: high-quality decoys Orange: medium-quality decoys 46

RosettaDock - summary First program to introduce general (side chain) flexibility during docking Advanced the docking field towards unbiased high-resolution modeling Many other protocols have since then incorporated RosettaDock as a high-resolution final step Targeted introduction of backbone flexibility can improve modeling dramatically 47

4. Data-driven docking Challenges: – Large conformational space to sample – Conformational changes of proteins upon binding Approach: restrict search space by previous information – HADDOCK (High Ambiguity Driven protein-protein Docking) 48

Scheme of Haddock Bonvin, JACS 2003 Information about complex can be retrieved from several sources 49

Haddock computational scheme 1.Derive Ambiguous Interaction Restraints (AIRs): – Active residues: involved in interaction, and solvent accessible – Passive residues: neighbors of active residues 2.Create CNS restraints file (Used in NMR structure determination) Rational: Include AIRs in energy function find protein complex structure with minimum energy Similar to – solving a structure by NMR – Homology modeling with constraints (e.g. Modeler) 50

Rigid body energy minimization: 1.rotational minimization 2.rotational & translational Align molecules if anisotropic data is available Satisfy maximum number of AIC Retain top200 Semi-flexible simulated annealing (SA) High temperature rigid body search Rigid body SA Semi-flexible SA with flexible side-chains at the interface Semi-flexible SA with fully flexible interface (both backbone and side-chains) Clustering Predictions Overview of Haddock Start Position Flexible explicit solvent refinement Improves energy ranking 51

Docking – Summary & Outlook Efficient search using – fast sampling techniques (e.g. FFT, Geometric hashing), or/and – Restraints to relevant region (e.g. biological constraints, etc) Challenge: conformational changes in the partners Introduction of flexibility has improved modeling to high resolution – Full side chain flexibility (Rosetta) – Targeted introduction of backbone flexibility Larger changes can be incorporated using techniques such as Normal Mode Analysis 52