Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive.

Slides:



Advertisements
Similar presentations
Design of Experiments Lecture I
Advertisements

Random Forest Predrag Radenković 3237/10
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Improving enrichment rates A practical solution to an impractical problem Noel O’Boyle Cambridge Crystallographic Data Centre
Why multiple scoring functions can improve docking performance - Testing hypotheses for rescoring success Noel M. O’Boyle, John W. Liebeschuetz and Jason.
Computational Drug Design Apr 2010 Postgrad course on Comp Chem Noel M. O’Boyle.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Iterative Relaxation of Constraints (IRC) Can’t solve originalCan solve relaxed PRMs sample randomly but… start goal C-obst difficult to sample points.
Molecular dynamics refinement and rescoring in WISDOM virtual screenings Gianluca Degliesposti University of Modena and Reggio Emilia Molecular Modelling.
1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005.
Molecular Docking G. Schaftenaar Docking Challenge Identification of the ligand’s correct binding geometry in the binding site ( Binding Mode ) Observation:
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
FAST: A Novel Protein Structure Alignment Algorithm Jianhua Zhu and Zhiping Weng PROTEINS: Structure, Function, and Bioinformatics 58:618–627 (2005) Created.
Docking of Protein Molecules
FLEX* - REVIEW.
An Integrated Approach to Protein-Protein Docking
BL5203: Molecular Recognition & Interaction Lecture 5: Drug Design Methods Ligand-Protein Docking (Part I) Prof. Chen Yu Zong Tel:
Optimized Numerical Mapping Scheme for Filter-Based Exon Location in DNA Using a Quasi-Newton Algorithm P. Ramachandran, W.-S. Lu, and A. Antoniou Department.
Molecular Docking Using GOLD Tommi Suvitaival Seppo Virtanen S Basics for Biosystems of the Cell Fall 2006.
Protein Structure Prediction Samantha Chui Oct. 26, 2004.
Protein-protein and Protein- ligand Docking The geometric filtering.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
eHiTS Score Darryl Reid, Zsolt Zsoldos, Bashir S. Sadjad, Aniko Simon, The next stage in scoring function evolution: a new statistically.
Protein Tertiary Structure Prediction
ClusPro: an automated docking and discrimination method for the prediction of protein complexes Stephen R. Comeau, David W.Gatchell, Sandor Vajda, and.
Evaluation of Alternative Methods for Identifying High Collision Concentration Locations Raghavan Srinivasan 1 Craig Lyon 2 Bhagwant Persaud 2 Carol Martell.
Flexible Multi-scale Fitting of Atomic Structures into Low- resolution Electron Density Maps with Elastic Network Normal Mode Analysis Tama, Miyashita,
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
In silico discovery of inhibitors using structure-based approaches Jasmita Gill Structural and Computational Biology Group, ICGEB, New Delhi Nov 2005.
SimBioSys Inc.© Slide #1 Enrichment and cross-validation studies of the eHiTS high throughput screening software package.
SimBioSys Inc.© 2004http:// Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:
Altman et al. JACS 2008, Presented By Swati Jain.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
1/20 Study of Highly Accurate and Fast Protein-Ligand Docking Method Based on Molecular Dynamics Reporter: Yu Lun Kuo
Hierarchical Database Screenings for HIV-1 Reverse Transcriptase Using a Pharmacophore Model, Rigid Docking, Solvation Docking, and MM-PB/SA Junmei Wang,
R L R L L L R R L L R R L L water DOCKING SIMULATIONS.
BREED: Generating Novel Inhibitors through Hybridization of Known Ligands (A. C. Pierce, G. Rao, and G. W. Bemis) Richard S. L. Stein CS 379a February.
Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1, Amarda Shehu 2, Lydia Kavraki 2,3  Provided an approach to generating.
Molecular dynamics simulations of toxin binding to ion channels Quantitative description protein –ligand interactions is a fundamental problem in molecular.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Molecular mechanics Classical physics, treats atoms as spheres Calculations are rapid, even for large molecules Useful for studying conformations Cannot.
ISA Kim Hye mi. Introduction Input Spectrum data (Protein database) Peptide assignment Peptide validation manual validation PeptideProphet.
Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.
Elon Yariv Graduate student in Prof. Nir Ben-Tal’s lab Department of Biochemistry and Molecular Biology, Tel Aviv University.
A new protein-protein docking scoring function based on interface residue properties Reporter: Yu Lun Kuo (D )
Ligand-Based Structural Hypotheses for Virtual Screening
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
Volume 19, Issue 8, Pages (August 2011)
Virtual Screening.
Ligand Docking to MHC Class I Molecules
An Integrated Approach to Protein-Protein Docking
Alexey Sulimov, Ekaterina Katkova, Vladimir Sulimov,
Flexible alignment in 3D & applications
Rosetta: De Novo determination of protein structure
AnchorDock: Blind and Flexible Anchor-Driven Peptide Docking
Reporter: Yu Lun Kuo (D )
Complementarity of Structure Ensembles in Protein-Protein Binding
Grace W. Tang, Russ B. Altman  Structure 
Increased Reliability of Nuclear Magnetic Resonance Protein Structures by Consensus Structure Bundles  Lena Buchner, Peter Güntert  Structure  Volume.
Volume 20, Issue 6, Pages (June 2012)
Ligand Binding to the Voltage-Gated Kv1
Volume 19, Issue 8, Pages (August 2011)
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive Cancer Center, University of California Presentation by Susan Tang CS 379a January 23, 2006

Protein-Ligand Docking Overview Goal - To predict how well a given set of ligands will bind to a protein structure - To predict the structure of bound protein-ligand complexes Components - Search method: explore different ways that ligand can interact/fit with protein - Scoring function: assign a quantitative value to each ligand/protein fit

Protein-Ligand Docking Overview Criteria 1) Docking accuracy Measures ability to find a conformation + alignment (pose) of a protein-ligand that is close to reality 2) Scoring accuracy Ability to rank a correct pose of a molecule higher than an incorrect one 3) Screening utility Ability to identify only true ligands in a set that contains false positives 4) Speed How fast the algorithm can screen a library of ligands

Surflex: A new docking methodology Combines Hammerhead’s empirical scoring function with a molecular similarity method to generate putative poses of ligand fragments Like Hammerhead, Surflex has 1 mode that uses an incremental construction search approach. But Surflex also has another mode: a whole molecule approach that is faster/more accurate Surflex is designed primarily as a screening tool for small molecule libraries

Surflex: Computational Design Protomol Generation First create an ideal active site ligand from the protein structure of interest Input: (a) protein structure (b) list of residues to identify protein active site Output: A protomol, or target to which potential ligands or ligand fragments are aligned based on molecular similarity Procedure: Molecular fragments are put into the protein binding site in multiple positions  optimized for interaction with protein  select high-scoring nonredundant fragments  protomol formation

Surflex: Computational Design Protomol for streptavidin compared with the native pose of biotin (green) The bond being pointed to is broken by Surflex to make fragments of biotin for docking.

Surflex: Computational Design Docking Ligands are docked into the protein to optimize scoring function Input: (a) protein structure, (b) protomol, (c) ligand(s) Output: The optimized poses of docked ligands along with corresponding scores Procedure: Divide input ligand into 1-10 molecular fragments  search each fragment in terms of conformation  each conformation of each fragment is aligned to protomol to get poses with maximum molecular similarity to protomol  score aligned fragments and keep those with highest score and minimal protein interpenetration  construct full ligand molecule from the aligned fragments using either an incremental construction approach or whole molecule approach  highest scoring poses undergo further refinement of conformation and alignment

Incremental Construction vs. Whole Molecule Algorithm Incremental Construction - Makes strong assumption that maximizing the similarity of tiny fragments to the protomol will generate good poses Whole Molecule Algorithm - bypasses the strong independence assumption made in incremental construction - “dead” pieces are carried with the “live” piece during conformation search - when creating putative poses to protomol, the “dead” pieces in their arbitrary initial conformation are carried into the molecular similarity computation  eliminate those with worst protein interpenetration - for remaining poses, score on basis of individual fragments - recursive search yields whole molecules that consist of fragments selected from different docked poses - these whole molecules score well in total, over all fragments Surflex: Computational Design

Illustrates the process of docking biotin to streptavidin (blue) Gray indicates the “live” fragment Magenta indicates the “dead” fragment Green lines show the result of merging the two well- docked fragments at the atoms indicated by yellow circles The merged pose closely follows the parent fragments’ original configurations

Surflex: Evaluation 1)Evaluation of reliability and accuracy of dockings - Comparison with experimental results on 81 protein/ligand pairs - The pairs were selected to represent structural diversity 2)Evaluation of Surflex’s utility as a screening tool -Performed on 2 protein targets (thymidine kinase and estrogen receptor) -Competing docking methods were tested side by side using the same data set for comparison purposes (GOLD, Dock, FlexX) 3)Evaluation of the Surflex’s docking speed - Investigate relationship between docking time and # of rotatable bonds

Surflex: Evaluation Data Set Construction Filtering Criteria: (1)15 or fewer rotatable bonds  Most small molecules have <= 15 rotable bonds (1)no covalent attachments between ligand and protein  Since Surflex’s scoring function was developed strictly on noncovalent complexes (3)ligands with no obvious errors in structure  Undesirable to modify an existing protein-ligand complex prior to testing * data set used for GOLD docking program 134 protein-ligand Complexes * filter 81 protein-ligand complexes

Surflex: Evaluation Results 1) Evaluation of reliability and accuracy of dockings Describes how thorough the search procedure is and to what extent scoring function can recognize good dockings Surflex returned a pose within 2.5 angstroms rmsd (94 % of cases) Surflex returned a BEST scoring pose that was within 2.5 angstroms (86 % of cases) With a single docking from a random initial pose, chances of finding a correct or nearly correct pose is averaged to be ~70 %

Surflex: Evaluation Results

2) Evaluation of Surflex’s utility as a screening tool Tests ability of program to detect true positives against a background of random molecules (sensitivity vs. specificity) Surflex had a True Positive rate of > 80% at a False Positive rate of < 1 % Surflex had the best performance (lowest FP rate for a given TP rate) out of the different individual and combined methods assayed

Surflex: Evaluation Results 3) Evaluation of the Surflex’s docking speed Docking speed becomes very important in screening large compound libraries. Surflex demonstrated a docking time that was approx. linear in number of rotatable bonds Rigid molecules took a few seconds and each additional rotatable bond took an additional ~10 seconds Surflex yielded a mean running time of 44 seconds for the 81 protein-ligands in the test set used earlier Docking speed ranges from seconds per molecule for FlexX, DOCK, and GOLD (Surflex speed is comparable to these times) Quantitative comparison across methods is difficult due to differences in hardware and methodology

Surflex: Evaluation Results

Conclusions Surflex marks a step forward in flexible molecular docking programs Compared to the best docking methods available, Surflex is: –as fast –as accurate in terms of docked ligand RMSD –much more accurate in terms of scoring Assaying the top scoring 1% of compounds in the screening library should yield a large proportion of true positives Potential areas of improvement - scoring and penetration terms should be combined into a single score - scoring function should include training on non-binding ligands (negative examples) - effect of nonbonded self-interactions within ligands should be accounted for explicitly - allow a degree of protein flexibility (side chain movement)