Modelling binding site with 3DLigandSite Mark Wass

Slides:



Advertisements
Similar presentations
Using the SmartPLS Software “Structural Model Assessment”
Advertisements

Functional Site Prediction Selects Correct Protein Models Vijayalakshmi Chelliah Division of Mathematical Biology National Institute.
Protein Structure Prediction using ROSETTA
The Cochrane Library High quality evidence about the effectiveness of treatments.
Pfam a resource for remote homology domain identification et al NAR 2014.
© Wiley Publishing All Rights Reserved. Analyzing Protein Sequences.
Protein Tertiary Structure Prediction
Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Applications of Homology Modeling Hanka Venselaar.
Modelling Workshop - Some Relevant Questions Prof. David Jones University College London Where are we now? Where are we going? Where should.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Application Process USAJOBS – Application Manager USA STAFFING ® —OPM’S AUTOMATED HIRING TOOL FOR FEDERAL AGENCIES.
Using 3D-SURFER. Before you start 3D-Surfer can be accessed at For visualization.
Current Status of Homology Modeling Using MCSG Structures 319 MCSG structures in PDB have over 400,000 sequence homologues. These structures represent.
Protein Tertiary Structure Prediction
DockoMatic: Automated Tool for Homology Modeling and Docking Studies DOCKOMATIC-Student Procedure-Homology Modeling & Molecular Docking Tutorial.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
COMPARATIVE or HOMOLOGY MODELING
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Author: Sali Allister Date: 10/01/2012 COASTAL Google Analytics Report September 2011– December /09/2011 – 08/12/11.
1 PyMOL Evolutionary Trace Viewer 1.1 Lichtarge Lab Sept. 13, 2010.
Representations of Molecular Structure: Bonds Only.
Classifier Evaluation Vasileios Hatzivassiloglou University of Texas at Dallas.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Manually Adjusting Multiple Alignments Chris Wilton.
Sketches and prototypes for the Orlando Six Degrees of Separation Project.
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
Tweaking BLAST Although you normally see BLAST as a web page with boxes to place data in and tick boxes, etc., it is actually a command line program that.
Guidelines for sequence reports. Outline Summary Results & Discussion –Sequence identification –Function assignment –Fold assignment –Identification of.
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Protein Tertiary Structure Prediction Structural Bioinformatics.
EBI is an Outstation of the European Molecular Biology Laboratory. A web based integrated search service to understand ligand binding and secondary structure.
FM Model Assessment: Old scores and New Combinations ShuoYong Shi Nick Grishin Lab.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
T3/Tutorials: Data Submission
Step 1: Specify a null hypothesis
MassMatrix PC Graphical User Interface (GUI) Manual
PDBemotif A web based integrated search service to understand ligand binding and secondary structure properties in macromolecular structures.
MassMatrix Search Results Explained
USAJOBS – Application Manager
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
(Residuals and
Volume 19, Issue 8, Pages (August 2011)
Prediction of IgE-binding epitopes by means of allergen surface comparison and correlation to cross-reactivity  Fabio Dall'Antonia, PhD, Anna Gieras,
Protein Folding and Protein Threading
BLAST.
Volume 86, Issue 6, Pages (June 2004)
Protein structure prediction.
Yang Liu, Perry Palmedo, Qing Ye, Bonnie Berger, Jian Peng 
Darren Thompson, Mark B Pepys, Steve P Wood  Structure 
Mechanism of Triton binding to ShHTL7
Complementarity of Structure Ensembles in Protein-Protein Binding
Volume 20, Issue 6, Pages (June 2012)
Ligand Binding to the Voltage-Gated Kv1
Volume 19, Issue 8, Pages (August 2011)
Figure 1. Workflow of the HawkDock server that is divided into three major steps: (i) input of unbound or bound protein ... Figure 1. Workflow of the HawkDock.
Protein Structure Prediction: Inroads to Biology
Volume 86, Issue 6, Pages (June 2004)
Alignment of the deduced amino acid sequences of the myosin light chain 2 (MLC2) proteins. Alignment of the deduced amino acid sequences of the myosin.
Volume 12, Issue 11, Pages (November 2004)
Homology modeling in short…
Gene regulatory regions of the insect/crustacean egr-B homologs.
Presentation transcript:

Modelling binding site with 3DLigandSite Mark Wass

CASP MEEYKVVVCGSGPVALGCFTarget sequence (2 per day) Predicted structure Predicted Binding site Assessment Results Performance Compared to other groups Human predictors Server predictors 3 days 3 weeks

3DLigandSite Developed as a result of participation in CASP MTEYKLVVVGAGGVGKSALTIQLIQ

3DLigandSite Homologous structures 2oai 2pls 2p4p Calcium Magnesium Wass & Sternberg Proteins 2009

3DLigandSite Homologous structures 2pls 2p4p 2oai Calcium Magnesium Wass & Sternberg Proteins 2009

3DLigandSite Homologous structures 2p4p 2pls 2oai Calcium Magnesium Wass & Sternberg Proteins 2009

3DLigandSite Homologous structures 2p4p 2pls 2oai Calcium Magnesium Wass & Sternberg Proteins 2009

3DLigandSite Homologous structures Calcium Magnesium Wass & Sternberg Proteins 2009

3DLigandSite CalciumTrue positiveFalse Positive Wass & Sternberg Proteins 2009

MCC Sternberg group LEE group Assessed using Matthews’ Correlation coefficient – -range -1 – +1 LEE & Sternberg not statistically different. LEE & Sternberg are statistically different to all other predictors (p <0.01). Adapted from Lopez et al., 2009 Groups Performance at CASP8

3DLigandSite Automating our CASP8 approach Wass et al., NAR 2010

3DLigandSite Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters 3DLigandSite

Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters 3DLigandSite

Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters 3DLigandSite

Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters 3DLigandSite

Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters Predictions 3DLigandSite

Predicting the residues contacting the ligand Predicting contacting residues Multiple molecules in cluster but where is the actual binding site? Threshold for prediction = Contact 25% ligands

CASP8 targets (28) Measure3DLigandSiteHuman CASP8 MCC Recall71%83% Precision60%56% FINDSITE set (617) Measure3DLigandSite MCC0.68 Recall70% Precision70% 3DLigandSite Benchmarking MCC – Matthews Correlation Coefficient Recall– percentage of binding sites that are predicted (TP/(TP+FN)) Precision– percentage of predicted residues that are correct (TP/(TP+FP)) 3DLigandSite Benchmarking

Using 3DLigandSite

3DLigandSite Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters

Homepage - submission Paste sequence and Run Or submit your own structure Retrieve results

Homepage - submission Upload structure

Results page

Submission details JOB ID Submission type – sequence/structure

3DLigandSite Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters

3DLigandSite Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters

Structural model JOB ID –same as 3DLig job id Model confidence 0 (low) -100 (high) Similarity of structural hits (higher value = structures more similar) Min lnE value used = 7 Predictions using low LnE values e.g. < ~12-15 should be treated with caution

3DLigandSite Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters

Ligand Clusters Model confidence 0 (low) -100 (high) Clusters ranked by number of ligands. Mammoth scores for cluster displayed to indicate how similar the structures are that contributed the ligands in the cluster. Top cluster displayed as main prediction. Click on rows to view predictions for the other clusters.

3DLigandSite Phyre2 Align homologous structures MAMMOTH search against Structural library Structural model MTEYKLVVVGAGG User provided strcuture Cluster ligands Select clusters Wass et al., NAR 2010 Make prediction Using selected Clusters Predictions

Interpreting predictions – what ligands? Lists the ligands that are present in the cluster and the structures that they are from

Interpreting predictions Predicted residue table Residues in cluster that are < 0.5A +vdw of 25% of cluster predicted Number of ligand contact Av distance between residue and these ligands JS Divergence – conservation score (range 0 – 1). These values can be used to refine the prediction – e.g. residues that contact few of the ligands are further from the ligands Have low conservation scores

Interpreting predictions

Control: Colouring of protein – by prediction or conservation Display of protein: Spacefill/wireframe/cartoon Label predicted residues so they can be identified in the graphical view. Separate controls for display of predicted residues Modify display of ligands: Spacefill/wireframe Overall: Make protein rotate Change background colour

Interpreting predictions - Metals Metals found bound like this – with 3-6 residues Often the residues aren’t sequential Binding sites with a single residue contacting the ligand are likely to be wrong

Interpreting predictions - Metals Sometimes the cluster of residues might overlap with the protein structure as in the examples above. This is more likely where the cluster is close to a loop. The prediction may be good but it might also be slightly affected by the overlap of the cluster and the structure

Interpreting predictions - Metals 2 clusters for the same protein. Multiple ligands in cluster Multiple residues contacting ligand Looks like it could be a ligand binding site Divergence colouring help suggest residue that might not be part of the binding site. Single ligand in cluster Single residue binds the ligand Unlikely to be a ligand binding site

Interpreting predictions - Oligomers Site only binds part of cluster Prediction viewed with other chain of dimer from one of the templates When predictions only seem to contact part of the ligand in some example this is because the ligand is bound between chains in an oligomer. Therefore part of the binding site might be missed. Different clusters predicted for the binding site may predict different residues that when combined contain the full binding site

Interpreting predictions – large Clusters Large cluster of many different ligands. This is unlikely to be a binding site

Interpreting predictions Suggestions for interpreting results: Consider the similarity between the structure and the hits The number of ligands in a cluster may be indicative of how likely it is for the region to be a binding site Use of the JS Divergence score may help refine predictions Metal binding site predictions can have high levels of false positive. Especially if there are many clusters and the clusters only contain a single metal ion Metal ions a generally contact multiple residues Checking the conservation score may be helpful here to remove false predictions Clusters can occasionally become very large – with many ligands covering a large are of the protein. Such a large site is likely to be incorrect, although part of it may be ligand binding.

Modelling binding site with 3DLigandSite Mark Wass