1 correlating graph-theoretical centrality indices with interface residue propensity or: where do things stick together? Stefan Maetschke Teasdale Group.

Slides:



Advertisements
Similar presentations
Structural Classification and Prediction of Reentrant Regions in Alpha-Helical Transmembrane Proteins: Application to Complete Genomes Håkan Viklunda,
Advertisements

LS-SNP: Large-scale annotation of coding non- synonymous SNPs based on multiple information sources -Bioinformatics April 2005.
Secondary structure prediction from amino acid sequence.
Protein Tertiary Structure Prediction
Improved prediction of protein-protein binding sites using a support vector machine ( James Bradford, et al (2004)) Tapan Patel CISC841 Trypsin (and inhibitor.
Structural bioinformatics
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
Docking of Protein Molecules
1 Computational Analysis of Protein-DNA Interactions Changhui (Charles) Yan Department of Computer Science Utah State University.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Protein Tertiary Structure Prediction Structural Bioinformatics.
Napovedovanje imunskega odziva iz peptidnih mikromrež Mitja Luštrek 1 (2), Peter Lorenz 2, Felix Steinbeck 2, Georg Füllen 2, Hans-Jürgen Thiesen 2 1 Odsek.
Protein Tertiary Structure Prediction
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Modelling binding site with 3DLigandSite Mark Wass
Discovering the Correlation Between Evolutionary Genomics and Protein-Protein Interaction Rezaul Kabir and Brett Thompson
Modelling Genome Structure and Function Ram Samudrala University of Washington.
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by a grant from the National.
Web Servers for Predicting Protein Secondary Structure (Regular and Irregular) Dr. G.P.S. Raghava, F.N.A. Sc. Bioinformatics Centre Institute of Microbial.
Localization prediction of transmembrane proteins Stefan Maetschke, Mikael Bodén and Marcus Gallagher The University of Queensland.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Protein-Protein Interaction Hotspots Carved into Sequences Yanay Ofran 1,2, Burkhard Rost 1,2,3 1.Department of Biochemistry and Molecular Biophysics,
Study of Protein Prediction Related Problems Ph.D. candidate Le-Yi WEI 1.
1 Web Site: Dr. G P S Raghava, Head Bioinformatics Centre Institute of Microbial Technology, Chandigarh, India Prediction.
Identification of amino acid residues in protein-protein interaction interfaces using machine learning and a comparative analysis of the generalized sequence-
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
B IOINFORMATICS AND C OMPUTATIONAL B IOLOGY A Computational Method to Identify RNA Binding Sites in Proteins Jeff Sander Iowa State University Rocky 2006.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
LOGO iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance- Pairs and Reduced Alphabet Profile into the General Pseudo Amino.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
 Developed Struct-SVM classifier that takes into account domain knowledge to improve identification of protein-RNA interface residues  Results show that.
Feature Extraction Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and.
Prediction of Protein Binding Sites in Protein Structures Using Hidden Markov Support Vector Machine.
Fehérjék 3. Simon István. p27 Kip1 IA 3 FnBP Tcf3 Bound IUP structures.
Typically, classifiers are trained based on local features of each site in the training set of protein sequences. Thus no global sequence information is.
Final Report (30% final score) Bin Liu, PhD, Associate Professor.
Protein Tertiary Structure Prediction Structural Bioinformatics.
We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained.
Predicting Structural Features Chapter 12. Structural Features Phosphorylation sites Transmembrane helices Protein flexibility.
Intrinsically disordered proteins Zsuzsanna Dosztányi EMBO course Budapest, 3 June 2016.
Protein Structure Visualisation
Volume 18, Issue 2, Pages (February 2010)
Prediction of RNA Binding Protein Using Machine Learning Technique
Extra Tree Classifier-WS3 Bagging Classifier-WS3
Support Vector Machine (SVM)

A Detailed View of a Ribosomal Active Site
Prediction of IgE-binding epitopes by means of allergen surface comparison and correlation to cross-reactivity  Fabio Dall'Antonia, PhD, Anna Gieras,
Volume 25, Issue 2, Pages (February 2017)
Bin Li, Darwin O.V Alonso, Valerie Daggett  Structure 
Structure-Based Reassessment of the Caveolin Signaling Model: Do Caveolae Regulate Signaling through Caveolin-Protein Interactions?  Brett M. Collins,
Solution Structure of the U11-48K CHHC Zinc-Finger Domain that Specifically Binds the 5′ Splice Site of U12-Type Introns  Henning Tidow, Antonina Andreeva,
Giovanni Settanni, Antonino Cattaneo, Paolo Carloni 
Molecular Basis of Box C/D RNA-Protein Interactions
Yvonne Groemping, Karine Lapouge, Stephen J. Smerdon, Katrin Rittinger 
Volume 18, Issue 2, Pages (February 2010)
Volume 13, Issue 2, Pages (February 2005)
Rules for Nuclear Localization Sequence Recognition by Karyopherinβ2
Structural Basis for the Inhibition of Caspase-3 by XIAP
Protein folding kinetics: timescales, pathways and energy landscapes in terms of sequence-dependent properties  Thomas Veitshans, Dmitri Klimov, Devarajan.
Study Identification and Selection Process
On Hydrophobicity and Conformational Specificity in Proteins
Structure of the Staphylococcus aureus AgrA LytTR Domain Bound to DNA Reveals a Beta Fold with an Unusual Mode of Binding  David J. Sidote, Christopher.
Alignment of the deduced amino acid sequences of the myosin light chain 2 (MLC2) proteins. Alignment of the deduced amino acid sequences of the myosin.
LC8 is structurally variable but conserved in sequence.
Suvobrata Chakravarty, Roberto Sanchez  Structure 
Volume 11, Issue 10, Pages (October 2003)
Morgan Huse, Ye-Guang Chen, Joan Massagué, John Kuriyan  Cell 
Volume 15, Issue 6, Pages (September 2004)
Presentation transcript:

1 correlating graph-theoretical centrality indices with interface residue propensity or: where do things stick together? Stefan Maetschke Teasdale Group

2 …a bit more specific  Prediction of interface residues  Protein-RNA interfaces  Machine learning methods  Structural information  Graph-topological features

3 something for the visual cortex [Terribilini et al. 2006][JMol,1R3E_A][Jung Library] Protein-RNA complex Binding siteContact graph

4 questions Most predictors are sequence based:   What impact has structural information on prediction accuracy?   What features are predictive for interface residues?

5 obvious features  is on surface => Accessible surface area  has to bind=> Physico-chemical prop.  must be stabilized=> Contact graph topology  prefers flat surface=> not really  is conserved=> maybe not that much Interface residue…

6 accessible surface area (ASA)

7 physico-chemical properties Hydrophobicity Inside/Outside Partition Coefficient Conformation  AAIndex database  approx. 400 indices  AUC over 144 protein chains 4304 binding and non-binding sequence similarity < 30%

8 patch types

9 patch type comparison  Naïve Bayes  PSI-BLAST Profiles  AUC  5-fold x-validation  RB144 data set

10 features over patches

11 betweenness-centrality (BC) s t v

12 BC for contact graph  1FJG_K  AUC = 0.71  Red: interface residue  Size: betweenness centrality Histogram: binned BC over RB144

13 combined features  WRC: distance-weighted retention coefficient  BC: betweenness centrality  ASA: accessible surface area  5-fold x–validation, RB144  Patch sizes: sequential->11, topological->19, spatial->19

14 summary  Patch size is critical for sequential patches  Spatial/topological patches perform better  Structural information helps – but not much: +5%  Novelty: centrality indices as predictors  SVM superior to NB  Top prediction accuracy – as far as one can tell  Accuracy in general is still low (MCC < 0.4)

15 what’s next…  Prediction of disease associated SNPs  Graph-spectral methods  Protein function prediction

16 acknowledgments  Zheng Yuan – Data sets and much more …  Karin Kassahn – Aminoacyl-tRNA synthetases

17 questions