1 correlating graph-theoretical centrality indices with interface residue propensity or: where do things stick together? Stefan Maetschke Teasdale Group.

Slides:

Advertisements

Similar presentations

Structural Classification and Prediction of Reentrant Regions in Alpha-Helical Transmembrane Proteins: Application to Complete Genomes Håkan Viklunda,

Advertisements

LS-SNP: Large-scale annotation of coding non- synonymous SNPs based on multiple information sources -Bioinformatics April 2005.

Secondary structure prediction from amino acid sequence.

Protein Tertiary Structure Prediction

Improved prediction of protein-protein binding sites using a support vector machine ( James Bradford, et al (2004)) Tapan Patel CISC841 Trypsin (and inhibitor.

Structural bioinformatics

CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.

Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.

Docking of Protein Molecules

1 Computational Analysis of Protein-DNA Interactions Changhui (Charles) Yan Department of Computer Science Utah State University.

. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]

Protein Tertiary Structure Prediction Structural Bioinformatics.

Napovedovanje imunskega odziva iz peptidnih mikromrež Mitja Luštrek 1 (2), Peter Lorenz 2, Felix Steinbeck 2, Georg Füllen 2, Hans-Jürgen Thiesen 2 1 Odsek.

Protein Tertiary Structure Prediction

CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.

Modelling binding site with 3DLigandSite Mark Wass

Discovering the Correlation Between Evolutionary Genomics and Protein-Protein Interaction Rezaul Kabir and Brett Thompson

Modelling Genome Structure and Function Ram Samudrala University of Washington.

Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by a grant from the National.

Web Servers for Predicting Protein Secondary Structure (Regular and Irregular) Dr. G.P.S. Raghava, F.N.A. Sc. Bioinformatics Centre Institute of Microbial.

Localization prediction of transmembrane proteins Stefan Maetschke, Mikael Bodén and Marcus Gallagher The University of Queensland.

Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.

Protein-Protein Interaction Hotspots Carved into Sequences Yanay Ofran 1,2, Burkhard Rost 1,2,3 1.Department of Biochemistry and Molecular Biophysics,

Study of Protein Prediction Related Problems Ph.D. candidate Le-Yi WEI 1.

1 Web Site: Dr. G P S Raghava, Head Bioinformatics Centre Institute of Microbial Technology, Chandigarh, India Prediction.

Identification of amino acid residues in protein-protein interaction interfaces using machine learning and a comparative analysis of the generalized sequence-

Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.

B IOINFORMATICS AND C OMPUTATIONAL B IOLOGY A Computational Method to Identify RNA Binding Sites in Proteins Jeff Sander Iowa State University Rocky 2006.

Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.

LOGO iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance- Pairs and Reduced Alphabet Profile into the General Pseudo Amino.

Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.

 Developed Struct-SVM classifier that takes into account domain knowledge to improve identification of protein-RNA interface residues  Results show that.

Feature Extraction Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and.

Prediction of Protein Binding Sites in Protein Structures Using Hidden Markov Support Vector Machine.

Fehérjék 3. Simon István. p27 Kip1 IA 3 FnBP Tcf3 Bound IUP structures.

Typically, classifiers are trained based on local features of each site in the training set of protein sequences. Thus no global sequence information is.

Final Report (30% final score) Bin Liu, PhD, Associate Professor.

Protein Tertiary Structure Prediction Structural Bioinformatics.

We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained.

Predicting Structural Features Chapter 12. Structural Features Phosphorylation sites Transmembrane helices Protein flexibility.

Intrinsically disordered proteins Zsuzsanna Dosztányi EMBO course Budapest, 3 June 2016.

Protein Structure Visualisation

Volume 18, Issue 2, Pages (February 2010)

Prediction of RNA Binding Protein Using Machine Learning Technique

Extra Tree Classifier-WS3 Bagging Classifier-WS3

Support Vector Machine (SVM)

A Detailed View of a Ribosomal Active Site

Prediction of IgE-binding epitopes by means of allergen surface comparison and correlation to cross-reactivity Fabio Dall'Antonia, PhD, Anna Gieras,

Volume 25, Issue 2, Pages (February 2017)

Bin Li, Darwin O.V Alonso, Valerie Daggett Structure

Structure-Based Reassessment of the Caveolin Signaling Model: Do Caveolae Regulate Signaling through Caveolin-Protein Interactions? Brett M. Collins,

Solution Structure of the U11-48K CHHC Zinc-Finger Domain that Specifically Binds the 5′ Splice Site of U12-Type Introns Henning Tidow, Antonina Andreeva,

Giovanni Settanni, Antonino Cattaneo, Paolo Carloni

Molecular Basis of Box C/D RNA-Protein Interactions

Yvonne Groemping, Karine Lapouge, Stephen J. Smerdon, Katrin Rittinger

Volume 18, Issue 2, Pages (February 2010)

Volume 13, Issue 2, Pages (February 2005)

Rules for Nuclear Localization Sequence Recognition by Karyopherinβ2

Structural Basis for the Inhibition of Caspase-3 by XIAP

Protein folding kinetics: timescales, pathways and energy landscapes in terms of sequence-dependent properties Thomas Veitshans, Dmitri Klimov, Devarajan.

Study Identification and Selection Process

On Hydrophobicity and Conformational Specificity in Proteins

Structure of the Staphylococcus aureus AgrA LytTR Domain Bound to DNA Reveals a Beta Fold with an Unusual Mode of Binding David J. Sidote, Christopher.

Alignment of the deduced amino acid sequences of the myosin light chain 2 (MLC2) proteins. Alignment of the deduced amino acid sequences of the myosin.

LC8 is structurally variable but conserved in sequence.

Suvobrata Chakravarty, Roberto Sanchez Structure

Volume 11, Issue 10, Pages (October 2003)

Morgan Huse, Ye-Guang Chen, Joan Massagué, John Kuriyan Cell

Volume 15, Issue 6, Pages (September 2004)

Presentation transcript:

1 correlating graph-theoretical centrality indices with interface residue propensity or: where do things stick together? Stefan Maetschke Teasdale Group

2 …a bit more specific  Prediction of interface residues  Protein-RNA interfaces  Machine learning methods  Structural information  Graph-topological features

3 something for the visual cortex [Terribilini et al. 2006][JMol,1R3E_A][Jung Library] Protein-RNA complex Binding siteContact graph

4 questions Most predictors are sequence based:   What impact has structural information on prediction accuracy?   What features are predictive for interface residues?

5 obvious features  is on surface => Accessible surface area  has to bind=> Physico-chemical prop.  must be stabilized=> Contact graph topology  prefers flat surface=> not really  is conserved=> maybe not that much Interface residue…

6 accessible surface area (ASA)

7 physico-chemical properties Hydrophobicity Inside/Outside Partition Coefficient Conformation  AAIndex database  approx. 400 indices  AUC over 144 protein chains 4304 binding and non-binding sequence similarity < 30%

8 patch types

9 patch type comparison  Naïve Bayes  PSI-BLAST Profiles  AUC  5-fold x-validation  RB144 data set

10 features over patches

11 betweenness-centrality (BC) s t v

12 BC for contact graph  1FJG_K  AUC = 0.71  Red: interface residue  Size: betweenness centrality Histogram: binned BC over RB144

13 combined features  WRC: distance-weighted retention coefficient  BC: betweenness centrality  ASA: accessible surface area  5-fold x–validation, RB144  Patch sizes: sequential->11, topological->19, spatial->19

14 summary  Patch size is critical for sequential patches  Spatial/topological patches perform better  Structural information helps – but not much: +5%  Novelty: centrality indices as predictors  SVM superior to NB  Top prediction accuracy – as far as one can tell  Accuracy in general is still low (MCC < 0.4)

15 what’s next…  Prediction of disease associated SNPs  Graph-spectral methods  Protein function prediction

16 acknowledgments  Zheng Yuan – Data sets and much more …  Karin Kassahn – Aminoacyl-tRNA synthetases

17 questions