Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,

Slides:



Advertisements
Similar presentations
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Advertisements

Protein Structure Prediction using ROSETTA
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Three-Stage Prediction of Protein Beta-Sheets Using Neural Networks, Alignments, and Graph Algorithms Jianlin Cheng and Pierre Baldi Institute for Genomics.
1 Profile Hidden Markov Models For Protein Structure Prediction Colin Cherry
Profiles for Sequences
درس بیوانفورماتیک December 2013 مدل ‌ مخفی مارکوف و تعمیم ‌ های آن به نام خدا.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Structural bioinformatics
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Tertiary protein structure viewing and prediction July 1, 2009 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
Fold Recognition Ole Lund, Assistant professor, CBS.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Fold Recognition Ole Lund, Associate professor, CBS.
Protein Fold recognition
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Tertiary protein structure modelling May 31, 2005 Graded papers will handed back Thursday Quiz#4 today Learning objectives- Continue to learn how to manipulate.
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
Exploiting Structural and Comparative Genomics to Reveal Protein Functions  How many domain families can we find in the genomes and can we predict the.
Similar Sequence Similar Function Charles Yan Spring 2006.
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
Protein Tertiary Structure Prediction Structural Bioinformatics.
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
Introduction to Bioinformatics - Tutorial no. 8 Predicting protein structure PSI-BLAST.
Multiple Sequence Alignments
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Prediction of Local Structure in Proteins Using a Library of Sequence-Structure Motifs Christopher Bystroff & David Baker Paper presented by: Tal Blum.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
COMPARATIVE or HOMOLOGY MODELING
Protein Sequence Alignment and Database Searching.
Rising accuracy of protein secondary structure prediction Burkhard Rost
Lecture 10 – protein structure prediction. A protein sequence.
Representations of Molecular Structure: Bonds Only.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices Yan Liu Sep 29, 2003.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Eric C. Rouchka, University of Louisville SATCHMO: sequence alignment and tree construction using hidden Markov models Edgar, R.C. and Sjolander, K. Bioinformatics.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Prediction of protein contact maps Piero Fariselli Department of Biology University of Bologna.
Protein Secondary Structure Prediction
Secondary structure prediction
Bioinformatics Multiple Alignment. Overview Introduction Multiple Alignments Global multiple alignment –Introduction –Scoring –Algorithms.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Protein Secondary Structure Prediction G P S Raghava.
1 Protein Structure Prediction (Lecture for CS397-CXZ Algorithms in Bioinformatics) April 23, 2004 ChengXiang Zhai Department of Computer Science University.
JM - 1 Introduction to Bioinformatics: Lecture XI Computational Protein Structure Prediction Jarek Meller Jarek Meller Division.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Predicting Protein Structure: Comparative Modeling (homology modeling)
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Protein Structure Prediction Graham Wood Charlotte Deane.
Homology Modeling 原理、流程,還有如何用該工具去預測三級結構 Lu Chih-Hao 1 1.
Matching Protein  -Sheet Partners by Feedforward and Recurrent Neural Network Proceedings of Eighth International Conference on Intelligent Systems for.
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Comparative methods Basic logics: The 3D structure of the protein is deduced from: 1.Similarities between the protein and other proteins 2.Statistical.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
“ Using Sequence Motifs for Enhanced Neural Network Prediction of Protein Distance Constraints ” J.Gorodkin, O.Lund, C.A.Anderson, S.Brunak On ISMB 99.
Techniques for Protein Sequence Alignment and Database Searching G P S Raghava Scientist & Head Bioinformatics Centre, Institute of Microbial Technology,
3.3b1 Protein Structure Threading (Fold recognition) Boris Steipe University of Toronto (Slides evolved from original material.
Lab Lab 10.2: Homology Modeling Lab Boris Steipe Departments of Biochemistry and.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
חיזוי ואפיון אתרי קישור של חלבון לדנ"א מתוך הרצף
Prediction of Protein Structure and Function on a Proteomic Scale
Marrying structure and genomics
Homology Modeling.
Protein structure prediction.
Structure prediction: Folding proteins by pattern recognition
Presentation transcript:

Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider, R. & Sander, C. JMB(1997)270, Presented by Jian Qiu

Why do we need protein threading?  To detect remote homologue  Genome annotation Structures are better conserved than sequences. Remote homologues with low sequence similarity may share significant structure similarity.  To predict protein structure based on structure template Protein A shares structure similarity with protein B. We could model the structure of protein A using the structure of protein B as a starting point.

An successful example by GenTHREADER  ORF MG276 from Mycoplasma genitalium was predicted to share structure similarity with 1HGX.  MG276 shares a low sequence similarity (10% sequence identity) with 1HGX. Supporting Evidence:  MG276 has an annotation of adenine phosphoribosyltransferase, based on high sequence similarity to the Escherichia coli protein; 1HGX is a hypoxanthine-guanine-xanthine phosphoribosyltransferase from the protozoan parasite Tritrichomonas foetus.  Four functionally important residues in 1HGX are conserved in MG276.  The secondary structure prediction for ORF MG276 agrees very well with the observed secondary structure of 1HGX.

Structure of 1HGX

Functional residue conservation between 1HGX and MG276

GenTHREADER Protocol Sequence alignment  For each template structure in the fold library, related sequences were collected by using the program BLASTP.  A multiple sequence alignment of these sequences was generated with a simplified version of MULTAL.  Get the optimal alignment between the target sequence and the sequence profile of a template structure with dynamic programming.

Threading Potentials Pairwise potential (the pairwise model family): k: sequence separation s: distance interval m ab : number of pairs ab observed with sequence separation k  weight given to each observation f k (s): frequency of occurrence of all residue pairs f k ab (s): frequency of occurrence of residue pair ab

Solvation potential (the profile model family): r: the degree of residue burial the number of other C  atoms located within 10 Å of the residue's C  atom f a (r): frequency of occurrence of residue a with burial r f (r): frequency of occurrence of all residues with burial r

Variables considered to predict the relationship  Pairwise energy score  Solvation energy score  Sequence alignment score  Sequence alignment length  Length of the structure  Length of the target sequence

Artificial Neural Network A node

Neural network architecture in GenTHREADER

The effects of sequence alignment score and pairwise potential on the Network output

Confidence level with different network scores Low Medium(80%) High (99%) Certain (100%)

Genome analysis of Mycoplasma genitalium All the 468 ORFs were analyzed within one day.

Distribution of protein folds in M. genitalium

PHD: Predict 1D structure from sequence MaxHom Sequence Multiple Sequence Alignment PHDsecPHDacc Secondary structure: H(helix), E(strand), L(rest) Solvent accessibility: Buried( =15%)

Threading Protocol

Similarity matrix in dynamic programming  Purely structure similarity matrix: six states (combination of three secondary structure states and two solvent accessibility states)  Purely sequence similarity matrix: McLachlan or Blosum62  Combination of strcture and sequence similarity matrix: M ij =  M ij 1D structure + (100-  )  M ij sequence  sequence alignment only  1D  structure alignment only

Performance of the algorithm

Results on the 11 targets of CASP1  Correctly detected the remote homologues at first rank in four cases; Average percentage of correctly aligned residues: 21%; Average shift: nine residues. Best performing methods in CASP1:  Expert-driven usage of THREADER by David Jones and colleagues detected five out of nine proteins correctly at first rank.  Best alignments of the potential-based threading method by Manfred Sippl and colleagues were clearly better than the best ones of this algorithm.