Fold Recognition Ole Lund, Associate professor, CBS.

Slides:



Advertisements
Similar presentations
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Advertisements

Hidden Markov models for detecting remote protein homologies Kevin Karplus, Christian Barrett, Richard Hughey Georgia Hadjicharalambous.
Protein Fold recognition Morten Nielsen, CBS, BioSys, DTU.
Protein Tertiary Structure Prediction
Structural bioinformatics
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Fold Recognition Ole Lund, Assistant professor, CBS.
Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,
Protein structure and homology modeling Morten Nielsen, CBS, BioCentrum, DTU.
Expect value Expect value (E-value) Expected number of hits, of equivalent or better score, found by random chance in a database of the size.
Multiple sequence alignment Conserved blocks are recognized Different degrees of similarity are marked.
Protein Fold recognition Morten Nielsen, CBS, Department of Systems Biology, DTU.
Protein Fold recognition Morten Nielsen, Thomas Nordahl CBS, BioCentrum, DTU.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Protein Fold recognition
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
The Protein Data Bank (PDB)
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Protein Tertiary Structure. Primary: amino acid linear sequence. Secondary:  -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded.
Similar Sequence Similar Function Charles Yan Spring 2006.
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
Multiple sequence alignment Conserved blocks are recognized Different degrees of similarity are marked.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Fold recognition Morten Nielsen, CBS, BioCentrum, DTU.
Introduction to Bioinformatics From Pairwise to Multiple Alignment.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Protein Tertiary Structure Prediction
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Tertiary Structure Prediction Methods Any given protein sequence Structure selection Compare sequence with proteins have solved structure Homology Modeling.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
Protein Sequence Alignment and Database Searching.
Representations of Molecular Structure: Bonds Only.
Lecture 12 CS5661 Structural Bioinformatics Motivation Concepts Structure Prediction Summary.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Structure prediction: Homology modeling
JM - 1 Introduction to Bioinformatics: Lecture XI Computational Protein Structure Prediction Jarek Meller Jarek Meller Division.
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Programme Last week’s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues Summary.
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
Protein Structure Prediction Graham Wood Charlotte Deane.
Sequence Based Analysis Tutorial March 26, 2004 NIH Proteomics Workshop Lai-Su L. Yeh, Ph.D. Protein Science Team Lead Protein Information Resource at.
Exercises Pairwise alignment Homology search (BLAST) Multiple alignment (CLUSTAL W) Iterative Profile Search: Profile Search –Pfam –Prosite –PSI-BLAST.
Homology Modeling 原理、流程,還有如何用該工具去預測三級結構 Lu Chih-Hao 1 1.
Lecture 7. Computing Protein Structures Current attempts: Threading: RAPTOR Consensus: ACE Fragment assembly Can we compute the protein structures eventually?
Guidelines for sequence reports. Outline Summary Results & Discussion –Sequence identification –Function assignment –Fold assignment –Identification of.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Protein Tertiary Structure Prediction Structural Bioinformatics.
HomologyIf twp proteins are homologous, they have a common fold and a common ancestor If two proteins have >25% identity across their entire length, they.
3.3b1 Protein Structure Threading (Fold recognition) Boris Steipe University of Toronto (Slides evolved from original material.
Using the Fisher kernel method to detect remote protein homologies Tommi Jaakkola, Mark Diekhams, David Haussler ISMB’ 99 Talk by O, Jangmin (2001/01/16)
PROTEIN MODELLING Presented by Sadhana S.
Protein Structure Prediction and Protein Homology modeling
LSM3241: Bioinformatics and Biocomputing Lecture 4: Sequence analysis methods revisited Prof. Chen Yu Zong Tel:
Protein dynamics Folding/unfolding dynamics
Protein Folding and Protein Threading
Protein Structures.
Homology Modeling.
Protein structure prediction.
Programme Last week’s quiz results + Summary
Protein Homology Modelling
Presentation transcript:

Fold Recognition Ole Lund, Associate professor, CBS

OL Fold recognition Find template for modeling – 1st step in comparative modeling Can be used to predict function

OL Template identification Search with sequence – Blast against proteins with known structure – Psi-Blast against all proteins – Fold recognition methods Use biological information Functional annotation in databases Active site/motifs

OL Blast derivatives: PDB-BLAST Procedure 1. Build sequence profile by iterative PSI-BLAST search against a sequence database 2. Use profile to search database of proteins with known structure Advantage – Makes sure hid to protein with known structure is not hidden behind a lot of hits to other proteins

OL BLAST derivatives: Transitive BLAST Procedure 1. Find homologues to query (your) sequence 2. Find homologues to these homologues 3. Etc. – Can be implemented with e.g. BLAST or PSI- BLAST Also known as Intermediate Sequence Search (ISS)

OL CASP – Critical Assessment of Structure Predictions – Every second year – Sequences from about-to-be-solved-structures are given to groups who submit their predictions before the structure is published – Modelers make prediction – Meeting in Asilomar where correct answers are revealed

OL Target difficulty CM: Comparative (homology) modeling CM/FR: not PSI-BLAST (but ISS) findable FR(H): Homologous fold recognition FR(A): Analogous fold recognition NF/FR: Partly New fold NF: New Fold (used to be called Ab Initio - from first principles- prediction)

OL CASP5 overview

OL Successful fold recognition groups at CASP5 3D-Jury (Leszek Rychlewski) 3D-CAM (Krzysztof Ginalski) Template recombination (Paul Bates) HMAP (Barry Honig) PROSPECT (Ying Xu) ATOME (Gilles Labesse)

OL Barry Honig Sequence&structure profile-profile based alignment – Database of template profiles Multiple structure alignment Sequence based profiles Position specific gap penalties derived from secondary structure Calibration to estimate statistical significance – Query profile Sequence based profile Predicted secondary structure (consensus between PSI- PRED,PHD,JNET) Abstract

OL Ying Xu PROSPECT:optimal alignments for a given energy function with any combination of the following terms: 1. mutation energy (including position-specific score matrix derived from multiple-sequence alignments), 2. singleton energy (including matching scores to the predicted secondary structures), 3. pairwise contact potential 4. alignment gap penalties. Abstract

OL 3D-Jury (Rychlewski) Inspired by Ab initio modeling methods – Average of frequently obtained low energy structures is often closer to the native structure than the lowest energy structure Find most abundant high scoring models 1. Use output from a set of servers 2. Superimpose all pairs of structures 3. Similarity score S ij = # of C  pairs within 3.5Å (if #>40;else S ij =0) 4. 3D-Jury score =  i S ij /(N+1) Similar methods developed by A Elofsson (Pcons) and D Fischer (3D shotgun) Rychlewski.doc

OL 3D-CAM (Krzysztof Ginalski) 3D-Consensus Alignment Method – Structural alignment for all members of fold from FSSP – Conservation of specific residues and contacts responsible for maintaining tertiary structure critical for substrate binding and/or catalysis – Find homologues with iterative PSI-BLAST – Align with ClustalW – identify conserved residues – Structural integrity of alignments – Manual realignment – Fold recognition for homologues – Modelling – Verification Visually Computationally (Verify3D, ProsaII, WHAT_CHECK) Ginalski.doc

OL Paul A Bates - In Silico Recombination of Templates, Alignments and Models Problems – Models rarely better than templates – Manual intervention have marginal effect Possible solution – Recombination of models Abstract

OL Paul A Bates – Modelling Procedure Define domains Make models (FAMS/Pmodeller/EsyPred3D) – Manual inspection/correction of alignments – Alignment of annotated residues (PFAM) – Preferably use alignment with >2 bits/aa Select pair of models – Superimpose – Crossover or mutate (average coordinates) Select best proportion – Contact pair potentials – Solvation energies (calculated from solvent accessible area) Convergence – Minimization and final refinements Abstract

OL Gilles Labesse Meta Server – 3D-PSSM, PDB-BLAST, FUGUE, GenTHREADER, SAM-T99, JPRED-2 Tool for Incremental Threading optimization (T.I.T.O.) Consensus ranking Abstract

OL LiveBench The Live Bench Project is a continuous benchmarking program. Every week sequences of newly released PDB proteins are being submitted to participating fold recognition servers. The results are collected and continuous evaluated using automated model assessment programs. A summary of the results is produced after several months of data collection. The servers must delay the updating of their structural template libraries by one week to participate.

OL Meta Server

OL Meta Server

OL Score # correct # wrong

OL Best servers? FFA3 3DS5 INBG SHUM 3DPS 3DS3 FUG3 SHGU FUG2 PCO2 PRO2 MGTH SFPP PMO3

OL Links to fold recognition servers Databases of links – – Meta server – (Example: ) 3DPSSM – good graphical output – GenTHREADER – FUGUE2 – SAM – FOLD – FFAS/PDBBLAST –