Protein Structure Prediction

Slides:



Advertisements
Similar presentations
Assignment of PROSITE motifs to topological regions: Application to a novel database of well characterised transmembrane proteins Tim Nugent.
Advertisements

Secondary structure prediction from amino acid sequence.
Protein Structure Prediction
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Chapter 14 Protein Secondary Structure Prediction.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
An Introduction to Bioinformatics Protein Structure Prediction.
Garnier-Osguthorpe-Robson
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein Structure Modeling (1). Protein Folding Problem A protein folds into a unique 3D structure under physiological conditions Lysozyme sequence: KVFGRCELAA.
Thomas Blicher Center for Biological Sequence Analysis
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
1 Protein Structure Prediction Reporter: Chia-Chang Wang Date: April 1, 2005.
Protein Tertiary Structure. Primary: amino acid linear sequence. Secondary:  -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded.
Protein structure determination & prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray.
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
PREDICTION OF PROTEIN FEATURES Beyond protein structure (TM, signal/target peptides, coiled coils, conservation…)
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Template-based Prediction of Protein 8-state Secondary Structures June 12 th 2013 Ashraf Yaseen and Yaohang Li DEPARTMENT OF COMPUTER SCIENCE OLD DOMINION.
Protein Structural Prediction. Protein Structure is Hierarchical.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Protein Tertiary Structure Prediction
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
10/10/07BCB 444/544 F07 ISU Dobbs #21 - Protein Secondary Structure Prediction1 BCB 444/544 Lecture 21 Protein Structure Visualization, Classification.
CSCE555 Bioinformatics Lecture 18 Protein Tertiary Structure Prediction Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Secondary Structure Prediction and Signal Peptides Protein Analysis Workshop 2012 Bioinformatics group Institute of Biotechnology University of helsinki.
Representations of Molecular Structure: Bonds Only.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Secondary structure prediction
2 o structure, TM regions, and solvent accessibility Topic 13 Chapter 29, Du and Bourne “Structural Bioinformatics”
Web Servers for Predicting Protein Secondary Structure (Regular and Irregular) Dr. G.P.S. Raghava, F.N.A. Sc. Bioinformatics Centre Institute of Microbial.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
10/12/07BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction BCB 444/544 Lecture 22  Secondary Structure Prediction  Tertiary Structure.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
1 Protein Structure Prediction (Lecture for CS397-CXZ Algorithms in Bioinformatics) April 23, 2004 ChengXiang Zhai Department of Computer Science University.
Structure prediction: Homology modeling
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Structure prediction: Ab-initio Lecture 9 Structural Bioinformatics Dr. Avraham Samson Let’s think!
Programme Last week’s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues Summary.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Comparative methods Basic logics: The 3D structure of the protein is deduced from: 1.Similarities between the protein and other proteins 2.Statistical.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Forces and Prediction of Protein Structure Ming-Jing Hwang ( 黃明經 ) Institute of Biomedical Sciences Academia Sinica
Protein Structure Prediction. Protein Sequence Analysis Molecular properties (pH, mol. wt. isoelectric point, hydrophobicity) Secondary Structure Super-secondary.
Secondary Structure Prediction
Prediction of protein features. Beyond protein structure
Protein Structure Prediction and Protein Homology modeling
Introduction to Bioinformatics II
Protein dynamics Folding/unfolding dynamics
Protein Structures.
Rosetta: De Novo determination of protein structure
Protein structure prediction.
Programme Last week’s quiz results + Summary
Computer-Aided Protein Structure Prediction
Protein Homology Modelling
Computer-Aided Protein Structure Prediction
Computer-Aided Protein Structure Prediction
Protein structure prediction
Neural Networks for Protein Structure Prediction Dr. B Bhunia.
Presentation transcript:

Protein Structure Prediction Lab 3 - BLAST BCB 444/544 9/6/07 Lab 7 Protein Structure Prediction Oct 11, 2007 BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Chp 14 - Secondary Structure Prediction Lab 3 - BLAST 9/6/07 Chp 14 - Secondary Structure Prediction SECTION V STRUCTURAL BIOINFORMATICS Xiong: Chp 14 Protein Secondary Structure Prediction Secondary Structure Prediction for Globular Proteins Secondary Structure Prediction for Transmembrane Proteins Coiled-Coil Prediction BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Secondary Structure Prediction Lab 3 - BLAST 9/6/07 Secondary Structure Prediction Has become highly accurate in recent years (>85%) Usually 3 (or 4) state predictions: H = -helix E = -strand C = coil (or loop) (T = turn) BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Secondary Structure Prediction Methods Lab 3 - BLAST 9/6/07 Secondary Structure Prediction Methods 1st Generation methods Ab initio - used relatively small dataset of structures available Chou-Fasman - based on amino acid propensities (3-state) GOR - also propensity-based (4-state) 2nd Generation methods based on much larger datasets of structures now available GOR II, III, IV, SOPM, GOR V, FDM 3rd Generation methods Homology-based & Neural network based PHD, PSIPRED, SSPRO, PROF, HMMSTR, CDM Meta-Servers combine several different methods Consensus & Ensemble based JPRED, PredictProtein, Proteus BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Secondary Structure Prediction Servers Lab 3 - BLAST 9/6/07 Secondary Structure Prediction Servers Prediction Evaluation? Q3 score - % of residues correctly predicted (3-state) in cross-validation experiments Best results? Meta-servers http://expasy.org/tools/ (scroll for 2' structure prediction) http://www.russell.embl-heidelberg.de/gtsp/secstrucpred.html JPred www.compbio.dundee.ac.uk/~www-jpred PredictProtein http://www.predictprotein.org/ Rost, Columbia Best "individual" programs? ?? CDM http://gor.bb.iastate.edu/cdm/ Sen…Jernigan, ISU FDM (not available separately as server) Cheng…Jernigan, ISU GOR V http://gor.bb.iastate.edu/ Kloczkowsky…Jernigan, ISU BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Consensus Data Mining (CDM) Lab 3 - BLAST Consensus Data Mining (CDM) 9/6/07 Developed by Jernigan Group at ISU Basic premise: combination of 2 complementary methods can enhance performance by harnessing distinct advantages of both methods; combines FDM & GOR V: FDM - Fragment Data Mining - exploits availability of sequence-similar fragments in the PDB, which can lead to highly accurate prediction - much better than GOR V - for such fragments, but such fragments are not available for many cases GOR V - Garnier, Osguthorpe, Robson V - predicts secondary structure of less similar fragments with good performance; these are protein fragments for which FDM method cannot find suitable structures For references & additional details: http://gor.bb.iastate.edu/cdm/ BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST 9/6/07 Secondary Structure Prediction: for Different Types of Proteins/Domains For Complete proteins: Globular Proteins - use methods previously described Transmembrane (TMM) Proteins - use special methods (next slides) For Structural Domains: many under development: Coiled-Coil Domains (Protein interaction domains) Zinc Finger Domains (DNA binding domains), others… BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

SS Prediction for Transmembrane Proteins Lab 3 - BLAST SS Prediction for Transmembrane Proteins 9/6/07 Transmembrane (TM) Proteins Only a few in the PDB - but ~ 30% of cellular proteins are membrane-associated ! Hard to determine experimentally, so prediction important TM domains are relatively 'easy' to predict! Why? constraints due to hydrophobic environment 2 main classes of TM proteins: - helical - barrel BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

SS Prediction for TM -Helices Lab 3 - BLAST SS Prediction for TM -Helices 9/6/07 -Helical TM domains: Helices are 17-25 amino acids long (span the membrane) Predominantly hydrophobic residues Helices oriented perpendicular to membrane Orientation can be predicted using "positive inside" rule Residues at cytosolic (inside or cytoplasmic) side of TM helix, near hydrophobic anchor are more positively charged than those on lumenal (inside an organelle in eukaryotes) or periplasmic side (space between inner & outer membrane in gram-negative bacteria) Alternating polar & hydrophobic residues provide clues to interactions among helices within membrane Servers? TMHMM or HMMTOP - 70% accuracy - confused by hydrophobic signal peptides (short hydrophobic sequences that target proteins to the endoplasmic reticulum, ER) Phobius - 94% accuracy - uses distinct HMM models for TM helices & signal peptide sequences BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

SS Prediction for TM -Barrels  Lab 3 - BLAST SS Prediction for TM -Barrels  9/6/07 -Barrel TM domains:  -strands are amphipathic (partly hydrophobic, partly hydrophilic) Strands are 10 - 22 amino acids long Every 2nd residue is hydrophobic, facing lipid bilayer Other residues are hydrophilic, facing "pore" or opening Servers? Harder problem, fewer servers… TBBPred - uses NN or SVM (more on these ML methods later) Accuracy ? BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Chp 15 - Tertiary Structure Prediction Lab 3 - BLAST 9/6/07 Chp 15 - Tertiary Structure Prediction SECTION V STRUCTURAL BIOINFORMATICS Xiong: Chp 15 Protein Tertiary Structure Prediction Methods Homology Modeling Threading and Fold Recognition Ab Initio Protein Structural Prediction CASP BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Protein Tertiary Structure Prediction Lab 3 - BLAST 9/6/07 Protein Tertiary Structure Prediction 3 Major Methods: Homology Modeling (easiest!) Threading and Fold Recognition (harder) Ab Initio Protein Structural Prediction (really hard) BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST Comparative Modeling? 9/6/07 Comparative modeling - term is sometimes used interchangeably with homology modeling, but also sometimes used to mean both homology modeling and/or threading/fold recognition BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Provides both folding pathway & folded structure Lab 3 - BLAST 9/6/07 Ab Initio Prediction Develop energy function bond energy bond angle energy dihedral angle energy van der Waals energy electrostatic energy Calculate structure by minimizing energy function (usually Molecular Dynamics or Monte Carlo methods) Ab initio prediction - impractical for most real (long) proteins Computationally? very expensive Accuracy? Usually poor for all except short peptides (but much improvement recently!) Provides both folding pathway & folded structure BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Provide folded structure only Lab 3 - BLAST 9/6/07 Comparative Modeling Two types: 1) Homology modeling 2) Threading (fold recognition) Both rely on availability of experimentally determined structures that are "homologous" or at least structurally very similar to target Provide folded structure only BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST 9/6/07 Homology Modeling Identify homologous protein sequences (-BLAST) Among available structures (in PDB), choose one with closest sequence to target as template (can combine steps 1 & 2 by using PDB-BLAST) Build model by placing target sequence residues in corresponding positions of homologous structure & refine by "tweaking" modeled structure (energy minimization) Homology modeling - works "well" Computationally? "relatively" inexpensive Accuracy? higher sequence identity  better model Requires ~30% sequence identity with sequence for which structure is known BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

Threading - Fold Recognition Lab 3 - BLAST 9/6/07 Threading - Fold Recognition Identify “best” fit between target sequence & template structure Develop energy function Develop template library Align target sequence with each template & score Identify top scoring template (1D to 3D alignment) Refine structure as in homology modeling Threading - works "sometimes" Computationally? Can be expensive or cheap, depends on energy function & whether "all atom" or "backbone only" threading Accuracy? in theory, should not depend on sequence identity (should depend on quality of template library & "luck") Usually, higher sequence identity to protein of known structure  better model BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs

BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST Today's Lab: 9/6/07 Homology Modeling - using SWISS-MODEL http://swissmodel.expasy.org//SWISS-MODEL.html Threading - using 3-D JURY (BioinfoBank, a METAserver) http://meta.bioinfo.pl/submit_wizard.pl Take a look at CASP contest: http://predictioncenter.gc.ucdavis.edu/ CASP7 contest in 2006 http://www.predictioncenter.org/casp7/Casp7.html BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST BCB 444/544 Fall 07 Dobbs