Protein structure (Part 2 of 2).

Slides:



Advertisements
Similar presentations
Secondary structure prediction from amino acid sequence.
Advertisements

1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Protein Structure Database Introduction Database of Comparative Protein Structure Models ModBase 生資所 g 詹濠先.
Hidden Markov models for detecting remote protein homologies Kevin Karplus, Christian Barrett, Richard Hughey Georgia Hadjicharalambous.
Introduction to Bioinformatics
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
Protein Tertiary Structure Prediction
Structural bioinformatics
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Strict Regularities in Structure-Sequence Relationship
Protein Structure Modeling (2). Prediction
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein Fold recognition Morten Nielsen, Thomas Nordahl CBS, BioCentrum, DTU.
Protein Fold recognition
The Protein Data Bank (PDB)
Protein Tertiary Structure. Primary: amino acid linear sequence. Secondary:  -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded.
Protein Structure and Function Prediction. Predicting 3D Structure –Comparative modeling (homology) –Fold recognition (threading) Outstanding difficult.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein structure Classification Ole Lund, Associate professor, CBS, DTU.
1-month Practical Course Genome Analysis Lecture 3: Residue exchange matrices Centre for Integrative Bioinformatics VU (IBIVU) Vrije Universiteit Amsterdam.
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Protein Structure Prediction II
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Protein Structure Prediction and Analysis
Protein Tertiary Structure Prediction
Structural alignment Protein structure Every protein is defined by a unique sequence (primary structure) that folds into a unique.
Macromolecular structure
Protein analysis and proteomics (Part 2 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
COMPARATIVE or HOMOLOGY MODELING
Protein Sequence Alignment and Database Searching.
Protein Structure Prediction. Historical Perspective Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999 A personal.
Gene Annotation and Analysis Lab Work Reference: European Multimedia Bioinformatics Educational Resource.
Representations of Molecular Structure: Bonds Only.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
BLOCKS Multiply aligned ungapped segments corresponding to most highly conserved regions of proteins- represented in profile.
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Protein Strucure Comparison Chapter 6,7 Orengo. Helices α-helix4-turn helix, min. 4 residues helix3-turn helix, min. 3 residues π-helix5-turn helix,
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Using structure in protein function annotation: predicting protein interactions Donald Petrey, Cliff Qiangfeng Zhang, Raquel Norel, Barry Honig Howard.
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
Point Specific Alignment Methods PSI – BLAST & PHI – BLAST.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison.
An Efficient Index-based Protein Structure Database Searching Method 陳冠宇.
Protein Tertiary Structure Prediction Structural Bioinformatics.
METHOD: Family Classification Scheme 1)Set for a model building: 67 microbial genomes with identified protein sequences (Table 1) 2)Set for a model.
Chapter 13 Protein structure Bioinformatics and Functional Genomics
Introduction to Bioinformatics
Chapter 14 Protein Structure Classification
Demo: Protein Information Resource
Identifying templates for protein modeling:
Protein dynamics Folding/unfolding dynamics
Protein Structures.
Homology Modeling.
Protein structure prediction.
Overview of Enzyme, Protein and Network Databases
Presentation transcript:

Protein structure (Part 2 of 2)

Copyright notice Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan Pevsner (ISBN 0-471-21004-8). Copyright © 2003 by John Wiley & Sons, Inc. These images and materials may not be used without permission from the publisher. We welcome instructors to use these powerpoints for educational purposes, but please acknowledge the source. The book has a homepage at http://www.bioinfbook.org Including hyperlinks to the book chapters.

Many databases explore protein structures SCOP CATH Dali Domain Dictionary FSSP Page 293

Structural Classification of Proteins (SCOP) SCOP describes protein structures using a hierarchical classification scheme: Classes Folds Superfamilies (likely evolutionary relationship) Families Domains Individual PDB entries http://scop.mrc.lmb.cam.ac.uk/scop/ Page 293

Page 297

SCOP statistics (October, 2002) Class # folds # superfamilies # families All a 151 252 393 All b 110 205 337 a/b 113 185 438 a+b 208 295 454 … Total 686 1073 1827 Page 298

Class, Architecture, Topology, and Homologous Superfamily (CATH) database CATH clusters proteins at four levels: C Class (a, b, a&b folds) A Architecture (shape of domain, e.g. jelly roll) T Topology (fold families; not necessarily homologous) H Homologous superfamily http://www.biochem.ucl.ac.uk/basm/cath_new Page 293

Fig. 9.23 Page 298

Fig. 9.24 Page 299

Fig. 9.24 Page 299

Fig. 9.25 Page 300

Fig. 9.25 Page 300

Page 301

Fig. 9.27 Page 302

Fig. 9.28 Page 303

Dali Domain Dictionary Dali contains a numerical taxonomy of all known structures in PDB. Dali integrates additional data for entries within a domain class, such as secondary structure predictions and solvent accessibility. Page 302

Fig. 9.29 Page 303

Fig. 9.30 Page 304

Fig. 9.30 Page 304

Fig. 9.30 Page 304

Fold classification based on structure-structure alignment of proteins (FSSP) FSSP is based on a comprehensive comparison of PDB proteins (greater than 30 amino acids in length). Representative sets exclude sequence homologs sharing > 25% amino acid identity. The output includes a “fold tree.” http://www.ebi.ac.uk/dali/fssp Page 293

Fig. 9.31 Page 305

FSSP: fold tree Fig. 9.32 Page 306

Fig. 9.33 Page 307

Fig. 9.34 Page 307

Approaches to predicting protein structures There are about >20,000 structures in PDB, and about 1 million protein sequences in SwissProt/ TrEMBL. For most proteins, structural models derive from computational biology approaches, rather than experimental methods. The most reliable method of modeling and evaluating new structures is by comparison to previously known structures. This is comparative modeling. An alternative is ab initio modeling. Page 303-305

Approaches to predicting protein structures obtain sequence (target) fold assignment comparative modeling ab initio modeling build, assess model Page 308

Comparative modeling of protein structures [1] Perform fold assignment (e.g. BLAST, CATH, SCOP); identify structurally conserved regions [2] Align the target (unknown protein) with the template. This is performed for >30% amino acid identity over a sufficient length [3] Build a model [4] Evaluate the model Page 305

Errors in comparative modeling Errors may occur for many reasons [1] Errors in side-chain packing [2] Distortions within correctly aligned regions [3] Errors in regions of target that do not match template [4] errors in sequence alignment [5] use of incorrect templates Page 306

Comparative modeling In general, accuracy of structure prediction depends on the percent amino acid identity shared between target and template. For >50% identity, RMSD is often only 1 Å. Page 306

Baker and Sali (2000) Page 308

Comparative modeling Many web servers offer comparative modeling services. Examples are SWISS-MODEL (ExPASy) Predict Protein server (Columbia) WHAT IF (CMBI, Netherlands) Page 309

Ab initio protein structure prediction Ab initio prediction can be performed when a protein has no detectable homologs. Protein folding is modeled based on global free-energy minimum estimates. The “Rosetta Stone” methods was applied to sequence families lacking known structures. For 80 of 131 proteins, one of the top five ranked models successfully predicted the structure within 6.0 Å RMSD (Bonneau et al., 2002). Page 309-310

Protein structure and human disease In some cases, a single amino acid substitution can induce a dramatic change in protein structure. For example, the DF508 mutation of CFTR alters the a helical content of the protein, and disrupts intracellular trafficking. Other changes are subtle. The E6V mutation in the gene encoding hemoglobin beta causes sickle- cell anemia. The substitution introduces a hydrophobic patch on the protein surface, leading to clumping of hemoglobin molecules. Page 311

Protein structure and human disease Disease Protein Cystic fibrosis CFTR Sickle-cell anemia hemoglobin beta “mad cow” disease prion protein Alzheimer disease amyloid precursor protein Table 9.5 Page 312