Protein structure Classification Ole Lund, Associate professor, CBS, DTU.

Slides:



Advertisements
Similar presentations
Secondary structure prediction from amino acid sequence.
Advertisements

LSM2104/CZ2251 Essential Bioinformatics and Biocomputing Essential Bioinformatics and Biocomputing Protein Structure and Visualization (2) Chen Yu Zong.
C A T H C A T H lass rchitecture opology or Fold Group
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
CATH and SCOP Topic 8 Chapters 17 & 18, Gu and Bourne “ Structural Bioinformatics”
Pfam(Protein families )
D.5: Phylogeny and Systematics. D.5.1: Outline Classification Called Systematics or classification –Based on common ancestry and natural relationships.
Hidden Markov models for detecting remote protein homologies Kevin Karplus, Christian Barrett, Richard Hughey Georgia Hadjicharalambous.
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein structure. Amino acids Amino acids: R group properties.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Strict Regularities in Structure-Sequence Relationship
Bioinformatics master course DNA/Protein structure-function analysis and prediction Lecture 5: Protein Fold Families Centre for Integrative Bioinformatics.
Bioinformatics master course DNA/Protein structure-function analysis and prediction Lecture 5: Protein Fold Families Jaap Heringa Integrative Bioinformatics.
Protein structure Anne Mølgaard, Center for Biological Sequence Analysis.
CS262 Lecture 15, Win06, Batzoglou Rapid Global Alignments How to align genomic sequences in (more or less) linear time.
Protein structure (Part 2 of 2).
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Remote homology detection  Remote homologs:  low sequence similarity, conserved structure/function  A number of databases and tools are available 
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Thomas Blicher, Center for Biological Sequence Analysis Details of Protein Structure.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Structure Thomas Blicher, Center for Biological Sequence Analysis.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein structure Anne Mølgaard, Center for Biological Sequence Analysis.
Appendix: Automated Methods for Structure Comparison Basic problem: how are any two given structures to be automatically compared in a meaningful way?
The Protein Data Bank (PDB)
ProteinStructuralDatabases. Proteins are built from amino-acids. Introduction H | NH2-c-CO2H | R.
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Protein Tertiary Structure. Primary: amino acid linear sequence. Secondary:  -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded.
Protein threading Structure is better conserved than sequence
Proteins, Pair HMMs, and Alignment. CS262 Lecture 8, Win06, Batzoglou A state model for alignment -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- TAG-CTATCAC--GACCGC-GGTCGATTTGCCCGACC.
Protein Structure and Function Prediction. Predicting 3D Structure –Comparative modeling (homology) –Fold recognition (threading) Outstanding difficult.
Protein structures in the PDB
BMI 731 Protein Structures and Related Database Searches.
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Protein Structure Prediction II
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Fold recognition Morten Nielsen, CBS, BioCentrum, DTU.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Structure Thomas Blicher, Center for Biological Sequence Analysis.
Protein Classification. PDB Growth New PDB structures.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structure Prediction and Analysis
M ACHINE L EARNING FOR P ROTEIN C LASSIFICATION : K ERNEL M ETHODS CS 374 Rajesh Ranganath 4/10/2008.
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Bioinformatics master course DNA/Protein structure-function analysis and prediction Lecture 5: Protein Fold Families Centre for Integrative Bioinformatics.
Structural databases Lecture 5 Structural Bioinformatics Dr. Avraham Samson
1 Randomized Algorithms for Three Dimensional Protein Structures Comparison Yaw-Ling Lin Dept Computer Sci and Info Engineering, Providence University,
CATH – a hierarchic classification of protein domain structures Rui Kuang.
BMMB597E Protein Evolution Protein classification 1.
Tertiary structure combines regular secondary structures and loops (coil) Bovine carboxypeptidase A.
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
PIRSF Classification System PIRSF: Evolutionary relationships of proteins from super- to sub-families Homeomorphic Family: Homologous proteins sharing.
Protein Strucure Comparison Chapter 6,7 Orengo. Helices α-helix4-turn helix, min. 4 residues helix3-turn helix, min. 3 residues π-helix5-turn helix,
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Protein structure – introduction “Bioinformatics: genes, proteins and computers” Orengo, Jones and Thornton (2003).
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Comparing and Classifying Domain Structures
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein Classification
Using the Fisher kernel method to detect remote protein homologies Tommi Jaakkola, Mark Diekhams, David Haussler ISMB’ 99 Talk by O, Jangmin (2001/01/16)
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
Chapter 14 Protein Structure Classification
IgSF.
Classification: understanding the diversity and principles of
Prediction of protein function from sequence analysis
Protein structure prediction.
Protein Structural Classification
Overview of Enzyme, Protein and Network Databases
Presentation transcript:

Protein structure Classification Ole Lund, Associate professor, CBS, DTU.

OL Why classify proteins Number of solved structures grow rapidly Generate overview of structure types Detect similarities (evolutionary relationships) Set up prediction benchmarks

OL Classification schemes SCOP – Manual classification (A Murzin) CATH – Semi manual classification (C orengo) FSSP – Automatic classification (L Holm)

OL Levels in SCOP 1. Class10 2. Folds Superfamilies Families1699 Murzin et al.,

OL Major classes in scop Classes – All alpha proteins – Alpha and beta proteins (a/b) – Alpha and beta proteins (a+b) – Multi-domain proteins – Membrane and cell surface proteins – Small proteins

OL All alpha: Hemoglobin (1bab)

OL All beta: Immunoglobulin (8fab)

OL Alpha/beta: Triosephosphate isomerase (1hti)

OL Alpha+beta: Lysozyme (1jsf)

OL Folds* Each Class may be divided into one or more folds Proteins which have the same (>~50%) secondary structure elements arranged the in the same order in the protein chain and in three dimensions are classified as having the same fold *confusingly also called fold classes

OL Superfamilies Superfamilies are a subdivisions of folds A superfamily contains proteins which are thought to be evolutionarily related due to – Sequence – Function – Special structural features Relationships between members of a superfamily may not be readily recognizable from the sequence alone

OL Families Subdivision of supefamilies Contains members whose relationship is readily recognizable from the sequence (>~25% sequence identity) Families are further subdivided in to Proteins Proteins are divided into Species – The same protein may be found in several species

OL Families

OL CATH Levels Class Architecture – This level is unique to CATH (~30 is defined) Topology – ~Fold(/superfamily) in SCOP Homologous Superfamily – ~Superfamily(/family) in SCOP

OL Architecture Same overall arrangement of secondary structures – Example: The architecture :Two layer beta sheet proteins contains different folds each with a distinct number and connectivity of strands

OL FSSP Fully automated classification Automatic update Database contains structural alignments Tree of protein structures

OL FSSP classification (MHC molecules)

OL Links PDB – SCOP – scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.html scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.html CATH – FSSP –