Localization prediction of transmembrane proteins Stefan Maetschke, Mikael Bodén and Marcus Gallagher The University of Queensland.

Slides:



Advertisements
Similar presentations
Using Support Vector Machines for transmembrane protein topology prediction Tim Nugent.
Advertisements

Progress in Transmembrane Protein Research 12 Month Report Tim Nugent.
Structural Classification and Prediction of Reentrant Regions in Alpha-Helical Transmembrane Proteins: Application to Complete Genomes Håkan Viklunda,
(SubLoc) Support vector machine approach for protein subcelluar localization prediction (SubLoc) Kim Hye Jin Intelligent Multimedia Lab
Secondary structure prediction from amino acid sequence.
Lesson 3: Translation.
Intracellular Compartments and Protein Sorting Haixu Tang School of Informatics.
Protein Sorting & Transport Paths of Protein Trafficking Nuclear Protein Transport Mitochondrial & Chloroplast Transport Experimental Systems Overview.
Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
Chapter 4 Plasma membrane, nucleus and ribosomes.
Chapter 26 Protein Sorting. Chapter Objectives Understand the pathways of cotranslational processing of proteins – ER, Golgi, Plasma membrane, Lysosomes.
Unit 7 Endomembranes. SECRETORY PATHWAY: Unit 7 Secretory Pathway Proteins are synthesized on the Rough ER. Move via vesicles to Golgi Move via vesicles.
2 Protein Targeting pathways Protein synthesis always begins on free ribosomes In cytoplasm 1) Post -translational: proteins of plastids, mitochondria,
Javad Jamshidi Fasa University of Medical Sciences Proteins Into membranes and Organelles and Vesicular Traffic Moving.
Intracellular Compartments ER, Golgi, Endsomes, Lysosomes and Peroxisomes.
Protein Sorting ISAT 351, Spring 2004 College of Integrated Science and Technology James Madison University.
Tools to analyze protein characteristics Protein sequence -Family member -Multiple alignments Identification of conserved regions Evolutionary relationship.
Prediction of protein localization and membrane protein topology Gunnar von Heijne Department of Biochemistry and Biophysics Stockholm Bioinformatics Center.
Copyright © 2005 Pearson Prentice Hall, Inc. Intracellular Compartments and Transport Membrane Enclosed Organelles Protein Sorting Vesicular Transport.
Tools to analyze protein characteristics Protein sequence -Family member -Multiple alignments Identification of conserved regions Evolutionary relationship.
Review For Final I. Should I take the final? Can’t hurt you Calculate your average and determine what you need to change your grade.
M.W. Mak and S.Y. Kung, ICASSP’09 1 Conditional Random Fields for the Prediction of Signal Peptide Cleavage Sites M.W. Mak The Hong Kong Polytechnic University.
PREDICTION OF PROTEIN FEATURES Beyond protein structure (TM, signal/target peptides, coiled coils, conservation…)
Topic 41 4.Structure/Function of the Organelles - Synthesis.
Inside the Cell 7.1 What’s Inside the Cell? Prokaryotic Cells Eukaryotic Cells –The Nucleus –Ribosomes –Rough Endoplasmic Reticulum –Golgi Apparatus –Smooth.
Lecture 2: Protein sorting (endoplasmic reticulum) Dr. Mamoun Ahram Faculty of Medicine Second year, Second semester, Principles of Genetics.
What is bioinformatics?. What are bioinformaticians up to, actually? Manage molecular biological data –Store in databases, organise, formalise, describe...
{ Cell Structures & Functions Review. What type of cell?
Summary 1.Rough ER and smooth ER; 2.Signal hypothesis, translocation into ER; 3.Single-span and multi-span membrane proteins; 4.Glycosylation; 5.Protein.
Cell Architecture. CELL THEORY Mathias Schleiden and Theodore Schwann Prokaryotic cell Eukaryotic cell – plant cell and animal cell Figure 9-1.
Chapter 3 Cell structure and function. Chapter 3 Cell structure and function.
Plasma membrane, nucleus and ribosomes
BINF6201/8201 Hidden Markov Models for Sequence Analysis
Copyright  2003 limsoon wong Recognition of Protein Features Limsoon Wong Institute for Infocomm Research BI6103 guest lecture on ?? March 2004.
1 Introduction(1/2)  Eukaryotic cells can synthesize up to 10,000 different kinds of proteins  The correct transport of a protein to its final destination.
Protein targeting to organelles 1.From the birth place to the destination— general principles 1)The problem: One place to make protein but many destinations—how.
TMpro: Transmembrane Helix Prediction using Amino Acid Properties and Latent Semantic Analysis Madhavi Ganapathiraju, N. Balakrishnan, Raj Reddy and Judith.
Cells. Views of Cells Why Are Cells Small? 30 µm10 µm Surface area of one large cube = 5,400 µm 2 Total surface area of 27 small cubes = 16,200 µm 2.
BIO201A Cell Biology Lecture 29 Wednesday 04/04/07.
Cell Parts – Practice Quiz Number your paper 1-15… Identify each structure!
Protein Properties Function, structure Residue features Targeting Post-trans modifications BIO520 BioinformaticsJim Lund Reading: Chapter , 11.7,
Pg. 367.
1 GCCTCAATGGATCCACCACCCTTTTTGGGCA GCCTCAATGGATCCACCACCCTTTTTGGTGCA AGCCTCAATGGATCCACCACCCTTTTTGGTGC AAGCCTCAATGGATCCACCACCCTTTTTGGTG CAAGCCTCAATGGATCCACCACCCTTTTTGGT.
Animal Vs. Plant Cell & Organelles
Chapter 12 Intracellular Compartments and Protein Sorting.
Copyright (c) by W. H. Freeman and Company 17.3 The rough ER is an extensive interconnected series of flattened sacs Figure
Bioinformatics in Vaccine Design
Experiment 1.Obtain a tube of water and a straw. 2.Exhale deeply through the straw into the water. 3.Make observations. 4.Why might your observations be.
Cell Theory -The cell is the structural and functional unit of life Human adults are made up of an estimated 100,000,000,000,000 cells Organismal activity.
Prokaryote – Bacterial Cell. Prokaryote s Unicellular organism (ONE TYPE OF CELL) Cell membrane Ribosomes Cillia/flagellum NO NUCLEUS.
Cellular Structures and Organelles
Biosynthesis of a Secretory Protein The starred words are made of membranes. This means that they are all composed of phospholipids Ribosome- *Rough Endoplasmic.
The path of a protein. The cell’s main job is to create proteins.
Cytoplasmic membranes-1 Unit objective: To understand that materials in cell are shuttled from one part to another via an extensive membrane network.
The Signal Hypothesis and the Targeting of Nascent Polypeptides to the Secretory Pathway Tuesday 9/ Mike Mueckler
2 Membranes and cell organelles The cell is the basic unit of _____________ in living organisms. Programmed cell ________ and reproduction of ____ cells.
The Biologist’s Wishlist A complete and accurate set of all genes and their genomic positions A set of all the transcripts produced by each gene The location.
Predicting Structural Features Chapter 12. Structural Features Phosphorylation sites Transmembrane helices Protein flexibility.
CELL STRUCTURE Eukaryotic cells contain many organelles: small structures within a cell, sometimes surrounded by a membrane.
Post-Translational Events I Protein Trafficking
E NDOMEMBRANOUS S YSTEMS By; Ayesha Shaukat. Functions of Rough ER  Many types of cells secrete proteins produced by ribosomes attached to rough ER.
Prediction of protein features. Beyond protein structure
Prediction of RNA Binding Protein Using Machine Learning Technique
Protein Synthesis and Transport within the Cell
Protein Structure Prediction
Intracellular Compartments and Transport
Hsin-Nan Lin, Ching-Tai Chen, Ting-Yi Sung,
CISC 667 Intro to Bioinformatics (Fall 2005) Hidden Markov Models (IV)
Profile HMMs GeneScan TMMOD
Chapter 7 Inside the Cell Biological Science, Third Edition
Presentation transcript:

Localization prediction of transmembrane proteins Stefan Maetschke, Mikael Bodén and Marcus Gallagher The University of Queensland

Maetschke et al, The University of Queensland 2 Protein classes  -helical  -barrel TransmembraneAnchored Integral Peripheral Protein Soluble Membrane Single-spanning Multi-spanning

Maetschke et al, The University of Queensland 3 Transmembrane protein types N N C C Type-IType-II Type-IV (multi-spanning) Cytosol (inside) signal peptide Type-III N C

Maetschke et al, The University of Queensland 4 Nucleus Mitochondrion Peroxisome Lysosome Endoplasmic Reticulum Golgi Complex ERGIC Endosome RNA Ribosome Eukaryotic cell

Maetschke et al, The University of Queensland 5 Secretory and endocytic pathway

Maetschke et al, The University of Queensland 6 Problem and hypothesis Sorting signals for transmembrane proteins serve multiple purposes (targeting, retention, retrieval, avoidance) and are largely unknown (the problem is challenging/multi- faceted) Current localization prediction of eukaryotic transmembrane proteins is poor (models based on soluble proteins are ill-suited) (previous work is inadequate/incomplete) Localization prediction for transmembrane proteins is virtually unexplored (paucity/variance of data) (it is an open problem) Explicit modelling of protein topology should enhance localization prediction accuracy (parameter tuning receives explicit guidance to biologically sensible solutions) (the way to do it!)

Maetschke et al, The University of Queensland 7 Hidden Markov model Inital state probabilities: State transition probabilities: a 12 S1S1 S2S2 S3S3 b1b1 a 23 a 11 a 33 b3b3 b2b2 a 22 Observation probabilities: A R 1 V A R 1 V A R 1 V s 1 s 1 s 1 s 2 s 2 s 2 s 2 s 2 s 2 s 3 State sequence: Observation sequence:

Maetschke et al, The University of Queensland 8 2-order Hidden Markov model Inital state probabilities: State transition probabilities: a 12 S1S1 S2S2 S3S3 b1b1 a 23 a 11 a 33 b3b3 b2b2 a 22 Observation probabilities: AA AR 1 VV s 1 s 1 s 1 s 2 s 2 s 2 s 2 s 2 s 2 s 3 State sequence: Observation sequence: AN AD 3 4 AA AR 1 VV AN AD 3 4 AA AR 1 VV AN AD 3 4

Maetschke et al, The University of Queensland 9 3-order Hidden Markov model Inital state probabilities: State transition probabilities: a 12 S1S1 S2S2 S3S3 b1b1 a 23 a 11 a 33 b3b3 b2b2 a 22 Observation probabilities: AAA AAR 1 VVV s 1 s 1 s 1 s 2 s 2 s 2 s 2 s 2 s 2 s 3 State sequence: Observation sequence: AAN AAD 3 4 AAC AAQ 5 6 AAA AAR 1 VVV AAN AAD 3 4 AAC AAQ 5 6 AAA AAR 1 VVV AAN AAD 3 4 AAC AAQ 5 6

Maetschke et al, The University of Queensland 10 Signal peptide cleavage region hydrophobic core N-terminal region mature protein

Maetschke et al, The University of Queensland 11 Transmembrane domain icapTMDocap

Maetschke et al, The University of Queensland 12 Protein topology model ocapTMDicapC-termN-termSP outsideinside

Maetschke et al, The University of Queensland 13 Localization model (5 x topology models) Nucleus Mitochondrion Peroxisome Lysosome Endoplasmic Reticulum Golgi Complex ERGIC Endosome

Maetschke et al, The University of Queensland 14 LOCATE dataset Subset LOCATE database FANTOM3, Mouse proteome Filter for transmembrane proteins No multi-targeted proteins Redundancy reduced (<25%) TMDs and SPs are labeled (predicted) High quality localization annotation 873 Plasma Membrane 261 Endoplasmic Reticulum 141 Golgi Complex 45 Lysosome 31 Endosome 1351

Maetschke et al, The University of Queensland 15 Prediction performance Prediction Performance (MCC) LOCATE dataset Mean correlation coefficient 10 fold, 10 times Five locations (ER, PM, GO, EN, LY) SVM: linear kernel 1-, 2- and 3-order HMMs Confusion Matrix HMM-2 => Di-peptide composition superior to single amino acid composition => Topological model superior to non-topological model

Maetschke et al, The University of Queensland 16 Predictor comparison Prediction accuracy in % CELLO 2.5: WolfPSort: ProteomeAnalyst 2.5: HMM-2: Test set (20 PM, 20 ER, 20 Golgi) HMM: only three classes but test set  train set Other predictors: more classes but test set  train set → difficult to compare!

Maetschke et al, The University of Queensland 17 Conclusion Novel predictor for subcellular localization of transmembrane proteins along the secretory pathway: Protein model has less states than topology predictors (TMHMM, HMMTOP, etc) but is of second order Localization model is trained and tested using LOCATE, a recent, high-quality localization dataset Overall better performance than current localization predictors (transmembrane proteins, eukaryotic, secretory pathway) –Di-peptide composition superior to single amino acid composition –"Topological" model superior to "non-topological" baseline model