Mass spectrometry-based proteomics

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

Protein Quantitation II: Multiple Reaction Monitoring
Post-Translational Modifications: CrossTalk Robert Chalkley Chem 204.
Proteomics Informatics – Protein characterization I: post-translational modifications (Week 10)
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics.
How to identify peptides October 2013 Gustavo de Souza IMM, OUS.
Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005.
Previous Lecture: Regression and Correlation
Mass Spectrometry. What are mass spectrometers? They are analytical tools used to measure the molecular weight of a sample. Accuracy – 0.01 % of the total.
Scaffold Download free viewer:
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Goals in Proteomics 1.Identify and quantify proteins in complex mixtures/complexes 2.Identify global protein-protein interactions 3.Define protein localizations.
A combination of the words Proteomics and Genomics. Proteogenomics commonly refer to studies that use proteomic information, often derived from mass spectrometry,
Analysis of tandem mass spectra - II Prof. William Stafford Noble GENOME 541 Intro to Computational Molecular Biology.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Karl Clauser Proteomics and Biomarker Discovery Taming Errors for Peptides with Post-Translational Modifications Bioinformatics for MS Interest Group ASMS.
Phosphoproteomics and motif mining Martin Miller Ph.d. student CBS DTU
The dynamic nature of the proteome
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
A Phospho-Peptide Spectrum Library for Improved Targeted Assays Barbara Frewen 1, Scott Peterman 1, John Sinclair 2, Claus Jorgensen 2, Amol Prakash 1,
Laxman Yetukuri T : Modeling of Proteomics Data
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Overview of Mass Spectrometry
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Click to add Text Sample Preparation for Mass Spectrometry Sermin Tetik, PhD Marmara University July 2015, New Orleans.
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Constructing high resolution consensus spectra for a peptide library
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
Identify proteins. Proteomic workflow Trypsin A typical sample We add a solution of 50 mM NH 4 HCO 3 (pH 7.8) containing trypsin ( µg/µl). Volume.
Estimating the False Discovery Rate in Genome-wide Studies BMI/CS 576 Colin Dewey Fall 2008.
Goals in Proteomics Identify and quantify proteins in complex mixtures/complexes Identify global protein-protein interactions Define protein localizations.
Database Search Algorithm for Identification of Intact Cross-Links in Proteins and Peptides Using Tandem Mass Sepctrometry 신성호.
Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1
MassMatrix Search Results Explained
Gene expression from RNA-Seq
Protein Identification via Database searching
Volume 4, Issue 6, Pages e4 (June 2017)
Learning Sequence Motif Models Using Expectation Maximization (EM)
Inferring Models of cis-Regulatory Modules using Information Theory
Interpolated Markov Models for Gene Finding
RNA Secondary Structure Prediction
Eukaryotic Gene Finding
Distribution of disorder in the cytosolic phosphoproteome
Linking Genetic Variation to Important Phenotypes
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra I
A perspective on proteomics in cell biology
Volume 4, Issue 6, Pages e4 (June 2017)
Proteomics Informatics –
Top-down protein identification.
Schematic of MS1 filtering.
Schematic representation of proteogenomic annotation strategy.
Correction of translational start site by identification of N-terminal peptide. Correction of translational start site by identification of N-terminal.
Bioinformatics for Proteomics
Interaction networks of the regulated phosphoproteins.
Shotgun Proteomics in Neuroscience
High level view of the MAE algorithm.
Full-length AdIGFBP-5 and T47D cell-derived IGFBP-5 analyzed by mass spectrometry.a, intact AdIGFBP-5 was analyzed by ESI-MS. Full-length AdIGFBP-5 and.
Schematic of AIMS-to-MRM experiment.
(A) Design of the PhosphoPep database.
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra
A Primer on Concepts and Applications of Proteomics in Neuroscience
Presentation transcript:

Mass spectrometry-based proteomics BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by Anthony Gitter, Mark Craven, and Colin Dewey

Goals for lecture Key concepts Benefits of mass spectrometry Generating mass spectrometry data Computational tasks Matching spectra and peptides

Mass spectrometry uses Mass spectrometry is like the protein analog of RNA-seq Quantify abundance or state of all (many) proteins No need to specify proteins to measure in advance Other applications in biology Targeted proteomics Metabolomics Lipidomics

Advantages of proteomics Proteins are functional units in a cell Protein abundance directly relevant to activity Post-translational modifications Change protein state Phosphorylation in signaling Thermo Fisher Scientific Histone modifications Latham Nature Structural & Molecular Biology 2007; Katie Ris-Vicari

Estimating protein levels from gene expression Correlation between gene expression and protein abundance has been debated Gene expression tells us nothing about post-translational modifications Contribution to protein levels Li and Biggin Science 2015

Mass spectrometry workflow Total ion chromato-gram MS2 Filter based on MS1 Nesvizhskii Journal of Proteomics 2010

Amino Acids 20 amino acids Building blocks of proteins Known molecular weight Common template Amino-terminal Carboxy-terminal Wikipedia, Yassine Mrabet

Peptide fragmentation Peptide bond Select similar peptides from MS1 Fragment with high energy collisions Break peptide bonds Wikipedia, Yassine Mrabet Charge on amino-terminal (b) or carboxy-terminal fragment (y) Subscript = # R groups retained Steen and Mann Nat Rev Mol Cell Biol 2004

Fragment and analyze one precursor ion Mass spectra MS1 Fragment and analyze one precursor ion MS2 Steen and Mann Nat Rev Mol Cell Biol 2004 Mass-to-charge ratio Spectrum contains information about amino acid sequence, fragment at different bonds

From spectra to peptides Nesvizhskii Journal of Proteomics 2010

Sequence database search Need to define a scoring function Identify peptide-spectrum match (PSM) Steen and Mann Nat Rev Mol Cell Biol 2004

SEQUEST Cross correlation (xcorr) Similarity between theoretical spectrum (x) and acquired spectrum (y) Correction for mean similarity at different offsets Offsets Actual similarity Theoretical Acquired Eng, McCormack, Yates J Am Soc Mass Spectrom 1994

Fast SEQUEST SEQUEST originally only applied to top 500 peptides based on coarse filtering score Skip the 0 offset Eng et al J Proteome Res 2008

PSM significance E-value: expected number of null peptides with score ≥ observed score Compute FDR from E-value distribution Add decoy peptides to database Reversed peptide sequences Used to estimate false discoveries

Target-decoy strategy Nesvizhskii Journal of Proteomics 2010 target PSMs above score threshold = Nt (ST) decoy PSMs above score threshold = Nd (ST)

Identifying proteins Even after identifying PSM, still need to identify protein of origin Serang and Noble Stat Interface 2012

Mass spectrometry versus RNA-seq Transcript → RNA fragment → paired-end read Mass spectrometry Protein → peptides → ions → spectrum Mapping spectra to proteins more ambiguous than mapping reads to transcripts Spectra state space is enormous

Protein-protein interactions Affinity-purification mass spectrometry Purify protein of interest, identify complex members Smits and Vermeulen Trends in Biotechnology 2016

Protein-protein interactions Mass spectrometry identifies proteins in the complex Must control for contaminants MS2 Gingras et al Nature Reviews Molecular Cell Biology 2007

Post-translational modifications (PTMs) Shift the peptide mass by a known quantity what-when-how

Phosphoproteomics example Gene Modified Site Peptide Phosphorylation (Treatment / Control) AGRN S671 AGPC[160.03]EQAEC[160.03]GS[167.00]GGSGSGEDGDC[160.03]EQELC[160.03]R 4.54 ADAMTS10 S74 RGTGATAES[167.00]R 0.30 CABYR T16 T[181.01]LLEGISR 0.37 TTC7B T152 VIEQDET[181.01]R 5.97 STAT3 Y705 K.n[305.21]YC[160.03]RPESQEHPEADPGSAAPY[243.03]LK[432.30].T 4.50 Sychev et al PLoS Pathogens 2017

Phosphoproteomics interpretation Predict kinases/phosphatases for phospho sites Linding et al Cell 2007

Mass spectrometry replicates Doesn’t identify all proteins in the sample Data dependent acquisition has low overlap across replicates Partly due to biological variation New protocols to overcome this Phosphorylation PTMs are especially variable Grimsrud et al Cell Metabolism 2012 5 biological replicates 9,558 phosphoproteins identified 5.6% in all replicates

Data independent acquisition Not dependent on most abundance signals in MS1 Sliding m/z window Doerr Nature Methods 2015

Mass spectrometry summary Incredibly powerful for looking at biological processes beyond gene expression Protein abundance Post-translational modifications Metabolites Protein-protein interactions Typically reports relative abundance Labeling strategies for comparative analysis Compare relative abundance in multiple conditions Missing data was a big problem, but improving Fully probabilistic analysis pipelines are not the most popular tools Arguably greater diversity in software than RNA-seq