Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Proteomics Informatics – Learning Objectives Be able analyze proteomics data sets and understand the limitations of the results.
Proteomics Informatics – Syllabus Lecture 1 Overview of proteomics (January 26, 2016 TRB 718 5pm) Lecture 2 Overview of mass spectrometry (February 2, 2016 TRB 718 5pm) Lecture 3 Protein identification I: searching protein sequence collections (February 9, 2016 TRB 718 5pm) Lecture 4 Databases, data repositories and standardization (February 16, 2016 TRB 718 5pm) Lecture 5 Presentations of Project 1: Trends in Proteomics (February 23, 2016 TRB 718 5pm) Lecture 6 Interpreation of mass spectra (March 1, 2016 TRB 718 5pm) Lecture 7 Protein identification II: de novo sequencing (March 8, 2016 TRB 718 5pm) Lecture 8 Protein quantitation I: data dependent acquisition (March 15, 2016 TRB 718 5pm) Lecture 9 Protein quantitation II: targeted and data-independent acquisition (March 22, 2016 TRB 718 5pm) Lecture 10 Presentations of project 2: Identification (April 12, 2016 TRB 718 5pm) Lecture 11 Proteogenomics (April 19, 2016 TRB 718 5pm) Lecture 12 Protein characterization I: post-translational modifications (April 26, 2016 TRB 718 5pm) Lecture 13 Protein characterization II: protein interactions (May 3, 2016 TRB 718 5pm) Lecture 14 Data analysis, visualization, and molecular markers (May 10, 2016 TRB 718 5pm) Lecture 15 Presentations of project 3: Quantitation (May 31, 2016 TRB 718 5pm)
Overview of Proteomics (Week 1) Why proteomics? Bioinformatics Overview of the course
Motivating Example: Protein Regulation Geiger et al., “Proteomic changes resulting from gene copy number variations in cancer cells”, PLoS Genet Sep 2;6(9). pii: e
Motivating Example: Protein Complexes Alber et al., Nature 2007
Motivating Example: Signaling Choudhary & Mann, Nature Reviews Molecular Cell Biology 2010
Bioinformatics Biological System Samples Measurements Experimental Design Raw Data Information Data Analysis
Mass Spectrometry Based Proteomics Mass spectrometry Lysis Fractionation MS Digestion Identified and Quantified Proteins Peak Finding Charge determination De-isotoping Integrating Peaks Searching
Ion Source Mass Analyzer Detector mass/charge intensity Overview of Mass spectrometry (Week 2)
Mass Analyzer 1 Frag- mentation DetectorIon Source Mass Analyzer 2 b y Overview of Mass spectrometry (Week 2)
Mass Analyzer 1 Frag- mentation Detector intensity mass/charge Ion Source Mass Analyzer 2 LC intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge Time intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge Overview of Mass spectrometry (Week 2)
Signal processing I: Analysis of mass spectra (Week 2) m/z Intensity
Protein identification I: searching protein sequence collections and significance testing (Week 3)
Most proteins show very reproducible peptide patterns Databases, data repositories and standardization (Week 4)
Query Spectrum Best match In GPMDB Second best match In GPMDB Databases, data repositories and standardization (Week 4)
Presentations of project I (Week 5) Find trends over the last 10 years in the public proteomics data. Data sets: 10 min presentations
Interpretation of Mass Spectra (Week 6) KLEDEELFG S
K 1166 L 1020 E 907 D 778 E 663 E 534 L 405 F 292 G 145 S 88b ions KLEDEELFG S Interpretation of Mass Spectra (Week 6)
147 K 1166 L E D E E L F G S y ions b ions KLEDEELFG S Interpretation of Mass Spectra (Week 6)
147 K 1166 L E D E E L F G S y ions b ions KLEDEELFG S Interpretation of Mass Spectra (Week 6)
147 K 1166 L E D E E L F G S y ions b ions KLEDEELFG S Interpretation of Mass Spectra (Week 6)
147 K 1166 L E D E E L F G S y ions b ions 113 KLEDEELFG S Interpretation of Mass Spectra (Week 6)
147 K 1166 L E D E E L F G S y ions b ions 129 KLEDEELFG S Interpretation of Mass Spectra (Week 6)
KLEDEELFG S 147 K 1166 L E D E E L F G S y ions b ions Interpretation of Mass Spectra (Week 6)
KLEDEELFG S 147 K 1166 L E D E E L F G S y ions b ions Interpretation of Mass Spectra (Week 6)
KLEDEELFG S 147 K 1166 L E D E E L F G S y ions b ions Interpretation of Mass Spectra (Week 6)
Protein identification II: de novo sequencing (Week 7) m/z % Relative Abundance [M+2H] Mass Differences Amino acid masses Sequences consistent with spectrum
Protein quantitation I: Overview (Week 8)
H L Fractionation Digestion LC-MS Light Heavy Lysis MS Oda et al. PNAS 96 (1999) 6591 Ong et al. MCP 1 (2002) 376 Assumption: All losses after mixing are identical for the heavy and light isotopes and Sample i Protein j Peptide k
Protein quantitation II: Targeted (Week 9) Fractionation Digestion LC-MS Lysis MS Shotgun proteomics Targeted MS 1. Records M/Z 2. Selects peptides based on abundance and fragments MS/MS 3. Protein database search for peptide identification Data Dependent Acquisition (DDA) Uses predefined set of peptides 1. Select precursor ion MS 2. Precursor fragmentation MS/MS 3. Use Precursor-Fragment pairs for identification
Presentations of project II (Week 10) Protein identification Data sets: 10 min presentations
Proteogenomics (Week 11) Tumor Specific Protein DB Non-Tumor Sample Genome sequencing Identify germline variants Reference Human Database (Ensembl) Genome sequencing RNA-Seq Tumor Sample Identify alternative splicing, somatic variants and novel expression TCGAGAGCTG TCGATAGCTG Exon 1 Exon 2 Exon 3 Exon 1 Variants Alt. Splicing Novel Expression Exon 1 Exon X Exon 2 Fusion Genes Gene X Exon 1 Gene X Exon 2 Gene Y Exon 1 Gene Y Exon 2 Gene XGene Y Kelly Ruggles
Protein characterization I: post-translational modifications (Week 12) Peptide with two possible modification sites MS/MS spectrum m/z Intensity Matching Which assignment does the data support? 1, 1 or 2, or 1 and 2?
A B A C D Digestion Mass spectrometry E F Identification Protein Characterization II: protein interactions (Week 13)
Data analysis and visualization (Week 14)
Presentations of project III (Week 15) Protein quantitation Data sets: TBD 10 min presentations
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information