Introduction to mass spectrometry- based protein identification and quantification Austin Yang, Ph.D. Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature Mar 13;422(6928): Review. Mueller LN, Brusniak MY, Mani DR, Aebersold R An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data. J Proteome Res Jan;7(1):51-61.
The typical proteomics experiment consists of five stages
Mass spectrometers used in proteome research.
Monoistopic Mass = Average Mass = (calculated) As shown in Figure 1. the monoisotoptic mass of this compound is For a given compound the monoisotopic mass is the mass of the isotopic peak whose elemental composition is composed of the most abundant isotopes of those elements. The monoisotopic mass can be calculated using the atomic masses of the isotopes. The average mass is the weighted average of the isotopic masses weighted by the isotopic abundances. The average mass can be calculated using the atomic weights of the elements.
Electrospray Ionization (ESI) Multiple charging Multiple charging – More charges for larger molecules MW range > 150 kDa MW range > 150 kDa Liquid introduction of analyte Liquid introduction of analyte – Interface with liquid separation methods, e.g. liquid chromatography – Tandem mass spectrometry (MS/MS) for protein sequencing mass/charge (m/z) highly charge droplets MS ESI
Origin of the ES Spectra of Peptides H H H H 4+4+ H H H 3+3+ H 2+2+ H 1+1+ H m/z = (M r +4H)/4 m/z = (M r +3H)/3 m/z = (M r +2H)/2 m/z = (M r +H) Rel. Inten. m/z ES-MS
b1b1 b2b2 b3b3 y1y1 y2y2 y3y3 LFG K Relative Intensity m/z FLGK ++ FLGK ++ FLGK ++ CID FLGK ++ FLGK ++ FLGK ++ b1b1 b2b2 b3b3 y3y3 y2y2 y1y1 FLGK ++ FLGK + Theoretical CID of a Tryptic Peptide KGLF MS/MS Spectrum Parent ions (464.29) Daughter ions Non-dissociated Parent ions
Peptide Sequencing by LC/MS/MS
Web addresses of some representative internet resources for protein identification from mass spectrometry data
Data Mining through SEQUEST and PAULA DatabaseSearch Time Yeast ORFs (6,351 entries) 52 sec: sec/s Non-redundant protein (100k entries) 3500 min: EST (100K entries, 3-frames) 5-10,000 min:
STEP 1. SEQUEST Algorithm (Experimental MS/MS Spectrum) 500 peptides with masses closest to that of the parent ion are retrieved from a protein database. Computer generates a theoretical MS/MS Spectrum for each peptide sequence (SEQ1, 2, 3, 4, …) (Experimental MS/MS Spectrum) Theoretical MS/MS spectra Step 1. Determine Parent Ion molecular mass Step 2. Step 3. Experimental Spectrum is compared with each theoretical spectra and correlation scores are assigned. Step 4. Scores are ranked and Protein Identifications are made based on these cross correlation scores. ZSA-charge assignment Unified Scoring Function
Prot A Peptide 1 Peptide 2 Prot B Peptide 3 Peptide 4 Peptide 5 Prot in the sample (enriched for ‘multi-hit’ proteins) not in the sample (enriched for ‘single hits’) Prot Peptide 6 Peptide 7 Peptide 8 Peptide 9 Peptide correct (+) Amplification of False Positive Error Rate from Peptide to Protein Level Peptide Level: 50% False Positives Protein Level: 71% False Positives
Quantitative Mass Spec Analysis 1. Relative Quantitation a. ICAT: Isotope-Coded Affinity Tags b. Digestion with Oxygen-18 Water c. Spectra Counting and Non-labeling Methodology 2. Absolute Quantitation
Cysteine C3H5NOS Carboxymethyl Cys C5H7NO3S Alkylation of Cysteine Residue
Mascot Example Slides ICAT
Trypsin Digestion with Oxygen18 and Oxygen16 Water
Johri et al. Nature Reviews Microbiology 4, 932 – 942 (December 2006) | doi: / nrmicro1552 Absolute Quantification
Public Web Server Class Data Download: Local Web Server Username: GPLS716 Password: GPLS716
MS1 PMF(peptide mass fingerprinting) Search Example Data: testms1.txt, 210 MS1 peaks Database: bovine Fixed modifications : Carboxymethyl (C) Variable modifications : Oxidation (M) Peptide Tolerance: 0.1 Da Monoisotopic mass Mass Value: Mr
Quantification Search Example Data: 18O_BSA_100fmol_1to5_01_ RAW.mgf Database: bovine Fixed modifications : Carbamidomethyl (C) Peptide Tolerance: 8 Da (required for O18 labeling) Fragment Tolerance: 0.2 Da Quantification Method: 18O corrected multiplex
MS/MS Database Search Example Data: BSA onespectra.mgf (one spectra) Database: bovine Fixed modifications: Carboxymethyl(C ) Varied modifications: Oxidatation(M) Peptide Mass Tolerance : 0.1 Da Fragment Mass Tolerance: 0.1 Da tation_help.html
MS2 mixture example Data: mixture10spectra.mgf Database: yeast Fixed modifications : Carbamidomethyl (C+57.02) Variable modifications : Oxidation (M) Peptide Mass Tolerance : 0.1 Da Fragment Mass Tolerance: 0.1 Da
Home Work 1. You will have to download your datasets from the following url: a. Identification of phosphorylation site : Data:BIG RAW.mgf Recommend parameters: Database: human. Variable Modification: Phospho(ST) Fixed modification: Carboamidomethyl(C). b. Quantificaiton of oxygen-18/oxygen-16 digested BSA Data: 18O_BSA_500fmol_1to5_ RAW.mgf. Submit your search results in pdf or html format to the following address: Please include the following information when you submit your homework 1. Your name and ID in the subject of your 2. Search parameters 3. A short summary of your search results. Questions: Contact Yunhu Wan, Phone number: