Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB

Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB david.wishart@ualberta.ca

Lecture 2.32 MS Proteomics Applications Protein identification/confirmation Protein sample purity determination Detection of post-translational modifications Detection of amino acid substitutions De novo peptide sequencing Determination of disulfide bonds (# & status) Monitoring protein folding (H/D exchange) Monitoring protein-ligand complexes/struct. 3D Structure determination

Lecture 2.33 Disulfide Mapping A (Oxidized)B (Reduced) Trypsin digest

Lecture 2.34 Disulfide Mapping of IL13 RP-HPLC of tryptic digest A) Before DTT B) After DTT

Lecture 2.35 Disulfide Mapping of IL13 Tryptic Digest + V8 Digest J. Mass Spectrom. Vol.35, 3 Pages: 446-453

Lecture 2.36 Hydrogen Exchange & MS To MS Protein Sample

Lecture 2.37 Hydrogen Exchange

Lecture 2.38 Hydrogen Exchange & MS

Lecture 2.39 Hydrogen Exchange & MS As deuterium (MW=2) is added to the peptide/protein, the isotopic abundances change and their masses increase Peak intensities change and shift right Average mass increases due to addition of deuterium

Lecture 2.310 HMS-PCI & Protein Ligands Ho Y, Gruhler A, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415:180-183 (2002) High Throughput Mass Spectral Protein Complex Identification (HMS-PCI) 10% of yeast proteins used as “bait” 3617 associated proteins identified 3 fold higher sensitivity than yeast 2-hybrid

Lecture 2.311 Protein-Ligand Mapping

Lecture 2.312 3D Protein Structure Determination by Mass Spectrometry Shan Sundararaj University of Alberta SEQ: rpdfcleppytgpckariiryfynakaglc 2ND: ccchhccccccccccccsssssssccccss S-S: 1 2 3 TYR: b e e HDE: FFFSSSSFSSSFSSFFFSSFFFFSSSSSSS

Lecture 2.313 How to do it? Use sequence & H-exchange data to identify secondary structure Use chemical modification and H-exchange to ID interior/exterior residues Use chemical cross-linking or disulfide connectivity to ID pairwise distances Use additional constraints derived from bioinformatics predictions Assemble structure using DG or GA

Lecture 2.314 Tools of the Trade DD O DTTN-bromosuccinamide D2OD2O Iodoacetate Tetranitromethane EDAC DSS Trypsin

Lecture 2.315

Lecture 2.316 Proteins Under Study BPTI 58 aa thioredoxin 108 aa lysozyme 129 aa ubiquitin 76 aa

Lecture 2.317 MS Proteomics Applications Protein identification/confirmation Protein sample purity determination Detection of post-translational modifications Detection of amino acid substitutions Determination of disulfide bonds (# & status) De novo peptide sequencing Monitoring protein folding (H/D exchange) Monitoring protein-ligand complexes/struct. 3D Structure determination

Lecture 2.318 Protein Identification 2D-GE + MALDI-MS –Peptide Mass Fingerprinting (PMF) 2D-GE + MS-MS –MS Peptide Sequencing/Fragment Ion Searching Multidimensional LC + MS-MS –ICAT Methods (isotope labelling) –MudPIT (Multidimensional Protein Ident. Tech.) 1D-GE + LC + MS-MS De Novo Peptide Sequencing

Lecture 2.319 2D-GE + MALDI (PMF) Trx p53 G6PDH Trypsin + Gel punch

Lecture 2.320 2D-GE + MS-MS p53 Trypsin + Gel punch

Lecture 2.321 MudPIT Trypsin + proteins p53 IEX-HPLCRP-HPLC

Lecture 2.322 ICAT (Isotope Coded Affinity Tag)

Lecture 2.323 The ICAT Reagent

Lecture 2.324 ICAT Quantitation

Lecture 2.325 Peptide Mass Fingerprinting (PMF)

Lecture 2.326 Peptide Mass Fingerprinting Used to identify protein spots on gels or protein peaks from an HPLC run Depends of the fact that if a peptide is cut up or fragmented in a known way, the resulting fragments (and resulting masses) are unique enough to identify the protein Requires a database of known sequences Uses software to compare observed masses with masses calculated from database

Lecture 2.327 Principles of Fingerprinting >Protein 1 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe >Protein 2 acekdfhsadfqea sdfpkivtmeeewe nkdadnfeqwfe >Protein 3 acedfhsadfqeka sdfpkivtmeeewe ndakdnfeqwfe Sequence Mass ( M+H ) Tryptic Fragments 4842.05 acedfhsak dfgeasdfpk ivtmeeewendadnfek gwfe acek dfhsadfgeasdfpk ivtmeeewenk dadnfeqwfe acedfhsadfgek asdfpk ivtmeeewendak dnfegwfe

Lecture 2.328 Principles of Fingerprinting >Protein 1 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe >Protein 2 acekdfhsadfqea sdfpkivtmeeewe nkdadnfeqwfe >Protein 3 acedfhsadfqeka sdfpkivtmeeewe ndakdnfeqwfe Sequence Mass ( M+H ) Mass Spectrum 4842.05

Lecture 2.329 Predicting Peptide Cleavages http://ca.expasy.org/tools/peptidecutter /

Lecture 2.330 http://ca.expasy.org/tools/peptidecutter/peptidecutter_enzymes.html#Tryps

Lecture 2.331 Protease Cleavage Rules TrypsinXXX[KR]--[!P]XXX ChymotrypsinXX[FYW]--[!P]XXX Lys CXXXXXK-- XXXXX Asp N endoXXXXXD-- XXXXX CNBrXXXXXM--XXXXX

Lecture 2.332 Why Trypsin? Robust, stable enzyme Works over a range of pH values & Temp. Quite specific and consistent in cleavage Cuts frequently to produce “ideal” MW peptides Inexpensive, easily available/purified Does produce “autolysis” peaks (which can be used in MS calibrations) –1045.56, 1106.03, 1126.03, 1940.94, 2211.10, 2225.12, 2283.18, 2299.18

Lecture 2.333 Calculating Peptide Masses Sum the monoisotopic residue masses Add mass of H 2 O (18.01056) Add mass of H + (1.00785 to get M+H) If Met is oxidized add 15.99491 If Cys has acrylamide adduct add 71.0371 If Cys is iodoacetylated add 58.0071 Other modifications are listed at –http://prowl.rockefeller.edu/aainfo/deltamassv2.html Only consider peptides with masses > 400

Lecture 2.334 Masses in MS Monoisotopic mass is the mass determined using the masses of the most abundant isotopes Average mass is the abundance weighted mass of all isotopic components

Lecture 2.335 Amino Acid Residue Masses Glycine57.02147 Alanine71.03712 Serine87.03203 Proline97.05277 Valine99.06842 Threonine101.04768 Cysteine103.00919 Isoleucine113.08407 Leucine113.08407 Asparagine114.04293 Aspartic acid115.02695 Glutamine128.05858 Lysine128.09497 Glutamic acid129.04264 Methionine131.04049 Histidine137.05891 Phenylalanine147.06842 Arginine156.10112 Tyrosine163.06333 Tryptophan186.07932 Monoisotopic Mass

Lecture 2.336 Amino Acid Residue Masses Glycine57.0520 Alanine71.0788 Serine87.0782 Proline97.1167 Valine99.1326 Threonine101.1051 Cysteine103.1448 Isoleucine113.1595 Leucine113.1595 Asparagine114.1039 Aspartic acid115.0886 Glutamine128.1308 Lysine128.1742 Glutamic acid129.1155 Methionine131.1986 Histidine137.1412 Phenylalanine147.1766 Arginine156.1876 Tyrosine163.1760 Tryptophan186.2133 Average Mass

Lecture 2.337 Preparing a Peptide Mass Fingerprint Database Take a protein sequence database (Swiss- Prot or nr-GenBank) Determine cleavage sites and identify resulting peptides for each protein entry Calculate the mass (M+H) for each peptide Sort the masses from lowest to highest Have a pointer for each calculated mass to each protein accession number in databank

Lecture 2.338 Building A PMF Database >P12345 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe >P21234 acekdfhsadfqea sdfpkivtmeeewe nkdadnfeqwfe >P89212 acedfhsadfqeka sdfpkivtmeeewe ndakdnfeqwfe Sequence DBCalc. Tryptic Frags Mass List acedfhsak dfgeasdfpk ivtmeeewendadnfek gwfe acek dfhsadfgeasdfpk ivtmeeewenk dadnfeqwfe acedfhsadfgek asdfpk ivtmeeewendak dnfegwfe 450.2017 (P21234) 609.2667 (P12345) 664.3300 (P89212) 1007.4251 (P12345) 1114.4416 (P89212) 1183.5266 (P12345) 1300.5116 (P21234) 1407.6462 (P21234) 1526.6211 (P89212) 1593.7101 (P89212) 1740.7501 (P21234) 2098.8909 (P12345)

Lecture 2.339 The Fingerprint (PMF) Algorithm Take a mass spectrum of a trypsin- cleaved protein (from gel or HPLC peak) Identify as many masses as possible in spectrum (avoid autolysis peaks) Compare query masses with database masses and calculate # of matches or matching score (based on length and mass difference) Rank hits and return top scoring entry – this is the protein of interest

Lecture 2.340 Query (MALDI) Spectrum 500 1000 1500 2000 2500 698 2098 1199 1007 609 450 2211 (trp) 1940 (trp)

Lecture 2.341 Query vs. Database Query Masses Database Mass List Results 450.2017 (P21234) 609.2667 (P12345) 664.3300 (P89212) 1007.4251 (P12345) 1114.4416 (P89212) 1183.5266 (P12345) 1300.5116 (P21234) 1407.6462 (P21234) 1526.6211 (P89212) 1593.7101 (P89212) 1740.7501 (P21234) 2098.8909 (P12345) 450.2201 609.3667 698.3100 1007.5391 1199.4916 2098.9909 2 Unknown masses 1 hit on P21234 3 hits on P12345 Conclude the query protein is P12345

Lecture 2.342 What You Need To Do PMF A list of query masses (as many as possible) Protease(s) used or cleavage reagents Databases to search (SWProt, Organism) Estimated mass and pI of protein spot (opt) Cysteine (or other) modifications Minimum number of hits for significance Mass tolerance (100 ppm = 1000.0 ± 0.1 Da) A PMF website (Prowl, ProFound, Mascot, etc.)

Lecture 2.343 PMF on the Web ProFound –http://129.85.19.192/profound_bin/WebProFound.exe MOWSE http://srs.hgmp.mrc.ac.uk/cgi-bin/mowse PeptideSearch http://www.narrador.embl- heidelberg.de/GroupPages/Homepage.html Mascot www.matrixscience.com PeptIdent http://us.expasy.org/tools/peptident.html

Lecture 2.344 ProFound

Lecture 2.345 ProFound (PMF)

Lecture 2.346 What Are Missed Cleavages? >Protein 1 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe SequenceTryptic Fragments ( no missed cleavage ) acedfhsak (1007.4251) dfgeasdfpk (1183.5266) ivtmeeewendadnfek (2098.8909) gwfe (609.2667) Tryptic Fragments ( 1 missed cleavage ) acedfhsak (1007.4251) dfgeasdfpk (1183.5266) ivtmeeewendadnfek 2098.8909) gwfe (609.2667) acedfhsakdfgeasdfpk (2171.9338) ivtmeeewendadnfekgwfe (2689.1398) dfgeasdfpkivtmeeewendadnfek (3263.2997)

Lecture 2.347 ProFound Results

Lecture 2.348 MOWSE

Lecture 2.349 PeptIdent

Lecture 2.350 MASCOT

Lecture 2.351 MASCOT

Lecture 2.352 Mascot Scoring The statistics of peptide fragment matching in MS (or PMF) is very similar to the statistics used in BLAST The scoring probability follows an extreme value distribution High scoring segment pairs (in BLAST) are analogous to high scoring mass matches in Mascot Mascot scoring is much more robust than arbitrary match cutoffs (like % ID)

Lecture 2.353 Extreme Value Distribution P(x) = 1 - e -e -x

Lecture 2.354 Extending HSP’s Extension (# aa) Cumulative Score E = kNe Number of HSP’s found purely by chance - s X T S

Lecture 2.355 Mascot/Mowse Scoring The Mascot Score is given as S = -10*Log(P), where P is the probability that the observed match is a random event Try to aim for probabilities where P<0.05 (less than a 5% chance the peptide mass match is random) Mascot scores greater than 72 are significant (p<0.05).

Lecture 2.356 Advantages of PMF Uses a “robust” & inexpensive form of MS (MALDI) Doesn’t require too much sample optimization Can be done by a moderately skilled operator (don’t need to be an MS expert) Widely supported by web servers Improves as DB’s get larger & instrumentation gets better Very amenable to high throughput robotics (up to 500 samples a day)

Lecture 2.357 High Throughput PMF Trx p53 G6PDH Trypsin + Gel punch

Lecture 2.358 2D-GE + MALDI (Manual)

Lecture 2.359 Robotic Gel Cutter

Lecture 2.360 HT Proteome Mapping

Lecture 2.361 Hi Throughput PMF (in gel tryptic digestion)

Lecture 2.362 Automated MALDI Processing

Lecture 2.363 HT Spotting on a MALDI Plate

Lecture 2.364 Limitations With PMF Requires that the protein of interest already be in a sequence database Spurious or missing critical mass peaks always lead to problems Mass resolution/accuracy is critical, best to have <20 ppm mass resolution Generally found to only be about 40% effective in positively identifying gel spots

Lecture 2.365 Can We Do Better? 2D-GE + MALDI-MS –Peptide Mass Fingerprinting (PMF) 2D-GE + MS-MS –Sequence Tag/Fragment Ion Searching Multidimensional LC + MS-MS –ICAT Methods (isotope labelling) –MudPIT methods 1D-GE + LC + MS-MS De Novo Peptide Sequencing (MS-MS)

Lecture 2.366 MS-MS & Proteomics

Lecture 2.367 MS-MS & Proteomics Provides precise sequence-specific data More informative than PMF methods (>90%) Can be used for de- novo sequencing (not entirely dependent on databases) Can be used to ID post- trans. modifications Requires more handling, refinement and sample manipulation Requires more expensive and complicated equipment Requires high level expertise Slower, not generally high throughput Advantages Disadvantages

Lecture 2.368 Conclusions There are two main approaches to applying MS to protein identification: 1) Peptide Mass Fingerprinting (PMF) and 2) Sequence tagging (sequencing) Both depend on bioinformatics and sequence databases to succeed Understanding the applications and limitations of MS in proteomics will help in understanding and meeting the growing bioinformatics needs in proteomics

Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB

Similar presentations

Presentation on theme: "Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB

Similar presentations

Presentation on theme: "Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB"— Presentation transcript:

Similar presentations

About project

Feedback