Download presentation
Presentation is loading. Please wait.
Published byBaldwin Hancock Modified over 8 years ago
1
Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB david.wishart@ualberta.ca
2
Lecture 2.32 MS Proteomics Applications Protein identification/confirmation Protein sample purity determination Detection of post-translational modifications Detection of amino acid substitutions De novo peptide sequencing Determination of disulfide bonds (# & status) Monitoring protein folding (H/D exchange) Monitoring protein-ligand complexes/struct. 3D Structure determination
3
Lecture 2.33 Disulfide Mapping A (Oxidized)B (Reduced) Trypsin digest
4
Lecture 2.34 Disulfide Mapping of IL13 RP-HPLC of tryptic digest A) Before DTT B) After DTT
5
Lecture 2.35 Disulfide Mapping of IL13 Tryptic Digest + V8 Digest J. Mass Spectrom. Vol.35, 3 Pages: 446-453
6
Lecture 2.36 Hydrogen Exchange & MS To MS Protein Sample
7
Lecture 2.37 Hydrogen Exchange
8
Lecture 2.38 Hydrogen Exchange & MS
9
Lecture 2.39 Hydrogen Exchange & MS As deuterium (MW=2) is added to the peptide/protein, the isotopic abundances change and their masses increase Peak intensities change and shift right Average mass increases due to addition of deuterium
10
Lecture 2.310 HMS-PCI & Protein Ligands Ho Y, Gruhler A, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415:180-183 (2002) High Throughput Mass Spectral Protein Complex Identification (HMS-PCI) 10% of yeast proteins used as “bait” 3617 associated proteins identified 3 fold higher sensitivity than yeast 2-hybrid
11
Lecture 2.311 Protein-Ligand Mapping
12
Lecture 2.312 3D Protein Structure Determination by Mass Spectrometry Shan Sundararaj University of Alberta SEQ: rpdfcleppytgpckariiryfynakaglc 2ND: ccchhccccccccccccsssssssccccss S-S: 1 2 3 TYR: b e e HDE: FFFSSSSFSSSFSSFFFSSFFFFSSSSSSS
13
Lecture 2.313 How to do it? Use sequence & H-exchange data to identify secondary structure Use chemical modification and H-exchange to ID interior/exterior residues Use chemical cross-linking or disulfide connectivity to ID pairwise distances Use additional constraints derived from bioinformatics predictions Assemble structure using DG or GA
14
Lecture 2.314 Tools of the Trade DD O DTTN-bromosuccinamide D2OD2O Iodoacetate Tetranitromethane EDAC DSS Trypsin
15
Lecture 2.315
16
Lecture 2.316 Proteins Under Study BPTI 58 aa thioredoxin 108 aa lysozyme 129 aa ubiquitin 76 aa
17
Lecture 2.317 MS Proteomics Applications Protein identification/confirmation Protein sample purity determination Detection of post-translational modifications Detection of amino acid substitutions Determination of disulfide bonds (# & status) De novo peptide sequencing Monitoring protein folding (H/D exchange) Monitoring protein-ligand complexes/struct. 3D Structure determination
18
Lecture 2.318 Protein Identification 2D-GE + MALDI-MS –Peptide Mass Fingerprinting (PMF) 2D-GE + MS-MS –MS Peptide Sequencing/Fragment Ion Searching Multidimensional LC + MS-MS –ICAT Methods (isotope labelling) –MudPIT (Multidimensional Protein Ident. Tech.) 1D-GE + LC + MS-MS De Novo Peptide Sequencing
19
Lecture 2.319 2D-GE + MALDI (PMF) Trx p53 G6PDH Trypsin + Gel punch
20
Lecture 2.320 2D-GE + MS-MS p53 Trypsin + Gel punch
21
Lecture 2.321 MudPIT Trypsin + proteins p53 IEX-HPLCRP-HPLC
22
Lecture 2.322 ICAT (Isotope Coded Affinity Tag)
23
Lecture 2.323 The ICAT Reagent
24
Lecture 2.324 ICAT Quantitation
25
Lecture 2.325 Peptide Mass Fingerprinting (PMF)
26
Lecture 2.326 Peptide Mass Fingerprinting Used to identify protein spots on gels or protein peaks from an HPLC run Depends of the fact that if a peptide is cut up or fragmented in a known way, the resulting fragments (and resulting masses) are unique enough to identify the protein Requires a database of known sequences Uses software to compare observed masses with masses calculated from database
27
Lecture 2.327 Principles of Fingerprinting >Protein 1 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe >Protein 2 acekdfhsadfqea sdfpkivtmeeewe nkdadnfeqwfe >Protein 3 acedfhsadfqeka sdfpkivtmeeewe ndakdnfeqwfe Sequence Mass ( M+H ) Tryptic Fragments 4842.05 acedfhsak dfgeasdfpk ivtmeeewendadnfek gwfe acek dfhsadfgeasdfpk ivtmeeewenk dadnfeqwfe acedfhsadfgek asdfpk ivtmeeewendak dnfegwfe
28
Lecture 2.328 Principles of Fingerprinting >Protein 1 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe >Protein 2 acekdfhsadfqea sdfpkivtmeeewe nkdadnfeqwfe >Protein 3 acedfhsadfqeka sdfpkivtmeeewe ndakdnfeqwfe Sequence Mass ( M+H ) Mass Spectrum 4842.05
29
Lecture 2.329 Predicting Peptide Cleavages http://ca.expasy.org/tools/peptidecutter /
30
Lecture 2.330 http://ca.expasy.org/tools/peptidecutter/peptidecutter_enzymes.html#Tryps
31
Lecture 2.331 Protease Cleavage Rules TrypsinXXX[KR]--[!P]XXX ChymotrypsinXX[FYW]--[!P]XXX Lys CXXXXXK-- XXXXX Asp N endoXXXXXD-- XXXXX CNBrXXXXXM--XXXXX
32
Lecture 2.332 Why Trypsin? Robust, stable enzyme Works over a range of pH values & Temp. Quite specific and consistent in cleavage Cuts frequently to produce “ideal” MW peptides Inexpensive, easily available/purified Does produce “autolysis” peaks (which can be used in MS calibrations) –1045.56, 1106.03, 1126.03, 1940.94, 2211.10, 2225.12, 2283.18, 2299.18
33
Lecture 2.333 Calculating Peptide Masses Sum the monoisotopic residue masses Add mass of H 2 O (18.01056) Add mass of H + (1.00785 to get M+H) If Met is oxidized add 15.99491 If Cys has acrylamide adduct add 71.0371 If Cys is iodoacetylated add 58.0071 Other modifications are listed at –http://prowl.rockefeller.edu/aainfo/deltamassv2.html Only consider peptides with masses > 400
34
Lecture 2.334 Masses in MS Monoisotopic mass is the mass determined using the masses of the most abundant isotopes Average mass is the abundance weighted mass of all isotopic components
35
Lecture 2.335 Amino Acid Residue Masses Glycine57.02147 Alanine71.03712 Serine87.03203 Proline97.05277 Valine99.06842 Threonine101.04768 Cysteine103.00919 Isoleucine113.08407 Leucine113.08407 Asparagine114.04293 Aspartic acid115.02695 Glutamine128.05858 Lysine128.09497 Glutamic acid129.04264 Methionine131.04049 Histidine137.05891 Phenylalanine147.06842 Arginine156.10112 Tyrosine163.06333 Tryptophan186.07932 Monoisotopic Mass
36
Lecture 2.336 Amino Acid Residue Masses Glycine57.0520 Alanine71.0788 Serine87.0782 Proline97.1167 Valine99.1326 Threonine101.1051 Cysteine103.1448 Isoleucine113.1595 Leucine113.1595 Asparagine114.1039 Aspartic acid115.0886 Glutamine128.1308 Lysine128.1742 Glutamic acid129.1155 Methionine131.1986 Histidine137.1412 Phenylalanine147.1766 Arginine156.1876 Tyrosine163.1760 Tryptophan186.2133 Average Mass
37
Lecture 2.337 Preparing a Peptide Mass Fingerprint Database Take a protein sequence database (Swiss- Prot or nr-GenBank) Determine cleavage sites and identify resulting peptides for each protein entry Calculate the mass (M+H) for each peptide Sort the masses from lowest to highest Have a pointer for each calculated mass to each protein accession number in databank
38
Lecture 2.338 Building A PMF Database >P12345 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe >P21234 acekdfhsadfqea sdfpkivtmeeewe nkdadnfeqwfe >P89212 acedfhsadfqeka sdfpkivtmeeewe ndakdnfeqwfe Sequence DBCalc. Tryptic Frags Mass List acedfhsak dfgeasdfpk ivtmeeewendadnfek gwfe acek dfhsadfgeasdfpk ivtmeeewenk dadnfeqwfe acedfhsadfgek asdfpk ivtmeeewendak dnfegwfe 450.2017 (P21234) 609.2667 (P12345) 664.3300 (P89212) 1007.4251 (P12345) 1114.4416 (P89212) 1183.5266 (P12345) 1300.5116 (P21234) 1407.6462 (P21234) 1526.6211 (P89212) 1593.7101 (P89212) 1740.7501 (P21234) 2098.8909 (P12345)
39
Lecture 2.339 The Fingerprint (PMF) Algorithm Take a mass spectrum of a trypsin- cleaved protein (from gel or HPLC peak) Identify as many masses as possible in spectrum (avoid autolysis peaks) Compare query masses with database masses and calculate # of matches or matching score (based on length and mass difference) Rank hits and return top scoring entry – this is the protein of interest
40
Lecture 2.340 Query (MALDI) Spectrum 500 1000 1500 2000 2500 698 2098 1199 1007 609 450 2211 (trp) 1940 (trp)
41
Lecture 2.341 Query vs. Database Query Masses Database Mass List Results 450.2017 (P21234) 609.2667 (P12345) 664.3300 (P89212) 1007.4251 (P12345) 1114.4416 (P89212) 1183.5266 (P12345) 1300.5116 (P21234) 1407.6462 (P21234) 1526.6211 (P89212) 1593.7101 (P89212) 1740.7501 (P21234) 2098.8909 (P12345) 450.2201 609.3667 698.3100 1007.5391 1199.4916 2098.9909 2 Unknown masses 1 hit on P21234 3 hits on P12345 Conclude the query protein is P12345
42
Lecture 2.342 What You Need To Do PMF A list of query masses (as many as possible) Protease(s) used or cleavage reagents Databases to search (SWProt, Organism) Estimated mass and pI of protein spot (opt) Cysteine (or other) modifications Minimum number of hits for significance Mass tolerance (100 ppm = 1000.0 ± 0.1 Da) A PMF website (Prowl, ProFound, Mascot, etc.)
43
Lecture 2.343 PMF on the Web ProFound –http://129.85.19.192/profound_bin/WebProFound.exe MOWSE http://srs.hgmp.mrc.ac.uk/cgi-bin/mowse PeptideSearch http://www.narrador.embl- heidelberg.de/GroupPages/Homepage.html Mascot www.matrixscience.com PeptIdent http://us.expasy.org/tools/peptident.html
44
Lecture 2.344 ProFound
45
Lecture 2.345 ProFound (PMF)
46
Lecture 2.346 What Are Missed Cleavages? >Protein 1 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe SequenceTryptic Fragments ( no missed cleavage ) acedfhsak (1007.4251) dfgeasdfpk (1183.5266) ivtmeeewendadnfek (2098.8909) gwfe (609.2667) Tryptic Fragments ( 1 missed cleavage ) acedfhsak (1007.4251) dfgeasdfpk (1183.5266) ivtmeeewendadnfek 2098.8909) gwfe (609.2667) acedfhsakdfgeasdfpk (2171.9338) ivtmeeewendadnfekgwfe (2689.1398) dfgeasdfpkivtmeeewendadnfek (3263.2997)
47
Lecture 2.347 ProFound Results
48
Lecture 2.348 MOWSE
49
Lecture 2.349 PeptIdent
50
Lecture 2.350 MASCOT
51
Lecture 2.351 MASCOT
52
Lecture 2.352 Mascot Scoring The statistics of peptide fragment matching in MS (or PMF) is very similar to the statistics used in BLAST The scoring probability follows an extreme value distribution High scoring segment pairs (in BLAST) are analogous to high scoring mass matches in Mascot Mascot scoring is much more robust than arbitrary match cutoffs (like % ID)
53
Lecture 2.353 Extreme Value Distribution P(x) = 1 - e -e -x
54
Lecture 2.354 Extending HSP’s Extension (# aa) Cumulative Score E = kNe Number of HSP’s found purely by chance - s X T S
55
Lecture 2.355 Mascot/Mowse Scoring The Mascot Score is given as S = -10*Log(P), where P is the probability that the observed match is a random event Try to aim for probabilities where P<0.05 (less than a 5% chance the peptide mass match is random) Mascot scores greater than 72 are significant (p<0.05).
56
Lecture 2.356 Advantages of PMF Uses a “robust” & inexpensive form of MS (MALDI) Doesn’t require too much sample optimization Can be done by a moderately skilled operator (don’t need to be an MS expert) Widely supported by web servers Improves as DB’s get larger & instrumentation gets better Very amenable to high throughput robotics (up to 500 samples a day)
57
Lecture 2.357 High Throughput PMF Trx p53 G6PDH Trypsin + Gel punch
58
Lecture 2.358 2D-GE + MALDI (Manual)
59
Lecture 2.359 Robotic Gel Cutter
60
Lecture 2.360 HT Proteome Mapping
61
Lecture 2.361 Hi Throughput PMF (in gel tryptic digestion)
62
Lecture 2.362 Automated MALDI Processing
63
Lecture 2.363 HT Spotting on a MALDI Plate
64
Lecture 2.364 Limitations With PMF Requires that the protein of interest already be in a sequence database Spurious or missing critical mass peaks always lead to problems Mass resolution/accuracy is critical, best to have <20 ppm mass resolution Generally found to only be about 40% effective in positively identifying gel spots
65
Lecture 2.365 Can We Do Better? 2D-GE + MALDI-MS –Peptide Mass Fingerprinting (PMF) 2D-GE + MS-MS –Sequence Tag/Fragment Ion Searching Multidimensional LC + MS-MS –ICAT Methods (isotope labelling) –MudPIT methods 1D-GE + LC + MS-MS De Novo Peptide Sequencing (MS-MS)
66
Lecture 2.366 MS-MS & Proteomics
67
Lecture 2.367 MS-MS & Proteomics Provides precise sequence-specific data More informative than PMF methods (>90%) Can be used for de- novo sequencing (not entirely dependent on databases) Can be used to ID post- trans. modifications Requires more handling, refinement and sample manipulation Requires more expensive and complicated equipment Requires high level expertise Slower, not generally high throughput Advantages Disadvantages
68
Lecture 2.368 Conclusions There are two main approaches to applying MS to protein identification: 1) Peptide Mass Fingerprinting (PMF) and 2) Sequence tagging (sequencing) Both depend on bioinformatics and sequence databases to succeed Understanding the applications and limitations of MS in proteomics will help in understanding and meeting the growing bioinformatics needs in proteomics
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.