Download presentation
Presentation is loading. Please wait.
Published byRebecca Wilkins Modified over 9 years ago
1
Peptide Mass Fingerprinting Manimalha Balasubramani Genomics and Proteomics Core Laboratories
2
Genomics and Proteomics Core Lab website www.genetics.pitt.edu www.genetics.pitt.edu
3
GPCL Inventory ABI Voyager DE PRO, user operated ABI 4700 Proteomics Analyzer Thermoelectron LCQ Deca with Surveyor HPLC ABI Qstar Elite with Ultimate 3000 HPLC Bruker micrOTOF with Ultimate 3000 HPLC Bruker 12 Tesla FTMS with Ultimate 3000 HPLC
4
4700 Proteomics Analyzer, ABI Voyager DE PRO, ABI micrOTOF, Bruker
5
LCQ Deca XP, Thermofisher 12T FT MS, Bruker Qstar Elite, ABI
6
Peptide mass fingerprinting (PMF) is a technique for protein and peptide identification
7
Outline PMF Workflow: –Sample preparation –Mass spectra: MS, and MS/MS –Database searches Examples, hands-on exercises Contaminants, post-translational modifications, enzyme digestions Evaluating PMF analysis
8
PMF: Sample preparation Peptide fingerprint
9
Mass Spectra are acquired with.. MALDI TOF MS (Voyager DE PRO, ABI) MALDI TOF/TOF MS (4700 Proteomics Analyzer, ABI) MALDI – M atrix A ssisted L aser Desorption I onization TOF – T ime O f F light MS – M ass S pectrometry
10
Mass Spectrum: MS Mass to charge ratio (m/z) Intensity
11
FWHM Full width at half maxima of a peak Source: wiki
12
Resolution and mass accuracy R = M Δm R = resolution M = mass of the peak of interest Δ m = width in daltons of the peak Δm measured at 50% peak height is the Full Width at Half Maxima (FWHM)
13
Ubiquitin ESI Spectra on 12T FT-ICR Mass Error > 0.56 ppm
14
Ubiquitin ESI Spectra on 12T FT-ICR Mass Error < 0.56 ppm
15
Ubiquitin ESI Spectra 12T FT-ICR Resolution > 175,000
16
Mass accuracy is measured as parts per million value ppm = 10 6 Δm = 10 6 M R
17
Peptide Mass Fingerprint
18
Mass spectrum processing, calibration External calibration Internal calibration –trypsin autodigestion peaks –Keratin peaks –Spiking with an internal standard
19
Peak List Spectrum viewer Compiled from the mass spectra –Mass list –Mass list and intensity Peak list is submitted for Database searching
20
Database searching
21
Description of database searching using Mascot program -At GPCL, 4800 Proteomics analyzer data is presented to the Mascot webserver through ProteinPilot -Mascot can be accessed through the web -http://www.matrixscience.comhttp://www.matrixscience.com
22
Mascot scoring A frequency factor matrix, F, is created, in which each row represents an interval of 100 Da in peptide mass, and each column an interval of 10 kDa in intact protein mass. As each sequence entry is processed, the appropriate matrix elements fi,j are incremented so as to accumulate statistics on the size distribution of peptide masses as a function of protein mass. The elements of F are then normalised by dividing the elements of each 10 kDa column by the largest value in that column to give the Mowse factor matrix M: After searching the experimental mass values against a calculated peptide mass database, the score for each entry is calculated according to: Where MProt is the molecular weight of the entry and the product term is calculated from the Mowse factor elements for each match between the experimental data and peptide masses calculated from the entry. Source: http://www.matrixscience.com/
23
PMF search page
24
Parameters used in database searching Database searched Taxonomy Enzyme Missed cleavages Fixed versus variable modifications (PTMs) MW and pI Mass tolerance
25
Oxidation of methionine in proteins and peptides +16 Da +32 Da From Ionsource.com
26
S-carboxymethylation of the amino acid residue cysteine with the alkylating agent iodoacetic acid Or s-carbamidomethylation with iodoacetamide (+57 da) + 58 Da From Ionsource.com
27
Databases: NCBI nr.*tar.gz non-redundant protein sequence database with entries from GenPept, Swissprot, PIR, PDF, PDB, and NCBI RefSeq
28
Swiss-Prot, IPI, others
29
Submit a peak list to Mascot 1075.513062 1086.581177 1090.547241 1092.517822 1100.630249 1103.572754 1106.553223 1107.529663 1118.498779 1119.519531 1121.509644 1129.604492 1141.572388 1156.586792 1166.537231 1170.607422 1172.612183 1179.590332 1194.604126 1217.567749 1232.610474 1252.583740 1308.654297 1312.705811 1314.744385 1337.672485 1401.651245 1424.745728 1427.830566 1435.718872 1475.762695 1479.710327 1493.734131 1502.774780 1530.834717 1575.850952 1607.807007 1629.868408 1639.935425 1752.863892 1753.904663 1754.915161 1791.744507 1792.805054 1794.820801 1816.801392 1875.976196 1902.006104 1940.941650 1960.053345 1962.928955 2211.118652 2225.130371 2233.105225 2249.076660 http://matrixscience.com/cgi/search_form.pl?FORMVER=2&SEARCH=PMF
30
Mascot PMF report
31
Hands-on exercise Go to Desktop – open txt file copy and paste in Mascot search page – Specify search parameters »Allow 100ppm error for PMFal_100.txt »Allow 25ppm error for PMFgd_25.txt
32
Not all peaks are matched –why? Theoretical peptide list –peptides lengths vs. MS range –Enzyme – missed/non-specific cleavage –Incorrect ORF –Amino acid substitutions –Ion suppression/efficiency
33
Experimental peptide list –Contaminants Trypsin autolysis peptides Hair, skin keratins Matrix molecules, clusters Unknown contaminants –Modifications PTM’s – known and unknown, biological origin Oxidized methionines, – gel induced artifactsOxidized methionines, Chemical – cysteine carbamidomethylation, sample handling introducedcysteine carbamidomethylation Adducts Amino acid substitutions Splice variant Not all peaks are matched –why?
34
Database search takes into account contaminants, modifications, For eg.
35
Evaluating PMF analysis Acceptable hit –High score –Major peaks accounted for No hit –Insufficient data – low intensity MS –Single gel band contains >2-3 proteins –Protein not represented in database – ORF/genome Further analysis –MS/MS confirmation of few major peaks, unaccounted peaks – Ideal –Low score, good spectrum – LC MS/MS –Low score, low intensity spectrum – concentrate sample, reacquire –High score, some unaccounted peaks – MS/MS
36
MS/MS Plot of m/z versus intensity At GPCL, –MALDI TOF/TOF MS –ESI QqTOF MS –ESI IT MS –MALDI/ESI FT ICR MS
37
Tandem MS 4700 Proteomics Analyzer, Applied Biosystems
38
MS MS, followed by precursor ion selection
39
Fragment ion spectrum Tandem MS
40
Tandem mass spectrum http://qbab.aber.ac.uk
41
Database Searching Peptide Mass Fingerprinting Sequence tag approach De novo sequencing inspect raw data http://qbab.aber.ac.uk Tandem mass spectra (MS/MS) can be used for peptide sequencing
42
Mascot Search Results Search title : SampleSetID: 362, AnalysisID: 567, MaldiWellID: 15790, SpectrumID: 17225, Path=\Mani\102004\New Analysis 1 Database : NCBInr 20040606 (1846720 sequences; 611532004 residues) Timestamp : 20 Oct 2004 at 14:52:50 GMT Top Score : 681 forgi|180570, creatine kinase [Homo sapiens] Probability Based Mowse Score Score is-10*Log(P), where P is the probability that the observed match is a random event. Protein scores greater than 75 are significant (p<0.05).
43
Top hits from Mascot Search – there are multiple accession numbers for the same protein
44
Search returns a cluster of proteins with the same matching peptides
45
Nominal mass (M r ): 42591; Calculated pI value: 5.34 Observed Mass & pI: 43kd, 6.2-6.27 Creatine kinase - B [Homo sapiens] Match to: gi|21536286 ; Score: 681 Sequence Coverage: 46% 1 MPFSNSHNAL KLRFPAEDEF PDLSAHNNHM AKVLTPELYA ELRAKSTPSG 51 FTLDDVIQTG VDNPGHPYIM TVGCVAGDEE SYEVFKDLFD PIIEDRHGGY 101 KPSDEHKTDL NPDNLQGGDD LDPNYVLSSR VRTGRSIRGF CLPPHCSRGE 151 RRAIEKLAVE ALSSLDGDLA GRYYALKSMT EAEQQQLIDD HFLFDKPVSP 201 LLSASGMARD WPDARGIWHN DNKTFLVWVN EEDHLRVISM QKGGNMKEVF 251 TRFCTGLTQI ETLFKSKDYE FMWNPHLGYI LTCPSNLGTG LRAGVHIKLP 301 NLGKHEKFSE VLKRLRLQKR GTGGVDTAAV GGVFDVSNAD RLGFSEVELV 351 QMVVDGVKLL IEMEQRLEQG QAIDDLMPAQ K Creatine kinase B is the highest scoring protein
46
GPCL resources for Bioinformatic analysis Mascot version 2.1.0, Matrix Science Ltd –Mascot Daemon ProteinPilot software 2.0, Applied Biosystems/MDS Sciex –Paragon algorithm –And Mascot algorithm Sequest, Thermoelectron Selected list
47
Resources http://www.hsls.pitt.edu/guides/genetics/obrc /proteomics
48
2 nd Dimension – SDS PAGE 1 st Dimension - Isoelectric focussing Spot picking Trypsin gel digest..its high-throughput…
49
Sample separation.. HPLC 1D or 2D LC MALDI In-solution Isoelectric focussing
50
GPCL services.. Fee for service model Support investigators –Scientific expertise –Technical expertise –Grant submission
51
Genomics and Proteomics Core Laboratories Paul WoodBilly W. Day Director Scientific Director Janette Lamb Assistant Director Proteomics Lab Chris Bolcato John Cardamone Emanuel M Schreiber Guy Ueichi James Porter Robert Wolfe Jason Sun
52
A mass spectrum Plot of m/z versus intensity MALDI TOF (/TOF) MS ESI TOF MS ESI QqTOF MS ESI IT MS MALDI/ESI FT ICR MS
53
Mass analyzers – several designs Aebersold and Mann, Nature review, 422, p198, 2003
54
QqTOF MS/MS
55
9% 19%7% 34% 5% 4%22% Mascot Each search engine identifies about the same number of spectra, But the overlap is surprisingly small. Different search engines match different spectra. But the overlap is surprisingly small. Different search engines match different spectra. Each search engine scores differently SEQUEST X!tandem Courtesy: Proteome Software Inc.
56
James Lyons-Weiler Scientific Director Bioinformatics Analysis Core (412) 393-2087 (office) (412) 728-8743 (cell) Fax: 412-648-1891
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.