Peptide Mass Fingerprinting Manimalha Balasubramani Genomics and Proteomics Core Laboratories
Genomics and Proteomics Core Lab website
GPCL Inventory ABI Voyager DE PRO, user operated ABI 4700 Proteomics Analyzer Thermoelectron LCQ Deca with Surveyor HPLC ABI Qstar Elite with Ultimate 3000 HPLC Bruker micrOTOF with Ultimate 3000 HPLC Bruker 12 Tesla FTMS with Ultimate 3000 HPLC
4700 Proteomics Analyzer, ABI Voyager DE PRO, ABI micrOTOF, Bruker
LCQ Deca XP, Thermofisher 12T FT MS, Bruker Qstar Elite, ABI
Peptide mass fingerprinting (PMF) is a technique for protein and peptide identification
Outline PMF Workflow: –Sample preparation –Mass spectra: MS, and MS/MS –Database searches Examples, hands-on exercises Contaminants, post-translational modifications, enzyme digestions Evaluating PMF analysis
PMF: Sample preparation Peptide fingerprint
Mass Spectra are acquired with.. MALDI TOF MS (Voyager DE PRO, ABI) MALDI TOF/TOF MS (4700 Proteomics Analyzer, ABI) MALDI – M atrix A ssisted L aser Desorption I onization TOF – T ime O f F light MS – M ass S pectrometry
Mass Spectrum: MS Mass to charge ratio (m/z) Intensity
FWHM Full width at half maxima of a peak Source: wiki
Resolution and mass accuracy R = M Δm R = resolution M = mass of the peak of interest Δ m = width in daltons of the peak Δm measured at 50% peak height is the Full Width at Half Maxima (FWHM)
Ubiquitin ESI Spectra on 12T FT-ICR Mass Error > 0.56 ppm
Ubiquitin ESI Spectra on 12T FT-ICR Mass Error < 0.56 ppm
Ubiquitin ESI Spectra 12T FT-ICR Resolution > 175,000
Mass accuracy is measured as parts per million value ppm = 10 6 Δm = 10 6 M R
Peptide Mass Fingerprint
Mass spectrum processing, calibration External calibration Internal calibration –trypsin autodigestion peaks –Keratin peaks –Spiking with an internal standard
Peak List Spectrum viewer Compiled from the mass spectra –Mass list –Mass list and intensity Peak list is submitted for Database searching
Database searching
Description of database searching using Mascot program -At GPCL, 4800 Proteomics analyzer data is presented to the Mascot webserver through ProteinPilot -Mascot can be accessed through the web -
Mascot scoring A frequency factor matrix, F, is created, in which each row represents an interval of 100 Da in peptide mass, and each column an interval of 10 kDa in intact protein mass. As each sequence entry is processed, the appropriate matrix elements fi,j are incremented so as to accumulate statistics on the size distribution of peptide masses as a function of protein mass. The elements of F are then normalised by dividing the elements of each 10 kDa column by the largest value in that column to give the Mowse factor matrix M: After searching the experimental mass values against a calculated peptide mass database, the score for each entry is calculated according to: Where MProt is the molecular weight of the entry and the product term is calculated from the Mowse factor elements for each match between the experimental data and peptide masses calculated from the entry. Source:
PMF search page
Parameters used in database searching Database searched Taxonomy Enzyme Missed cleavages Fixed versus variable modifications (PTMs) MW and pI Mass tolerance
Oxidation of methionine in proteins and peptides +16 Da +32 Da From Ionsource.com
S-carboxymethylation of the amino acid residue cysteine with the alkylating agent iodoacetic acid Or s-carbamidomethylation with iodoacetamide (+57 da) + 58 Da From Ionsource.com
Databases: NCBI nr.*tar.gz non-redundant protein sequence database with entries from GenPept, Swissprot, PIR, PDF, PDB, and NCBI RefSeq
Swiss-Prot, IPI, others
Submit a peak list to Mascot
Mascot PMF report
Hands-on exercise Go to Desktop – open txt file copy and paste in Mascot search page – Specify search parameters »Allow 100ppm error for PMFal_100.txt »Allow 25ppm error for PMFgd_25.txt
Not all peaks are matched –why? Theoretical peptide list –peptides lengths vs. MS range –Enzyme – missed/non-specific cleavage –Incorrect ORF –Amino acid substitutions –Ion suppression/efficiency
Experimental peptide list –Contaminants Trypsin autolysis peptides Hair, skin keratins Matrix molecules, clusters Unknown contaminants –Modifications PTM’s – known and unknown, biological origin Oxidized methionines, – gel induced artifactsOxidized methionines, Chemical – cysteine carbamidomethylation, sample handling introducedcysteine carbamidomethylation Adducts Amino acid substitutions Splice variant Not all peaks are matched –why?
Database search takes into account contaminants, modifications, For eg.
Evaluating PMF analysis Acceptable hit –High score –Major peaks accounted for No hit –Insufficient data – low intensity MS –Single gel band contains >2-3 proteins –Protein not represented in database – ORF/genome Further analysis –MS/MS confirmation of few major peaks, unaccounted peaks – Ideal –Low score, good spectrum – LC MS/MS –Low score, low intensity spectrum – concentrate sample, reacquire –High score, some unaccounted peaks – MS/MS
MS/MS Plot of m/z versus intensity At GPCL, –MALDI TOF/TOF MS –ESI QqTOF MS –ESI IT MS –MALDI/ESI FT ICR MS
Tandem MS 4700 Proteomics Analyzer, Applied Biosystems
MS MS, followed by precursor ion selection
Fragment ion spectrum Tandem MS
Tandem mass spectrum
Database Searching Peptide Mass Fingerprinting Sequence tag approach De novo sequencing inspect raw data Tandem mass spectra (MS/MS) can be used for peptide sequencing
Mascot Search Results Search title : SampleSetID: 362, AnalysisID: 567, MaldiWellID: 15790, SpectrumID: 17225, Path=\Mani\102004\New Analysis 1 Database : NCBInr ( sequences; residues) Timestamp : 20 Oct 2004 at 14:52:50 GMT Top Score : 681 forgi|180570, creatine kinase [Homo sapiens] Probability Based Mowse Score Score is-10*Log(P), where P is the probability that the observed match is a random event. Protein scores greater than 75 are significant (p<0.05).
Top hits from Mascot Search – there are multiple accession numbers for the same protein
Search returns a cluster of proteins with the same matching peptides
Nominal mass (M r ): 42591; Calculated pI value: 5.34 Observed Mass & pI: 43kd, Creatine kinase - B [Homo sapiens] Match to: gi| ; Score: 681 Sequence Coverage: 46% 1 MPFSNSHNAL KLRFPAEDEF PDLSAHNNHM AKVLTPELYA ELRAKSTPSG 51 FTLDDVIQTG VDNPGHPYIM TVGCVAGDEE SYEVFKDLFD PIIEDRHGGY 101 KPSDEHKTDL NPDNLQGGDD LDPNYVLSSR VRTGRSIRGF CLPPHCSRGE 151 RRAIEKLAVE ALSSLDGDLA GRYYALKSMT EAEQQQLIDD HFLFDKPVSP 201 LLSASGMARD WPDARGIWHN DNKTFLVWVN EEDHLRVISM QKGGNMKEVF 251 TRFCTGLTQI ETLFKSKDYE FMWNPHLGYI LTCPSNLGTG LRAGVHIKLP 301 NLGKHEKFSE VLKRLRLQKR GTGGVDTAAV GGVFDVSNAD RLGFSEVELV 351 QMVVDGVKLL IEMEQRLEQG QAIDDLMPAQ K Creatine kinase B is the highest scoring protein
GPCL resources for Bioinformatic analysis Mascot version 2.1.0, Matrix Science Ltd –Mascot Daemon ProteinPilot software 2.0, Applied Biosystems/MDS Sciex –Paragon algorithm –And Mascot algorithm Sequest, Thermoelectron Selected list
Resources /proteomics
2 nd Dimension – SDS PAGE 1 st Dimension - Isoelectric focussing Spot picking Trypsin gel digest..its high-throughput…
Sample separation.. HPLC 1D or 2D LC MALDI In-solution Isoelectric focussing
GPCL services.. Fee for service model Support investigators –Scientific expertise –Technical expertise –Grant submission
Genomics and Proteomics Core Laboratories Paul WoodBilly W. Day Director Scientific Director Janette Lamb Assistant Director Proteomics Lab Chris Bolcato John Cardamone Emanuel M Schreiber Guy Ueichi James Porter Robert Wolfe Jason Sun
A mass spectrum Plot of m/z versus intensity MALDI TOF (/TOF) MS ESI TOF MS ESI QqTOF MS ESI IT MS MALDI/ESI FT ICR MS
Mass analyzers – several designs Aebersold and Mann, Nature review, 422, p198, 2003
QqTOF MS/MS
9% 19%7% 34% 5% 4%22% Mascot Each search engine identifies about the same number of spectra, But the overlap is surprisingly small. Different search engines match different spectra. But the overlap is surprisingly small. Different search engines match different spectra. Each search engine scores differently SEQUEST X!tandem Courtesy: Proteome Software Inc.
James Lyons-Weiler Scientific Director Bioinformatics Analysis Core (412) (office) (412) (cell) Fax: