Peptide Mass Fingerprinting Manimalha Balasubramani Genomics and Proteomics Core Laboratories.

Slides:



Advertisements
Similar presentations
Tandem MS (MS/MS) on the Q-ToF2
Advertisements

From Genome to Proteome Juang RH (2004) BCbasics Systems Biology, Integrated Biology.
The Proteomics Core at Wayne State University
ProteinPilot ™ Software © 2008 Applera Corporation and MDS Inc.
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics.
How to identify peptides October 2013 Gustavo de Souza IMM, OUS.
Peptide Mass Fingerprinting
Mass Fingerprint. Protease A protease is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that.
Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005.
Proteomics The proteome is larger than the genome due to alternative splicing and protein modification. As we have said before we need to know All protein-protein.
Proteomics and mass spectrometry Manimalha Balasubramani.
PROTEIN IDENTIFICATION BY MASS SPECTROMETRY. OBJECTIVES To become familiar with matrix assisted laser desorption ionization-time of flight mass spectrometry.
ProReP - Protein Results Parser v3.0©
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Basics of 2-DE and MALDI-ToF MS
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
Mass Spectrometry. What are mass spectrometers? They are analytical tools used to measure the molecular weight of a sample. Accuracy – 0.01 % of the total.
Each results report will contain:
Scaffold Download free viewer:
My contact details and information about submitting samples for MS
Goals in Proteomics 1.Identify and quantify proteins in complex mixtures/complexes 2.Identify global protein-protein interactions 3.Define protein localizations.
Proteomics Josh Leung Biology 1220 April 13 th, 2010.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Evaluated Reference MS/MS Spectra Libraries Current and Future NIST Programs.
Proteome.
Tryptic digestion Proteomics Workflow for Gel-based and LC-coupled Mass Spectrometry Protein or peptide pre-fractionation is a prerequisite for the reduction.
Mueller LN, Brusniak MY, Mani DR, Aebersold R
Comparison of chicken light and dark meat using LC MALDI-TOF mass spectrometry as a model system for biomarker discovery WP 651 Jie Du; Stephen J. Hattan.
PROTEIN CHARACTERIZATION
Towards the Management of Information Quality in Proteomics David Stead University of Aberdeen.
Session III How we analyzed proteomic data? 台大生技教改暑期課程.
UPDATE! In-Class Wed Oct 6 Latil de Ros, Derek Buns, John.
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Laxman Yetukuri T : Modeling of Proteomics Data
Protein Identification by Database Searching John Cottrell Matrix Science.
C. Other Enzymes PCA1 PCA2 glycolytic HSPB2 CK Other Enzymes PCA1 PCA2 Other Enzymes PC1 glycolytic HSPB2 CK glycolytic HSPB2 CK Quantitation of Changes.
In-Gel Digestion Why In-Gel Digest?
PEAKS: De Novo Sequencing using Tandem Mass Spectrometry Bin Ma Dept. of Computer Science University of Western Ontario.
Anti-Importin  3 NIH3T3 cells Primary hippocampal neurons Mouse ES cells Anti-Importin  1 Anti-Importin  1 NIH3T3 cells Primary hippocampal neurons.
INF380 - Proteomics-71 INF380 – Proteomics Chap 7 –Protein Identification and Characterization by MS Protein identification in our context means that we.
Faster, more sensitive peptide identification from tandem mass spectra by sequence database compression Nathan J. Edwards Center for Bioinformatics & Computational.
Error tolerant search Large number of spectra remain without significant score. Reasonable number of fragment ion peaks might have not match. – Underestimated.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
The observed and theoretical peptide sequence information Cal.MassObserved. Mass ±da±ppmStart Sequence EndSequenceIon Score C.I%modification FLPVNEK.
Deducing protein composition from complex protein preparations by MALDI without peptide separation.. TP #419 Kenneth C. Parker SimulTof Corporation, Sudbury,
2014 생화학 실험 (1) 6주차 실험조교 : 류 지 연 Yonsei Proteome Research Center 산학협동관 421호
Minimize Database-Dependence in Proteome Informatics Apr. 28, 2009 Kyung-Hoon Kwon Korea Basic Science Institute.
Using Scaffold OHRI Proteomics Core Facility. This presentation is intended for Core Facility internal training purposes only.
RANIA MOHAMED EL-SHARKAWY Lecturer of clinical chemistry Medical Research Institute, Alexandria University MEDICAL RESEARCH INSTITUTE– ALEXANDRIA UNIVERSITY.
Yonsei Proteome Research Center Peptide Mass Finger-Printing Part II. MALDI-TOF 2013 생화학 실험 (1) 6 주차 자료 임종선 조교 내선 6625.
Post translational modification n- acetylation Peptide Mass Fingerprinting (PMF) is an analytical technique for identifying unknown protein. Proteins to.
Database Search Algorithm for Identification of Intact Cross-Links in Proteins and Peptides Using Tandem Mass Sepctrometry 신성호.
Protein identification
Novel Proteomics Techniques and Bioinformatics
Mass Spectrometry 101 (continued) Hackert - CH 370 / 387D
MassMatrix Search Results Explained
Novel Proteomics Techniques
Methodology for LC-MS/MS data analysis
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra I
Protein Identification by Peptide Mass Fingerprinting
Proteomics Informatics –
Bioinformatics for Proteomics
Mass Spectrometry THE MAIN USE OF MS IN ORG CHEM IS:
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra
Presentation transcript:

Peptide Mass Fingerprinting Manimalha Balasubramani Genomics and Proteomics Core Laboratories

Genomics and Proteomics Core Lab website

GPCL Inventory  ABI Voyager DE PRO, user operated  ABI 4700 Proteomics Analyzer  Thermoelectron LCQ Deca with Surveyor HPLC  ABI Qstar Elite with Ultimate 3000 HPLC  Bruker micrOTOF with Ultimate 3000 HPLC  Bruker 12 Tesla FTMS with Ultimate 3000 HPLC

4700 Proteomics Analyzer, ABI Voyager DE PRO, ABI micrOTOF, Bruker

LCQ Deca XP, Thermofisher 12T FT MS, Bruker Qstar Elite, ABI

Peptide mass fingerprinting (PMF) is a technique for protein and peptide identification

Outline PMF Workflow: –Sample preparation –Mass spectra: MS, and MS/MS –Database searches Examples, hands-on exercises Contaminants, post-translational modifications, enzyme digestions Evaluating PMF analysis

PMF: Sample preparation Peptide fingerprint

Mass Spectra are acquired with.. MALDI TOF MS (Voyager DE PRO, ABI) MALDI TOF/TOF MS (4700 Proteomics Analyzer, ABI) MALDI – M atrix A ssisted L aser Desorption I onization TOF – T ime O f F light MS – M ass S pectrometry

Mass Spectrum: MS Mass to charge ratio (m/z) Intensity

FWHM Full width at half maxima of a peak Source: wiki

Resolution and mass accuracy R = M Δm R = resolution M = mass of the peak of interest Δ m = width in daltons of the peak Δm measured at 50% peak height is the Full Width at Half Maxima (FWHM)

Ubiquitin ESI Spectra on 12T FT-ICR Mass Error > 0.56 ppm

Ubiquitin ESI Spectra on 12T FT-ICR Mass Error < 0.56 ppm

Ubiquitin ESI Spectra 12T FT-ICR Resolution > 175,000

Mass accuracy is measured as parts per million value ppm = 10 6 Δm = 10 6 M R

Peptide Mass Fingerprint

Mass spectrum processing, calibration External calibration Internal calibration –trypsin autodigestion peaks –Keratin peaks –Spiking with an internal standard

Peak List Spectrum viewer Compiled from the mass spectra –Mass list –Mass list and intensity Peak list is submitted for Database searching

Database searching

Description of database searching using Mascot program -At GPCL, 4800 Proteomics analyzer data is presented to the Mascot webserver through ProteinPilot -Mascot can be accessed through the web -

Mascot scoring A frequency factor matrix, F, is created, in which each row represents an interval of 100 Da in peptide mass, and each column an interval of 10 kDa in intact protein mass. As each sequence entry is processed, the appropriate matrix elements fi,j are incremented so as to accumulate statistics on the size distribution of peptide masses as a function of protein mass. The elements of F are then normalised by dividing the elements of each 10 kDa column by the largest value in that column to give the Mowse factor matrix M: After searching the experimental mass values against a calculated peptide mass database, the score for each entry is calculated according to: Where MProt is the molecular weight of the entry and the product term is calculated from the Mowse factor elements for each match between the experimental data and peptide masses calculated from the entry. Source:

PMF search page

Parameters used in database searching Database searched Taxonomy Enzyme Missed cleavages Fixed versus variable modifications (PTMs) MW and pI Mass tolerance

Oxidation of methionine in proteins and peptides +16 Da +32 Da From Ionsource.com

S-carboxymethylation of the amino acid residue cysteine with the alkylating agent iodoacetic acid Or s-carbamidomethylation with iodoacetamide (+57 da) + 58 Da From Ionsource.com

Databases: NCBI nr.*tar.gz non-redundant protein sequence database with entries from GenPept, Swissprot, PIR, PDF, PDB, and NCBI RefSeq

Swiss-Prot, IPI, others

Submit a peak list to Mascot

Mascot PMF report

Hands-on exercise Go to Desktop – open txt file copy and paste in Mascot search page – Specify search parameters »Allow 100ppm error for PMFal_100.txt »Allow 25ppm error for PMFgd_25.txt

Not all peaks are matched –why? Theoretical peptide list –peptides lengths vs. MS range –Enzyme – missed/non-specific cleavage –Incorrect ORF –Amino acid substitutions –Ion suppression/efficiency

Experimental peptide list –Contaminants Trypsin autolysis peptides Hair, skin keratins Matrix molecules, clusters Unknown contaminants –Modifications PTM’s – known and unknown, biological origin Oxidized methionines, – gel induced artifactsOxidized methionines, Chemical – cysteine carbamidomethylation, sample handling introducedcysteine carbamidomethylation Adducts Amino acid substitutions Splice variant Not all peaks are matched –why?

Database search takes into account contaminants, modifications, For eg.

Evaluating PMF analysis Acceptable hit –High score –Major peaks accounted for No hit –Insufficient data – low intensity MS –Single gel band contains >2-3 proteins –Protein not represented in database – ORF/genome Further analysis –MS/MS confirmation of few major peaks, unaccounted peaks – Ideal –Low score, good spectrum – LC MS/MS –Low score, low intensity spectrum – concentrate sample, reacquire –High score, some unaccounted peaks – MS/MS

MS/MS Plot of m/z versus intensity At GPCL, –MALDI TOF/TOF MS –ESI QqTOF MS –ESI IT MS –MALDI/ESI FT ICR MS

Tandem MS 4700 Proteomics Analyzer, Applied Biosystems

MS MS, followed by precursor ion selection

Fragment ion spectrum Tandem MS

Tandem mass spectrum

Database Searching Peptide Mass Fingerprinting Sequence tag approach De novo sequencing inspect raw data Tandem mass spectra (MS/MS) can be used for peptide sequencing

Mascot Search Results Search title : SampleSetID: 362, AnalysisID: 567, MaldiWellID: 15790, SpectrumID: 17225, Path=\Mani\102004\New Analysis 1 Database : NCBInr ( sequences; residues) Timestamp : 20 Oct 2004 at 14:52:50 GMT Top Score : 681 forgi|180570, creatine kinase [Homo sapiens] Probability Based Mowse Score Score is-10*Log(P), where P is the probability that the observed match is a random event. Protein scores greater than 75 are significant (p<0.05).

Top hits from Mascot Search – there are multiple accession numbers for the same protein

Search returns a cluster of proteins with the same matching peptides

Nominal mass (M r ): 42591; Calculated pI value: 5.34 Observed Mass & pI: 43kd, Creatine kinase - B [Homo sapiens] Match to: gi| ; Score: 681 Sequence Coverage: 46% 1 MPFSNSHNAL KLRFPAEDEF PDLSAHNNHM AKVLTPELYA ELRAKSTPSG 51 FTLDDVIQTG VDNPGHPYIM TVGCVAGDEE SYEVFKDLFD PIIEDRHGGY 101 KPSDEHKTDL NPDNLQGGDD LDPNYVLSSR VRTGRSIRGF CLPPHCSRGE 151 RRAIEKLAVE ALSSLDGDLA GRYYALKSMT EAEQQQLIDD HFLFDKPVSP 201 LLSASGMARD WPDARGIWHN DNKTFLVWVN EEDHLRVISM QKGGNMKEVF 251 TRFCTGLTQI ETLFKSKDYE FMWNPHLGYI LTCPSNLGTG LRAGVHIKLP 301 NLGKHEKFSE VLKRLRLQKR GTGGVDTAAV GGVFDVSNAD RLGFSEVELV 351 QMVVDGVKLL IEMEQRLEQG QAIDDLMPAQ K Creatine kinase B is the highest scoring protein

GPCL resources for Bioinformatic analysis Mascot version 2.1.0, Matrix Science Ltd –Mascot Daemon ProteinPilot software 2.0, Applied Biosystems/MDS Sciex –Paragon algorithm –And Mascot algorithm Sequest, Thermoelectron Selected list

Resources /proteomics

2 nd Dimension – SDS PAGE 1 st Dimension - Isoelectric focussing Spot picking Trypsin gel digest..its high-throughput…

Sample separation.. HPLC 1D or 2D LC MALDI In-solution Isoelectric focussing

GPCL services.. Fee for service model Support investigators –Scientific expertise –Technical expertise –Grant submission

Genomics and Proteomics Core Laboratories Paul WoodBilly W. Day Director Scientific Director Janette Lamb Assistant Director Proteomics Lab Chris Bolcato John Cardamone Emanuel M Schreiber Guy Ueichi James Porter Robert Wolfe Jason Sun

A mass spectrum Plot of m/z versus intensity MALDI TOF (/TOF) MS ESI TOF MS ESI QqTOF MS ESI IT MS MALDI/ESI FT ICR MS

Mass analyzers – several designs Aebersold and Mann, Nature review, 422, p198, 2003

QqTOF MS/MS

9% 19%7% 34% 5% 4%22% Mascot Each search engine identifies about the same number of spectra, But the overlap is surprisingly small. Different search engines match different spectra. But the overlap is surprisingly small. Different search engines match different spectra. Each search engine scores differently SEQUEST X!tandem Courtesy: Proteome Software Inc.

James Lyons-Weiler Scientific Director Bioinformatics Analysis Core (412) (office) (412) (cell) Fax: