Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB

Slides:



Advertisements
Similar presentations
Proteins: Structure reflects function….. Fig. 5-UN1 Amino group Carboxyl group carbon.
Advertisements

Review.
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
5’ C 3’ OH (free) 1’ C 5’ PO4 (free) DNA is a linear polymer of nucleotide subunits joined together by phosphodiester bonds - covalent bonds between.
How to identify peptides October 2013 Gustavo de Souza IMM, OUS.
Peptide Mass Fingerprinting
Mass Fingerprint. Protease A protease is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that.
Amino Acids, Peptides, Protein Primary Structure
Amino Acids, Peptides, Protein Primary Structure
Molecular Techniques in Molecular Systematics. DNA-DNA hybridisation -Measures the degree of genetic similarity between pools of DNA sequences. -Normally.
©CMBI 2006 Amino Acids “ When you understand the amino acids, you understand everything ”
You Must Know How the sequence and subcomponents of proteins determine their properties. The cellular functions of proteins. (Brief – we will come back.
Basics of 2-DE and MALDI-ToF MS
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
Mass Spectrometry. What are mass spectrometers? They are analytical tools used to measure the molecular weight of a sample. Accuracy – 0.01 % of the total.
My contact details and information about submitting samples for MS
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Tryptic digestion Proteomics Workflow for Gel-based and LC-coupled Mass Spectrometry Protein or peptide pre-fractionation is a prerequisite for the reduction.
PROTEIN CHARACTERIZATION
Session III How we analyzed proteomic data? 台大生技教改暑期課程.
Lab 2.41 Peptide Mass Fingerprinting and MS/MS Fragment Ion analysis with MASCOT Gary Van Domselaar University of Alberta Edmtonton, AB
UPDATE! In-Class Wed Oct 6 Latil de Ros, Derek Buns, John.
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Common parameters At the beginning one need to set up the parameters.
. Sequence Alignment. Sequences Much of bioinformatics involves sequences u DNA sequences u RNA sequences u Protein sequences We can think of these sequences.
AMINO ACIDS.
Now playing: Frank Sinatra “My Way” A large part of modern biology is understanding large molecules like Proteins A large part of modern biology is understanding.
Laxman Yetukuri T : Modeling of Proteomics Data
Doug Raiford Lesson 19.  Framework model  Secondary structure first  Assemble secondary structure segments  Hydrophobic collapse  Molten: compact.
Learning Targets “I Can...” -State how many nucleotides make up a codon. -Use a codon chart to find the corresponding amino acid.
intro-VIRUSES Virus NamePDB ID HUMAN PAPILLOMAVIRUS 161DZL BACTERIOPHAGE GA1GAV L-A virus1M1C SATELLITE PANICUM MOSAIC VIRUS1STM SATELLITE TOBACCO NECROSIS2BUK.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Genome of the week - Enterococcus faecalis E. faecalis - urinary tract infections, bacteremia, endocarditis. Organism sequenced is vancomycin resistant.
In-Gel Digestion Why In-Gel Digest?
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
INF380 - Proteomics-71 INF380 – Proteomics Chap 7 –Protein Identification and Characterization by MS Protein identification in our context means that we.
Amino Acids ©CMBI 2001 “ When you understand the amino acids, you understand everything ”
Chapter 3 Proteins.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
The observed and theoretical peptide sequence information Cal.MassObserved. Mass ±da±ppmStart Sequence EndSequenceIon Score C.I%modification FLPVNEK.
Salamanca, March 16th 2010 Participants: Laboratori de Proteomica-HUVH Servicio de Proteómica-CNB-CSIC Participants: Laboratori de Proteomica-HUVH Servicio.
Lecture 6 Comparative analysis Oct 2011 SDMBT.
Oct 2011 SDMBT1 Lecture 11 Some quantitation methods with LC-MS a.ICAT b.iTRAQ c.Proteolytic 18 O labelling d.SILAC e.AQUA f.Label Free quantitation.
Mass Spectrometric Peptide Identification Using MASCOT
Computational Challenges in Metabolomics (Part 1)
Protein structure depends on amino acid sequence and interactions Amino acid sequence Local interactions Long distance interactions Interactions between.
MS-MS: Applications to Proteomics
Amino acids.
Protein Folding Notes.
BIOLOGY 12 Protein Synthesis.
Protein Sequence Alignments
Proteomics Lecture 4 Proteases.
Section 3-4: Translation
MCB test 2 Review M. Alex Miranda 11/5/16.
Fig. 5-UN1  carbon Amino group Carboxyl group.
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra I
Protein Identification by Peptide Mass Fingerprinting
Proteomics Informatics –
Do now activity #6 What is the definition of: RNA?
Translation.
Do now activity #5 How many strands are there in DNA?
Protein Identification Using Tandem Mass Spectrometry
Bioinformatics for Proteomics
Example of regression by RBF-ANN
Proteins Proteins have many structures, resulting in a wide range of functions Proteins do most of the work in cells and act as enzymes 2. Proteins are.
Interpretation of Mass Spectra
Protein Identification by Sequence Database Search
Presentation transcript:

Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB

Lecture 2.32 MS Proteomics Applications Protein identification/confirmation Protein sample purity determination Detection of post-translational modifications Detection of amino acid substitutions De novo peptide sequencing Determination of disulfide bonds (# & status) Monitoring protein folding (H/D exchange) Monitoring protein-ligand complexes/struct. 3D Structure determination

Lecture 2.33 Disulfide Mapping A (Oxidized)B (Reduced) Trypsin digest

Lecture 2.34 Disulfide Mapping of IL13 RP-HPLC of tryptic digest A) Before DTT B) After DTT

Lecture 2.35 Disulfide Mapping of IL13 Tryptic Digest + V8 Digest J. Mass Spectrom. Vol.35, 3 Pages:

Lecture 2.36 Hydrogen Exchange & MS To MS Protein Sample

Lecture 2.37 Hydrogen Exchange

Lecture 2.38 Hydrogen Exchange & MS

Lecture 2.39 Hydrogen Exchange & MS As deuterium (MW=2) is added to the peptide/protein, the isotopic abundances change and their masses increase Peak intensities change and shift right Average mass increases due to addition of deuterium

Lecture HMS-PCI & Protein Ligands Ho Y, Gruhler A, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415: (2002) High Throughput Mass Spectral Protein Complex Identification (HMS-PCI) 10% of yeast proteins used as “bait” 3617 associated proteins identified 3 fold higher sensitivity than yeast 2-hybrid

Lecture Protein-Ligand Mapping

Lecture D Protein Structure Determination by Mass Spectrometry Shan Sundararaj University of Alberta SEQ: rpdfcleppytgpckariiryfynakaglc 2ND: ccchhccccccccccccsssssssccccss S-S: TYR: b e e HDE: FFFSSSSFSSSFSSFFFSSFFFFSSSSSSS

Lecture How to do it? Use sequence & H-exchange data to identify secondary structure Use chemical modification and H-exchange to ID interior/exterior residues Use chemical cross-linking or disulfide connectivity to ID pairwise distances Use additional constraints derived from bioinformatics predictions Assemble structure using DG or GA

Lecture Tools of the Trade DD O DTTN-bromosuccinamide D2OD2O Iodoacetate Tetranitromethane EDAC DSS Trypsin

Lecture 2.315

Lecture Proteins Under Study BPTI 58 aa thioredoxin 108 aa lysozyme 129 aa ubiquitin 76 aa

Lecture MS Proteomics Applications Protein identification/confirmation Protein sample purity determination Detection of post-translational modifications Detection of amino acid substitutions Determination of disulfide bonds (# & status) De novo peptide sequencing Monitoring protein folding (H/D exchange) Monitoring protein-ligand complexes/struct. 3D Structure determination

Lecture Protein Identification 2D-GE + MALDI-MS –Peptide Mass Fingerprinting (PMF) 2D-GE + MS-MS –MS Peptide Sequencing/Fragment Ion Searching Multidimensional LC + MS-MS –ICAT Methods (isotope labelling) –MudPIT (Multidimensional Protein Ident. Tech.) 1D-GE + LC + MS-MS De Novo Peptide Sequencing

Lecture D-GE + MALDI (PMF) Trx p53 G6PDH Trypsin + Gel punch

Lecture D-GE + MS-MS p53 Trypsin + Gel punch

Lecture MudPIT Trypsin + proteins p53 IEX-HPLCRP-HPLC

Lecture ICAT (Isotope Coded Affinity Tag)

Lecture The ICAT Reagent

Lecture ICAT Quantitation

Lecture Peptide Mass Fingerprinting (PMF)

Lecture Peptide Mass Fingerprinting Used to identify protein spots on gels or protein peaks from an HPLC run Depends of the fact that if a peptide is cut up or fragmented in a known way, the resulting fragments (and resulting masses) are unique enough to identify the protein Requires a database of known sequences Uses software to compare observed masses with masses calculated from database

Lecture Principles of Fingerprinting >Protein 1 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe >Protein 2 acekdfhsadfqea sdfpkivtmeeewe nkdadnfeqwfe >Protein 3 acedfhsadfqeka sdfpkivtmeeewe ndakdnfeqwfe Sequence Mass ( M+H ) Tryptic Fragments acedfhsak dfgeasdfpk ivtmeeewendadnfek gwfe acek dfhsadfgeasdfpk ivtmeeewenk dadnfeqwfe acedfhsadfgek asdfpk ivtmeeewendak dnfegwfe

Lecture Principles of Fingerprinting >Protein 1 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe >Protein 2 acekdfhsadfqea sdfpkivtmeeewe nkdadnfeqwfe >Protein 3 acedfhsadfqeka sdfpkivtmeeewe ndakdnfeqwfe Sequence Mass ( M+H ) Mass Spectrum

Lecture Predicting Peptide Cleavages /

Lecture

Lecture Protease Cleavage Rules TrypsinXXX[KR]--[!P]XXX ChymotrypsinXX[FYW]--[!P]XXX Lys CXXXXXK-- XXXXX Asp N endoXXXXXD-- XXXXX CNBrXXXXXM--XXXXX

Lecture Why Trypsin? Robust, stable enzyme Works over a range of pH values & Temp. Quite specific and consistent in cleavage Cuts frequently to produce “ideal” MW peptides Inexpensive, easily available/purified Does produce “autolysis” peaks (which can be used in MS calibrations) – , , , , , , ,

Lecture Calculating Peptide Masses Sum the monoisotopic residue masses Add mass of H 2 O ( ) Add mass of H + ( to get M+H) If Met is oxidized add If Cys has acrylamide adduct add If Cys is iodoacetylated add Other modifications are listed at – Only consider peptides with masses > 400

Lecture Masses in MS Monoisotopic mass is the mass determined using the masses of the most abundant isotopes Average mass is the abundance weighted mass of all isotopic components

Lecture Amino Acid Residue Masses Glycine Alanine Serine Proline Valine Threonine Cysteine Isoleucine Leucine Asparagine Aspartic acid Glutamine Lysine Glutamic acid Methionine Histidine Phenylalanine Arginine Tyrosine Tryptophan Monoisotopic Mass

Lecture Amino Acid Residue Masses Glycine Alanine Serine Proline Valine Threonine Cysteine Isoleucine Leucine Asparagine Aspartic acid Glutamine Lysine Glutamic acid Methionine Histidine Phenylalanine Arginine Tyrosine Tryptophan Average Mass

Lecture Preparing a Peptide Mass Fingerprint Database Take a protein sequence database (Swiss- Prot or nr-GenBank) Determine cleavage sites and identify resulting peptides for each protein entry Calculate the mass (M+H) for each peptide Sort the masses from lowest to highest Have a pointer for each calculated mass to each protein accession number in databank

Lecture Building A PMF Database >P12345 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe >P21234 acekdfhsadfqea sdfpkivtmeeewe nkdadnfeqwfe >P89212 acedfhsadfqeka sdfpkivtmeeewe ndakdnfeqwfe Sequence DBCalc. Tryptic Frags Mass List acedfhsak dfgeasdfpk ivtmeeewendadnfek gwfe acek dfhsadfgeasdfpk ivtmeeewenk dadnfeqwfe acedfhsadfgek asdfpk ivtmeeewendak dnfegwfe (P21234) (P12345) (P89212) (P12345) (P89212) (P12345) (P21234) (P21234) (P89212) (P89212) (P21234) (P12345)

Lecture The Fingerprint (PMF) Algorithm Take a mass spectrum of a trypsin- cleaved protein (from gel or HPLC peak) Identify as many masses as possible in spectrum (avoid autolysis peaks) Compare query masses with database masses and calculate # of matches or matching score (based on length and mass difference) Rank hits and return top scoring entry – this is the protein of interest

Lecture Query (MALDI) Spectrum (trp) 1940 (trp)

Lecture Query vs. Database Query Masses Database Mass List Results (P21234) (P12345) (P89212) (P12345) (P89212) (P12345) (P21234) (P21234) (P89212) (P89212) (P21234) (P12345) Unknown masses 1 hit on P hits on P12345 Conclude the query protein is P12345

Lecture What You Need To Do PMF A list of query masses (as many as possible) Protease(s) used or cleavage reagents Databases to search (SWProt, Organism) Estimated mass and pI of protein spot (opt) Cysteine (or other) modifications Minimum number of hits for significance Mass tolerance (100 ppm = ± 0.1 Da) A PMF website (Prowl, ProFound, Mascot, etc.)

Lecture PMF on the Web ProFound – MOWSE PeptideSearch heidelberg.de/GroupPages/Homepage.html Mascot PeptIdent

Lecture ProFound

Lecture ProFound (PMF)

Lecture What Are Missed Cleavages? >Protein 1 acedfhsakdfqea sdfpkivtmeeewe ndadnfekqwfe SequenceTryptic Fragments ( no missed cleavage ) acedfhsak ( ) dfgeasdfpk ( ) ivtmeeewendadnfek ( ) gwfe ( ) Tryptic Fragments ( 1 missed cleavage ) acedfhsak ( ) dfgeasdfpk ( ) ivtmeeewendadnfek ) gwfe ( ) acedfhsakdfgeasdfpk ( ) ivtmeeewendadnfekgwfe ( ) dfgeasdfpkivtmeeewendadnfek ( )

Lecture ProFound Results

Lecture MOWSE

Lecture PeptIdent

Lecture MASCOT

Lecture MASCOT

Lecture Mascot Scoring The statistics of peptide fragment matching in MS (or PMF) is very similar to the statistics used in BLAST The scoring probability follows an extreme value distribution High scoring segment pairs (in BLAST) are analogous to high scoring mass matches in Mascot Mascot scoring is much more robust than arbitrary match cutoffs (like % ID)

Lecture Extreme Value Distribution P(x) = 1 - e -e -x

Lecture Extending HSP’s Extension (# aa) Cumulative Score E = kNe Number of HSP’s found purely by chance - s X T S

Lecture Mascot/Mowse Scoring The Mascot Score is given as S = -10*Log(P), where P is the probability that the observed match is a random event Try to aim for probabilities where P<0.05 (less than a 5% chance the peptide mass match is random) Mascot scores greater than 72 are significant (p<0.05).

Lecture Advantages of PMF Uses a “robust” & inexpensive form of MS (MALDI) Doesn’t require too much sample optimization Can be done by a moderately skilled operator (don’t need to be an MS expert) Widely supported by web servers Improves as DB’s get larger & instrumentation gets better Very amenable to high throughput robotics (up to 500 samples a day)

Lecture High Throughput PMF Trx p53 G6PDH Trypsin + Gel punch

Lecture D-GE + MALDI (Manual)

Lecture Robotic Gel Cutter

Lecture HT Proteome Mapping

Lecture Hi Throughput PMF (in gel tryptic digestion)

Lecture Automated MALDI Processing

Lecture HT Spotting on a MALDI Plate

Lecture Limitations With PMF Requires that the protein of interest already be in a sequence database Spurious or missing critical mass peaks always lead to problems Mass resolution/accuracy is critical, best to have <20 ppm mass resolution Generally found to only be about 40% effective in positively identifying gel spots

Lecture Can We Do Better? 2D-GE + MALDI-MS –Peptide Mass Fingerprinting (PMF) 2D-GE + MS-MS –Sequence Tag/Fragment Ion Searching Multidimensional LC + MS-MS –ICAT Methods (isotope labelling) –MudPIT methods 1D-GE + LC + MS-MS De Novo Peptide Sequencing (MS-MS)

Lecture MS-MS & Proteomics

Lecture MS-MS & Proteomics Provides precise sequence-specific data More informative than PMF methods (>90%) Can be used for de- novo sequencing (not entirely dependent on databases) Can be used to ID post- trans. modifications Requires more handling, refinement and sample manipulation Requires more expensive and complicated equipment Requires high level expertise Slower, not generally high throughput Advantages Disadvantages

Lecture Conclusions There are two main approaches to applying MS to protein identification: 1) Peptide Mass Fingerprinting (PMF) and 2) Sequence tagging (sequencing) Both depend on bioinformatics and sequence databases to succeed Understanding the applications and limitations of MS in proteomics will help in understanding and meeting the growing bioinformatics needs in proteomics