MS-MS: Applications to Proteomics

Slides:



Advertisements
Similar presentations
Proteins: Structure reflects function….. Fig. 5-UN1 Amino group Carboxyl group carbon.
Advertisements

Amino Acids PHC 211.  Characteristics and Structures of amino acids  Classification of Amino Acids  Essential and Nonessential Amino Acids  Levels.
Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
5’ C 3’ OH (free) 1’ C 5’ PO4 (free) DNA is a linear polymer of nucleotide subunits joined together by phosphodiester bonds - covalent bonds between.
Amino Acids, Peptides, Protein Primary Structure
You Must Know How the sequence and subcomponents of proteins determine their properties. The cellular functions of proteins. (Brief – we will come back.
My contact details and information about submitting samples for MS
Chapter 27 Amino Acids, Peptides, and Proteins. Nucleic Acids.
Tryptic digestion Proteomics Workflow for Gel-based and LC-coupled Mass Spectrometry Protein or peptide pre-fractionation is a prerequisite for the reduction.
Protein Synthesis. DNA RNA Proteins (Transcription) (Translation) DNA (genetic information stored in genes) RNA (working copies of genes) Proteins (functional.
PROTEIN CHARACTERIZATION
How Proteins Are Made Mrs. Wolfe. DNA: instructions for making proteins Proteins are built by the cell according to your DNA What kinds of proteins are.
AMINO ACIDS.
Laxman Yetukuri T : Modeling of Proteomics Data
Amino Acids are the building units of proteins
Learning Targets “I Can...” -State how many nucleotides make up a codon. -Use a codon chart to find the corresponding amino acid.
Welcome Back! February 27, 2012 Sit in any seat for today. You will have assigned seats tomorrow Were you absent before the break? Plan on coming to tutorial.
In-Gel Digestion Why In-Gel Digest?
Amino Acids ©CMBI 2001 “ When you understand the amino acids, you understand everything ”
CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Proteins.
Chapter 3 Proteins.
Amino Acids  Amino Acids are the building units of proteins. Proteins are polymers of amino acids linked together by what is called “ Peptide bond” (see.
Amino acids Common structure of 19 AAs H3N+H3N+ COO - R H C Proline.
Mass Spectrometric Peptide Identification Using MASCOT
Constructing high resolution consensus spectra for a peptide library
Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB
Proteins Tertiary Protein Structure of Enzyme Lactasevideo Video 2.
Identify proteins. Proteomic workflow Trypsin A typical sample We add a solution of 50 mM NH 4 HCO 3 (pH 7.8) containing trypsin ( µg/µl). Volume.
Amino acids Proof. Dr. Abdulhussien Aljebory College of pharmacy
Why is this useful for the prokaryote? Why can’t eukaryotes do this?
Mass Spectrometry 101 (continued) Hackert - CH 370 / 387D
Biochemistry Free For All
Amino acids.
Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.
A Database of Peak Annotations of Empirically Derived Mass Spectra
Translation PROTEIN SYNTHESIS.
Chpt. 5 The Structure and Function of Macromolecules
Proteins Proteins are long polymers made up of 20 different amino acid monomers They are quite large, with molar masses of around 5,000 g/mol to around.
BIOLOGY 12 Protein Synthesis.
Protein Sequence Alignments
Proteins.
Transport proteins Transport protein Cell membrane
Proteomics Lecture 4 Proteases.
Chemistry 121 Winter 2016 Introduction to Organic Chemistry and Biochemistry Instructor Dr. Upali Siriwardane (Ph.D. Ohio State)
The Interface of Biology and Chemistry
Chapter 3 Proteins.
Fig. 5-UN1  carbon Amino group Carboxyl group.
Proteomics Informatics David Fenyő
A Ala Alanine Alanine is a small, hydrophobic
Amino acids R-groups non-polar polar acidic basic proteins
Interpretation of Mass Spectra I
Amino acids R-groups non-polar polar acidic basic proteins
Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.
Proteomics Informatics –
The 20 amino acids.
Translation.
The 20 amino acids.
Protein Identification Using Tandem Mass Spectrometry
What is the name of the amino acid shown below?
Shotgun Proteomics in Neuroscience
Example of regression by RBF-ANN
Proteins Proteins have many structures, resulting in a wide range of functions Proteins do most of the work in cells and act as enzymes 2. Proteins are.
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra
Protein Identification by Sequence Database Search
“When you understand the amino acids,
Protein Building Blocks: Amino Acids, Peptides and Polypeptides
Protein Building Blocks: Amino Acids, Peptides and Polypeptides
Presentation transcript:

MS-MS: Applications to Proteomics David Wishart June 2005 MS-MS: Applications to Proteomics David Wishart University of Alberta Edmonton, AB david.wishart@ualberta.ca (c) CGDN 2005

MS-MS Methods 2D-GE + MALDI-MS 2D-GE + MS-MS Peptide Mass Fingerprinting (PMF) 2D-GE + MS-MS Sequence Tag/Fragment Ion Searching Multidimensional LC + MS-MS ICAT Methods (isotope labelling) MudPIT methods 1D-GE + LC + MS-MS De Novo Peptide Sequencing (MS-MS) Lecture 2.3

2D-GE + MS-MS Trypsin + Gel punch p53 Lecture 2.3

MudPIT IEX-HPLC RP-HPLC Trypsin + proteins p53 Lecture 2.3

ICAT (Isotope Coded Affinity Tag) Lecture 2.3

Some Interesting Examples

The E. coli Interactome Butland et al., Nature, 433(7025):531-537 (2005) Lecture 2.3

E. coli Interactome Created C-terminal, affinity-tagged constructs of 1,000 open reading frames (approximately 23% of the genome) A total of 857 proteins, including 198 of the most highly conserved, soluble non-ribosomal proteins were tagged successfully 648 could be purified to homogeneity and their interacting protein partners identified by mass spectrometry Lecture 2.3

SPA or TAP Tagging E. coli Proteins Lecture 2.3

Bait-Prey Selection & MS Lecture 2.3

Gel Analysis (Silver Stain) Lecture 2.3

LC-MS/MS (MudPIT) Lecture 2.3

The E. coli Interactome Lecture 2.3

Organellar Proteomics Lecture 2.3

Organellar Proteomics Taylor SW, Fahy E, Ghosh SS. Trends Biotechnol. 2003 Feb;21(2):82-8. Lecture 2.3

MS-MS for Protein ID Proteins are isolated (from gel or HPLC) and subjected to tryptic digestion Peptides are sent through ionizer and into a collision cell where the doubly charged ions are selected and fragmented through collision induced decay (CID) The resulting singly charged ions (daughter ions) are analyzed to determine the sequence or to ID the parent peptide Lecture 2.3

Why Trypsin for MS-MS? CID of peptides less than 2-3 kD is most reliable for MS-MS studies – The frequency of tryptic cleavage guarantees that most peptides will be of this size Trypsin cleaves on the C-terminal side of arginine and lysine. By putting the basic residues at the C-terminus, peptides fragment in a more predictable manner throughout the length of the peptide Lecture 2.3

Why Double Charges? Easiest spectra to interpret are those obtained from doubly-charged peptide precursors, where the resulting fragment ions are mostly singly-charged Doubly-charged precursors also fragment such that most of the peptide bonds break with comparable frequency, such that one is more likely to derive a complete sequence Lecture 2.3

MS-MS & Peptide Fragments When peptides are proteins are admitted to a collision cell the peptide usually fragments at the weakest bond (the peptide bond, but some CH-NH and CH-CO breakage also occurs) Collision conditions have to be optimized for each peptide Two main types of daughter ions are produced -- “b” ions and “y” ions Lecture 2.3

MS-MS Peptide Fragmentation yn-1 yn-2 y1 R1 R2 R3 Rn H2N-CH-CO-NH-CH-CO-NH-CH-CO…CO-NH-CH-CO2H b1 b2 bn-1 b1 y1 b2 y2 b3 y3 b4 y4 b5 y5 signal Lecture 2.3

MS-MS Peptide Fragmentation Ala-Gly-His-Leu-….Phe-Glu-Cys-Tyr b1 y1 b2 y2 b3 y3 b4 y4 b5 y5 signal Lecture 2.3

Tandem MS of BSA Lecture 2.3

MS-MS of Fibrinogen Relative Abundance m/z 400 600 800 1000 1200 1400 1600 m/z 20 40 60 80 100 Relative Abundance E A X F D G S YADSGEGDFLAEGGGVR Lecture 2.3

Amino Acid Residue Masses Monoisotopic Mass Glycine 57.02147 Alanine 71.03712 Serine 87.03203 Proline 97.05277 Valine 99.06842 Threonine 101.04768 Cysteine 103.00919 Isoleucine 113.08407 Leucine 113.08407 Asparagine 114.04293 Aspartic acid 115.02695 Glutamine 128.05858 Lysine 128.09497 Glutamic acid 129.04264 Methionine 131.04049 Histidine 137.05891 Phenylalanine 147.06842 Arginine 156.10112 Tyrosine 163.06333 Tryptophan 186.07932 Lecture 2.3

MS/MS – The Movie (Kathleen Binns) http://www.mshri.on.ca/pawson/ms/movie.html Lecture 2.3

Protein ID by MS-MS Peptide fragments from target protein are sequenced by MS-MS using a variety of algorithms (SEQUEST, Mascot) or via manual methods The peptide fragment sequences are sent to BLAST to be queried against a protein sequence database The protein having the highest number of sequence matches is ID’d as the target Lecture 2.3

SEQUEST Algorithm developed for MS-MS fragment ion identification by J. Eng (1994) in John Yates Lab (Scripps, U Wash) Compares predicted MS-MS spectra against observed daughter ion spectra to identify and rank matches (no “sequencing” per se) Lecture 2.3

SEQUEST and 2D-GE Lecture 2.3

SEQUEST Algorithm SEQUEST correlates uninterpreted tandem mass (MS-MS) spectra of peptides with amino acid sequences from protein and nucleotide databases SEQUEST will determine the amino acid sequence and thus the protein(s) and organism(s) that correspond to the mass spectrum being analyzed SEQUEST is distributed by Finnigan Corp. Lecture 2.3

SEQUEST Algorithm Sequence DB Calc. Tryptic Frags Calc. MS-MS Spec. acedfhsakdfqea sdfpkivtmeeewe ndadnfekgpfna >P21234 acekdfhsadfqea nkdadnfeqwfe >P89212 acedfhsadfqeka ndakdnfeqwfe acedfhsak dfgeasdfpk ivtmeeewendadnfek gpfna acek dfhsadfgeasdfpk ivtmeeewenk dadnfeqwfe acedfhsadfgek asdfpk ivtmeeewendak dnfegwfe Lecture 2.3

Creating a Synthetic MS-MS Spectrum for GPFNA b ions y ions G 57 P 97 F 147 N 114 A 71 A 71 N 114 F 147 P 97 G 57 57 154 301 415 486 71 185 332 429 486 combine Lecture 2.3

SEQUEST Algorithm Query Spectrum Spectral Database Result acedfhsak mtlsyk giqwemncyk nmqtydr Score = 128 Accession P12345 Protein = p53 Org. Homo sapiens giqwemncyk Lecture 2.3

Alternatives to SEQUEST Web software and servers using algorithms based on manual methods Sending your data to friends who have a SEQUEST license Manual analysis of MS-MS spectra This is still the most reliable method for interpreting MS-MS spectra Also allows for de-novo sequencing Lecture 2.3

MS-MS on the Web PepSea (disabled) ProteinProspector http://195.41.108.38/PA_SequenceOnlyForm.html ProteinProspector http://prospector.ucsf.edu/ PeptideSearch (limited) http://www.narrador.embl-heidelberg.de/GroupPages/Homepage.html Mascot (probably the best) www.matrixscience.com Lecture 2.3

Mascot MS-MS Form Lecture 2.3

Mascot MS-MS Input Format COM=10 pmol digest of Sample X15 ITOL=1 ITOLU=Da MODS=Met Ox,Cys B propionamide MASS=Monoisotopic USERNAME=Lou Scene USEREMAIL=leu@altered-state.edu CHARGE=2+ and 3+ BEGIN IONS TITLE=Peak 1 PEPMASS=983.6 846.60 73 846.80 44 847.60 67 Parent ion Mass (2+) Daughter ion mass intensity Lecture 2.3

Mascot MS-MS Output Lecture 2.3

Mascot MS-MS Output Lecture 2.3

A Real Example Lecture 2.3

Lecture 2.3

Lecture 2.3

Protocols for MS-MS Sequencing Usually can’t tell a “b” ion from a “y” ion Assume the lowest mass visible in the spectrum is a lysine or arginine (this is the y1 ion) this is because trypsin cuts after a lysine or arginine This y1 mass should be 147.113 for lysine or 175.119 for arginine {The y1 ion is calculated by adding 19.018 u (three hydrogens and one oxygen) to the residue masses of lysine and arginine} Lecture 2.3

MS-MS Sequencing Using the mass tables, look to the right of y1 and see if you can find another prominent peak that is equal to y1 + AA where AA is the residue mass for any of the 20 amino acids. This is the y2 ion Proceed in a rightward direction, identifying other yn ions that differ by an AA residue mass (don’t expect to find all) The yn series produces a “reverse” sequence Watch for possible dipeptide peaks that may fool you Lecture 2.3

Things To Remember Gly + Gly = 114.043 u and Asn = 114.043 u Ala + Gly = 128.059 u and Gln = 128.059 u and Lys = 128.095 u Gly + Val = 156.090 u and Arg = 156.101 u Ala + Asp = Glu + Gly = 186.064 and Trp = 186.079 u Ser + Val = 186.100 u and Trp = 186.079 u Leu = Ile = 113.084u Lecture 2.3

MS-MS Sequencing Use the remaining “unassigned” peaks to see if you can construct a “b” ion series The highest mass peak corresponds to the parent ion or parent minus 147 (K) or 175 (R) The “b” ions give the “normal” sequence Both forward (b ion) and backward (y ion) sequences should be consistent Use the resulting sequence tag to search the databases using BLAST (remember to use a high Expect value ~ 100) to see if the sequence matches something Lecture 2.3

Tandem MS of BSA Lecture 2.3

Different MS-MS Instruments Yield Different Spectra A typical QTOF or triple quad MS-MS spectrum of a tryptic peptide contains a continuous series of y-type ions. The b-type ions are usually seen only at lower masses below the precursor m/z value Ion trap CID data of tryptic peptides is different in that one often finds a continuous series of both b-type and y-type ions throughout the spectrum Lecture 2.3

Post-Translational Modifications (PTM) Lecture 2.3

PTM by MALDI (PMF) Trypsin MS 600 Da 840 Da 1044 Da 1236 Da 1730 Da Database: MKALSPVRGCYEAVCCLSERSLAIARGRGKSPSAEEPLSLLDDMNHCYSRLRELVPGVPRGTQLSQVEILQRVIDYILDLQVVLAEPAPGPPDGPHLPIQVREGARPGSSERAGWDAAGLPHRVLEYLG AVAKVELRGTVQPASNFNDDSSQGLGTDEGSIVLTQRSNAQAVEGAGTDESTLIELMATRNNQEIAAINEAYSLEDDLSSDTSGHFRILVSLALGNRDEGPENLTQAVVAETLNKPAFFADRLLALXGGDD MRWLTPFGMLFISGTYYGLIFFGLIMEVIHNALISLVLAFFVVFAWDLVLSLIYGLRFVKEGDYIALDWDGQFPDCYGLFASTCLSAVIWTYTDSLLLGLIVPVIIVFLGKQLMRGLYEKIKS MS 600 Da 840 Da 1044 Da 1236 Da 1730 Da Trypsin GTVQPASNFNDDSSQGLGTDEGSIVLTQR Lecture 2.3

PTM by MS-MS Trypsin MS/MS spectra GTVQPASNFNDDSSQGLGTDEGSIVLTQR AVAKVELRGTVQPASNFNDDSSQGLGTDEGSIVLTQRSNAQAVEGAGTDESTLIELMATRNNQEIAAINEAYSLEDDLSSDTSGHFRILVSLALGNRDEGPENLTQAVVAETLNKPAFFADRLLALXGGDDFKLMAAG Lecture 2.3

Phosphoserine Detection 200 600 1000 1400 m/z 10 30 50 Relative Abundance 843.9 794.7 257.9 128.9 445.8 686.4 1331.2 174.1 559.0 1262.5 758.1 326.3 1061.6 961.4 629.9 1459.8 1175.2 1430.1 [M+2H]2+ - 49 y8 y7 y6 y5 y4 y1 y9 y12 y13 y14-H3PO4 y14 y10 b1 b2 b3-phos 846.7 y11 y15-H3PO4 b-ions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 NH- K E s* S N T D S A G A L G T L R -OH y-ions 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 s* = phosphoserine Lecture 2.3

De Novo Sequencing (MS-MS) Done when sample is not amenable to Edman Degradation Done when no sequence or PMF match seems to exist in databases Requires a very high resolution mass analyzer (FT-ICR, QTOF or Qstar instrument) with <20 ppm resolution Usually requires multi-enzyme digestion Still a difficult process but possible to do at much lower amounts than Edman Deg. Lecture 2.3

MS-MS & Proteomics Advantages Disadvantages Provides precise sequence-specific data More informative than PMF methods (>90%) Can be used for de-novo sequencing (not entirely dependent on databases) Can be used to ID post-trans. modifications Requires more handling, refinement and sample manipulation Requires more expensive and complicated equipment Requires high level expertise Slower, not generally high throughput Lecture 2.3