Protein & Peptide Analysis Linda Breci Chemistry Mass Spectrometry Facility University of Arizona MS Summer Workshop
Using mass spectrometry for the measurement and/or identification of proteins Measuring whole proteins –Information about proteins is available on the internet –Limits due to instrument resolution, protein mass, matrix –method: MALDI/TOF –method: ESI + various analyzers Measuring peptides from proteins by MS –peptide mass mapping Gel separation steps to prepare for protein identification by MS/MS Identifying proteins from peptides by MS/MS Overview
Proteins versus peptides Protein Peptides
Analysis of whole proteins Good news & bad news MALDI-TOF = measure with 1 or 2 protons ESI-Ion Trap = measure with many protons (high charge state) Result = mass accuracy not good enough to identify protein (but still useful!) –Mass accuracy decreases as size increases
Same protein, 2 ionization methods
MALDI/TOF – whole protein detected
ESI: Protein MW can be calculated from a protein’s charge distribution
We measure ISOTOPES (not averages) Example: Carbon is (not ) For every 12 C there is 1.1% 13 C Peak broadening in high mass measurement
Theoretical isotope distribution for a small protein 9th Isotope st Isotope Peak broadening in high mass measurement
Resolution = Mass Accuracy Peak broadening in high mass measurement No reflectron for high masses = reduced resolution Only multiply charged proteins observed (more peaks/mass unit)
Examples of post translational modifications Mass changes are difficult to identify in high mass measurements
Computer Exercises -- Goals Exercise #2, Whole protein analysis –Explore Expasy information available for a protein –Find the theoretical MW of a protein –Find the amino acid sequence of a protein in FASTA format for use in another exercise –Explore the X-ray crystal structure of a protein Open Webpage:
Proteins versus peptides Protein Peptides
Identification of proteins from peptide analysis
Separate by 2-D (or 1-D) Gel
Remove protein from gel after cutting into peptides with an enzyme (trypsin)
We can identify hundreds of proteins in one experiment
Extracting and Separating proteins Extracting proteins from biological organisms – Results in complex mixture of proteins – May require detergents, etc. that complicate Mass Spec analysis – Remove contaminants (filtration, dialysis, SPE, etc.) Separating proteins – 1D SDS-PAGE Cross linking controls MW separated Low resolution technique, spot can contain 10's to 100's of proteins –2D SDS-PAGE Best method for complex protein mixtures (IEF + SDS-PAGE) –Preparative isolectric focusing (IEF) –Reverse phase HPLC –Size exclusion chromotography –Ion exchange chromatography –Affinity chromatography
2-D Electrophoresis 1st Dimension: Isoelectric Focusing (IEF) –Requires maximal resolution of a target group of proteins –Uses Immobiline DryStrip gels (various lengths and pH gradients) –IPGphor programmed to hydrate and separate proteins by pI (i.e. overnight) 2nd Dimension: Gel Separation –Apply the Immobiline DryStrip to the top of a gel –Separation by molecular weight is rapid (6-10 hours)
2-D Electrophoresis Standard Method: –Separate proteins on 2-dimensional gels –Spots (and changes) can be observed (manual or with computer aid). –Method is reproducible (multiple runs required) –Cut out and identify spots of interest Gel Electrophoresis (DIGE) –Two or more samples for comparative analysis are labelled with different fluorescent dyes, mixed together, run on the same 2D gel, and interrogated with a multi-wavelength fluorescent scanner –Allows quantitation of subtle changes in protein expression levels between samples, without inter-gel variability - Very good for quantitation of subtle protein expression changes –Following example: analysis of a Bordetella broncheoseptica enzyme knockout cell line, compared to wild type.
10/29/03 Gel 2: Multiplexed gel image pH 3pH 7 small Molecular Weight large
10/29/03 Gel 2: Side-by-side Cropped Grayscale Images of WT (Cy3) and ∆dnt (Cy5) WT (Cy3)∆dnt (Cy5) pH 3pH 7pH 3pH 7
BB3856 – AZURIN L.AAECSVDIAGTDQM#QFDK.K A.AEC*SVDIAGTDQM#QFDK.K K.QFTVNLK.H K.DGIAAGLDNQYLK.A nanoLC-MS/MS identification of two differentially expressed protein spots ∆dnt (Cy5) BB3856 – AZURIN K.TADMQAVEK.D K.VLGGGESDSVTFDVAK.L K.DGIAAGLDNQYLK.A WT (Cy3)
In-Gel enzymatic digestion (trypsin most common)
Computer Exercises -- Goals Exercise #3, 2-D Gel Electrophoresis –Find a gel image online containing a protein spot of interest –Explore the gel images of various organisms Open Webpage:
Analysis of peptides from proteins MALDI-TOF = measure mass of peptides –peptide mass mapping ESI-Ion Trap = measure mass/charge of peptide –PLUS can select and fragment (MS/MS) for more information Result = possible to identify a protein, or identify SNP’s or modifications made to a protein
Peptide Mass Mapping using MALDI-TOF
MS of a peptide mixture by MALDI/TOF
Data Analysis for Peptide Mass Mapping Important data –multiple peaks –mass accuracy –confirming information (pI, approx. mass, organism, etc.) ? MS MS Peptide MW Found in Selected Databases NDALYFPT... SWDLTAL... PTDLDVSY... protein peptidesidentify rank
Computer Exercises -- Goals Exercise #4, Peptide Mass Mapping –Identify a protein from a peptide mass list –Confirm this identity by producing a theoretical mass list –Optional (for the speedy ones) identify more unknowns from mass lists Open Webpage:
Unknown proteins 66 = Bovine Serum Albumin 116 = beta-galactosidase from e.coli 55 = glutamic dehydrogenase from bovine liver 36 = glyceraldehyde-3-phosphate dehydrogenase from rabbit muscle
LC/LC-MS/MS for Complex Mixtures SCX = Strong cation exchange RP = Reverse Phase (C-18) Alternate an increasing salt gradient (move some peptides onto RP) Follow by RP gradient (separate peptides, send to mass spec) SCXRP MS/MS peptides from many proteins Results in thousands of mass spectra A computational challenge!
MS/MS Method Using ESI
Ion Current over 60 min MS MS/MS
Peptide precursor ions observed by MS MH + m/z = [M+ 2H] 2+ m/z = calculation of MH m/z measured x 2 1,142.4 [M+2H] ,141.4 [M+H]
MS-MS of
Data Analysis for MS/MS Sequencing Method ? MS/MS MS Peptide MW Found in Selected Databases NDALYFPT... SWDLTAL... PTDLDVSY... protein peptides identify rank theoretical spectra compare
F Phe G Gly T Thr D Asp M Met D Asp N Asn y series ions V F G T D M D N S R y y y y5y5 y6y6 y7y7 y8y8
Peptide bond fragment ions Peptide fragment ions Internal immonium ionAmino acid immonium ion a2a2 b2b2 c2c2 x2x2 y2y2 z2z2
Peptide Sequencing 71 u.115 u. AlaAsp
Computer Exercises -- Goals Exercise #5, Peptide Sequencing & Protein ID –Identify a peptide (and it’s protein) from an MS-MS mass list Open Webpage:
Homology search to find protein function BLAST: Computer Exercise #6 Peptide sequences found for an “unknown” protein by Sequest database searching Find a possible function of this protein Locus Spectrum Count Sequence CoverageLengthDescriptive Name CL _fgenesh_1_aa 13.20%1013Unknown FileNameXCorrDeltCNM+H+Sequence lb100404_ RLVVVNAKPTAASAVGLAGPGAADVLP FVEADLKKS lb100404_ RHFFAAAAGQPPPQY.L
Computer Exercises -- Goals Exercise #6, Blast Search –Perform a BLAST search for a peptide sequence that was found in the previous exercise –Observe the other proteins with similar sequence –Not all organisms have full genomic information – homology sequencing is useful for protein identification Open Webpage:
Computer Exercises -- Goals Exercise #7, Find an unknown protein –Use the same method of #4 to find an unknown peptide –Information provided: MS spectrum MS/MS spectrum Open Webpage:
Open source software for high-throughput proteomics: X Tandem Current trends to free software The Global Proteome Machine –X Tandem –Sequenced peptide libraries –Software available to programmers
Computer Exercises -- Goals X Tandem identification of the same spectra Exercise #8, Find an unknown protein Open Webpage: