Global Internal Standard Technology (GIST) – A Tool for Protein Expression Analysis Xiang Zhang Bindley Bioscience Center Purdue University 17 January 2019
Identification of changes in protein expression and its modification is essential for understanding biological processes Cellular response to stimuli is reflected by changes, i.e. protein expression, post-translation modification or processing Stimuli origin Chemical (drug, toxin etc.) Physical (cell interaction, changes in temperature, pressure etc.) Combination of both (disease) The search for differences in protein expression and modification is called Comparative Proteomics. 17 January 2019
There are several approaches for quantitation The Key Element of Comparative Proteomics is Quantitation of Changes in Protein (Peptide) Levels It is not easy to determine a change in a single protein level (such as Western Blot) In comparative proteomics, the challenge is to identify changes for as many proteins (peptides) as possible There are several approaches for quantitation Pattern recognition approach Isotopic labeling approach 17 January 2019
Pattern Recognition Works well but has some potential issues Alignment, normalization and peak intensity comparison Individual analysis of sample A and B by LC-MS Works well but has some potential issues Strongly depends on LC-MS system reproducibility Intensity of any peak is not only function of peptide concentration. It also depends on analyte composition It is difficult to obtain direct fold changes between samples Some of these potential issues could be overcome by using isotopic labeling strategies. 17 January 2019
Isotopic Labeling Biosynthetic labeling Post-biosynthetic labeling. In vivo incorporation of isotopic labeled species (growing cells in media enriched in 14N vs. 15N) Impossible to use it with human subjects Post-biosynthetic labeling. Labeling amino groups (GIST) Labeling cysteine residues (ICAT) 18O incorporation during proteolysis 17 January 2019
Isotopic Labeling – Basic principles (ICAT) 17 January 2019
GIST – Isotopic Labeling Technique Concept was first introduced by Fred Regnier lab at Purdue University in 2000 Labeling reagents Succinimidyl propionate (12C vs 13C) Succinimidyl acetate (1H vs 2H) Target group Primary amine (N-terminus, Lysine residue) Sample is labeled following digestion 17 January 2019
GIST – Chemical structure of Labeling Reagents Heavy forms CH 3 O N C 2 Acetate-based reagent Propionate-based reagent Light forms H = 2H C = 13C H = 1H C = 12C 17 January 2019
Generating primary amines Trypsin cleaves polypeptides C-terminal to lysine and arginine -NH-CH(R1)-CO-NH-CH(R2)-CO- trypsin -NH-CH(R1)-COOH H2N-CH(R2)-CO- It will be shown in subsequent slides that amino groups generated in proteolysis can be labeled. 18O labeling of carboxyl groups can also be used. There are however, some “tricks” associated with 18O labeling that we are still in the process of working out. It is not totally clear that the 18O process will be as quantitative as the acetate labeling described here. Primary amine groups are present globally - every peptide generated will be labeled by GIST reagents 17 January 2019
Amino groups are easily alkylated Note that Arg is not acetylated. All primary amino groups are labeled. This slide shows the use of N-acetoxysuccinimide labeling of peptides. N-hydroxysuccinimides have the great advantage that they can be used for acylation in aqueous solution and still provide quantitative derivatization. In the case of glycopeptides and ser/thre peptides, there is occasionally some acetylation of hydroxyl groups. Esters are easily cleaved by treating with hydroxylamine at pH 10. Addition of hydroxyl amine is standard in the procedure to preclude ester formation. 17 January 2019
Data dependent MS/MS and/or GIST – Experimental Design Normal (State 1) sample digestion Disease (State 2) Combine Light & Heavy RP LC/MS Q-Tof Light-1H or 12C labeling Heavy-2H or 13C Ratio analysis using GISTool V1.1™ Data dependent MS/MS and/or targeted ion MS/MS 17 January 2019
Labeling samples from two different sources After derivatization these samples are mixed. An internal standard is created for each peptide MW 500 MW 542 MW 545 Sample 1 – peptide A Sample 2 – peptide A This is a global labeling strategy in that all tryptic peptides are derivatized except those that are amino terminally blocked. Control samples are acetylated with 1H3-acetate and experimental samples with 2H3-acetate. After differential labeling the samples are mixed, separated by multidimensional chromatography and analyzed by mass spectrometry. It is important that no resolution of the isotopic species occurs prior to mass spectrometry. 17 January 2019
Example of GIST labeled peptide analyzed by Mass Spectrometry 542 545 peptide from experimental sample labeled with heavy form peptide from control sample labeled with light form 17 January 2019 m/z
How to Analyze GIST Data? 603.81 600.79 Peak picking Identify peptide peak cluster Doublet identification Ratio calculation 627.25 630.28 17 January 2019
Process Flow Chart of GISTool 17 January 2019
Data Acquired on qTOF Profile data Centroid data 17 January 2019
Chemical Noise Filter I. Peak Density Filter spectrum is segmented noise level of each segment is calculated based on local peak density noise level is smoothed across spectrum II. Spike Filter One third of user defined peak width is used as minimum width of isotopic peak 17 January 2019
Charge Deconvolution – simple case Peptide can carry different charges in ESI experiment Peptide charge can be used for doublet recognition Some overlapped peptides can be resolved 17 January 2019
Charge Deconvolution – complicated case white and red peptides both have +1 charge, but they are shifted by 0.5 Da Identify peak group Find base peak Try different charges Assigned +2 to the group Check isotopic peak profile Flag white peaks Check M/Z space of the white peaks Assign white peaks as +1 Search other white peaks in the group Try +1 on red peaks Search other reds in peak group 17 January 2019
Deisotope Quantitatively resolve overlapped peptide peaks Simplify peak list Simple case Overlapped peptides M2+M0 M0 M0 M1 M3+M1 M1 M2 M3 M4+M2 17 January 2019
(12C+13C)m(1H+2H)n(16O+17O+18O)o(14N+15N)p(32S+33S+34S+36S)q How Do We Deisotope ? Peptide isotopic peak profile can be calculated if AA composition is known. Peptide CmHnOoNpSq (12C+13C)m(1H+2H)n(16O+17O+18O)o(14N+15N)p(32S+33S+34S+36S)q peptide sequence is unknown during data processing in-silico prediction of isotopic peak profile comparing in-silico profile with experimental data 17 January 2019
Peptide Isotopic Distribution Can Be Predicted Large peptide Small peptide Relative intensity Peptide A Peptide B Molecular weight (amu) Molecular weight (amu) Peptide A and B have the same molecular weight but different AA composition Peptide isotopic peak profile varies significantly with MW To certain MW, the variation of isotopic peak profile is not significant 17 January 2019
Correlating in-silico Results with Experimental Results Detect the significantly intense isotopic peaks by comparing experimentally measured isotopic peak profile with with in-silico predicted peak profile - No, there is no other peptide - Yes, a peak from other peptide contributes to the current peak Relative Intensity M/Z MW 17 January 2019
Charge Deconvolution and Deisotoping Process 17 January 2019
Doublet Recognition Mass difference Retention time GIST Acetate +3, +6, … ICAT +8, +16, … Shifting according to labeling reagent. 17 January 2019
Calculation of Peak Ratio I. Ratio is calculated in each scan – good for 12C/13C pairs II. Ratio is calculated after smoothing peptide peaks at chromatographic level – good for H/D or 16O/18O pairs SG smooth each peak Peak detection Doublet recognition Peptide Regulation 17 January 2019
Ranking Doublets RANK 1, 2, 3, 4, 5 Good doublet Doublet complex Singlet 17 January 2019
Decharging to Rescue Mixed Doublets PeptideA +2 light & heavy PeptideB +2 light & heavy 500 505 6 possible combinations 17 January 2019
Overview of MS Proteomics Platform Proteins GIST label Separation Mass Spectrometer e.g. Ion Trap, Q-Tof Ionization Digest RPLC/CapLC +ESI Purified & Separated MS/MS (fragmentation pattern) MS-only Ion Chromatogram (stop here for quantitative profiling) Survey Scan MS select ion Time m/z m/z Protein Database Theoretical MS/MS Spectrum SEQUEST (Correlation Analysis) Protein Identification 17 January 2019
Typical GIST Doublet – Light:Heavy shows 3 m/z unit separation (6 Da difference) since 2+ ion State 2 (disease) 2H or 13C 3 Intensity State 1 (normal) 1H or 12C m/z 17 January 2019
Are Singlets Ever Observed? GIST-BSA Doublets Singlets # peptides 46 2 PQVSTPTLVEVSR doublet singlet Pro has no primary amine, also no Lys present doublet There is usually a scientific explanation for singlets 17 January 2019
Effectiveness of GIST Technology Peak ratio distribution is rather tight Labeling efficiency is > 95%; no mixed populations of labeled and unlabeled peptides are observed 13C labeled peptides show greater deviation from the mean than 2H Mean ratio values below 1.0 and 3.0 are most likely due to experimental error 17 January 2019
GIST Results for 8 Protein Mixture Experimental design – Create an unknown mixture of 8 standard proteins in different concentrations and different ratios to test the entire GIST process (labeling, MS, GISTool, etc.) Jiri creating secret mix Protein Actual Ratio Exp. Ratio # peptides ID’d BSA 5:1 4.8:1 17 Transferrin 3:1 3.3:1 10 Glucose oxidase 1:3 1:2.1 13 Lysozyme 1:1 1.1:1 4 Carbonic Anhydrase 7:1 5.1:1 4 Lactoglobulin 10:1 only Light detected 7 Myoglobin 1:1 1.3:1 1 Ovalbumin 1:5 N/A unidentified Ratios determined by GISTool from MS only data Proteins identified by MS/MS sequences Ratios were successfully determined for most proteins in mix 17 January 2019
A Real Sample: Human Serum Experimental design Samples: 1 normal vs. 1 colon cancer Protein Level Fractionation SAX RP 175, 225, 300, 1000 mM NaCl 40%, 100% organic Sample 10 total fractions No peptide level fractionation other than RP-LC/MS A single protein fraction (175mM, 40% ACN) from 1 normal and 1 diseased sample was digested separately with trypsin, GIST labeled, and mixed 17 January 2019
Serum Fold Changes Using Deuterium and 13C GIST Reagents 0.1 1 10 350 600 850 1100 1350 1600 0.1 1 10 350 600 850 1100 1350 1600 GIST Acetate Normal(1H) vs. Disease(2H) GIST Propionate Normal(12C) vs. Disease(13C) 10 10 log (Ratio) log (Ratio) 1 1 350 350 600 600 850 850 1100 1100 1350 1350 1600 1600 0.1 0.1 m/z m/z Only well-resolved doublets ranked 1-3 by GISTool are shown Only fold changes ≥ 2:1 were considered significant 17 January 2019
Largest Fold Change Identified in Human Serum Heavy colon cancer LC/MS 10:1 Ratio normal Light Up-regulation of deoxyhemoglobin indicates inadequate oxygen levels in blood Diseases such as cancer usually result in deficient oxygen supplies to tissues LC/MS/MS deoxyhemoglobin MS/MS data 17 January 2019
GIST Fold-Change Results from Normal vs. Diseased Serum Identifications from a single data-dependent LC/MS/MS experiment 17 January 2019
Conclusions GIST is an efficient global peptide isotopic labeling technique that can be used for protein expression analysis and can be used with many selections of chromatography GISTool is a very successful software algorithm for analyzing any isotope-labeled data generated by high-resolution MS (ICAT, etc.-not just GIST) Since GIST is capable of easily obtaining direct fold changes, PTM and/or post-translational processing experiments can be designed Peptide fractionation will be necessary for complex samples since isotopic-labeling strategies double the complexity of samples 17 January 2019