Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein Sequencing Research Group: Results of the PSRG 2012 Study Terminal Sequencing of Standard Proteins in a Mixture Year 1 of the 2-year Study.

Similar presentations


Presentation on theme: "Protein Sequencing Research Group: Results of the PSRG 2012 Study Terminal Sequencing of Standard Proteins in a Mixture Year 1 of the 2-year Study."— Presentation transcript:

1 Protein Sequencing Research Group: Results of the PSRG 2012 Study Terminal Sequencing of Standard Proteins in a Mixture Year 1 of the 2-year Study

2 Current PSRG Members Henriette Remmer (Co-Chair)University of Michigan Jim Walters (Co-Chair) Sigma-Aldrich Robert English*University of Texas Medical Branch Pegah Jalili*Sigma-Aldrich Viswanatham KattaGenentech, Inc Kwasi MawuenyegaWashington University School of Medicine Detlev SuckauBruker Daltonics Bosong XiangMonsanto, Co. Jack Simpson (EB liaison)United States Pharmacopeia * new members added in 2011

3 PSRG 2012/13 – Study Background and Design Status of Terminal Sequencing :  In the midst of a technology transition from classical Edman sequencing to mass spectrometry (MS) based sequencing  Both technique have varied strengths and weaknesses and both have a role in biochemical research.  With a complimentary role realized, we attempt to push the capabilities of the various sequencing techniques, namely terminal sequencing of proteins in mixture Concept of the 2012 Study- Terminal Sequencing of Proteins in a Mixture:  Sequencing proteins in a mixture requires separation of proteins prior to analysis  Edman Sequencing : SDS-PAGE and electroblotting prior to analysis –  well established in most core facilities   MS based sequencing: LC separation necessary prior to analysis-  not well established in most core facilities => PSRG designed a 2-year study YEAR 1: Terminal sequencing and identification of three separated standard proteins YEAR 2: Same three proteins distributed, this time in mixture

4 PSRG 2012 Year 1: Study Objective To obtain N-terminal sequence information on three standard proteins supplied as separated samples.

5 2011 Study Design – The Samples  Participants were asked to analyze the samples for terminal sequencing using any technology available  Participants obtained all three proteins with ID in sufficient amounts to sequence each protein utilizing all three technologies. Feasibility of analysis had been validated by PSRG members.  Participants also filled out a survey, all responses were kept anonymously Protein Name Amounts Provided (pmol) N-terminally blocked? Fusion Protein? Comments BSA1mgNo reference protein/ calibrant Protein A3x 100Yes Fusion protein with blocked N-terminus Endostatin3x 100No Contains two N- terminal variants

6 Participation and Survey results 25 laboratories from 12 countries requested samples for Edman sequencing and most of the labs (23) also for MS sequencing. 14 of the 25 participating laboratories (56%) completed the survey. 7 of the 14 labs utilized Edman sequencing, 6 top-down MS and 6 bottom-up MS. Out of 14 respondents, 9 labs analyzed the reference protein BSA, 8 correctly determined the N-terminus 13 labs analyzed Protein A, 5 correctly determined the N-terminus 14 labs analyzed Endostatin, 12 labs correctly determined the N-terminus, only 7 identified the presence of the second N-terminus

7 Survey Response Results

8 Purification and separation method before analysis

9 N-Terminal Techniques: Edman Degradation

10 Edman Workflows PSRG 2012 Samples Used sample as Provided (5) ABI Procise 4 - 494 HT’s 1 – 492 cLC 2 - 494 cLC SDS PAGE – blotting on PVDF (2) blotting on PVDF (1) Shimadzu PPSQ-33A

11 Edman sequencing Protein A PROTEIN A- FUSION PROTEIN- N-TERMINUS BLOCKED C10 Polybrene-precycled glass fiber filters ABI Procise Biosystems Model 494HT De-blocking (PGAP) 100 pmol Sequence 1M Met LRPVETP C10-LRPVETP

12 Edman sequencing of Endostatin A00 Probability 2: position 7 Histine to Glutamine blotting on PVDF Shimadzu PPSQ-33A H2O with 0.1 % TFA Probability 1: position 4 Proline to Arginine Initial Yield: 36.95 % Repetitive Yield: 84.98 %

13 Edman sequencing of Endostatin A00 Sequence 1DFQPVLHLVALNSPL A00/Vaiants 1DFQPVLHLVALNSPL Sequence 2HSHRDFQPVLHLVAL A00/Variant 2RQ Sequence Verification: with Blast P Information about the sequence: SwissProt output

14 Summary of N-terminal sequencing result Sample DescriptionLab IDAmino acid sequence BSA Y20D T H K S E I A H R F K D L G E E H F K G L V L I A F S Q Y L Q Q X P F D E H V K L V N C10D T H K S E I A H R F K D L G E E H F K G L V L I A F S Q Y N32D T H K S E I A H R F K D L G E E H F K G L V L I A00D T H K S E I A H R F K D L G E E H F K G L V L I A F S Q Y Protein A Y20 F L R P V E T P T R E I K K L D G L A Q H D E A Q Q N A F Y Q V L N M P N Y20M F L R P V E T P T C10 L R P V E T P T R E I K K L D G L A Q H D E A Q Q N A F Y Q V L N32 X L R P V E T P X R E I K K L A00 M L R P V E T P T R E I K K L D G L S10 X L R P V E T P T R E I K K L D G L A Q H D E A Q Q N A V00 F L R P V E T P T R E I K K L D G L A Q H D E A Q Q N A F Y Q V L N M P N Endostatin Seq. 1 Y20D F Q P V L H L V A L N S P L S G G M R G I R G A D F Q X F Q Q A C10D F Q P V L H L V A L N S P L S G G M R G I R G A D F Q C F Q Q A R E20D F Q P V L H L V A L N S P L S G G M R G I R G A D F Q C F Q Q A R A V G L A G T N32D F Q P V L H L V A L N S P L S G G M R G I A00D F Q P V L H L V A L N S P L S10D F Q P V L H L V A L N S P L S G G M R G Endostatin Seq. 2 Y20H S H R D F Q P C10H S H R D F Q P X L H X X A L N X X X S G G M E20H S H R D F Q P V L H L V A L N S P L S G G M R G I R G A D F Q C N32H S H R D F Q P V X H X V A L N S

15 PSRG 2011 Edman Conclusions & Observations All lab returned N-terminal data which correlate well with the published protein sequences It can produce the data with and without separation (SDS PAGE and chromatography) No C-terminal data was produced with Edman. If the protein N-terminally blocked, the reaction will not proceed for most but not all modifications. The reagents for Edman sequencing are very expensive Edman sequencing allows for direct determination of the protein’s N-terminal sequence.

16 N-Terminal Techniques Overview: MS Techniques

17 Mass Spectrometry Methods Used Top-Down Sequencing (no digests) ISD, T³:AB Sciex 4800 MALDI-TOF/TOF MS, ISD, T³:Bruker Ultraflex MALDI-TOF/TOF MS, ETD,CID:Bruker maXis 4G UHR-QTOF Only Top-Down N-term results were returned. Some participants used Bottom-Up MS as validation step Bottom-Up MS/MS (digests) MALDI-TOF/TOFs: AB/Bruker ESI-Orbitrap: Thermo

18 Top-Down Experimental Bruker Ultraflex Bruker UltrafleXtreme HPLC Direct infusion As provided Sample Separation Top-Down Instrumentation 0.1% TFA MeOH/H2O/HOAc 6M GndHCl Various organic/H2O/acid AB Sciex 4800 Triversa Nanomate Agilent 1200 Bruker Autoflex speed Bruker MaXis 4G ISD/T³ ISD ETD CID

19 Software used for MS Top-Down Analysis BioTools 3.2: Sequence-tags, automatic de-novo sequencing, trigger Mascot TD searching, result visualization, terminal assignments, TD report generation (Bruker) Mascot 2.3: TD and BU Database searches (Matrix Science) BLAST/MS-BLAST: Protein identification based on sequence tags (NIH, Harvard/EMBL) ISDetect: Sequence-tags, semi-automatic de-novo sequencing, result visualization (Genentech, Y Gan et al, in prep. )

20 The Top-Down MS Standard Analysis Strategies MW Determination: Check Sample Quality + Final QC ETD/ISD: obtain internal sequence Tags ID Protein: e.g. Mascot search Extend Sequence towards N-terminus (and C-term alike)  Compare with obtained protein sequences incl. PTMs)  T³-Sequencing, i.e. MS/MS analysis of MALDI-ISD fragments  Edman sequencing Problems: unknown terminal modifications (Sample B), fusion proteins (Sample B), ragged ends (Sample C) DTHKSEIAHRFKDLGEEHFKGLVLIAFSQYLQQCP DTHKSEIAHRFKDLGEEHFKGLVLIAFSQYLQQCP

21 BSA ISD Spectrum in DAN matrixPSRG123 good calibrant for ISD Spectra

22 Sample A: BSA, ISD+EdmanC10 following the basic strategy BSA sequence Accession number: AAI02743 c-ions in the MALDI-ISD spectrum revealed the sequence from Arg10 -Tyr30. Edman sequencing provided Asp1 to Gly15 Data from the orthogonal methods were put together to obtain 30 residues of BSA sequence. FINAL SEQUENCE OBTAINED FOR BSA: 1 10 20 30 40 DTHKSEIAH RFKDLGEEHF KGLVLIAFSQ YLQQCPFDEH VKLVNELTEF… Coverage by Edman Coverage by MALDI-ISD Coverage by both

23 Sample B Endostatin (donated by Sigma) issues: ragged N-term, C-term loss of K C-term K excised added

24 EndostatinL36 Annotated ISD Spectrum from on/off gradient Interfering component

25 EndostatinL36 HPLC chromatogram, separation of two variant, ISD of F1, F2 not assigned The recovery from the endostation sample might be lower than 100 pmol 100 pmol Myoglobin standard F1 F2 LC-separation detected the protein heterogeneity, removed polymeric contamination but reduced the sample amount and readout length

26 UHR-QTOF MS analysis of Endostatin: 2 Components 1221.9913 1297.3352 1390.0011 1496.8469 1621.4171 1768.8184 1945.6003 +MS, 0.0 0.2 0.4 0.6 0.8 1.0 5 x10 Intens. 12001400160018002000m/z Z10 In contrast to MALDI-ISD, the QTOF-ETD analysis takes place after precursor ion selection

27 ETD Analysis of Endostatin, First Precursor: Mascot Database Search Result Simplest Use of Top-Down Data: Mascot Search Z10

28 TDS Analysis of Endostatin, First Precursor: Deconvoluted and Annotated ETD Spectrum c 2 c 9 c 26 Z10

29 TDS Analysis of Endostatin, First Precursor: Mass Accuracy of intact Protein Measured Monoisotopic mass19433.8783 Theoretical Monoisotopic mass19433.8151 Mass error3.2 ppm Measured (black) Spectrum Simulated (red) Spectrum Z10 Precision MW allows to confirm proper N-term and C-term loss of Lysin

30 Endostatin: TDS Sequence 1PSRG123

31 Endostatin: TDS Sequence 2PSRG123 If ISD spectral quality is good, both sequences can be directly read and N- and C-termini can be assigned from THE SAME SPECTRUM

32 Rec. Protein A (donated by Repligen) Issues: N-term methylation, fusion site after residue 18 E.coli  -Glucuronidase SPA_STAAU C-term sequence does not match intact MW (nice challenge for Top-Down MS in the Future..)

33 ISD Spectrum Protein A (DAN) E20 manual sequence generation TR E I/LK/Q I/LD G K/Q A H D EA

34 ISD spectrum for Samples #2 (Protein A) was manually interpreted by sequential subtraction of ions Resultant sequence: was Blasted against the Dayhoff public database (below) Protein A Identification E20 TRE[IL][KQ][KQ][IL]DG[IL]A[KQ] Only two sequences matched. Homology searching of the N-term Tag provided a)  -Glucuronidase, b) its N-terminally extended sequence, c) mass offset indicates N-term Methylation

35 Protein A MS/MS E20 ISD c-ion m/z 1056.538 T³-sequence analysis of c 9 confirms N-term methylation

36 Protein AL36 MS/MS of N-terminal tryptic fragment Validation of assigned N-term methylation and glucuronidase sequence by Bottom-Up LC-MALDI-TOF/TOF analysis

37 Protein AL36 Annotated ISD spectrum The N-terminal sequence is  -gluronidase fused with protein A. The N-terminal Methionine is methylated. The N-terminal aminoacids not confirmed by ISD was confirmed by MS/MS of the N-terminal tryptic fragment

38 Results from MS Analyses Please look at poster ##?? For more details

39 Lessons to be Learned from this Years Study Mass Spec Lessons.. 1.Top-Down with ETD or ISD provides reliable N-term sequences 2.Top-Down CID was most easily misinterpreted 3.Edman and Top-Down Complement each other very well: Edman for the first ~10 residues, Top-Down for the inexpensive extension of calls (e.g. through the fusion site of Protein A) 4.Validation of the N-term by either T³-sequencing or Bottom-Up works as well 5.Efficient use of Top-Down MS requires good software support 6.Bottom-Up was great to confirm N-term results but not to generate them 7.Use of protein HPLC resulted in shortened readouts 8.Protein A Successful analysis of the fusion required high experience 9.Endostatin ragged N-termini were recognized by those that determined the intact molecular weight(s), detected heterogeneity by HPLC or Edman 10.Top-Down by ETD or ISD permitted the detection of the C-terminal removal of Lysine, intact MW determination allowed to validate the finding

40 Next years ABRF-PSRG2013 study what's going to happen? Most likely, the same proteins will be provided again! But: provided as a stew in a single pot! Task: Isolate/separate them from the mixture Problem: SDS-PAGE works well for Edman, but it is difficult to extract intact proteins Hints:  Protein LC needs to be established, to get to the next level!  Always try to get intact MW information!  Use high sample amounts as you loose a lot during LC

41 The ABRF-PSRG Acknowledges the following Support Recombinant Protein A was obtained as donation from RepliGen (Waltham, MA) Endostatin was obtained as donation from SIGMA- ALDRICH (St Louis, MO) Steve Smith (University of Texas Medical Branch) and Larry Dangott (Texas A&M University) for Edman sequencing to provide reference data for this study.

42 End Following slides are bonus material

43 In-Source Decay (MALDI-ISD) “pseudo-MS/MS” technique, no precursor selection ISD of protein in the MALDI plume at <nsec timescale (similar to ETD) Fragmentation due to radical transfer from matrix to analyte (Takayama, 2001) a,c- ions: N-terminus; y, z+2-ions: C-terminus – simultaneous sequencing TOF/TOF allows for T³-sequencing: MS/MS analysis of ISD fragments MALDI-ISD

44 MALDI-ISD and T³-Sequencing Suckau & Resemann (2003) Anal Chem 75

45 ESI-ETD (Electron Transfer Dissociation) CID Collision with inert gas protein is internally heated globally it fragments in statistic process weak bond cleavages ETD Collision with electron donating gas perturbates electronic structure locally resulting in local bond cleavages ETD fragments all bond (except Pro) for top down MS/MS of intact proteins with precursor ion selection

46 ETD Measurement Cycle on QTOF Reaction Cell n-CI Source 10 kHz 1. Precursor Ion Accumulation 2. Electron Transfer Reagent Addition 3. ETD Reaction 4. Fragment Ion Transfer and Detection Tsybin et al. (2011) Anal Chem 83:8919

47 I/LN SGGMRG N K/Q D F C E20 ISD Endostatin (DAN): initial manual interpretation

48 Data base search for [IL]SGGMRGNR[KQ]DF[KQ]CF Excerpt from COIA1_HUMAN Excerpt from COIA1_MOUSE Differences between human and mouse can be seen in the -2 position from the start of ISD sequence (ie. LNSPL in human and LNTPL in mouse) Sequence from spectrum was found beginning at 1548.694, so we know there are a handful of residues preceding this seq E20

49 010212_B23_10pmol_Endostatin_MSMS_2kV_1364.65 To confirm N-terminus not covered in the ISD spectrum, MS/MS was performed on m/z1364.6 y7y7 b7b7 b8b8 y9y9 b9b9 b 10 b2b2 b3b3 b4b4 b5b5 Immonium Ions P H I/L K/Q E20 y6y6

50 Determination of Endostatin N-termini by Edman degradation. -Major sequence matches CO1A1_HUMAN at position 1576. -A second sequence was found from position 1572. -Both sequences concur with the ISD findings. E20 Edman sequencing detected the ragged N-term, ISD confirmed and extended it Largely manual analysis of ISD spectra made it difficult to extract full information

51 2012/2013 PSRG: Timeline of the 2-year study ABRF 2011 ABRF 2012 Settled on the 3 standard proteins for distribution as separated proteinsi n year 1 of the study Year 1 (2012) Study announcement Samples sent to participants Extended deadline for returning data Data analysis Feb ‘11 Oct ‘11 Jan ‘12 Mar ‘12 May ‘12 Discussed ideas for 2012 study. Agreement upon a study design May ‘11 Aug ‘11 Sep ‘11 Feb ‘13 ABRF 2013 Distribution of proteins in mixture for year 2 of the study Data analysis Oct ‘12 Deadline for returning data Jun ‘12 Year 2 (2013) Study announcement

52 Comments…… un-reproducible recovery from the tube for Endostatin is a problem if one wants to optimize the setting or try to reproduce the data….. Thanks! PSRG. It was fun. Unelss I've missed something, the availability of the proteins in the public domain made this an easy project. Sample quality was very good! I thought the fusion Protein A solution was blocked? I obtained sequence matches to the protein B-Glucuronidase, either B-Glucuronidase is fused to Protein A and you were not successful blocking the protein or B-Glucuronidase is a contaminant…… It was very costly study for an Edman lab, (reagents).


Download ppt "Protein Sequencing Research Group: Results of the PSRG 2012 Study Terminal Sequencing of Standard Proteins in a Mixture Year 1 of the 2-year Study."

Similar presentations


Ads by Google