PRG2007 Research Study Advanced Quantitative Proteomics http://www.abrf.org/prg ABRF PRG 2007
PRG Members Arnold Falick (Chair) – UC Berkeley HHMI William Lane (EB Liason) – Harvard University Kathryn Lilley (ad hoc) – University of Cambridge Michael MacCoss – University of Washington Brett Phinney – UC Davis Genome Center Nicholas Sherman – University of Virginia Susan Weintraub – Univ. Texas Heath Science Center Ewa Witkowska – UC San Francisco Nathan Yates – Merck Research Laboratories ABRF PRG 2007
Past Research Studies PRG2002: Identification of Proteins in a Simple Mixture Task: Identify components of a 5 protein mixture PRG2003: Phosphorylation Site Determination Task: Identify 2 phosphopeptides and sites of phosphorylation PRG2004: Differentiation of Protein Isoforms Task: Discrimination of 3 closely related proteins PRG2005: Sequencing Unknown Peptides Task: De novo sequence analysis of 5 peptide mixture PRG2006: Quantification of Proteins from a Simple Mixture Task: Relative Abundance of 8 Proteins Between 2 Different Samples ABRF PRG 2007
PRG2007 Study Objectives What methods are used in the community for assessing differences between complex mixtures? How well established are quantitative methodologies in the community? What is the accuracy of the quantitative data acquired in core facilities? We wanted to build upon last years study by providing samples that were more complicated, yet more realistic. ABRF PRG 2007
PRG2007 Sample Design Identical Sample A Sample B Sample C 100 µg E. coli lysate 12 Total Protein Spikes - 10 Non-E. coli proteins - 2 E. coli proteins 100 µg E. coli lysate 12 Total Protein Spikes - 10 Non-E. coli proteins - 2 E. coli proteins 100 µg E. coli lysate 12 Total Protein Spikes - 10 Non-E. coli proteins - 2 E. coli proteins Spikes at Different Levels and Ratios ABRF PRG 2007
PRG2007 Study Tasks Identify the proteins that had altered components between the samples Determine the relative amounts of the proteins between samples ABRF PRG 2007
Proteins in PRG2007 Sample * * *E. coli Proteins ABRF PRG 2007
Proteins in PRG2007 Sample ABRF PRG 2007
Protein Sequence Database >gi|16131131|ref|NP_417708.1| putative membrane protein [Escherichia coli K12] MKTLIRKFSRTAITVVLVILAFIAIFNAWVYYTESPWTRDARFSADVVAIAPDVSGLITQVNVHDNQLVK KGQILFTIDQPRYQKALEEAQADVAYYQVLAQEKRQEAGRRNRLGVQAMSREEIDQANNVLQTVLHQLAK AQATRDLAKLDLERTVIRAPADGWVTNLNVYTGEFITRGSTAVALVKQNSFYVLAYMEETKLEGVRPGYR AEITPLGSNKVLKGTVDSVAAGVTNASSTRDDKGMATIDSNLEWVRLAQRVPVRIRLDNQQENIWPAGTT ATVVVTGKQDRDESQDSFFRKMAHRLREFG Was converted to: >PRG_seq_5 ABRF_PRG2007_Protein_5 MKTLIRKFSRTAITVVLVILAFIAIFNAWVYYTESPWTRDARFSADVVAIAPDVSGLITQVNVHDNQLVK The file contains: 1) 4,346 protein sequences 2) common contaminants (e.g. keratins, trypsin, etc...) 3) an equal number of decoy sequences ABRF PRG 2007
Samples Analyzed by 2D DIGE Sample A Sample B Sample C A pooled standard of all three samples was made and labelled with Cy5 (red). The samples were then labelled individually with Cy3 (green) and each gel was run with a single sample versus pooled standard. ABRF PRG 2007
Samples by µLC-MS (1 µg on column) Base Peak Chromatograms Sample A Sample B ABRF PRG 2007
Demographics of the Participants ABRF PRG 2007
Demographics of the Participants Quantitative Data Returned = 35 Total Participants = 43 87 Labs Requested Samples: 49% Return Rate ABRF PRG 2007
PRG2007 Abbreviations DIGE Differential In-Gel Electrophoresis ICPL Isotope Coded Protein Label iTRAQ isobaric Tags for Relative and Absolute Quantitation ICAT Isotope Coded Affinity Tag 18O Stable Oxygen Isotope Label SRM Selected Reaction Monitoring ABRF PRG 2007
35 Participants Returned Methods Used ABRF PRG 2007
Techniques Applied ABRF PRG 2007
Results: True Positives vs False Positives 17 ABRF PRG 2007 ABRF PRG 2007
Results: True Positives vs False Positives 18 ABRF PRG 2007 ABRF PRG 2007
Quantitative Accuracy: Ubiquitin 2D Gels Label Free Stable Isotope Labeling A = 5 pmol B = 23 pmol 8 Anticipated Mole Ratio 4.6 6 B/A Ratio Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 4 2 ABRF PRG 2007
Quantitative Accuracy: Myoglobin 2D Gels Label Free Stable Isotope Labeling A = 0.5 pmol B = 5 pmol 16 14 12 Anticipated Mole Ratio 10 B/A Ratio 10 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 8 6 4 2 ABRF PRG 2007
Quantitative Accuracy: Serum Albumin 2D Gels Label Free Stable Isotope Labeling A = 5 pmol B = 3.3 pmol 3.5 3 2.5 Anticipated Mole Ratio 0.67 B/A Ratio 2 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 1.5 1 0.5 ABRF PRG 2007
Quantitative Accuracy: Carbonic Anhydrase I 2D Gels Label Free Stable Isotope Labeling A = 2.5 pmol B = 1.14 pmol 1.8 1.6 1.4 Anticipated Mole Ratio 0.45 1.2 B/A Ratio 1 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 0.8 0.6 0.4 0.2 ABRF PRG 2007
Quantitative Accuracy: Glucose Oxidase 2D Gels Label Free Stable Isotope Labeling A = 0.5 pmol B = 0.33 pmol 1 0.8 Anticipated Mole Ratio 0.67 0.6 B/A Ratio Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 0.4 0.2 ABRF PRG 2007
Quantitative Accuracy: Hexokinase 2D Gels Label Free Stable Isotope Labeling A = 0.5 pmol B = 0.16 pmol 2.5 2 Anticipated Mole Ratio 0.31 B/A Ratio 1.5 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 1 0.5 ABRF PRG 2007
Quantitative Accuracy: Tryptophanase* 2D Gels Label Free Stable Isotope Labeling A = 5 pmol B = 1.56 pmol 10 8 6 Anticipated Mole Ratio from 1 to 0.31 4 B/A Ratio 2 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE ABRF PRG 2007
Biggest Challenges Reported – Summary Complexity of the proteolytic digest. Long calculation times at several analytical steps To find the resources: spent more than $1000 on [the study] and had one technician busy for more than a week and a scientist for 2-3 days Finding the time No automation software available - too much hands-on work. Sample solubilization The ABRF fasta database: several search algorithms had problems. Number of replicates possible, making it difficult to determine a reasonable error rate, making it difficult to determine whether a protein is actually differentially expressed The MS identification of low abundance differential spots ABRF PRG 2007
Selected Comments The study was very good for researchers new to the proteomics field. This was an excellent learning experience. This study highlighted my facility's capabilities (peptide fractionation and MS) and weaknesses (chemical labeling of proteins and peptides and quant. analysis). This years study was a much more realistic sample that imitates real proteomic samples (without the dynamic range issue from serum/plasma samples). Very interesting study because it addresses a 'real world' issue which is the relative quantitation of a small number of proteins in a very complex mixture. We didn't have enough time... The protein amount of these samples is small and so it is difficult to have confident results. ABRF PRG 2007
Selected Comments -- Continued More sample, more time. We would have run these in at least triplicate as per our routine operation if we had had more sample and time. Make sure the solubilisation is as good as possible: I did not obtain any useful data from the samples, probably because I was not able to solubilise the sample completely. not fun!!! Overall peak intensity of the samples was not as high as the expected intensity for the amount of protein specified (100 µg) in the study. Liked it, because we could evaluate ourselves. For regular samples (500 µg on gel) I always am able to confidently assign most proteins. That was not so with the concentrations here. ABRF PRG 2007
Would you do this sort of study again? Other Responses: Yes, learned a lot, but need to watch resources Yes, but time issue Maybe Yes, but it was not fun ABRF PRG 2007
Conclusions Quantitative proteomics experiments are complex and require many factors for success A handful of participants reported excellent results indicating that quantitative results are achievable Participants using similar techniques did not obtain similar performance and suggests that expertise is a key factor Head to head comparisons of different approaches is not possible because of the high dependence on expertise Interest in this area is high and many labs appear to be developing these capabilities ABRF PRG 2007
Acknowledgements Kevin Hakala (UTHSCSA) Michelle Salemi (UC Davis) Rich Eigenheer (UC Davis) Matthew Russell (University of Cambridge) Ekaterina Deyanova (Merck Research Laboratories) ABRF PRG 2007
A huge thanks to all the labs that participated in this year’s study! Acknowledgements A huge thanks to all the labs that participated in this year’s study! ABRF PRG 2007