Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.

Similar presentations


Presentation on theme: "Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics."— Presentation transcript:

1 Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics tgriffin@umn.edu

2 What is proteomics? “Proteomics includes not only the identification and quantification of proteins, but also the determination of their localization, modifications, interactions, activities, and, ultimately, their function.” -Stan Fields in Science, 2001.

3 Genomics vs. Proteomics Similarities:  Large datasets, tools needed for annotation and interpretation of results Differences:  Genomics – generally mature technologies, data processing methods, questions asked usually involve quantitative changes in RNA transcripts (microarrays)  Proteomics – still evolving, complexity of protein biochemical properties: expression changes, modifications, interactions, activities – many questions to ask and data to interpret, methods changing, different approaches (mass spec, arrays etc.),

4 Genomics, Proteomics, and Systems Biology mature prototype emerging genomic DNA mRNA sequencing arrays genomics protein cataloguing protein products functional protein quantitative profiling protein phosphorylation Protein dynamics Protein Modifications sub cellular location catalytic activity descriptive protein interaction maps 3D structure proteomics measure and define properties system identify system components interactions between components computational biology

5 µLC separation (50-100 um) Tandem mass spectrum (thousands in a matter of hours) “Shotgun” identification of proteins in mixtures by LC-MS/MS Liquid chromatography coupled to tandem mass spectrometry (MS/MS) Ionization: MALDI or Electrospray IsolationFragmentation Mass Analysis peptide fragments peptides ++ + + + + + + + + + + + + + + m/z

6 Peptide sequence determination from MS/MS spectra H 2 N -N--S--G--D--I--V--N--L--G--S--I--A--G--R- COOH b2b2 b3b3 b4b4 b5b5 b6b6 b7b7 b8b8 b9b9 b 10 b 11 b 12 b 13 b 14 b1b1 y 13 y 12 y 11 y 10 y9y9 y8y8 y7y7 y6y6 y5y5 y4y4 y3y3 y2y2 y1y1 y 14 Collision-induced dissociation (CID) creates two prominent ion series: y-series: b-series:

7 H 2 N -NSGDIVNLGSIAGR- COOH 20040060080010001200 m/z Relative Abundance LGSIAGR GSIAGR SIAGR IAGR AGR GR R NLGSIAGR VNLGSIAGR IVNLGSIAGR DIVNLGSIAGR GDIVNLGSIAGR Peptide sequence identifies the protein YMR134W, yeast protein involved in iron metabolism

8 High-throughput protein identification by LC-MS/MS and automated sequence database searching Protein sequence and/or DNA sequence database search Raw MS/MS spectrum Peptide sequence match Direct identification of 1000+ proteins from complex mixtures Protein identification

9 Dealing with the data 1. Data acquisition 2. Peak analysis 3. Knowledge annotation and interpretation Experimental information, metadata capture Sequence database searching Quantitative analysis Database mining Assignment of function, pathway, localization etc. Output for database archiving, publication Integrated workflow?

10 1. Data acquisition: capturing experimental information Proteomics Experimental Data Repository (PEDRo) Proposed schema Similar to genomic needs, but experimental info a bit different

11 2. Peak Analysis  ProFound  Mascot  PepSea  MS-Fit  MOWSE  Peptident  Multident  Sequest  PepFrag  MS-Tag Protein identification Computational algorithms for searching MS/MS spectra against protein sequence databases, mRNA sequences, DNA sequences need cpu horsepower (parallel computing)

12 2. Peak Analysis: data formats Format 1Format 3Format 2 Output 1 Output 2 Output 3 Lack of flexibility Slow to evolve Lack of incorporation of competing products, methods ??

13 2. Peak Analysis: need general, flexible, in-house solutions Format 1Format 3Format 2 General tools for analysis of multiple data formats reverse engineering of data formats

14 2. Peak Analysis; reverse engineering data formats http://sashimi.sourceforge.net/software_glossolalia.html

15 2. Peak analysis: quality control of protein matches Unfiltered – 10 5 + matches (lots of noise and junk) Filtered – thousands of “true” matches filtering Statistical analysis of database results (tools are available)

16 2. Peak Analysis: Quantitative analysis Flexibility is key – need tools to handle different quantitative methods External chemical labeling Metabolic labeling (SILAC) Enzymatic incorporation (O 16 /O 18 )

17 2. Peak Analysis: Quantitative analysis Sample 1 Sample 2 Relative intensity = relative protein abundance

18 Evolving methodologies: iTRAQ iTRAQ label: +114 +115+116 +117 Multidimensional separation 114116115117 m/z Intensity Digest to peptides Diagnostic ions used for quantitative analysis Peptide fragments used for sequence identification MS/MS spectrum Sample: 1 2 3 4 2 1 3 4 4-way multiplexing: simultaneous comparison of multiple states, replicates

19 Need for “changeable” tools 116.0972 115.0963 117.1025 114.1005 Intensity 1 2 4 3 “old” “new” Automated analysis tools?

20 3. Knowledge annotation: making sense of lists of data

21 3. Knowledge annotation: mining proteomic/genomic databases

22 3. Knowledge annotation: needs Annotation: accession numbers and protein names Functional assignments (functional degeneracy?) Pathway assignments Subcellular localization Disease implications Comparison of different proteomic datasets (i.e. expression profiles compared to modification state profiles, other protein properties) Automated and streamlined?? Publication and deposit in databases Visualization of complex phenomena, interpretation of biological relevance Modeling, integration with genomics data – computational and systems biology


Download ppt "Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics."

Similar presentations


Ads by Google