Download presentation
Presentation is loading. Please wait.
Published byMyron Walker Modified over 9 years ago
1
Goals in Proteomics 1.Identify and quantify proteins in complex mixtures/complexes 2.Identify global protein-protein interactions 3.Define protein localizations within cells 4. Measure and characterize post-translational modifications 5.Measure and characterize activity (e.g. substrate specificity, etc) 1
2
Goals in Proteomics 1.Identify and quantify proteins in complex mixtures/complexes MS and MS/MS 2.Identify global protein-protein interactions MS and MS/MS, Y2H 3.Define protein localizations within cells High-throughput microscopy, organelle pull down 4. Measure and characterize post-translational modifications MS techniques 5.Measure and characterize activity (e.g. substrate specificity, etc) Protein arrays 2
3
Coon et al. 2005 Basic overview of Tandem mass-spectrometry (MS/MS) 3
4
Mass SpecMS Spectrum Ion sourceMass analyzerDetector Intro to Mass Spec (MS) Separate and identify peptide fragments by their Mass and Charge (m/z ratio) Basic principles: 1. Ionize (i.e. charge) peptide fragments 2. Separate ions by mass/charge (m/z) ratio 3. Detect ions of different m/z ratio 4. Compare to database of predicted m/z fragments for each genome 4
5
Intro to Mass Spec (MS) Separate and identify peptide fragments by their Mass and Charge (m/z ratio) 1.Ionization Goal: ionize (i.e. charge) peptide fragments without destroying molecule http://www.colorado.edu/chemistry/chem5181/MS_ESI_Gilman_Mashburn.pdf Positive ionization (protonate amine groups) especially useful for trypsinized proteins (cleaved after R and K) vs. Negative ionization (deprotonate carboxylics and alcohols) 5
6
Liquid chromatography + Electrospray ionization electric field * Commonly used with liquid solutions, more sensitive to contaminants, used for complex mixtures 6
7
Liquid chromatography + Electrospray ionization MALDI http://www.astbury.leeds.ac.uk/facil/MStut/mstutorial.htm electric field * Commonly used with liquid solutions, more sensitive to contaminants, used for complex mixtures * Less sensitive to contaminants, more common for less complex mixtures 7
8
Intro to Mass Spec (MS) Separate and identify peptide fragments by their Mass and Charge (m/z ratio) 2. Separation of ions based on m/z ratio (mass m versus charge c) Multiple flavors of mass analyzers use different technology * TOF (‘time of flight’): separates based on velocity * Triple quadrupole: separation based on pulsed electrical pulse 8
9
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted m/z of trypsinized proteins Tandem MS/MS (peptide sequencing): Pulls each peptide from the first MS Breaks up peptide bond Identifies each fragment based on m/z Collision cell 9
10
Multiple flavors of mass analyzers … can be hooked together in multiple configs. 10 g. Orbitrap
11
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted m/z of trypsinized proteins Tandem MS/MS (peptide sequencing): Pulls each peptide from the first MS Breaks up peptide bond Identifies each fragment based on m/z Collision cell 11 Now multiple types of collision cells: CID: collision induced dissociation ETD: electron transfer dissociation HCD: high-energy collision dissociation
12
Peptide can fragment along 3 possible bonds … charge stays on either the ‘left’ (a,b, or c) or ‘right’ (x, y, or z) side of cleavagee Fragmentation happens in fairly defined way along peptide backbone Cleavage along the CO-NH bond is most common, generating ‘b’ and ‘y’ ions 12
13
MS spectrum (i.e. peptide ions) Mann Nat Reviews MBC. 5:699:711 Each peak is a different peptide, separated based on m/z A single peptide is selected by the instrument for the second MS Each peak often surrounded by smaller peaks of similar m/z Sensitivity of instrument determines resolution 13
14
Second MS identifies y (or b) ions to read out amino-acid sequence Mann Nat Reviews MBC. 5:699:711 14
15
Trypsin often used to digest proteins (cleaves after Arg and Lys) WHY? Because of challenges distinguishing spectra, simplified mixtures are typically injected into the MS: -either excised proteins -purified complexes -fractionated pools of complex mixtures 15
16
The first dimension (separation by isoelectric focusing) - gel with an immobilised pH gradient - electric current causes charged proteins to move until it reaches the isoelectric point (pH gradient makes the net charge 0) The second dimension (separation by mass) -pH gel strip is loaded onto a SDS gel -SDS denatures the protein (to make movement solely dependent on mass, not shape) and eliminates charge. 2 dimensional gel separation (largely outdated) Ahna Skop 16
17
2D-SDS PAGE gel Ahna Skop 17
18
TAP-tag: Tandem Affinity Purification (for IP’ing individual proteins and proteins bound to them) 18
19
Ion exchange chromatography Anion exchange: Column is postively charged (can bind negativey charged proteins). Cation exchange: Column is negativey charged (can bind positively charged proteins). Exploit the isoelectric point of a protein to Separate it from other macromolecules. Ahna Skop 19
20
Size exclusion chromatography Porous beads made of different but controlled sizes. Smaller proteins go in and out of beads and will be retained in the resin. Large proteins will only go into large beads and will be retained less. Very large proteins will not go into any of the beads (exclusion limit). Can be used as a preparative method or to determine the molecular weight of a protein in solution. Ahna Skop 20
21
A ligand with high affinity to the protein is attached to a matrix. Protein of interest binds to ligand And is retained by resin. Everything else flows through. Can use excess of the soluble ligand to elute the protein. Affinity chromatography Ahna Skop 21
22
Mann Nat Reviews MBC. 5:699:711 22 How does each spectrum translate to amino acid sequence?
23
1.De novo sequencing: very difficult and not widely used (but being developed) for large-scale datasets 2.Matching observed spectra to a database of theoretical spectra 23
24
Theoretical spectra: - in silico digestion of a known protein database - set of limited set of theoretical spectra based on enzyme, instrument sensitivity, others - this reduces search space - can miss some peptides - comparisons based on several different scores (eg. correlation between obs. and theoretical profiles) Mann Nat Reviews MBC. 5:699:711 24
25
1.De novo sequencing: very difficult and not widely used (but being developed) for large-scale datasets 2.Matching observed spectra to a database of theoretical spectra 3.Matching observed spectra to a spectral database of previously seen spectra How does each spectrum translate to amino acid sequence? 25
26
Nesvizhskii (2010) J. Proteomics, 73:2092- 2123. -spectral matching is supposedly more accurate but … -limited to the number of peptides whose spectra have been observed before With either approach, observed spectra are processed to: group redundant spectra, remove bad spectra, recognized co-fragmentation, improve z estimates Many good spectra will not match a known sequence due to: absence of a target in DB, PTM modifies spectrum, constrained DB search, incorrect m or z estimate. 26
27
Result: peptide-to-spectral match (PSM) A major problem in proteomics is bad PSM calls … therefore statistical measures are critical Methods of estimating significance of PSMs: p- (or E-) value: compare score S of best PSM against distribution of all S for all spectra to all theoretical peptides FDR correction methods: 1.B&H FDR 2.Estimate the null distribution of RANDOM PSMs: - match all spectra to real (‘target’) DB and to fake (‘decoy) DB - often decoy DB is the same peptides in the library but reverse sequence one measure of FDR: 2*(# decoy hits) / (# decoy hits + # target hits) 3. Use #2 above to calculate posterior probabilities for EACH PSM 27
28
3. Use #2 above to calculate posterior probabilities for EACH PSM - mixture model approach: take the distribution of ALL scores S - this is a mixture of ‘correct’ PSMs and ‘incorrect’ PSMs - but we don’t know which are correct or incorrect - scores from decoy comparison are included, which can provide some idea of the distribution of ‘incorrect’ scores -EM or Bayesian approaches can then estimate the proportion of correct vs. incorrect PSM … based on each PSM score, a posterior probability is calculated FDR can be done at the level of PSM identification … but often done at the level of Protein identification 28
29
Error in PSM identification can amplify FDR in Protein identification Often focus on proteins identified by at least 2 different PSMs (or proteins with single PSMs of very high posterior probability) Nesvizhskii (2010) J. Proteomics, 73:2092- 2123. Some methods combine PSM FDR to get a protein FDR 29
30
Some practical guidelines for analyzing proteomics results 1.Know that abundant proteins are much easier to identify 2.# of peptides per protein is an important consideration - proteins ID’d with >1 peptide are more reliable - proteins ID’d with 1 peptide observed repeatedly are more reliable - note than longer proteins are more likely to have false PSMs 3.Think carefully about the p-value/FDR and know how it was calculated 4.Know that proteomics is no where near saturating … many proteins will be missed 30
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.