Previous Lecture: Regression and Correlation

Name: Previous Lecture: Regression and Correlation
Uploaded: 2017-10-12T06:54:12+00:00
Duration: PTM29S32
Channel: Osborne Barker
Description: Previous Lecture: Regression and Correlation

Previous Lecture: Regression and Correlation

This Lecture Introduction to Biostatistics and Bioinformatics
Proteomics Informatics

Proteomics Informatics – Learning Objectives
Structure of mass spectrometry data Protein identification Protein quantitation

Protein Identification and Quantitation
by Mass Spectrometry Samples Peptides Mass Spectrometry Quantity intensity m/z Identity

Sample preparation for protein identification,
characterization and quantitation Lysis Fractionation Digestion Mass spectrometry

Overview of Mass spectrometry
Ion Source Mass Analyzer Detector intensity mass/charge

Mass Spectrometry (MS)

Example data – MALDI-TOF Peptide intensity vs m/z

Peptide Fragmentation
Mass Analyzer 1 Frag-mentation Detector Ion Source Mass Analyzer 2 b y

Liquid Chromatography (LC)-MS/MS
Ion Source Mass Analyzer 1 Frag-mentation Mass Analyzer 2 Detector intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge intensity mass/charge Time

Example data – ESI-LC-MS/MS
Peptide intensity vs m/z vs time m/z m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022 MS/MS Fragment intensity vs m/z Time

Charge-State Distributions
MALDI ESI 1+ 2+ 3+ Peptide intensity intensity 4+ 2+ 1+ mass/charge mass/charge M - molecular mass n - number of charges H – mass of a proton MALDI ESI 2+ 27+ 3+ 1+ Protein 31+ intensity 4+ intensity 5+ mass/charge mass/charge

Charge-State Example:
M - molecular mass n - number of charges H – mass of a proton Example: peptide of mass 898 carrying 1 H+ = ( ) / 1 = 899 m/z carrying 2 H+ = ( ) / 2 = 450 m/z carrying 3 H+ = ( ) / 3 = m/z

Isotope Distributions
12C 14N 16O 1H 32S +1Da Intensity +2Da +3Da m/z m/z m/z 0.015% 2H 1.11% 13C 0.366% 15N 0.038% 17O, 0.200% 18O, 0.75% 33S, 4.21% 34S, 0.02% 36S Only 12C and 13C: p=0.0111 n is the number of C in the peptide m is the number of 13C in the peptide Tm is the relative intensity of the peptide m 13C 𝑇 𝑚 = 𝑛 𝑚 𝑝 𝑚 (1−𝑝) 𝑛−𝑚

Isotope Clusters and Charge State
1+ 1 Intensity m/z 2+ 0.5 Intensity m/z 3+ 0.33 Intensity m/z

What is the Charge State?
between the isotopes is 0.5 Da between the isotopes is 0.33 Da

Protein Identification
by Mass Spectrometry Samples Peptides Mass Spectrometry intensity m/z Identity

Protein Identification - Exercise
1. Protein identification: NUP1 was genomically tagged protein A, affinity purified under two conditions, and the resulting protein mixture was analyzed with liquid chromatography mass spectrometry (LC-MS). Search the resulting spectra (NUP1-less-stringent-wash.mgf, NUP1-more-stringent-wash.mgf) using X! Tandem ( Change the taxon to “S. cerevisiae (budding yeast)” but otherwise keep the default parameter settings. a. Look at the list of identified proteins and explain why they are found in this sample. More information is also available by selecting the “go”, “path”, “ppi”, “doms”, “string” tabs on top of the page. b. Select the “mh” display on top right of the page, and zoom in to +/-100 ppm (the default setting for the mass accuracy that was used in the search). What precursor mass accuracy should we have used? Zoom in further and determine what precursor mass accuracy could have been used if the spectra were recalibrated (the error distribution centered at zero).

Identification – Tandem MS

Tandem MS – Sequence Confirmation
K L E D F G S m/z % Relative Abundance 100 250 500 750 1000

K L E D F G S K 1166 L 1020 E 907 D 778 663 534 405 F 292 G 145 S 88 b ions m/z % Relative Abundance 100 250 500 750 1000

K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000

K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022

K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022 113 113

K L E D F G S 147 K 1166 L 260 1020 E 389 907 D 504 778 633 663 762 534 875 405 F 1022 292 G 1080 145 S 88 y ions b ions m/z % Relative Abundance 100 250 500 750 1000 [M+2H]2+ 762 260 389 504 633 875 292 405 534 907 1020 663 778 1080 1022 129 129

Tandem MS – de novo Sequencing
762 100 Amino acid masses 875 [M+2H]2+ % Relative Abundance 633 292 405 260 389 534 1022 504 663 778 907 1020 1080 250 500 750 1000 m/z Mass Differences Sequences consistent with spectrum

X X X …GF(I/L)EEDE(I/L)… …(I/L)EDEE(I/L)FG… Peptide M+H = 1166 = 87 => S SGF(I/L)EEDE(I/L)… SGF(I/L)EEDE(I/L)… 1166 – 1020 – 18 = 128 K or Q SGF(I/L)EEDE(I/L)(K/Q) …GF(I/L)EEDE(I/L)… …(I/L)EDEE(I/L)FG… X X X

Challenges in de novo sequencing Neutral loss (-H2O, -NH3) Modifications Background peaks Incomplete information Challenges in de novo sequencing Neutral loss (-H2O, -NH3) Modifications Background peaks Incomplete information

Tandem MS – Database Search
Sequence DB Lysis Fractionation Pick Protein Digestion LC-MS Pick Peptide Repeat for all proteins MS/MS All Fragment Masses all peptides Repeat for MS/MS Compare, Score, Test Significance

Information Content in a Single Mass Measurement
Human 10 8 6 Avg. #of matching peptides 4 3 2 1 #of matching peptides Tryptic peptide mass [Da] S. cerevisiae 10 8 6 Avg. #of matching peptides 4 3 2 1 #of matching peptides Tryptic peptide mass [Da]

Protein Identification and Quantitation
by Mass Spectrometry Samples Peptides Mass Spectrometry Quantity intensity m/z

Protein Quantitation by Mass Spectrometry
Sample i Protein j Peptide k Lysis Fractionation Digestion MS LC-MS

Quantitation – Label-Free (MS)
Sample i Protein j Peptide k Lysis Assumption: constant for all samples Fractionation Digestion LC-MS MS MS

Quantitation – Metabolic Labeling
Light Heavy Lysis Fractionation Digestion LC-MS Sample i Protein j Peptide k MS H L Oda et al. PNAS 96 (1999) 6591 Ong et al. MCP 1 (2002) 376

Quantitation – Labeled Synthetic Peptides
Assumption: All losses after mixing are identical for the heavy and light isotopes and Lysis Fractionation Digestion Synthetic Peptides (Heavy) Light Enrichment with Peptide antibody LC-MS Anderson, N.L., et al. Proteomics 3 (2004) MS H L Gerber et al. PNAS 100 (2003) 6940

Estimating peptide quantity
Peak height Peak height Curve fitting Curve fitting Intensity Peak area m/z

What is the best way to estimate quantity?
Peak height - resistant to interference - poor statistics Peak area - better statistics - more sensitive to interference Curve fitting - better statistics - needs to know the peak shape - slow Spectrum counting - resistant to interference - easy to implement - poor statistics for low-abundance proteins

Proteomics Informatics - Summary
Structure of mass spectrometry data Protein identification Protein quantitation

Next Lecture: Gene Expression

Protein Quantitation - Exercise
2. Protein quantitation: Two breast tumor xenografts (one basal and one luminal) were analyzed in by LC-MS and the spectral counts for the identified peptides in the different analyses are listed in two-sample-three-replicate-comparison.txt. a. Compare replicate one of Sample 1 with replicate one of Sample 2 using proteomics_no_replicate.py. Which differences are significant? b. Compare replicate one and two of Sample 1 using proteomics_one_replicate.py. Compare to the distribution in 2a. Which differences are significant in 2a? c. Compare the three replicates of Sample 1 with the three replicates of Sample 2 using proteomics_three_replicates.py. Which differences are significant? d. In cases when a protein is not observed in one sample, how many spectra do we need to observe in the other sample to say that there is a significant difference?

Phosphorylation Exercise: an unmodified peptide
Theoretical fragment ions You could give that as a help to see what changes etc.

Spectrum of the phosphorylated peptide
You could give that as a help to see what changes etc.

Spectrum of the peptide phosphorylated at a different site
You could give that as a help to see what changes etc.

Previous Lecture: Regression and Correlation

Similar presentations

Presentation on theme: "Previous Lecture: Regression and Correlation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Previous Lecture: Regression and Correlation

Similar presentations

Presentation on theme: "Previous Lecture: Regression and Correlation"— Presentation transcript:

Similar presentations

About project

Feedback