Hanyang Univ. Introduction to Data Analyses for Mass Spectrometry-based Proteomics 1
Hanyang Univ. Peptide Assignment DEAR vs. READ Differentiable ? 2
Hanyang Univ. DEARREAD digestion mass-spectrometry ProteinPeptides m/z intensity DEAR READ Mass spectrum (MS) m/z intensity DEAR READ m/z intensity mass-spectrometry DEAR Mass/Mass spectrum (MS/MS) D EAR DE AR DEA R peptide fragmentation m/z intensity READ R EAD RE AD REA D mass-spectrometry peptide fragmentation 471 Data Analysis - Peptide Assignment 3
Hanyang Univ. 4 The mass-spectrometry/proteomic experiment ???? digestion mass-spectrometry ProteinPeptides m/z intensity ???? Mass spectrum (MS) m/z intensity ???? 471 ? ??? ?? ??? ? mass-spectrometry peptide fragmentation m/z intensity ???? mass-spectrometry Mass/Mass spectrum (MS/MS) m/z intensity ???? Trypsin - Pepsin - Lys-C - Quadrupole - Time of flight - FTICR ? ??? ?? ??? ? peptide fragmentation - CID - ECD - ETD
Hanyang Univ. y-ion Labeled from C-terminal to N-terminal b-ion Labeled from N-terminal to C-terminal N-terminal (Amino-terminal) C-terminal (Carboxy-terminal) Peptide Fragmentation
Hanyang Univ. Peptide Fragmentation
Hanyang Univ. Calculating b-/y-ion mass
Hanyang Univ. Average vs. Monoisotopopic mass 8
Hanyang Univ. Amino AcidMass A71 D115 E129 R156 energy RDEA H +H ++ OH Peptide E R A D Fragmentation 1 D E AR 3 D E R A 2 Intensity m/z b1 b2 b3 y3 y2 y1 116 b1 245 b2 316 b3 375 y3 246 y2 175 y1 ? ? ERA EDA MS/MS Peptide Assignment 9
Hanyang Univ. 334 y3 134 y1 Amino AcidMass A71 D115 E129 R156 energy DREA H +H ++ OH Peptide E D A R Fragmentation 1 R E AD 3 R E D A 2 b1 b2 b3 y3 y2 y1 157 b1 286 b2 356 b3 205 y2 Intensity m/z EDA ERA MS/MS Peptide Assignment 10
Hanyang Univ. DEAR vs. READ Intensity m/z b1 b2 b3 y3y2y1 334 y3 134 y1 157 b1 286 b2 356 b3 205 y2 Intensity m/z DEAR READ MS/MS 11
Hanyang Univ. MS/MS simulation 1 D EAR DEAR Intensity m/z Intensity m/z fragmentation MS MS/MS DEAR READ b1 y3 12
Hanyang Univ. MS/MS simulation 2 DEA R DEAR Intensity m/z Intensity m/z fragmentation MS MS/MS DEAR READ b1 b3 y3 y1 13
Hanyang Univ. MS/MS simulation 3 DEAR Intensity m/z fragmentation Intensity m/z D EAR MS MS/MS DEAR READ b1 b3 y3 y1 14
Hanyang Univ. MS/MS simulation 4 DE AR DEAR Intensity m/z fragmentation Intensity m/z MS MS/MS DEAR READ b1 b2 b3 y3 y2 y1 15
Hanyang Univ. MS/MS simulation 5 DE AR DEAR Intensity m/z fragmentation Intensity m/z MS MS/MS DEAR READ b1 b2 b3 y3 y2 y1 16
Hanyang Univ. MS/MS simulation 100 DE AR DEAR Intensity m/z fragmentation D EAR DEA R Intensity m/z MS MS/MS READ b1 b2 b3 y3 y2 y1 17
Hanyang Univ. Peptide assignment RDEA H +H ++ OH Peptide Intensity m/z b1 b2 b3 y3y2y1 ERA EDA Not known whether an ion is a b-ion or y-ion Some ions may be missing Various ion types (neutral loss) Amino acid modification 18
Hanyang Univ. 19 Database search - Peptide assignment using MS/MS >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPKNKNRNR YRDVSPFDHSRKREADDNDYINASLIKMEEAQRSYILTQQIDKSG SWAAIYQDIRHEASDFHEASDFPCRVAKLPKNKDEARYMEKEFEQ IDKGAGVDADIRHEMEKEFEQIDKSGSWAAIYQDIRHE >Protein B MKVLILACLVALALAEGDRLNVPGEIVESLSSSEESITRINKKIE KFQSEEQQQTEDELQDKIHPFAQTQSLVYPFPGPEGDVAPQNIPP LTQTPVVVPPFLQPEVMGVSKVKEAMAPKHKEMPFPKYPVEPF >Protein C... Intensity m/z MS/MS Parent mass = 471 Sequence Database Raw genomic Transcript or EST Protein Sequence
Hanyang Univ. 20 Database search - Peptide assignment using MS/MS >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPKNKNRNR YRDVSPFDHSRKREADDNDYINASLIKMEEAQRSYILTQQIDKSG SWAAIYQDIRHEASDFHEASDFPCRVAKLPKNKDEARYMEKEFEQ IDKGAGVDADIRHEMEKEFEQIDKSGSWAAIYQDIRHE >Protein B MKVLILACLVALALAEGDRLNVPGEIVESLSSSEESITRINKKIE KFQSEEQQQTEDELQDKIHPFAQTQSLVYPFPGPEGDVAPQNIPP LTQTPVVVPPFLQPEVMGVSKVKEAMAPKHKEMPFPKYPVEPF >Protein C... Intensity m/z MS/MS Parent mass = 471
Hanyang Univ. 21 Peptide assignment using MS/MS READ DVGAE DEAR GAGVDA EGDVA … Candidate peptides Intensity m/z Experimental MS/MS spectrum MS/MS Comparison Parent mass = 471
Hanyang Univ. 22 Peptide assignment using MS/MS READ DVGAE DEAR GAGVDA EGDVA … Candidate peptides Intensity m/z Experimental MS/MS spectrum Intensity m/z Intensity m/z Intensity m/z Intensity m/z Intensity m/z Theoretical MS/MS spectrum MS/MS Parent mass = 471
Hanyang Univ. 23 Peptide assignment using MS/MS Intensity m/z Intensity m/z Intensity m/z Intensity m/z Intensity m/z Intensity m/z Intensity m/z Intensity m/z Intensity m/z Intensity m/z Theoretical MS/MS spectrum READ DVGAE DEAR GAGVDA EGDVA … Candidate peptidesComparison Select TOP one Match score
Hanyang Univ. 24 Post-translational modification (PTM) modified protein Addition of chemical groups Structural changes Various Cellular Functions PROTEIN PROT PO4 EIN PROTEINS PROTE CH2 IN ROTEIN PROT PO4 EINS PO4 Dynamic proteome
Hanyang Univ. 25 MS/MS of modified peptides digestion MS/MS Intensity m/z Modified protein
Hanyang Univ. 26 MS/MS spectrum of modified peptides ‘TVTAMDVVY’ m/z intensity AVTMDVV T MS/MS spectrum of peptide ‘TVTAM Δ DVVY’ with a modification of +Δ mass T TV TVT TVTA TVTAM TVTAMD TVTAMDV TVTAMDVV VTAMDVVY TAMDVVY AMDVVY MDVVY DVVY VVY VY Y ΔΔΔΔ M+ΔDVV Δ SHIFT AVT T intensity m/z T TV TVT TVTA TVTAM Δ TVTAM Δ D TVTAM Δ DV TVTAM Δ DVV MDVV TVTAMDVVY vs. TVTAM Δ DVVY ‘TVTAM Δ DVVY’
Hanyang Univ Database search – modification analysis >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPK NKNRNRYRDVSPFDHSRKREADDNDYINASLIKMEEAQR SYILTQQIDKSGSWAAIYQDIRHEASDFHEASDFPCRVA KLPKNKDEARYMEKEFEQIDKGAGVDADIRHEMEKEFEQ IDKSGSWAAIYQDIRHE >Protein B … Intensity m/z MS/MS Parent mass = 471 DVGAE READ DEAR GAGVDA … Candidate peptides Every substring Candidate peptides PEAK Modification analysis Explosion of the no. of candidate peptides
Hanyang Univ. 28 Complexity for analyzing modified peptides O(N) Intensity m/z MS/MS Parent mass = 769 PTMPEPT 753 PTMPEPT PTMPEPT PTMPEPT PTMPEPT PTMPEPT PTMPEPT PTMPEPT 16 - Considering one modification per peptide
Hanyang Univ. 29 Complexity for analyzing modified peptides PTMPEPT 100 = = = … = = … = d (-200 ~ +200) O(dN 2 ) N(N-1) - Considering two modifications per peptide
Hanyang Univ. 30 Standard method for modification analysis PTMPEPT 100 = = = … = = … = Input modifications +1 on N +3 on M +97 on E +102 on T Restrictive search
Hanyang Univ. Spectral Library - Peptide assignment using MS/MS 31
Hanyang Univ. Spectral Library - Peptide assignment using MS/MS Consensus Spectrum 32
Hanyang Univ. Peptide Validation Peptide assignment 각각의 MS/MS 스펙트럼에 대해 독립적으로 해석 사용한 소프트웨어가 다를 경우에 대응이 어려움 Manual validation Filtering by search scores, NTT(Number of Tryptic Termini) 주관적인 판단이 개입될 수 있음 Error rate 이 얼마나 되는지 알 수 없음 Dataset 이 커지면 ? Statistical validation Search score 에 대한 확률모델을 근거로 각각의 peptide assignment 가 올바를 확률을 제시 — PeptideProphet False discovery rate 를 decoy peptide 에 대한 match 를 근거로 추정 33
Hanyang Univ. (Un)reliability of Manual Validation Manual Authenticators Search Results Correct ValidationIncorrect ValidationValidation Withheld 34
Hanyang Univ. Peptide Validation PeptideProphet AAAA m/z intensity m/z intensity CCCC m/z intensity m/z intensity m/z intensity GGGG KKKKLLLL TTTT m/z intensity m/z intensity LLLLQQQQIIII m/z intensity m/z intensity
Hanyang Univ. Peptide Validation Target/Decoy m/z intensity YILT DAER m/z intensity >Protein A (Target Sequence) MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPK NKNRNRYRDVSPFDHSRKREADDNDYINASLIKMEEAQR SYILTQQIDKSGSWAAIYQDIRHEASDFHEASDFPCRVA KLPKNKDEARYMEKEFEQIDKGAGVDADIRHEMEKEFEQ IDKSGSWAAIYQDIRHE >Reversed Protein A (Decoy Sequence) EHRIDQYIAAWSGSKDIQEFEKEMEHRIDADVGAGKDIQ EFEKEMYRAEDKNKPLKAVRCPFDSAEHFDSAEHRIDQY IAAWGSGKDIQQTLIYSRQAEEMKILSANIYDNDDAERK RSHDFPSVDRYRNRNKNKPLKAVRCPFDEAGVDIDQYIA AWSGSKDIQEFEKEMEM T = 1000# of matches to the target sequence (above score threshold) D = 20 # of matches to the decoy sequence (above score threshold) False Discovery Rate = ? 36
Hanyang Univ. Peptide Validation Target/Decoy Target/Decoy 는 large dataset 에 대해서만 의미가 있음. Decoy database 로 적당한 것은 ? (amino acid composition, peptide sequence redundancy, precursor mass distribution) Reversed sequence Random sequences Pseudo-reverse sequence Separated or concatenated? Threshold score 30 Match to the target score 50 Match to the decoy score 40 Is this counted as a false positive? Calculating FDR Concatenated: Is this counted as a false positive? 37
Hanyang Univ. Semi-parametric PeptideProphet Decoy search results => distribution for incorrect assignments EM algorithm to estimate distributions of correct assignments NTT (number of tryptic termini) 38
Hanyang Univ. Protein Assignment 39
Hanyang Univ. Protein Assignment >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPKNKNRNRYRDVSPFD HSRKREADDNDYINASLIKMEEAQRSYILTQQIDKSGSWAAIYQDIRHEASDF HEASDFPCRVAKLPKNKDEARYMEKEFEQIDKGAGVDADIRHEMEKEFEQIDK SGSWAAIYQDIRHE VAKLPKNKNR:p=0.96 YMEKEFEQIDK:p=0.65 EADDNDYINASLIK:p=0.83 P = 1 – (1-0.83)(1-0.65)(1-0.96) 40
Hanyang Univ. Protein Assignment >Protein A MEMEKEFEQIDKSGSWAAIYQDIDVGAEDFPCRVAKLPKNKNRNRYRDVSPFD HSRKREADDNDYINASLIKMEEAQRSYILTQQIDKSGSWAAIYQDIRHEASDF HEASDFPCRVAKLPKNKDEARYMEKEFEQIDKGAGVDADIRHEMEKEFEQIDK SGSWAAIYQDIRHE EADDNDYINASLIK:p=0.83 EADDNDYINASLIK:p=0.62 EADDNDYINASLIK:p=0.95 Probability(Protein A)=? 41
Hanyang Univ. Protein Assignment - ProteinProphet 42
Hanyang Univ. Protein Assignment - ProteinProphet Probabilistic Model with NSP(Number of Sibling Peptides) as a random var. 43
Hanyang Univ. Protein Assignment - ProteinProphet Degenerate Peptides (alternative splicing, paralogs, database redundancies) 44
Hanyang Univ. Protein Assignment - IDPicker Bipartite graph A.Initialize B.Collapse C.Separate D.Reduce 45
Hanyang Univ. Peptide Quantitation Labeled Quantitation Use of stable isotope containing compound Peptide assignment from MS/MS Peptide Quantitation from MS: single ion chromatogram 46
Hanyang Univ. Peptide Quantitation Label-free Quantitation Matching peptide features AMT (Accurate Mass & Time) approach — normalized elution time Spectral counting — number of spectra identified for a given peptide 47
Hanyang Univ. Pipeline : integrated tools for MS/MS proteomics Input Spectrum data (Protein database) Peptide assignment SEQUEST PEAKS MODi Peptide validation manual validation PeptideProphet Target/Decoy Protein assignment & validation ProteinProphet IDPicker Output Interpretation Quantitation ASAPRatio MaxQuant 48
Hanyang Univ. 49