Download presentation
Presentation is loading. Please wait.
1
PEAKS: De Novo Sequencing using MS/MS spectra Bin Ma, U. Western Ontario, Canada Kaizhong Zhang,U. Western Ontario, Canada Chengzhi Liang, Bioinformatics Solutions Inc. Canada
2
Outline Background –Tandem Mass Spectrometry De novo sequencing –Problem Definition and Algorithm. Software implementation – PEAKS Future work
3
Background Human has 100,000 different proteins. Because of the existence of post translational modifications, each protein can have many different versions. Diseases are closely related to the abnormal proteins or the expression levels of proteins. Given a tissue, the identification of the proteins (and their modified versions) in it is a fundamental problem for the drug design.
4
Proteins and Peptides A protein is a sequence of 20 different types of amino acids. –A protein is a string over alphabet with size 20 A peptide is a substring of the protein. The 20 amino acids have 19 distinct masses. –I and L have the same mass and cannot (difficult) be distinguished by MS/MS. –Regard them as the same letter.
5
Tandem Mass Spectrometry MS/MS is the only reliable way for protein identification. …VITK | GTDIMNEMR | SMW… tissue fraction gel protein peptide
6
LGSSEVEQVQLVVDGVKpeptide sequence: tandem mass spectrometer: MS/MS spectrum de novo sequencing: LGSSEVEQVQLVVDGVK database
7
How Does a Peptide Fragment? m(y 1 )=19+m(A 4 ) m(y 2 )=19+m(A 4 )+m(A 3 ) m(y 3 )=19+m(A 4 )+m(A 3 )+m(A 2 ) m(b 1 )=1+m(A 1 ) m(b 2 )=1+m(A 1 )+m(A 2 ) m(b 3 )=1+m(A 1 )+m(A 2 )+m(A 3 )
8
Matching Sequence with Spectrum
9
For any peptide P= a 1 …a n, m(P) = Σ i a i. De Novo Sequencing –Given a spectrum, a mass value m, compute a sequence P, s.t. m(P)=m, and the matching score score(P) is maximized. De Novo Sequencing
10
A Simpler Case – Only Y-ions
11
Y-ions Determined By a Suffix 19 y1y1 y2y2 y3y3 score(Q) can be defined for a suffix Q.
12
Counting Both y and b ions
13
Strategies Consider a pair of prefix R and a suffix Q simultaneously. Consider only those pairs (R,Q) that satisfy a nice property, which we call “chummy” Chummy pairs allow: –The score of a chummy pair can be computed recursively from a smaller chummy pair. –There are a series of chummy pairs that grow to the optimal solution.
14
Dynamic Programming Combining Lemma A, B, we can compute Suppose (R,Q) is the pair maximizing DP(u,v) under the condition m(R)+m(Q)+a=m. Then RaQ is the optimal peptide.
15
PEAKS – The Software
16
Red = Correct Comparison of PEAKS and Lutefisk
17
Users
18
Implementation Particulars More accurate scoring: –sum of the logarithmic intensities –many other ion types –coexisting ions, e.g., x 2, y 2, z 2 Deconvolution –converting multiply-charged peaks to singly-charged ones Recalibration –compress/stretch the spectrum for calibration error Noise reduction
19
Acknowledgement Bin Ma, Kaizhong Zhang were supported by NSERC. Chengzhi Liang was supported by BSI. Thanks the development team in BSI for the software development.
21
Tandem Mass Spectrometer mass analyzer fragment precursor ionsfragment ions MPSER SG… + PAK + + P + AK PAK + + PA + K AK + P K + PA P + K + + AK + PAK + + de novo sequencing … mass analyzer ions detector
22
Algorithm Sandwich DP(0,0) = 0; DP(u,v) = -infinity for (u,v)!=(0,0); for u from 1 to m/2 do for v from u-max(a) to u+max(a) do for a in Σ do if u<v then else find u,v,a, s.t. u+v+a=m and DP(u,v) maximized; backtracking;
24
Dynamic Programming 1.for u from 0 to m 2.backtracking
25
Dynamic Programming We hope DP(u,v) for u+v=m gives the optimal prefix and suffix. The optimal solution can be obtained by concatenation of the prefix and suffix.
26
Chummy Pairs Two strings Ra and bQ are called chummy pairs, iff. either of the following two is true: (C1) (C2) (LGE, LVR) (C2) (LGE, VR) (C1) (LGE, R) (C1) (LG,VR) is not chummy
27
Chummy pairs Lemma A – Suppose Ra and bQ are a chummy pair. u=m(Ra), v=m(bQ). If (C1) is true, If (C2) is true,
28
Chummy Pairs Lemma B – Let P be the optimal solution. Then there is a chummy pair (R,Q) and a letter a such that P=RaQ. Also, there is a chummy pair series such that
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.