Presentation is loading. Please wait.

Presentation is loading. Please wait.

PEAKS: De Novo Sequencing using MS/MS spectra Bin Ma, U. Western Ontario, Canada Kaizhong Zhang,U. Western Ontario, Canada Chengzhi Liang, Bioinformatics.

Similar presentations


Presentation on theme: "PEAKS: De Novo Sequencing using MS/MS spectra Bin Ma, U. Western Ontario, Canada Kaizhong Zhang,U. Western Ontario, Canada Chengzhi Liang, Bioinformatics."— Presentation transcript:

1 PEAKS: De Novo Sequencing using MS/MS spectra Bin Ma, U. Western Ontario, Canada Kaizhong Zhang,U. Western Ontario, Canada Chengzhi Liang, Bioinformatics Solutions Inc. Canada

2 Outline Background –Tandem Mass Spectrometry De novo sequencing –Problem Definition and Algorithm. Software implementation – PEAKS Future work

3 Background Human has 100,000 different proteins. Because of the existence of post translational modifications, each protein can have many different versions. Diseases are closely related to the abnormal proteins or the expression levels of proteins. Given a tissue, the identification of the proteins (and their modified versions) in it is a fundamental problem for the drug design.

4 Proteins and Peptides A protein is a sequence of 20 different types of amino acids. –A protein is a string over alphabet with size 20 A peptide is a substring of the protein. The 20 amino acids have 19 distinct masses. –I and L have the same mass and cannot (difficult) be distinguished by MS/MS. –Regard them as the same letter.

5 Tandem Mass Spectrometry MS/MS is the only reliable way for protein identification. …VITK | GTDIMNEMR | SMW… tissue fraction gel protein peptide

6 LGSSEVEQVQLVVDGVKpeptide sequence: tandem mass spectrometer: MS/MS spectrum de novo sequencing: LGSSEVEQVQLVVDGVK database

7 How Does a Peptide Fragment? m(y 1 )=19+m(A 4 ) m(y 2 )=19+m(A 4 )+m(A 3 ) m(y 3 )=19+m(A 4 )+m(A 3 )+m(A 2 ) m(b 1 )=1+m(A 1 ) m(b 2 )=1+m(A 1 )+m(A 2 ) m(b 3 )=1+m(A 1 )+m(A 2 )+m(A 3 )

8 Matching Sequence with Spectrum

9 For any peptide P= a 1 …a n, m(P) = Σ i a i. De Novo Sequencing –Given a spectrum, a mass value m, compute a sequence P, s.t. m(P)=m, and the matching score score(P) is maximized. De Novo Sequencing

10 A Simpler Case – Only Y-ions

11 Y-ions Determined By a Suffix 19 y1y1 y2y2 y3y3 score(Q) can be defined for a suffix Q.

12 Counting Both y and b ions

13 Strategies Consider a pair of prefix R and a suffix Q simultaneously. Consider only those pairs (R,Q) that satisfy a nice property, which we call “chummy” Chummy pairs allow: –The score of a chummy pair can be computed recursively from a smaller chummy pair. –There are a series of chummy pairs that grow to the optimal solution.

14 Dynamic Programming Combining Lemma A, B, we can compute Suppose (R,Q) is the pair maximizing DP(u,v) under the condition m(R)+m(Q)+a=m. Then RaQ is the optimal peptide.

15 PEAKS – The Software

16 Red = Correct Comparison of PEAKS and Lutefisk

17 Users

18 Implementation Particulars More accurate scoring: –sum of the logarithmic intensities –many other ion types –coexisting ions, e.g., x 2, y 2, z 2 Deconvolution –converting multiply-charged peaks to singly-charged ones Recalibration –compress/stretch the spectrum for calibration error Noise reduction

19 Acknowledgement Bin Ma, Kaizhong Zhang were supported by NSERC. Chengzhi Liang was supported by BSI. Thanks the development team in BSI for the software development.

20

21 Tandem Mass Spectrometer mass analyzer fragment precursor ionsfragment ions MPSER SG… + PAK + + P + AK PAK + + PA + K AK + P K + PA P + K + + AK + PAK + + de novo sequencing … mass analyzer ions detector

22 Algorithm Sandwich DP(0,0) = 0; DP(u,v) = -infinity for (u,v)!=(0,0); for u from 1 to m/2 do for v from u-max(a) to u+max(a) do for a in Σ do if u<v then else find u,v,a, s.t. u+v+a=m and DP(u,v) maximized; backtracking;

23

24 Dynamic Programming 1.for u from 0 to m 2.backtracking

25 Dynamic Programming We hope DP(u,v) for u+v=m gives the optimal prefix and suffix. The optimal solution can be obtained by concatenation of the prefix and suffix.

26 Chummy Pairs Two strings Ra and bQ are called chummy pairs, iff. either of the following two is true: (C1) (C2) (LGE, LVR)  (C2) (LGE, VR)  (C1) (LGE, R)  (C1) (LG,VR) is not chummy

27 Chummy pairs Lemma A – Suppose Ra and bQ are a chummy pair. u=m(Ra), v=m(bQ). If (C1) is true, If (C2) is true,

28 Chummy Pairs Lemma B – Let P be the optimal solution. Then there is a chummy pair (R,Q) and a letter a such that P=RaQ. Also, there is a chummy pair series such that


Download ppt "PEAKS: De Novo Sequencing using MS/MS spectra Bin Ma, U. Western Ontario, Canada Kaizhong Zhang,U. Western Ontario, Canada Chengzhi Liang, Bioinformatics."

Similar presentations


Ads by Google