Presentation is loading. Please wait.

Presentation is loading. Please wait.

謝孫源 (Sun-Yuan Hsieh) 成功大學 電機資訊學院 資訊工程系

Similar presentations


Presentation on theme: "謝孫源 (Sun-Yuan Hsieh) 成功大學 電機資訊學院 資訊工程系"— Presentation transcript:

1 謝孫源 (Sun-Yuan Hsieh) 成功大學 電機資訊學院 資訊工程系 hsiehsy@mail.ncku.edu.tw
A New Branch and Bound Method for the Protein Folding Problem in the HP model   謝孫源 (Sun-Yuan Hsieh) 成功大學 電機資訊學院 資訊工程系 2019/4/16

2 Outline Bioinformatics Background Motivation & Purpose Related Works
Preliminaries New Branch and Bound Experiment Result Conclusion 2019/4/16

3 Background Amino acid Protein protein fundamental unit.
Twenty kinds of different amino acid. e.g. 纈胺酸 (Valine) => 促進腦力,改善肌肉協調功能及安定情緒。 Protein  Proteins are large organic compounds made of amino acids arranged in a linear chain 2019/4/16

4 Background(1/2) Protein structure 2019/4/16

5 Background(2/2) Protein folding problem (PFP)
Every two years, the performance of current methods is assessed in the CASP (Critical Assessment of Techniques for Protein Structure Prediction) experiment. Protein structure prediction is high importance in medicine (for example, in drug design) and biotechnology. 2019/4/16

6 Motivation & Purpose(1/2)
Experiment method NMR & X-ray crystallography (Protein Data Bank) Computing model Homology modeling Ab initio 2019/4/16

7 Motivation & Purpose(2/2)
Prediction structure methods compare Our purpose is to design an efficient algorithm with HP model to find minimal-energy conformations (native structure) easily. COST TIME Experiment High Computing Low 2019/4/16

8 Related Works(1/2) HP model (Hydrophobic-polar model) was introduced by K.A. Dill (1985). 親水性 疏水性 P H 2019/4/16

9 Related Works(2/2) Other computing methods for the PFP in HP model
Ant Colony Optimization  [A. Smygelska 05] Branch and bound     [M. Chen 05] Evolutionary Monte Carlo [F. Liang 01] Genetic Algorithms     [R. Unger 93] 2019/4/16

10 Preliminaries(1/9) In general, PFP can be reduced to the following three steps: Design a simple model with a desired level of accuracy. (Dill) Define an energy function that can effectively discriminate native states form nonnative states. (Dill) Design an efficient algorithm to find minimal-energy conformations easily. 2019/4/16

11 Preliminaries(2/9) 2019/4/16

12 Preliminaries(3/9) Conformation process Input sequence (1D)
binding edges Input sequence (1D) 2D HP lattice model contact edges Ex: PPHHPPHHPPHPPHPP cost (free energy)=-5 2019/4/16

13 Preliminaries(4/9) 2019/4/16

14 Preliminaries(5/9) Proof.
Note that the least number of binding edges contributed by h H monomers occur when p1=H and pm=H . Without loss of generality ,assume that where P+ represents a substring composed by at least one P. Then, the number of binding edges contributed by h H monomers equals h1 + (h2+1) + (h3+1) + … + (hk-1+1) + hk = h+k-2 => h+1-2 = h □ 2019/4/16

15 Preliminaries(6/9) Proof. Note that every H monomers in has exactly four neighbors which contribute at most two contact edges. (endpoint contribute at most three) Therefore, any conformation in has at most contacts edges, the cost of a conformation equals -(number of contact edges),-(h+1) is a low bound of the cost 2019/4/16

16 Preliminaries(7/9) A conformation with the cost –(h+1), where s and t represent two endpoints of the path. 2019/4/16

17 Preliminaries(8/9) Proof. contact edges => 2(h’-1)+3 = 2h’+1
cost => -(2h’+1) 2019/4/16

18 Preliminaries(/99) h = The number of H monomers in p
Contact edges at most h+1 Cost at least - (h+1) (lemma 2) h’= Residues of numbers of H monomers in p Contact edges at most 2h’+1 Cost at least - (2h’+1) (theorem 1) 2019/4/16

19 New Branch and Bound(1/9)
Use case NBB method Folding to 2D/3D in the HP model Input Protein sequence (1D) Output time: faster cost: minimum 2019/4/16

20 New Branch and Bound(2/9)
Search Tree 2D lattice HP model H Energy function=> Cost = -1 P P P H 向右 向前 向左 P 開始方向 H P H H H H P H P Cost = -1 P H monomers的數量和集中度, 是影響cost之主因 2019/4/16

21 New Branch and Bound(3/9)
Fully branch It will consider all conformation completely Time cost is high Bound (pruning) It may be miss match some best solution Threshold is utilized to establish the boundary for bounding. 2019/4/16

22 New Branch and Bound(4/9)
Dynamic threshold R-step checking 2019/4/16

23 New Branch and Bound(5/9)
Dynamic threshold Use potential score function to predict the future cost in the residues monomers. Theorem 1 α: compensation coefficient 當α值很低代表目前的path無法取得較樂觀的cost,但不代表最後的cost不是solution,因此我們利用(2h’+1)*(1- α) 來幫它補強 。 2019/4/16

24 New Branch and Bound(6/9)
An illustration of potential score function The path from the root to v corresponds to a partial conformation formed by p1p2…pk, where v is associated with ps(k). In (a), we will branch out from v because ps(k) Cmin In (b), we will prune the branch v because ps(k)>Cmin 2019/4/16

25 New Branch and Bound(7/9)
R-step checking s D D D 若在半徑r的圓中,存在H(1)的vertex則代表在未來的r step的folding process中會有可能產生cost. H D r 2019/4/16

26 New Branch and Bound(8/9)
The concept of NBB method When, ratio=1 => with threshold ps(k) ratio=0 => with threshold –(Ek+h’+1) lemma1 2019/4/16

27 New Branch and Bound(9/9)
The algorithm of NBB method 2019/4/16

28 Experiment Result(1/6) PC: Parameter: Benchmark sequence CPU: 3.4 GHz
Memory: 512 MB Parameter: r = (2) ratio = (0.5) For the most experiments Benchmark sequence S1 - S10 (length from 20 ~ 100) ( eports/compbio/tortilla-hp-benchmarks.htm.) 2019/4/16

29 Experiment Result(2/6) 2019/4/16

30 Experiment Result(3/6) 2019/4/16

31 Experiment Result(4/6) Cost comparison of various algorithms 2019/4/16

32 Experiment Result(5/6) 2019/4/16

33 Experiment Result(6/6) 2019/4/16

34 Conclusion our method can obtain a near-optimal energy conformation by running once for each benchmark sequence Using protein biochemical property to construct the conformation are different from BB method. To extend the work to the 3D HP model. 2019/4/16


Download ppt "謝孫源 (Sun-Yuan Hsieh) 成功大學 電機資訊學院 資訊工程系"

Similar presentations


Ads by Google