Download presentation
Presentation is loading. Please wait.
1
DNA computing Solving Optimization problems on a DNA computer Ka-Lok Ng Dept. of Bioinformatics Taichung Healthcare and Management University
2
Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective
3
Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective
4
Why consider DNA computing ? Essences of computation 1.Massive Parallelism A test tube can contains 10 22 DNA strands, each reaction take place independently. 2.Number of operations/sec. Silicon-based computer is much better, ~ 10 operations/sec DNA computing needs human interception. 3.Extreme large associative memory Memory density ~ 1 bit/(nm) 3 >> video tape ~ 1 bit/10 12 (nm) 3 Human synapses ~ 10 14, each store a few bits. Associative memory – match a sub-sequence Store000000…. 110100…. 000111…. Input seq.*1*1…… retrieve & read the 2 nd strand
5
Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective
6
Basic Molecular Biology 5’ 3’ A G T C C ……………………. T C A G G …………………… 3’ 5’ The DNA Double Helix Purines(Double ring structure)Adenine, AGuanine, G Pyrimidines (Single ring structure) Thymine, TCytosine, C (A,T) Watson-Crick complement (C,G) pairwise attraction Hydrogen boning
7
Basic DNA Operations 1.DNA synthesiser – make arbitrary DNA strands, time ~ hours Notation : ATGC = 5’-ATGC-3’ (ATGC) C = 3’-TACG-5’ 2.Hybridization (annealing), by hydrogen boning, time ~ 30 sec. it is a 2 nd order kinetic reaction 3.Denature – by heating till the longest strands unstable, dsDNA ssDNA 4.Ligation x y x C y C x y x C y C Ligase enzyme
8
Basic DNA Operations 5.Polymerase Extensions Polymerase enzyme attached to the 3’ end of the promer seq. & construct the x C of the longer seq. 3‘ 5‘ 3‘ Primer 3‘ 5‘ Polymerase enzyme x xCxC
9
Basic DNA Operations 6. Cut – Type II Restriction enzyme (endonucleases), cut ssDNA or dsDNA strand at a specific sub-seq., usually a 4 to 8 nucleotides seq. NomenclatureSpecific-siteExpected freq. EcoRI5’-GAATTC-3’4 6 = 4096 HaeIII5’-GGCC-3’4 4 = 256 PstI5’-CTGCAG-3’4 6 = 4096 HaeIIIPstIEcoRI GGCCCTGCAGGAATTC CCGGGACGTCCTTAAG Blunt3’-protruding5’-protruding
10
Basic DNA Operations 7.Merge – combine two or more test tubes of DNA solution 8.Separation by length – Gel Electrophoresis Agarose Gel Electrophoresis (AGE) Long seq. – 300 ~ 50,000 bp, t ~ 5 hours Polyacrylamide Gel Electrophoresis (PAGE) Short seq. – 1 ~ 1000 bp, t ~ 1 hour - gel buffer + gel buffer - + Shortest DNA strands
11
High Level Manipulations 1.Polymerase Chain Reaction (PCR) – Amplification developed by Kary Mullis (Noble Prize in medicine 1994) Prepare Primers x y z x C y C z C 3‘ 5‘ 3‘ xCxC zCzC zCzC 5‘ 3‘ xCxC 5‘3‘ Template
12
High Level Manipulations 1.Polymerase Chain Reaction (PCR) Polymerase dsDNA melting polymerase …… Repeat the above two processes amplification x y z xCxC 3‘ 5‘ polymerase z zCzC 5‘ 3‘ polymerase
13
High Level Manipulations 2.Separation by sub-sequence – by magnetic bead (affinity purification) 3.Append 3.1 Polymerase 3.2 Ligation s sCsC magnet A primer with attached bead anneal a short seq. s sCsC xy x C y C 3‘ 5‘ 3‘ x y
14
High Level Manipulations 4.Mark – for separation or operate selectively 4.1 appending a tag seq. 4.2 methylation or (de)phosphorylation 4.3 forming a dsDNA through hybridization or the action of polymerase 5.Unmark – removes the mark on the strand append a tag seq. ssDNA 3‘3‘ 5‘ Methylation or (de)phosphorylation of the 5’ end. Carry out by specific enzymes, it can Stop some restriction enzymes cutting to the Site.
15
Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective
16
Molecular Computation - Solved problems Hamiltonian Path Problem 3.1Directed Hamiltonian Path Problem (DHPP) Adleman, Science, 266, 1024 (1994) Problem : Given 7 cities, is there a unique path every cities visit once ? 4 31 06 25 A possible solution : 0 1 2 3 4 5 6 Algorithm : 1.For each vertex V and edge E, create a 20-mer DNA strand V : x 0 y 0, x 1 y 1,……, x 6 y 6 and x 0 c,y 6 c E : y 0 c x 1 c, y 1 c x 2 c,…., y 5 c x 6 c, y 0 c x 3 c, y 0 c x 6 c,….. inout
17
Molecular Computation - Solved problems Hamiltonian Path Problem Algorithm (DHPP) 2.Hybridization – possibility of forming the following DNA strands Not x 0 begin, y 6 end ( 1 2 3 4 5 6 ) x 0 begin, not y 6 end ( 0 1 2 3 4 5 ) x 0 begin, y 6 end but not visit every cities once ( 0 3 2 3 4 5 6 ) consider to be the noise x 0 begin, y 6 end, and visit every cities once ( 0 1 2 3 4 5 6 )
18
Molecular Computation - Solved problems Hamiltonian Path Problem Algorithm (DHPP) 3.Separation Select those dsDNA start with x 0 and end with y 6 Amplify – use PCR to amplify the above type of DNA strands 4.Separate out all dsDNA that go through exactly 7 vertices (140-mer), by PAGE, for N<150, d = a – b ln N then amplify by PCR d N a
19
Molecular Computation - Solved problems Hamiltonian Path Problem Algorithm (DHPP) 5.Separate out all DNA strands go through all 7 cities – by affinity purification Melt the dsDNA strands from above Extract by affinity purification …………. 6.Detect if there are any DNA strands remain, Yes solution of the DHPP No No solution of the DHPP y 0 c x 1 c with attached bead y 5 c x 6 c with attached bead
20
Molecular Computation - Solved problems Hamiltonian Path Problem StepTime Create DNA strands Hybridization~30 sec PCR~ 2 hrs. Gel Electrophoresis~ 5 hrs. (AGE), ~ 1.2 hrs. (PAGE) Affinity Purification7 times ~ 1 hrs. Detect~ sec. Total~ 7 days !!
21
Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective
22
Molecular Computation - Solved problems Boolean formula Boolean Formula, B In particular, consider the Conjunctive Normal Form (CNF) B = C 1 C 2 C 3 … C m where C k = x 1 x 2 x 3 ’… C is called the clause, x is called the literal, is the logical AND, is the logical OR and x’ is the negation of x B = (x 1 x 2 ) ( x 1 x 2 x 3 ’ ) … C m Satisfiability Problem ( B True ) Determine a set of of the logical variables (x 1, x 2, x 3 …) such that B T
23
Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) Example : B = (x 1 x 2 ) ( x 1 ’ x 2 ’ ) x 1 x 2 x 1 x 2 (x 1 x 2 ) (x 1 ’ x 2 ’) x 1 ’ x 2 ’ x 1 ’ x 2 ’ 0 0 0 0 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 1 0 0 0 0
24
Molecular Computation - Solved problems Boolean formula 3.2Boolean Formula Lipton, Science, 268, 542 (1995) Encode an n bit binary number by a graph, G n x 1 x 2 x n a 1 a 2 a 3 …………a n a n+1 x 1 ’x 2 ’x n ’ Notation : x=1 True, x’=0 False, vertex a i, Edges E aixi, E aixi’, E xiai+1,E xi’ai+1 a 1 x 1 a 2 x 2 ’a 3 encode binary number 10 In general, graph G n represent {0,1} n X X
25
Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) 1.Create DNA strands to encode vertices and edges x 1 x 2 x n a 1 a 2 a 3 …………… a n a n+1 x 1 ’ x 2 ’ x n ’ Vertex (3n+1 strands)Edge (4n strands) a 1 E a1x1. a n+1 E a1x1’ x 1 E x1a2 x 1 ’ E x1’a2 p a1 q a1 5‘3‘ p an+1 q an+1 5‘3‘ q a1 c p x1 c 5‘3‘ q a1 c p x1’ c 5‘3‘ q x1 c p a2 c 5‘3‘ q x1’ c p a2 c 5‘3‘ p x1 q x1 5‘3‘ p x1’ q x1’ 5‘3‘
26
Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) 2.Hybridization A path V-E-V-E-V……..V denote an n bit binary number Example : a path a 1 x 1 a 2 x 2 ’a 3 denote a 2 bit binary number, 10 p a1 q a1 5‘ p x1 q x1 p x2’ q x2’ p a2 q a2 p a3 q a3 q a1 c p x1 c q x1 c p a2 c q a2 c p x2’ c q x2’ c p a3 c
27
Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) 3.Extraction Define E(t,i,a) to represent extracting test tube t, where the ith position has Boolean value, a = 0 or 1. OR – are done by using multiple tubes AND – are done by repeated extraction
28
Molecular Computation - Solved problems Boolean formula Algorithm (Boolean Formula) 3.Extraction Test tubeOperationValue present t0t0 Create DNA strands00, 01, 10, 11 t1t1 E(t 0,1,1)10, 11 t1’t1’Remainder of t 1 00, 01 t2t2 E(t 1 ’,2,1)01 t3t3 Merge, t 1 t 2 10, 11, 01 T t 4 (need to remove 11)E(t 3,1,0)01 t4’t4’Remainder of t 4 10, 11 t5t5 E (t 4 ’,2,0)10 t6t6 Merge, t 4 t 5 01, 10
29
Molecular Computation - Solved problems Boolean formula Algorithm (Boolean CNF Formula) 1.Create DNA strands to encode all n bit binary number 2.Hybridization 3.Extraction Let t k be the test tube satisfies C 1 C 2 C 3 … C k and let C k+1 = x a x a+1 … x m (where x is either 0 or 1), for simplification consider C k+1 = x a x a+1 E(t k,a,1) x a =1 T 1a T 1a R E(T 1a R,a+1,1) x a+1 =1 T a+1 T 1a T a+1 satisfies C k+1 E(t k,a,0) x a =0 T 0a E(T 0a,a+1,1) x a+1 =1 T a+1 T 0a R T 0a R T a+1 satisfies C k+1
30
Molecular Computation - Solved problems Integer knapsack problem Integer knapsack problem Given a set of integers a i and integer A, does there exist a subset S {1,…n}, s.t. i S ≦ A. 1.To solve this problem, make use of the synthesis, annealing and merging operations. 2.Prepare a starter, S, one strand end is blunt and blocked by 5’- biotinylation and the other end is sticky. 3.Use DNA double strands to encode integers a 1 ….a n. With length proportional to the magnitude and both ends are sticky. B S a1a1 xCxC x xCxC
31
Molecular Computation - Solved problems Integer knapsack problem 4.Generation of all possible combinations, 2 n, by concatenation of the DNA strands. 5.The final DNA solution consisting of 2 n different DNA double strands; the final answer is to check if the solution containing strands with length equal to A by agarose gel electrophoresis.
32
Molecular Computation - Solved problems Integer knapsack problem Limitation : This brute-force algorithm has an exponential time- complexity, O(2 n ). The concept of encoding all possible solutions by DNA strands is suffered from the exponential growth in the size of the solution space, for instance, a 70 cities of the DHPP will fit in a milliliter of solution (10 20 DNA strands). Hence, people consider to develop a parallel computation model. Dynamic programming approach 1.Parallelism : parallel algorithm, because of the principle of optimality applied, hence, a DNA computer might be useful for solving large instances of problems. 2.For the integer knapsack problem : the worst-case time complexity is O(minimum(2 n, nA)) [Ref. 1].
33
Molecular Computation - Solved problems Integer knapsack problem Given a set of integers w 1,w 2 ……w n and W, with the corresponding profit integers p 1,p 2 ……p n, is it exist that a sub-set S {1,2,….n}, that satisfy i S w i ≦ W max and maximize i S p i. Dynamic programming Solution Let f i (x) be the optimal solution to the integer knapsack problem, f i (x) = max { f i-1 (x), p n + f n-1 (x-w n ) } where x is the capacity remaining, and f i (x) = 0 for x>0 and f i (x) = ﹣ ∞ for x 0. Notice that f i (x) is an ascending function, i.e. 0=x 1 x 2 ….. x n , f i (x 1 ) f i (x 2 ) ….. f i (x k ) ; f i (x) = ﹣ ∞ , x x 1 ; f i (x) = f i (x k ) , x x k ; and f i (x) = f i (x j ) , x j x x j+1 。 To solve this problem, we make use of the method suggested by Horowitz etc. [Ref. 2] to compute f i (x j ) for 1 j k. Let the ordered set S 1 i = { ( P,W ) | ( P - p i+1, W - p i+1 ) S i } to represent f i (x), where P = f i (x j ) , W = x j and S 0 ={ ( 0,0 ) }.
34
Molecular Computation - Solved problems Integer knapsack problem S i+1 can be computed from S i by first computing S 1 i = { ( P,W ) | ( P - p i+1, W - w i+1 ) S i } where S i+1 = S i ∪ S 1 i. If S i contains ( P j,W j ) and ( P k,W k ) with P j P k and W j W k then the pair ( P j,W j ) can be discarded from S i, and this condition is known as the dominance rules. For example, consider the case n=3, (w 1, w 2, w 3 )=(2,3,4), (p 1, p 2, p 3 )=(1,2,5) and W max =6. For this case, we have S 0 ={(0,0)}; S 1 0 ={(1,2)} S 1 ={(0,0),(1,2)}; S 1 1 ={(2,3),(3,5)} S 2 ={(0,0),(1,2), (2,3),(3,5)}; S 1 2 ={(5,4),(6,6),(7,7),(8,9)} S 3 ={(0,0),(1,2), (2,3),(3,5),(5,4),(6,6),(7,7),(8,9)} The pair (3,5) is discarded because of the dominance rules.
35
Molecular Computation - Solved problems Integer knapsack problem Implementation of dynamic programming Consider the case n = 3, (w 1, w 2, w 3 ) = (2,3,4), (p 1, p 2, p 3 ) = (1,2,5) and W max = 6. DNA OperationTest Tubes, T P and T W S 0 = {(0,0)} CopyS 0 = {(0,0)} Addition : (p,w) = (1,2)S 0 1 = {(1,2)} Merge S 1 = S 0 S 0 1 = {(0,0), (1,2)} CopyS 1 = {(0,0), (1,2)} Addition: (p,w) = (2,3)S 1 1 = {(2,3), (3,5)} Merge S 2 = S 1 S 1 1 = {(0,0), (1,2), (2,3), (3,5)} CopyS 2 = {(0,0), (1,2), (2,3), (3,5)} Addition: (p,w) = (5,4)S 2 1 = {(5,4), (6,6), (7,7), (8,9)} Merge S 3 = S 2 S 2 1 = {(0,0), (1,2), (2,3), (3,5), (5,4), (6,6), (7,7), (8,9)}
36
Molecular Computation - Solved problems Integer knapsack problem Implementation of dynamic programming Difficulties 1.Do not know how to communicate between DNA strands. This operation is required in order to match P k and W k. 2.Do not know how to compare numbers between DNA strands. This operation is required in order to test the dominance rules.
37
Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective
38
Limitations and Errors 1.DNA synthesis ~ 90% efficiency 2.Long strands of DNA decay quickly, 10000 is the maximum base length can be kept in vitro without significant breakage. 3.Extraction A good path were lost during extract Take a bad path as if a good one 4.Undesirable hybridization 5.Seq. s could anneal with a similar seq. s c
39
Content 1.Why consider DNA computing ? 2.Basic molecular biology & Basic DNA operations 3.Molecular Computation - Solved problems 3.1 Hamiltonian Path Problem 3.2 Boolean formula 3.3 Integer knapsack problem 4.Limitations and Errors 5.Prospective
40
Prospective 1.There appears little theoretical difficulty in creating a functional DNA computer. 2.Depend on finding killer applications uniquely suitable for computation by DNA. 3.Improvements in reducing errors and operation costs.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.