DNA Self-Assembly Robert Schweller Northwestern University Speaking of Science talk Buena Vista University February 28, 2005
Outline Importance of DNA Self-Assembly Tile Self-Assembly Synthesis of Nanostructures DNA Computing Tile Self-Assembly DNA Word Design
Smart Bricks
Wang Tiles TILE
TILE
TILE G C A T C G C G T A G C
TILE G C A T C G C G T A G C
TILE
TILE
Super Small Circuits, Built Autonomously
Molecular-scale pattern for a RAM memory with demultiplexed addressing (Winfree, 2003)
DNA Computers + Output! Computer Program Input
DNA Computers + Output! Computer Program Input Program
DNA Computers + Output! Computer Program Input + Input Program
DNA Computers + Output! Computer Program Input + Output! Input Program
Outline Importance of DNA Self-Assembly Tile Self-Assembly (Generalized Models) Tile Complexity Shape Verification Error Resistance DNA Word Design
Tile Model of Self-Assembly (Rothemund, Winfree STOC 2000) Tile System: t : temperature, positive integer G: glue function T: tileset s: seed tile
How a tile system self assembles G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2 T =
How a tile system self assembles G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2 T =
How a tile system self assembles G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2 T =
How a tile system self assembles G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2 T =
How a tile system self assembles G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2 T =
How a tile system self assembles G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2 T =
How a tile system self assembles G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2 T =
How a tile system self assembles G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2 T =
How a tile system self assembles G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2 T =
New Models Multiple Temperature Model Flexible Glue Model temperature may go up and down Flexible Glue Model Remove the restriction that G(x, y) = 0 for x!=y Multiple Tile Model tiles may cluster together before being added Unique Shape Model unique shape vs. unique supertile
New Models Multiple Temperature Model Flexible Glue Model temperature may go up and down Flexible Glue Model Remove the restriction that G(x, y) = 0 for x!=y Multiple Tile Model tiles may cluster together before being added Unique Shape Model unique shape vs. unique supertile
New Models Multiple Temperature Model Flexible Glue Model temperature may go up and down Flexible Glue Model Remove the restriction that G(x, y) = 0 for x!=y Multiple Tile Model tiles may cluster together before being added Unique Shape Model unique shape vs. unique supertile
New Models Multiple Temperature Model Flexible Glue Model temperature may go up and down Flexible Glue Model Remove the restriction that G(x, y) = 0 for x!=y Multiple Tile Model tiles may cluster together before being added Unique Shape Model unique shape vs. unique supertile
Reduce Tile Complexity Focus Multiple Temperature Model Adjust temperature during assembly Flexible Glue Model Remove the restriction that G(x, y) = 0 for x!=y Goal: Reduce Tile Complexity
Our Tile Complexity Results Multiple temperature model: k x N rectangles: (our paper) beats standard model: (our paper) Flexible Glue: N x N squares: (our paper) (Adleman, Cheng, Goel, Huang STOC 2001) beats standard model:
Building k x N Rectangles k-digit, base N(1/k) counter: k N
Building k x N Rectangles k-digit, base N(1/k) counter: k If N is the kth power of some integer, then you choose a base that is big enough and then seed the counter to an appropriate value. Note that for k<<N, N^1/k dominates. N Tile Complexity:
Build a 4 x 256 rectangle: t = 2 S3 S2 S1 S g g g p C0 C1 C2 C3 S
t = 2 Build a 4 x 256 rectangle: S3 g S2 1 2 3 g S1 S g g g p C0 C1 C2 g S2 1 2 3 g S1 S g g g p C0 C1 C2 C3 S3 S2 S1 g g p S C1 C2 C3
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 g S1 S g g 1 1 S3 p r g S2 1 2 3 g S1 S g g g p C0 C1 C2 C3 S3 S2 S1 p S C1 C2 C3
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 g S1 S g g 1 1 S3 p r g S2 1 2 3 g S1 S g g g p C0 C1 C2 C3 S3 S2 g g S1 1 S C1 C2 C3
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 g S1 S g g 1 1 S3 p r g S2 1 2 3 g S1 S g g g p C0 C1 C2 C3 S3 S2 S1 1 p S C1 C2 C3 C0 C1 C2 C3
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 1 2 g S1 S 1 1 S3 p r g S2 1 2 3 1 2 g S1 S g g g p 2 3 C0 C1 C2 C3 S3 S2 S1 1 1 1 p S C1 C2 C3 C0 C1 C2 C3
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 1 2 g S1 p 1 1 S3 p r g S2 1 2 3 1 2 g S1 p r S g g g p 3 P R 2 3 p r C0 C1 C2 C3 S3 S2 S1 1 1 1 1 2 2 2 2 3 3 3 p S C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 1 2 g S1 p 1 1 S3 p r g S2 1 2 3 1 2 g S1 p r S g g g p 3 P R 2 3 p r C0 C1 C2 C3 S3 S2 S1 1 1 1 1 2 2 2 2 3 3 3 P S C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 1 2 g S1 p 1 1 S3 p r g S2 1 2 3 1 2 g S1 p r S g g g p 3 P R 2 3 p r C0 C1 C2 C3 S3 S2 1 S1 1 1 1 1 2 2 2 2 3 3 3 P S C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 1 2 g S1 p 1 1 S3 p r g S2 1 2 3 1 2 g S1 p r S g g g p 3 P R 2 3 p r C0 C1 C2 C3 S3 S2 1 S1 1 1 1 1 2 2 2 2 3 3 3 P R S C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 1 2 g S1 p 1 1 S3 p r g S2 1 2 3 1 2 g S1 p r S g g g p 3 P R 2 3 p r C0 C1 C2 C3 S3 S2 1 S1 1 1 1 1 2 2 2 2 3 3 3 P R … S C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 1 2 g S1 p 1 1 S3 p r g S2 1 2 3 1 2 g S1 p r S g g g p 3 P R 2 3 p r C0 C1 C2 C3 S3 S2 1 1 1 … S1 1 1 1 1 2 2 2 2 3 3 3 P R S C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2
t = 2 Build a 4 x 256 rectangle: g g 1 1 S3 p r g S2 1 2 3 1 2 g S1 p 1 1 S3 p r g S2 1 2 3 1 2 g S1 p r S g g g p 3 P R 2 3 p r C0 C1 C2 C3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 P 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 P 3 3 P R 1 1 1 1 2 2 2 2 3 3 3 P C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3
Building k x N Rectangles k-digit, base N(1/k) counter: k If N is the kth power of some integer, then you choose a base that is big enough and then seed the counter to an appropriate value. Note that for k<<N, N^1/k dominates. N Tile Complexity:
2-temperature model t = 4 3 1 3 3
2-temperature model t = 4 6
2-temperature model Kolmogorov Complexity Beats Standard Model (our paper) Kolmogorov Complexity (Rothemund, Winfree STOC 2000) Beats Standard Model (our paper)
Assembly of N x N Squares
Assembly of N x N Squares N - k k N - k k
Assembly of N x N Squares Complexity: N - k X (Adleman, Cheng, Goel, Huang STOC 2001) k N - k Y k
N x N Squares --- Flexible Glue Model Kolmogorov lower bounds: Standard (Rothemund, Winfree STOC 2000) Flexible Standard Glue Function Flexible Glue Function a b c d e f a 1 - - - - - b - 0 - - - - c - - 3 - - - d - - - 2 - - e - - - - 2 - f - - - - - 1 a b c d e f a 1 0 2 0 0 1 b 0 0 1 0 1 0 c 0 0 3 0 1 1 d 2 2 2 2 0 1 e 0 0 0 1 2 1 f 1 1 2 2 1 1
N x N Square --- Flexible Glue Model N – log N All the complexity is coming from that damn seed row! seed row log N
N x N Square --- Flexible Glue Model N – log N Complexity: All the complexity is coming from that damn seed row! seed row log N
N x N Square --- Flexible Glue Model goal: - seed binary counter to a given value - 1 1 1 1 1 1 1 1 1 1 1 All the complexity is coming from that damn seed row! 2 log N
N x N Square --- Flexible Glue Model 5 3 3 3 4 4 4 4 4 4 5 5 5 5 . . . 3 4 5 1 2 3 4 5 1 2 3 4 5 All the complexity is coming from that damn seed row!
N x N Square --- Flexible Glue Model key idea: 5 0 0 1 1 0 1 1 0 0 1 1 1 0 | | | | | | | | | | | | | 5 3 3 3 4 4 4 4 4 4 5 5 5 5 . . . 3 4 5 1 2 3 4 5 1 2 3 4 5 All the complexity is coming from that damn seed row!
N x N Square --- Flexible Glue Model G(b4, p5) = 1 G(b4, w5) = 0 5 p5 5 5 5 5 w5 b4 1 2 3 4 5
N x N Square --- Flexible Glue Model 5 given B = 011011 110101 010111 … encode B into glue function p5 b4 4 p0 p1 p2 p3 p4 p5 b0 0 1 1 0 1 1 b1 1 1 0 1 0 1 b2 0 1 0 1 1 1 b3 0 0 1 0 1 0 b4 0 0 0 0 0 1 b5 1 1 1 1 1 0 B = 011011 110101 010111 …
N x N Square --- Flexible Glue Model build block Complexity: 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 0 1 0 1
0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 1 1 0 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 1 0 0 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 0 0 0 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 0 1 0 1
N – log N 2 x log N block log N
N – log N N – log N log N log N
X N – log N Complexity: N – log N log N Y log N
Our Tile Complexity Results Multiple temperature model: k x N rectangles: (our paper) beats standard model: (our paper) Flexible Glue: N x N squares: (our paper) (Adleman, Cheng, Goel, Huang STOC 2001) beats standard model:
Molecular-scale pattern for a RAM memory with demultiplexed addressing (Winfree, 2003)
Outline Importance of DNA Self-Assembly Tile Self-Assembly (Generalized Models) Tile Complexity Shape Verification Error Resistance DNA Word Design
Shape Verification Unique Shape Problem Input: T, a tile system S, a shape Question: Does T uniquely assemble S. Standard: P (Adleman, Cheng, Goel, Huang, Kempe, Flexible Glue: P Espanes, Rothemund, STOC 2002) Unique Shape: Co-NPC (our paper) Multiple Temperature: NP-hard (our paper) Multiple Tile: NP-hard (our paper)
3-SAT Problem Clause 1: Clause 2: Clause 3:
Unique-Shape Model *
Unique-Shape Model * x3 x2 x1 *
Unique-Shape Model * x3 x2 x1 * * c1 c2 c3 *
Unique-Shape Model * 1 x x3 x x x2 x x1 x * * c1 c2 c3 *
Unique-Shape Model * x3 1 x2 1 x1 * * c1 c2 c3 *
Unique-Shape Model * x3 1 x2 1 x1 c1 * * c1 c2 c3 *
Unique-Shape Model * x3 1 x2 1 ok x1 c1 * * c1 c2 c3 *
Unique-Shape Model * x3 1 ok x2 1 ok x1 c1 * * c1 c2 c3 *
Unique-Shape Model * x3 1 ok x2 1 ok x1 c1 c2 * * c1 c2 c3 *
Unique-Shape Model * x3 1 ok x2 1 ok c2 x1 c1 c2 * * c1 c2 c3 *
Unique-Shape Model * x3 1 ok ok x2 1 ok c2 x1 c1 c2 * * c1 c2 c3 *
Unique-Shape Model * x3 1 ok ok x2 1 ok c2 x1 c1 c2 ok * * c1 c2 c3 *
Unique-Shape Model * x3 1 ok ok ok x2 1 ok c2 ok x1 c1 c2 ok * * c1 c2 c1 c2 ok * * c1 c2 c3 *
Unique-Shape Model * * x3 1 ok ok ok * x2 1 ok c2 ok * x1 c1 c2 ok * * c1 c2 ok * * * c1 c2 c3 *
Unique-Shape Model * * T x3 1 ok ok ok * x2 1 ok c2 ok * x1 c1 c2 ok * c1 c2 ok * * * c1 c2 c3 *
Unique-Shape Model * * T T x3 1 ok ok ok * x2 1 ok c2 ok * x1 c1 c2 ok c1 c2 ok * * * c1 c2 c3 *
Unique-Shape Model * * T T T x3 1 ok ok ok * x2 1 ok c2 ok * x1 c1 c2 c1 c2 ok * * * c1 c2 c3 *
Satisfied Unique-Shape Model * * T T T SAT x3 1 ok ok ok * x2 1 ok c2 c1 c2 ok * * * c1 c2 c3 * Satisfied (LaBean and Lagoudakis, 1999)
Satisfied Unique-Shape Model * * T T T SAT * * x3 1 ok ok ok * x3 ok ok c2 ok * x2 1 ok c2 ok * x2 1 ok c2 ok * x1 c1 c2 ok * x1 c1 c2 ok * * * c1 c2 c3 * * * c1 c2 c3 * Satisfied (LaBean and Lagoudakis, 1999)
Satisfied Unique-Shape Model * * T T T SAT * * T x3 1 ok ok ok * x3 ok ok c2 ok * x2 1 ok c2 ok * x2 1 ok c2 ok * x1 c1 c2 ok * x1 c1 c2 ok * * * c1 c2 c3 * * * c1 c2 c3 * Satisfied (LaBean and Lagoudakis, 1999)
Satisfied Unique-Shape Model * * T T T SAT * * T F x3 1 ok ok ok * x3 ok c2 ok * x2 1 ok c2 ok * x2 1 ok c2 ok * x1 c1 c2 ok * x1 c1 c2 ok * * * c1 c2 c3 * * * c1 c2 c3 * Satisfied (LaBean and Lagoudakis, 1999)
Not Satisfied Satisfied Unique-Shape Model * * T T T SAT * * T F F x3 1 ok ok ok * x3 ok c2 ok * x2 1 ok c2 ok * x2 1 ok c2 ok * x1 c1 c2 ok * x1 c1 c2 ok * * * c1 c2 c3 * * * c1 c2 c3 * Satisfied Not Satisfied (LaBean and Lagoudakis, 1999)
Multiple Temperature Model * * * * * * * * * * x3 x3 x2 x2 x1 x1 * * c1 c2 c3 * * * c1 c2 c3 * Satisfied Not Satisfied
Multiple Temperature Model * * * * * * * * * T T T T SAT * T T F F NO x3 1 ok ok ok * x3 ok c2 ok * x2 1 ok c2 ok * x2 1 ok c2 ok * x1 c1 c2 ok * x1 c1 c2 ok * * * c1 c2 c3 * * * c1 c2 c3 * Satisfied Not Satisfied
Multiple Temperature Model * * * * * * * * * T T T T SAT * T T F F NO x3 1 ok ok ok * x3 ok c2 ok * x2 1 ok c2 ok * x2 1 ok c2 ok * x1 c1 c2 ok * x1 c1 c2 ok * * * c1 c2 c3 * * * c1 c2 c3 * Satisfied Not Satisfied
Multiple Temperature Model * * * * * * * * * * x3 x3 x2 x2 x1 x1 * * Satisfied Not Satisfied
Unique Shape Problem Results Standard P Flexible Glue P Multiple Temperature NP-hard Unique Shape Co-NPC Multiple Tile NP-hard (Adleman, Cheng, Goel, Huang, Kempe, Espanes, Rothemund, STOC 2002) (our paper) (our paper) (our paper)
Outline Importance of DNA Self-Assembly Tile Self-Assembly (Generalized Models) Tile Complexity Shape Verification Error Resistance DNA Word Design
Further Research t = 2 Error Resistance: Insufficient Bindings Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research t = 2 Error Resistance: Insufficient Bindings Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research t = 2 Error Resistance: Insufficient Bindings Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research t = 2 Error Resistance: Insufficient Bindings Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research t = 2 Error Resistance: Insufficient Bindings Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research t = 2 Error Resistance: Insufficient Bindings Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research t = 2 Error Resistance: Insufficient Bindings Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research Error Resistance: Insufficient Bindings Standard Fluctuating b temperature Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity a
Further Research 2 1 Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research 2 1 Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research 2 1 Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Further Research 2 1 Multiple temperature model raising and lowering temperature multiple times monotonically increasing temperatures Time complexity versus tile complexity multiple tile model and time complexity
Outline Importance of DNA Self-Assembly Tile Self-Assembly (Generalized Models) DNA Word Design
DNA Word Design 1 2 3 4 5 6 7 8 9 3 4 ACCT TGGA GCTA CGAT 5
DNA Word Design 1 2 3 4 5 6 7 8 9 green: red: yellow: blue: purple: white: black: teal: ACCT GAAA GCTA CGTA CTCG CATG ACGA TTTA Must be sufficiently different -Must have similar thermodynamic properties -Must be short
Hamming Constraint (k) ACCTGAGAGAGCTCGCGCAGCTGGCTCATTAGCAGACTGACAGCTTCGTAGCATAGATAGCTGCATCGATTGCTAGCGTCAAGCAGCATTATAGATACGCCCGTAGACTCGATCGAGTAGATCGATCGACGTAGGCTTTGCTGATGATTAGGCGTTCAGCTGCGGCTATCGATGCGTAGCTAGAGTGCTGCTAGCTAGCTAGTCACTCGATCGACTAGCTTCGATTAGCCGCGTAGCTGACTAGTCGATCAGTCGCGCTTATATATATCGTAGTCTAGTCTACGATCGCTAGTC X= GCTTCGTAGCATAG | | | Y= TTAGCCGCGTAGCT n strings HAMM(X,Y) = 11 > k length L = 14
Free Energy Constraint A C G T A 2 1 5 3 C 7 2 6 9 G 1 1 3 1 T 8 7 4 2 ACCTGAGAGAGCTCGCGCAGCTGGCTCATTAGCAGACTGACAGCTTCGTAGCATAGATAGCTGCATCGATTGCTAGCGTCAAGCAGCATTATAGATACGCCCGTAGACTCGATCGAGTAGATCGATCGACGTAGGCTTTGCTGATGATTAGGCGTTCAGCTGCGGCTATCGATGCGTAGCTAGAGTGCTGCTAGCTAGCTAGTCACTCGATCGACTAGCTTCGATTAGCCGCGTAGCTGACTAGTCGATCAGTCGCGCTTATATATATCGTAGTCTAGTCTACGATCGCTAGTC Pairwise free energies = n strings length L = 14
Free Energy Constraint A C G T A 2 1 5 3 C 7 2 6 9 G 1 1 3 1 T 8 7 4 2 ACCTGAGAGAGCTCGCGCAGCTGGCTCATTAGCAGACTGACAGCTTCGTAGCATAGATAGCTGCATCGATTGCTAGCGTCAAGCAGCATTATAGATACGCCCGTAGACTCGATCGAGTAGATCGATCGACGTAGGCTTTGCTGATGATTAGGCGTTCAGCTGCGGCTATCGATGCGTAGCTAGAGTGCTGCTAGCTAGCTAGTCACTCGATCGACTAGCTTCGATTAGCCGCGTAGCTGACTAGTCGATCAGTCGCGCTTATATATATCGTAGTCTAGTCTACGATCGCTAGTC Pairwise free energies = n strings X= AGCATTATAGATAC FE(X) = 5+1+7+... length L = 14
Free Energy Constraint A C G T A 2 1 5 3 C 7 2 6 9 G 1 1 3 1 T 8 7 4 2 ACCTGAGAGAGCTCGCGCAGCTGGCTCATTAGCAGACTGACAGCTTCGTAGCATAGATAGCTGCATCGATTGCTAGCGTCAAGCAGCATTATAGATACGCCCGTAGACTCGATCGAGTAGATCGATCGACGTAGGCTTTGCTGATGATTAGGCGTTCAGCTGCGGCTATCGATGCGTAGCTAGAGTGCTGCTAGCTAGCTAGTCACTCGATCGACTAGCTTCGATTAGCCGCGTAGCTGACTAGTCGATCAGTCGCGCTTATATATATCGTAGTCTAGTCTACGATCGCTAGTC Pairwise free energies = n strings X= AGCATTATAGATAC FE(X) = 5+1+7+... For all strings X and Y: |FE(X) – FE(Y)| < C length L = 14
DNA Word Design Word Design Problem Input: integers n and k Output: n strings of length L such that for all strings X and Y: 1) HAMM(X,Y) > k 2) |FE(X) – FE(Y)| < C Minimize L
DNA Word Design Simple Lower Bound: L > log n L > k L > ½(k + log n)
DNA Word Design Word Length: Run-Time:
DNA Word Design Hamming Constraint k: -Set L = 5*(k + log n) -Generate all random strings Pr[FAILURE] = All Random length L = 5*(k+log n)
Free Energy Constraint: length L = O(k+log n)
Free Energy Constraint: All length L strings n length L = O(k+log n)
Free Energy Constraint: Low FE All length L strings n length L = O(k+log n)
Free Energy Constraint: Low FE All length L strings n High FE length L = O(k+log n)
Free Energy Constraint: Low FE All length L strings n High FE length L = O(k+log n)
Free Energy Constraint: All length L strings n length L = O(k+log n) Fact: Strings can be chosen to satisfy the Free Energy Constraint
Free Energy Constraint: For each string X: a < FE(X) < b n How do you get these strings? length L = O(k+log n)
Free Energy Constraint: Given:
Free Energy Constraint: Given: Find:
Free Energy Constraint: Problem: 4^L length L strings Given: Find: a < FE < b Problem: 4^L length L strings
Free Energy Constraint: Fixed Energy String Problem Input: Length L, Energy E Output: a string with: 1) length L 2) free energy E
Free Energy Constraint: Consider bases a,b in {A,C,G,T} ci = # of length L strings such that: 1) FE = i 2) First character is a 3) Last Character is b a b L
fLa,b, fL/2a,b, fL/4a,b, …, f1a,b for all What if we knew… fLa,b, fL/2a,b, fL/4a,b, …, f1a,b for all a,b in {A,C,G,T}
fLa,b, fL/2a,b, fL/4a,b, …, f1a,b for all What if we knew… fLa,b, fL/2a,b, fL/4a,b, …, f1a,b for all a,b in {A,C,G,T} a b L
fLa,b, fL/2a,b, fL/4a,b, …, f1a,b for all What if we knew… fLa,b, fL/2a,b, fL/4a,b, …, f1a,b for all a,b in {A,C,G,T} a c d b FEc,d L/2 L
fLa,b, fL/2a,b, fL/4a,b, …, f1a,b for all What if we knew… fLa,b, fL/2a,b, fL/4a,b, …, f1a,b for all a,b in {A,C,G,T} SOLUTION: in O(L log L) time complexity a c d b FEc,d L/2 L
Recursive Property: a c d b FEc,d L/2 L
Recursive Property: T(L) = a c d b FEc,d L/2 L
Recursive Property: T(L) = T(L/2) + a c d b FEc,d L/2 L
Recursive Property: T(L) = T(L/2) + L log L a c d b FEc,d L/2 L
T(L) = T(L/2) + L log L = O(L log L) FEc,d Recursive Property: a c d b
Summary for Word Design Hamming Constraint (k): -Randomly generate words of length L = O(k + log n) n length L = O(k+log n)
Summary for Word Design Hamming Constraint (k): -Randomly generate words of length L = O(k + log n) Free Energy Constraint: -Append new strings n length L = O(k+log n)
Summary for Word Design Hamming Constraint (k): -Randomly generate words of length L = O(k + log n) Free Energy Constraint: -Append new strings Run-Time: n Word Length: length L = O(k+log n)
Questions? DNA Self-Assembly Importance of DNA Self-Assembly Tile Self-Assembly DNA Word Design Questions?