Complexities for Generalized Models of Self-Assembly Gagan Aggarwal Stanford University Michael H. Goldwasser St. Louis University Ming-Yang Kao Northwestern University Robert T. Schweller Northwestern University Some results were obtained independantly by Cheng, Espanes 2003
Outline Importance of DNA Self-Assembly –Synthesis of Nanostructures –DNA Computing Tile Self-Assembly DNA Word Design
TILE
G C A T C G C G T A G C
TILE G C A T C G C G T A G C
TILE
Super Small Circuits, Built Autonomously
Molecular-scale pattern for a RAM memory with demultiplexed addressing ( Winfree, 2003)
DNA Computers Computer Program + Input Output!
DNA Computers Computer Program + Input Output! Program
DNA Computers Computer Program + Input Output! Program Input +
DNA Computers Computer Program + Input Output! Program Input + Output!
Outline Importance of DNA Self-Assembly Tile Self-Assembly (Generalized Models) –Tile Complexity –Shape Verification –Error Resistance DNA Word Design
Tile Model of Self-Assembly (Rothemund, Winfree STOC 2000) Tile System: t : temperature, positive integer G: glue function T: tileset s: seed tile
How a tile system self assembles T = G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2
How a tile system self assembles T = G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2
How a tile system self assembles T = G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2
How a tile system self assembles T = G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2
How a tile system self assembles T = G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2
How a tile system self assembles T = G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2
How a tile system self assembles T = G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2
How a tile system self assembles T = G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2
How a tile system self assembles T = G(y,y) = 2 G(g,g) = 2 G(r, r) = 2 G(b,b) = 2 G(p,p) = 1 G(w,w) = 1 t = 2
New Models Multiple Temperature Model –temperature may go up and down Flexible Glue Model –Remove the restriction that G(x, y) = 0 for x!=y Multiple Tile Model –tiles may cluster together before being added Unique Shape Model –unique shape vs. unique supertile
New Models Multiple Temperature Model –temperature may go up and down Flexible Glue Model –Remove the restriction that G(x, y) = 0 for x!=y Multiple Tile Model –tiles may cluster together before being added Unique Shape Model –unique shape vs. unique supertile
New Models Multiple Temperature Model –temperature may go up and down Flexible Glue Model –Remove the restriction that G(x, y) = 0 for x!=y Multiple Tile Model –tiles may cluster together before being added Unique Shape Model –unique shape vs. unique supertile
New Models Multiple Temperature Model –temperature may go up and down Flexible Glue Model –Remove the restriction that G(x, y) = 0 for x!=y Multiple Tile Model –tiles may cluster together before being added Unique Shape Model –unique shape vs. unique supertile
Focus Multiple Temperature Model –Adjust temperature during assembly Flexible Glue Model –Remove the restriction that G(x, y) = 0 for x!=y Goal: Reduce Tile Complexity
Our Tile Complexity Results Multiple temperature model: k x N rectangles: beats standard model: Flexible Glue: N x N squares: beats standard model: ( Adleman, Cheng, Goel, Huang STOC 2001 ) (our paper)
Building k x N Rectangles k-digit, base N (1/k) counter: k N
Building k x N Rectangles k-digit, base N (1/k) counter: Tile Complexity: N k
S C1C1 C2C2 C3C3 0 ggp Build a 4 x 256 rectangle: t = 2 C0C0 g S3S3 S2S2 0 0 S S1S1
S C1C1 C2C2 C3C ggp Build a 4 x 256 rectangle: t = 2 C0C0 g S3S3 S2S g g SC1C1 C2C2 C3C3 S1S1 S2S2 S3S3 S1S ggp
S C1C1 C2C2 C3C ggp Build a 4 x 256 rectangle: t = 2 C0C0 g gg S3S3 S2S g g pr 0 SC1C1 C2C2 C3C3 S1S1 00 S2S2 S3S S1S1 p
S C1C1 C2C2 C3C ggp Build a 4 x 256 rectangle: t = 2 C0C0 g gg S3S3 S2S g g pr 0 SC1C1 C2C2 C3C3 S1S S2S2 S3S S1S1 gg
S C1C1 C2C2 C3C ggp Build a 4 x 256 rectangle: t = 2 C0C0 g gg S3S3 S2S g g pr 0 SC1C1 C2C2 C3C3 S1S S2S2 S3S S1S1 C0C0 C1C1 C2C2 C3C p
S C1C1 C2C2 C3C ggp Build a 4 x 256 rectangle: t = 2 C0C0 g gg S3S3 S2S g g pr 0 SC1C1 C2C2 C3C3 S1S S2S2 S3S S1S1 C0C0 C1C1 C2C2 C3C p
S C1C1 C2C2 C3C ggp 12 P p 3 Build a 4 x 256 rectangle: t = 2 C0C0 g gg R 0 p r r S3S3 S2S g g pr 0 SC0C0 C1C1 C2C2 C3C3 S1S C1C1 C2C2 C3C S2S2 S3S C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 01 S1S1 p
S C1C1 C2C2 C3C ggp 12 P p 3 Build a 4 x 256 rectangle: t = 2 C0C0 g gg R 0 p r r S3S3 S2S g g pr 0 SC0C0 C1C1 C2C2 C3C3 S1S C1C1 C2C2 C3C S2S2 S3S P C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 01 S1S1
S C1C1 C2C2 C3C ggp 12 P p 3 Build a 4 x 256 rectangle: t = 2 C0C0 g gg R 0 p r r S3S3 S2S g g pr 0 SC0C0 C1C1 C2C2 C3C3 S1S C1C1 C2C2 C3C S2S2 S3S P 01 C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 01 S1S1
S C1C1 C2C2 C3C ggp 12 P p 3 Build a 4 x 256 rectangle: t = 2 C0C0 g gg R 0 p r r S3S3 S2S g g pr 0 SC0C0 C1C1 C2C2 C3C3 S1S C1C1 C2C2 C3C S2S2 S3S P 01 C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 01 S1S1 R
S C1C1 C2C2 C3C ggp 12 P p 3 Build a 4 x 256 rectangle: t = 2 C0C0 g gg R 0 p r r S3S3 S2S g g pr 0 SC0C0 C1C1 C2C2 C3C3 S1S C1C1 C2C2 C3C S2S2 S3S P 01 C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 01 S1S1 R C0C0 C1C1 C2C2 …
S C1C1 C2C2 C3C ggp 12 P p 3 Build a 4 x 256 rectangle: t = 2 C0C0 g gg R 0 p r r S3S3 S2S g g pr 0 SC0C0 C1C1 C2C2 C3C3 S1S C1C1 C2C2 C3C S2S2 S3S P 01 R … C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 01 S1S1
S C1C1 C2C2 C3C ggp 12 P p 3 Build a 4 x 256 rectangle: t = 2 C0C0 g gg R 0 p r r S3S3 S2S g g pr P P P RP C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 C0C0 C1C1 C2C2 C3C3 S1S1
Building k x N Rectangles k-digit, base N (1/k) counter: Tile Complexity: N k
2-temperature model t = 4
2-temperature model t = 4 6
2-temperature model Kolmogorov Complexity (Rothemund, Winfree STOC 2000) Beats Standard Model (our paper)
Assembly of N x N Squares
k N - k k
Assembly of N x N Squares k N - k X Y k Complexity: ( Adleman, Cheng, Goel, Huang STOC 2001 )
N x N Squares --- Flexible Glue Model a b c d e f a b c d e f a b c d e f a b c d e f Standard Glue FunctionFlexible Glue Function Kolmogorov lower bounds: Standard Flexible (Rothemund, Winfree STOC 2000)
N x N Square --- Flexible Glue Model log N N – log N seed row
N x N Square --- Flexible Glue Model log N N – log N seed row Complexity:
N x N Square --- Flexible Glue Model goal: - seed binary counter to a given value - 2 log N
N x N Square --- Flexible Glue Model
| | | | | | | | | | | | | 5 5 N x N Square --- Flexible Glue Model key idea:
b4b4 5 5 w5w5 p5p5 G(b 4, p 5 ) = 1 G(b 4, w 5 ) = 0 N x N Square --- Flexible Glue Model
p0 p1 p2 p3 p4 p5 b b b b b b given B = … encode B into glue function B = … N x N Square --- Flexible Glue Model 4 b4b4 5 p5p5
Complexity: build block N x N Square --- Flexible Glue Model
2 x log N block N – log N log N
N – log N log N
N – log N X Y log N Complexity:
Our Tile Complexity Results Multiple temperature model: k x N rectangles: beats standard model: Flexible Glue: N x N squares: beats standard model: ( Adleman, Cheng, Goel, Huang STOC 2001 ) (our paper)
Molecular-scale pattern for a RAM memory with demultiplexed addressing ( Winfree, 2003)
Outline Importance of DNA Self-Assembly Tile Self-Assembly (Generalized Models) –Tile Complexity –Shape Verification –Error Resistance DNA Word Design
Shape Verification Unique Shape Problem Input: T, a tile system S, a shape Question: Does T uniquely assemble S. Standard:P (Adleman, Cheng, Goel, Huang, Kempe, Flexible Glue:P Espanes, Rothemund, STOC 2002) Unique Shape:Co-NPC (our paper) Multiple Temperature:NP-hard (our paper) Multiple Tile:Co-NPC (our paper)
x1x1 ** x2x2 x3x3 *TTTT ok c2c2 c1c1 c2c2 c2c2 c3c3 * * * * c1c SAT x1x1 ** x2x2 x3x3 *TTFF ok c2c2 c2c2 c1c1 c2c2 c2c2 c3c3 * * * * c1c Satisfied Not Satisfied (LaBean and Lagoudakis, 1999) Unique-Shape Model
** x1x1 x2x2 x3x3 c1c1 c2c2 c3c3 * * * * * ** x1x1 x2x2 x3x3 c1c1 c2c2 c3c3 * * * * * Satisfied Not Satisfied ** Multiple Temperature Model
** x1x1 x2x2 x3x3 c1c1 c2c2 c3c3 * * 0 1 c1c1 ok c2c2 c2c2 1 T T T T * * * SAT * * * ** x1x1 x2x2 x3x3 c1c1 c2c2 c3c3 * * 0 1 c1c1 ok c2c2 c2c2 0 T T c2c2 F F * * * NO * * * Satisfied Not Satisfied ** Multiple Temperature Model
Satisfied Not Satisfied Multiple Temperature Model ** x1x1 x2x2 x3x3 c1c1 c2c2 c3c3 * * 0 1 c1c1 ok c2c2 c2c2 1 T T T T * * * SAT * * * * ** x1x1 x2x2 x3x3 c1c1 c2c2 c3c3 * * 0 1 c1c1 ok c2c2 c2c2 0 T T c2c2 F F * * * NO * * * *
Satisfied Not Satisfied Multiple Temperature Model * x1x1 x2x2 x3x3 * * * * * * x1x1 x2x2 x3x3 * * * * *
Unique Shape Problem Results StandardP Flexible GlueP Multiple TemperatureNP-hard Unique ShapeCo-NPC Multiple TileNP-hard (Adleman, Cheng, Goel, Huang, Kempe, Espanes, Rothemund, STOC 2002) (our paper)
Outline Importance of DNA Self-Assembly Tile Self-Assembly (Generalized Models) –Tile Complexity –Shape Verification –Error Resistance DNA Word Design
Further Research Error Resistance: Insufficient Bindings t = 2
Further Research Error Resistance: Insufficient Bindings t = 2
Further Research Error Resistance: Insufficient Bindings t = 2
Further Research Error Resistance: Insufficient Bindings t = 2
Further Research Error Resistance: Insufficient Bindings t = 2
Further Research Error Resistance: Insufficient Bindings t = 2
Further Research Error Resistance: Insufficient Bindings t = 2
Further Research Error Resistance: Insufficient Bindings a b temperature Standard Fluctuating
Outline Importance of DNA Self-Assembly Tile Self-Assembly (Generalized Models) DNA Word Design
ACCT TGGA GCTA CGAT 5 DNA Word Design
green: red: yellow: blue: purple: white: black: teal: ACCT GAAA GCTA CGTA CTCG CATG ACGA TTTA -Must be sufficiently different -Must have similar thermodynamic properties -Must be short
Hamming Constraint (k) ACCTGAGAGAGCTC GCGCAGCTGGCTCA TTAGCAGACTGACA GCTTCGTAGCATAG ATAGCTGCATCGAT TGCTAGCGTCAAGC AGCATTATAGATAC GCCCGTAGACTCGA TCGAGTAGATCGAT CGACGTAGGCTTTG CTGATGATTAGGCG TTCAGCTGCGGCTA TCGATGCGTAGCTA GAGTGCTGCTAGCT AGCTAGTCACTCGA TCGACTAGCTTCGA TTAGCCGCGTAGCT GACTAGTCGATCAG TCGCGCTTATATAT ATCGTAGTCTAGTC TACGATCGCTAGTC n strings X= GCTTCGTAGCATAG | | | Y= TTAGCCGCGTAGCT length L = 14 HAMM(X,Y) = 11 > k
Free Energy Constraint ACCTGAGAGAGCTC GCGCAGCTGGCTCA TTAGCAGACTGACA GCTTCGTAGCATAG ATAGCTGCATCGAT TGCTAGCGTCAAGC AGCATTATAGATAC GCCCGTAGACTCGA TCGAGTAGATCGAT CGACGTAGGCTTTG CTGATGATTAGGCG TTCAGCTGCGGCTA TCGATGCGTAGCTA GAGTGCTGCTAGCT AGCTAGTCACTCGA TCGACTAGCTTCGA TTAGCCGCGTAGCT GACTAGTCGATCAG TCGCGCTTATATAT ATCGTAGTCTAGTC TACGATCGCTAGTC n strings length L = 14 A C G T A C G T Pairwise free energies =
Free Energy Constraint ACCTGAGAGAGCTC GCGCAGCTGGCTCA TTAGCAGACTGACA GCTTCGTAGCATAG ATAGCTGCATCGAT TGCTAGCGTCAAGC AGCATTATAGATAC GCCCGTAGACTCGA TCGAGTAGATCGAT CGACGTAGGCTTTG CTGATGATTAGGCG TTCAGCTGCGGCTA TCGATGCGTAGCTA GAGTGCTGCTAGCT AGCTAGTCACTCGA TCGACTAGCTTCGA TTAGCCGCGTAGCT GACTAGTCGATCAG TCGCGCTTATATAT ATCGTAGTCTAGTC TACGATCGCTAGTC n strings length L = 14 A C G T A C G T Pairwise free energies = X= AGCATTATAGATAC FE(X) =
Free Energy Constraint ACCTGAGAGAGCTC GCGCAGCTGGCTCA TTAGCAGACTGACA GCTTCGTAGCATAG ATAGCTGCATCGAT TGCTAGCGTCAAGC AGCATTATAGATAC GCCCGTAGACTCGA TCGAGTAGATCGAT CGACGTAGGCTTTG CTGATGATTAGGCG TTCAGCTGCGGCTA TCGATGCGTAGCTA GAGTGCTGCTAGCT AGCTAGTCACTCGA TCGACTAGCTTCGA TTAGCCGCGTAGCT GACTAGTCGATCAG TCGCGCTTATATAT ATCGTAGTCTAGTC TACGATCGCTAGTC n strings length L = 14 A C G T A C G T Pairwise free energies = X= AGCATTATAGATAC FE(X) = For all strings X and Y: |FE(X) – FE(Y)| < C
DNA Word Design Word Design Problem Input: integers n and k Output: n strings of length L such that for all strings X and Y: 1) HAMM(X,Y) > k 2) |FE(X) – FE(Y)| < C Minimize L
DNA Word Design L > log n L > k L > ½(k + log n) Simple Lower Bound: Hamming Constraint: Set L = 5*(k + log n) Generate n strings of length L uniformly at random. -satisfies hamming constraint with high probability.