Adleman and computing on a surface 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Slides:



Advertisements
Similar presentations
Analysis of Algorithms
Advertisements

Chapter 11 Limitations of Algorithm Power Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Ashish Gupta Ashish Gupta Unremarkable Problem, Remarkable Technique Operations in a DNA Computer DNA : A Unique Data Structure ! Pros.
The Theory of NP-Completeness
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Montek Singh COMP Nov 15,  Two different technologies ◦ TODAY: DNA as biochemical computer  DNA molecules encode data  enzymes, probes.
CSE332: Data Abstractions Lecture 27: A Few Words on NP Dan Grossman Spring 2010.
NP-Complete Problems Reading Material: Chapter 10 Sections 1, 2, 3, and 4 only.
The Theory of NP-Completeness
NP-Complete Problems Problems in Computer Science are classified into
Analysis of Algorithms CS 477/677
JSPS Project on Molecular Computing (presentation by Masami Hagiya) funded by Japan Society for Promotion of Science Research for the Future Program –biocomputing.
CSE 421 Algorithms Richard Anderson Lecture 27 NP Completeness.
Chapter 11: Limitations of Algorithmic Power
Chapter 11 Limitations of Algorithm Power Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 11 Limitations of Algorithm Power Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Presented By:- Anil Kumar MNW-882-2K11
DNA Computing on Surfaces
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
Chapter 11 Limitations of Algorithm Power. Lower Bounds Lower bound: an estimate on a minimum amount of work needed to solve a given problem Examples:
CSCE350 Algorithms and Data Structure
A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 11 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved.
Computational Complexity Polynomial time O(n k ) input size n, k constant Tractable problems solvable in polynomial time(Opposite Intractable) Ex: sorting,
1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.
DNA Computing on a Chip Mitsunori Ogihara and Animesh Ray Nature, vol. 403, pp Cho, Dong-Yeon.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
Beyond Silicon: Tackling the Unsolvable with DNA.
1 Computing with DNA L. Adelman, Scientific American, pp (Aug 1998) Note: This ppt file is based on a student presentation given in October, 1999.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 8: Complexity Theory.
Extra. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing systems.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
CSE 024: Design & Analysis of Algorithms Chapter 9: NP Completeness Sedgewick Chp:40 David Luebke’s Course Notes / University of Virginia, Computer Science.
1 Lower Bounds Lower bound: an estimate on a minimum amount of work needed to solve a given problem Examples: b number of comparisons needed to find the.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
What is DNA Computing? Shin, Soo-Yong Artificial Intelligence Lab.
Unit 9: Coping with NP-Completeness
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
1 Chapter 34: NP-Completeness. 2 About this Tutorial What is NP ? How to check if a problem is in NP ? Cook-Levin Theorem Showing one of the most difficult.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
Hard Problems Some problems are hard to solve.  No polynomial time algorithm is known.  E.g., NP-hard problems such as machine scheduling, bin packing,
Lecture 6 NP Class. P = ? NP = ? PSPACE They are central problems in computational complexity.
Design and Analysis of Algorithms - Chapter 101 Our old list of problems b Sorting b Searching b Shortest paths in a graph b Minimum spanning tree b Primality.
NP-Completeness (Nondeterministic Polynomial Completeness) Sushanth Sivaram Vallath & Z. Joseph.
CS 3343: Analysis of Algorithms Lecture 25: P and NP Some slides courtesy of Carola Wenk.
CSE 589 Part V One of the symptoms of an approaching nervous breakdown is the belief that one’s work is terribly important. Bertrand Russell.
DNA computing on a chip Mitsunori Ogihara and Animesh Ray Nature, 2000 발표자 : 임예니.
1 Biological Computing – DNA solution Presented by Wooyoung Kim 4/8/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad.
“One ring to rule them all” Analogy (sort of) Lord of The Rings Computational Complexity “One problem to solve them all” “my preciousss…”
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
NPC.
CSC 413/513: Intro to Algorithms
Towards Autonomous Molecular Computers Towards Autonomous Molecular Computers Masami Hagiya, Proceedings of GP, Nakjung Choi
Young CS 331 D&A of Algo. NP-Completeness1 NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and.
1 Ch 10 - NP-completeness Tractable and intractable problems Decision/Optimization problems Deterministic/NonDeterministic algorithms Classes P and NP.
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
1 The Theory of NP-Completeness 2 Review: Finding lower bound by problem transformation Problem X reduces to problem Y (X  Y ) iff X can be solved by.
CS 154 Formal Languages and Computability May 10 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
Conceptual Foundations © 2008 Pearson Education Australia Lecture slides for this course are based on teaching materials provided/referred by: (1) Statistics.
Hard Problems Some problems are hard to solve.  No polynomial time algorithm is known.  E.g., NP-hard problems such as machine scheduling, bin packing,
ICS 353: Design and Analysis of Algorithms NP-Complete Problems King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Richard Anderson Lecture 26 NP-Completeness
Hard Problems Introduction to NP
ICS 353: Design and Analysis of Algorithms
Richard Anderson Lecture 25 NP-Completeness
Chapter 34: NP-Completeness
JSPS Project on Molecular Computing (presentation by Masami Hagiya)
Chapter 11 Limitations of Algorithm Power
DNA computing on surfaces
Our old list of problems
RAIK 283 Data Structures & Algorithms
Presentation transcript:

Adleman and computing on a surface

1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing systems 6P systems 7Hairpins 8Detection techniques 9Micro technology introduction 10Microchips and fluidics 11Self assembly 12Regulatory networks 13Molecular motors 14DNA nanowires 15Protein computers 16DNA computing - summery 17Presentation of essay and discussion Course outline

Who’s who?

Tom Head Areas of interest Algebra Computing with biomolecules Formal representations of communication Department of Mathematical Sciences Binghamton University

Areas of interest  Method for Obtaining Digital Signatures and Public-Key Cryptosystems  Distinguishing Prime Numbers From Composite Numbers  The First Case of Fermat's Last Theorem  Primality Testing And Two Dimensional Abelian Varieties Over Finite Fields  Molecular Computation of Solutions To Combinatorial Problem Leonard Adleman Turing Award 2002 Department of Computer Science

Theoretical Computer Science College of Computing, Georgia Tech Richard Lipton Areas of interest  Algorithms and Complexity Theory  Cryptography  DNA Computing

Laura Landweber Areas of interest  Origins of Genes, Genomes  the Genetic Code  Early Pathways of RNA Evolution  Scrambled Genes  RNA Editing  Gene Scrambling  DNA Computing Dept. of Ecology and Evolutionary Biology Princeton University

John Reif Computer Science Duke University Areas of interest  DNA nanostructures  Molecular Computation  Efficient Algorithms  Parallel Computation  Robotic Motion Planning  Optical Computing.

Erik Winfree Computer Science Computation and Neural Systems Caltech, Areas of interest  DNA-based computers  Computing by self-assembly  Genetic Regulatory Networks  Signal Transduction Cascades  Ribosomal Translation  DNA and RNA folding MacArthur Fellow 2000

Nadrian Seeman Department of Chemistry New York University Areas of interest  DNA Nanotechnology  Macromolecular Design and Topology  Biophysical Chemistry of Recombinational Intermediates  DNA-Based Computation  Crystallography

Robert Corn Chemistry Department University of Wisconsin Areas of interest  surface plasmon resonance (SPR) to monitor biopolymer adsorption, the chemical modification of surfaces,  characterization of molecular monolayers  electron transfer processes at liquid/liquid electrochemical interfaces. DNA computing algorithms at surfaces  multilayer polyelectrolyte films for ion transport applications.

Hagiya Masami Department of Computer Science, University of Tokyo Areas of interest  Automated Deduction, Formal Verification and Programming Languages  Bio-Computing  Hybrid Systems...

Akira Suyama Graduate School of Arts and Sciences, University of Tokyo Areas of interest  SNPs  Probe design DNA chips  Quantitative gene expression  Hybrid Systems...

John Rose Areas of interest  the DNA chip, especially Tag-Antitag Systems  Whiplash PCR, a simple autonomous DNA computer  equilibrium chemistry/statistical thermodynamic model Department of Computer Science, University of Tokyo

Gheorghe Păun Areas of interest  Formal language theory (and applications)  Combinatorics on words  Semiotics  operational research  DNA Computing  Membrane Computing Institute of Mathematics of the Romanian Academy

Grzegorz Rozenberg Institute of Advanced Computer Science University of Leiden Areas of interest  Molecular Computing  Evolutionary Algorithms  Neural Networks

Areas of interest  H systems  P systems  Neural Networks Giancarlo Mauri Dipartimento di Informatica, Sistemistica e Comunicazione (DISCo) MilanoDISCo

Ehud Shapiro Areas of interest  DNA as input fuel  Biological nanocomputer  Turing machine-like model Computer Science and Applied Mathematics the Weizmann Institute

Byoung-Tak Zhang Areas of interest  Evolutionary Intelligence  Neural Intelligence  Molecular Intelligence  Computational Learning Theory School of Computer Science and Engineering Seoul National University

Danny van Noort Areas of interest  microstructure design and fabrication  DNA-hybridisation  instrumentation  fluorescent microscopy  affinity biosensors  protein chips  DNA computing  cell behaviour School of Computer Science and Engineering Seoul National University

NP complete problems

 Tractable and intractable problems  NP-complete problems The theory of NP-completeness

 Classify problems as tractable or intractable.  Problem is tractable if there exists at least one polynomial bound algorithm that solves it.  An algorithm is polynomial bound if its worst case growth rate can be bound by a polynomial p(n) in the size n of the problem Classifying problems

Problem is intractable if it is not tractable. All algorithms that solve the problem are not polynomial bound. It has a worst case growth rate f(n) which cannot be bound by a polynomial p(n) in the size n of the problem. For intractable problems the bounds are: Intractable problems

 There are many practical problems for which no one has yet found a polynomial bound algorithm.  Examples: traveling salesperson, 0/1 knapsack, graph coloring, bin packing etc.  Most design automation problems such as testing and routing.  Many networks, database and graph problems. Hard practical problems

 The theory of NP-completeness enables showing that these problems are at least as hard as NP-complete problems  Practical implication of knowing problem is NP-complete is that it is probably intractable ( whether it is or not has not been proved yet)  So any algorithm that solves it will probably be very slow for large inputs The theory of NP-completeness

 A decision problem answers yes or no for a given input  Examples:  Given a graph G Is there a path from s to t of length at most k?  Does graph G contain a Hamiltonian cycle?  Given a graph G is it bipartite? Decision problems

 A Hamiltonian cycle of a graph G is a cycle that includes each vertex of the graph exactly once.  Problem: Given a graph G, does G have a Hamiltonian cycle? Decision problem: Hamiltonian cycle

 P is the class of decision problems that are polynomial bounded  Is the following problem in P?  Given a weighted graph G, is there a spanning tree of weight at most B?  The decision versions of problems such as shortest distance, and minimum spanning tree belong to P The class P

 NP is the class of decision problems for which there is a polynomial bounded verification algorithm  It can be shown that:  all decision problems in P, and  decision problems such as traveling salesman, knapsack, bin pack, are also in NP The class NP

 P  NP  If a problem is solvable in polynomial time, a polynomial time verification algorithm can easily be designed that ignores the certificate and answers “yes” for all inputs with the answer “yes”. The relation between P and NP

 It is not known whether P = NP.  Problems in P can be solved “quickly”  Problems in NP can be verified “quickly”.  It is easier to verify a solution than to solve a problem.  Some researchers believe that P and NP are not the same class. The relation between P and NP

 A problem A is NP-complete if 1. It is in NP and 2. For every other problem A’ in NP, A’  A  A problem A is NP-hard if For every other problem A’ in NP, A’  A NP-complete problems

 Cook’s theorem Satisfiability is NP-complete  This was the first problem shown to be NP-complete  Other problems the decision version of knapsack, the decision version of traveling salesman Examples of NP-complete problems

Satisfiability problem

 First, Conjunctive Normal Form (CNF) will be defined  Then, the Satisfiability problem will be defined The satisfiability problem

 A logical (Boolean) variable is a variable that may be assigned the value true or false (x, y, w and z are Boolean variables)  A literal is a logical variable or the negation of a logical variable (x and  y are literals)  A clause is a disjunction of literals ((w  x  y) and (  x  y) are clauses) Conjunctive normal form (CNF)

 A logical (Boolean) expression is in Conjunctive Normal Form if it is a conjunction of clauses.  The following expression is in conjunctive normal form: (w  x  y)  (w   y  z)  (  x  y)  (  w   y) Conjunctive normal form (CNF)

 Is there a truth assignment to the n variables of a logical expression in Conjunctive Normal Form which makes the value of the expression true?  For the answer to be yes, all clauses must evaluate to true  Otherwise the answer is no The satisfiability problem

 x=F, y=F, w=T and z=T is a truth assignment for: (w  x  y)  (w   y  z)  (  x  y)  (  w   y)  Note that if y=F then  y=T  Each clause evaluates to true The satisfiability problem

Adleman’s experiment

The 1994 experiment DNA computer

The 1994 experiment

Basic Idea Perform molecular biology experiment to find solution to math problem. The 1994 experiment

 (Proposed by William Hamilton)  Given a network of nodes and directed connections between them, is there a path through the network that begins with the start node and concludes with the end node visiting each node only once (“Hamiltonian path")?  Does a Hamiltonian path exist, or not?” Hamiltonian path

Detroit BostonChicago Atlanta start city end city Hamiltonian path does exist

Detroit BostonChicago Atlanta end city start city Hamiltonian path does not exist

Generation-&-Test Algorithm Step 1Generate random paths on the network. Step 2Keep only those paths that begin with start city and conclude with end city. Step 3 If there are N cities, keep only those paths of length N. Step 4 Keep only those that enter all cities at least once. Step 5 Any remaining paths are solutions (i.e., Hamiltonian paths). Solving the Hamiltonian problem

[X] D -> B -> A [X] B -> C -> D -> B -> A -> B [X] A -> B -> C -> B [X] C -> D -> B -> A [x] A -> B -> A -> D [O] A -> B -> C -> D [X] A -> B -> A -> B -> C -> D The paths

Solving the Hamiltonian problem

 The total number of paths grows exponentially as the network size increases:  (e.g.) 10 6 paths for N=10 cities, paths (N=20), paths!! (N =100)  The Generation-&-Test algorithm takes “forever”. Some sort of smart algorithm must be devised; none has been found so far (NP-hard). Combinatorial explosion

The key to solving the problem is using DNA to perform the five steps of the Generation-&- Test algorithm in parallel search, instead of serial search. Finding a solution with DNA

 Protein that produces complementary DNA strand  A -> T, T -> A, C -> G, G -> C  Requires primer and starter  Enables DNA to reproduce Intermezzo: DNA polymerase

The bio-nanomachine  hops onto DNA strand  slides along  reads each base  writes its complement onto new strand Intermezzo: DNA polymerase

Ingredients and tools needed  DNA strands that encode city names and connections between them  Polymerases, ligase, water, salt, other ingredients  Polymerase chain reaction (PCR) set  Gel electrophoresis tool (that filters out non-solution strands) Experimental set-up

Gel electrophoresis

Detroit BostonChicago Atlanta start city end city Solving a Hamiltonian path problem

City coding

City coding with DNA

Detroit BostonChicago Atlanta start city end city Atlanta-BostonBoston-Chicago Chicago* Chicago-Detroit Detroit*Atlanta*Boston* Possible paths

Detroit BostonChicago Atlanta start city end city Boston-AtlantaAtlanta-Detroit Detroit*Boston*Atlanta* Possible paths

In pictures

1. In a test tube, mix the prepared DNA pieces together (which will randomly link with each other, forming all different paths). 2. Perform PCR with two ‘start’ and ‘end’ DNA pieces as primers (which creates millions’ copies of DNA strands with the right start and end). 3. Perform gel electrophoresis to identify only those pieces of right length (e.g., N=4). The DNA experiment

4. Use DNA ‘probe’ molecules to check whether their paths pass through all intermediate cities. 5. All DNA pieces that are left in the tube should be precisely those representing Hamiltonian paths.  If the tube contains any DNA at all, then conclude that a Hamiltonian path exists, and otherwise not.  When it does, the DNA sequence represents the specific path of the solution. The DNA experiment

Why does it work?  Enormous parallelism, with DNA pieces working in parallel to find solution simultaneously.  Takes less than a week (vs. thousands years for supercomputer) Extraordinary energy efficient  ( of supercomputer energy use) Note this is a Universal Turing machine Summary and conclusion

Experimental set-up

CAPTURE LAYER (-R or G)

- + Experimental set-up

- + CAPTURE LAYER (-R or G) Experimental set-up

- + CAPTURE LAYER (-R or G) Experimental set-up

- + HOT CAPTURE LAYER (-R or G) Experimental set-up

DNA computing on a surface

DNA computing on surfaces

 Advantages over “solution phase” chemistry  Disadvantages:  Facile purification steps  Reduced interference between strands  Easily automated  Loss of information density (2D)  Lower surface hybridization efficiency  Slower surface enzyme kinetics DNA computing on surfaces

DNA strands representing the set {0,1}^n are synthesized and subsequently immobilized on a surface in a non-addressed fashion DNA surface model: input

A strand is comprised of words. Each word is a short DNA strand (16mer) representing one or more bits. Word Bit Encoding binary information

 Requirements of a “DNA code”  Success in specific hybridization between a DNA code word and its Watson-crick complement  Few false positive signals  Virtually all designs enforce combinatorial constraints on the code words  Applications:  Information storage, retrieval for DNA computing  Molecular bar codes for chemical libraries DNA word design problem

 Hamming: distance between two code words should be large  Reverse complement: distance between a word and the reverse complement of another word should be large  Also: frame shift, distinct sub-words, forbidden sub-words, … DNA word design problem

 Seeman (1990): de novo design of sequences for nucleic acid structural engineering  Brenner (1997): sorting polynucleotides using DNA tags  Shoemaker et al. (1996): analysis of yeast deletion mutants using a parallel molecular bar-coding strategy  Many other examples in DNA computing Work on DNA code design

Word design example

 MARK strands in which bit j = 0 (or 1): hybridize with Watson-Crick complements of word containing bit j, followed by polymerization  DESTROY  UNMARK DNA surface model: process

 MARK strands in which bit j = 0 (or 1)  DESTROY unmarked strands: exonuclease degradation  UNMARK DNA surface model: process

MARK strands in which bit j = 0 (or 1): hybridize with Watson-Crick complements of word containing bit j, followed by polymerization DNA surface model: process

 MARK strands in which bit j = 0 (or 1)  DESTROY unmarked strands  UNMARK strands: wash in distilled water DNA surface model: process

Detect remaining strands (if any) by detaching strands from surface and amplifying using PCR (polymerase chain reaction). DNA surface model: output

Theorem Any CNFSAT formula of size m can be computed using O(m) mark, unmark and destroy operations. Theorem Any circuit of size m can be computed using O(m) mark, unmark, destroy, and append operations. Computational power

Input 16 strands Process Output exactly those strands that satisfy the circuit remain on the surface. or notor z and wyx MARK if bit z = 1 MARK if bit w = 1 MARK if bit y = 0 DESTROY UNMARK MARK if bit w = 0 MARK if bit y = 0 DESTROY UNMARK … or not The satisfiability problem

(w  x  y)  (w   y  z)  (  x  y)  (  w   y) {0000} {0001} {0010} {0011} {0100} {0101} {0110} {0111} {1000} {1001} {1010} {1011} {1100} {1101} {1110} {1111} 4-variable SAT demo

 The logic of the DNA computation in each cycle, leading at the end to four types of DNA molecules remaining on the surface.  The identity of those molecules that correspond to the solutions was determined by PCR.  Solution: S 3 S 7 S 8 S 9 4-variable SAT demo

S 3 : w=0, x=0, y=1, z=1 S 7 : w=0, x=1, y=1, z=1 S 8 : w=1, x=0, y=0, z=0 S 9 : w=1, x=0, y=0, z=1 y=1: (w V x V y) z=1: (w V y V z) x=0 or y=1: (x V y) w=0: (w V y) 4-variable SAT, the answers

 Synthesize; Attach  Mark  Destroy  Unmark  Readout  Cycle 4-variable SAT demo

 Solid-phase chemistry is a promising approach to DNA computing  DNA computing will require greatly improved DNA surface attachment chemistries and control of chemical and enzymatic processes Conclusions