DNA Implementation of a Royal Road Fitness Evaluation Ji Yoon Park Dept. of Biochem Hanyang University Elizabeth Goode, David Harlan Wood, and Junghuei Chen
Abstract 1. A model for DNA implementation of Royal Road evolutionary computation - for separation by fitness : 2-d DGGE, PAGE - for separation by fitness : 2-d DGGE, PAGE 2. Suggestion for possible use of the MutS and MutY 2. Suggestion for possible use of the MutS and MutY - mismatch-binding proteins in combination with gel - mismatch-binding proteins in combination with gel shift assays for separation by fitness shift assays for separation by fitness
The Royal Road * A class of evolutionary computations - van Nimwegen et al - van Nimwegen et al ▪ The population dynamics of various Royal Road fitness functions ▪ The population dynamics of various Royal Road fitness functions ▪ Only a relatively few generation ▪ Only a relatively few generation ▪ Don’t support theoretical results on the stasis ▪ Don’t support theoretical results on the stasis ▪ Limitation of genetic variation ▪ Limitation of genetic variation - By implementing Royal Road problems using DNA, - By implementing Royal Road problems using DNA, ◊ Use populations many others of magnitude larger than the populations ◊ Use populations many others of magnitude larger than the populations available using conventional computers available using conventional computers ◊ Huge DNA storage capacity permits exploring populations with ◊ Huge DNA storage capacity permits exploring populations with much greater genetic diversity much greater genetic diversity ◊ Test precisely theoretical predictions van Nimwegen ◊ Test precisely theoretical predictions van Nimwegen
Focus on… ☞ Fitness-based separation of individuals for a Royal ☞ Fitness-based separation of individuals for a Royal Road problem Road problem * DNA model for simulating Royal Road computation * DNA model for simulating Royal Road computation - Potential of computing with very large population ! - Potential of computing with very large population ! - Separation : 2-d DGGE, PAGE - Separation : 2-d DGGE, PAGE
Evolutionary Algorithms - Population(possibly random) - Selection - Selection → fitness function → fitness function - Reproduction - Reproduction → reproduce the next generation of individuals according to some → reproduce the next generation of individuals according to some reproduction strategy which may include mutation and crossover reproduction strategy which may include mutation and crossover ex) MaxOnes ex) MaxOnes - Begins with a random set of individual bitstrings - Begins with a random set of individual bitstrings of 0 and 1, each of length n. of 0 and 1, each of length n. - For a given initial population size, the goal is to generation - For a given initial population size, the goal is to generation such perfect individuals such perfect individuals
Royal Road Fitness Function * Generation of the MaxOnes fitness function * Generation of the MaxOnes fitness function - The population - The population : Strings which contain discrete blocks which are subsequences of bits : Strings which contain discrete blocks which are subsequences of bits - Each block is evaluated for fitness - Each block is evaluated for fitness ▶ Each block in a given individual bitstring which satisfies its predefined ▶ Each block in a given individual bitstring which satisfies its predefined block fitness criterion contributes to the fitness rating of that block fitness criterion contributes to the fitness rating of that individual individual ▶ Any deviations from the required specification fails to contributes the ▶ Any deviations from the required specification fails to contributes the total fitness for the bitstring total fitness for the bitstring ▶ The sum of the block contributions constitutes the total fitness for the ▶ The sum of the block contributions constitutes the total fitness for the bitstring bitstring ▶ Blocks are assigned fitness 1 if they are perfect, and fitness 0 otherwise. ▶ Blocks are assigned fitness 1 if they are perfect, and fitness 0 otherwise.
1. Examine the population dynamics in instances of the Royal Road 1. Examine the population dynamics in instances of the Royal Road problem problem ▶ The potential of generating previously unobtained information(10 12 >) ▶ The potential of generating previously unobtained information(10 12 >) 2. Feasibility of the necessary laboratory steps for DNA implementation 2. Feasibility of the necessary laboratory steps for DNA implementation * The enormous storage capacity of DNA * The enormous storage capacity of DNA ▶ The potential gain in computing evolutionary algorithms using DNA ▶ The potential gain in computing evolutionary algorithms using DNA rather than silicon is unprecedented rather than silicon is unprecedented
The preliminary Example for Royal Road Fitness-Proportional Selection * Let A={C, T, G} be working set of symbols * The block is B={C, T} ▶ The population of interest is a set of bitstrings of length 88 written ▶ The population of interest is a set of bitstrings of length 88 written over A, each containing 2 blocks written over B of length 6 in bit over A, each containing 2 blocks written over B of length 6 in bit positions and positions and ▶ The population contains at most 2 12 individuals ▶ The population contains at most 2 12 individuals ▶ The individuals, once encoded in DNA, must be physically separable by fitness ▶ The individuals, once encoded in DNA, must be physically separable by fitness ▶ Fitness 1 for each perfect block containing all Ts ▶ Fitness 1 for each perfect block containing all Ts ▶ A perfect individual contains only T in each of its blocks: fitness 2 ▶ A perfect individual contains only T in each of its blocks: fitness 2 ▶ Doing selection over the entire population of one generation in one day (possible to treat populations of size ) ▶ Doing selection over the entire population of one generation in one day (possible to treat populations of size )
Principle ◈ Fisher and Lerman (1983); Myers et al (1987); Sheffield et al (1989) ▶ When ds DNA migrates through increasing concentrations of urea and formamide, the complementary strands will dissociate in a domain-dependent fashion. ▶ When ds DNA migrates through increasing concentrations of urea and formamide, the complementary strands will dissociate in a domain-dependent fashion. ▶ The dissociation causes an abrupt decrease in the mobility of the fragment in polyacrylamide gels. ▶ The dissociation causes an abrupt decrease in the mobility of the fragment in polyacrylamide gels. ▶ The presence of a mutation may change the stability of its local domain and hence alter its pattern of migration. ▶ The presence of a mutation may change the stability of its local domain and hence alter its pattern of migration.
Experimental ◈ When preparing DNA for analysis by DGGE, ▶ PCR is used to attach an ~40 bp G-C clamp to one end of the fragment. ▶ PCR is used to attach an ~40 bp G-C clamp to one end of the fragment. * Clamp: highly stable, denaturation-resistant domain * Clamp: highly stable, denaturation-resistant domain ▶ Allow mutations in lower melting domains to be acertained ▶ Allow mutations in lower melting domains to be acertained ▶ Heteroduplexes between a wild-type strand and a potential mutant strand will be destabilized by the single base-pair mismatch and will migrate more slowly than either homoduplex ▶ Heteroduplexes between a wild-type strand and a potential mutant strand will be destabilized by the single base-pair mismatch and will migrate more slowly than either homoduplex ▶ Heteroduplexes generated during PCR amplification of heterozygous genomic DNA can greatly assist in the detection of mutations ▶ Heteroduplexes generated during PCR amplification of heterozygous genomic DNA can greatly assist in the detection of mutations
Strengths/Limitations ◈ Advantage: ▶ Used to analyze PCR-amplified, G-C clamped segments of DNA < 500 bp ▶ Used to analyze PCR-amplified, G-C clamped segments of DNA < 500 bp in length. in length. ▶ Best suited to scanning multiple samples for mutations in the same DNA ▶ Best suited to scanning multiple samples for mutations in the same DNA fragment fragment ▶ A change from A/T to G/C usually increase the stability of the local domain ▶ A change from A/T to G/C usually increase the stability of the local domain ▶ A change from G/C to A/T usually has a destabilizing effect ▶ A change from G/C to A/T usually has a destabilizing effect ◈ Disadvantage: ◈ Disadvantage: ▶ The exact position and nature of the mutation must be confirmed by DNA ▶ The exact position and nature of the mutation must be confirmed by DNA sequencing sequencing ▶ Requires specialized equipment and a distinctly user-unfriendly computer ▶ Requires specialized equipment and a distinctly user-unfriendly computer program, which is needed to select sequences for oligonucleotide primers program, which is needed to select sequences for oligonucleotide primers
Perpendicular 2-d DGGE * Separation by fitness * Separation by fitness ▶ Denaturing gradient gel electrophoresis(DGGE) ▶ Denaturing gradient gel electrophoresis(DGGE) - ds DNA is moved through the gradient gel environment by - ds DNA is moved through the gradient gel environment by electrophoresis electrophoresis - Partial dehybridization of ds DNA in a denaturing environment - Partial dehybridization of ds DNA in a denaturing environment reduces the mobility of DNA reduces the mobility of DNA - The different m.p of different seqs - The different m.p of different seqs ▶ differences between the movement of those seqs, even if those ▶ differences between the movement of those seqs, even if those seqs are the same length seqs are the same length - To determine an optimal denaturing gradient between candidates - To determine an optimal denaturing gradient between candidates of different fitness of different fitness - Separation is verified with PAGE - Separation is verified with PAGE
The Candidate Individuals * Candidate * Candidate ▶ ss DNA consisting of 88 bases each ▶ ss DNA consisting of 88 bases each ▶ Each individual strand consists of 5 concatenated seqs of C, G and T ▶ Each individual strand consists of 5 concatenated seqs of C, G and T ▶ All concatenates of the following five seqs ▶ All concatenates of the following five seqs : Clamp1 - Block1 - Clamp2 - Block2 - Clamp3 : Clamp1 - Block1 - Clamp2 - Block2 - Clamp3 ▶ Clamps are distinct, but constant for all candidates, and have lengths ▶ Clamps are distinct, but constant for all candidates, and have lengths 24, 26 and 26 and G-C rich regions 24, 26 and 26 and G-C rich regions ▶ Blocks have length 6, and contain a mixture of C and T, varying among ▶ Blocks have length 6, and contain a mixture of C and T, varying among different candidates. different candidates. ▶ The ‘perfect candidate’ has only T in Block1 and Block2 ▶ The ‘perfect candidate’ has only T in Block1 and Block2
The candidate strands can be divided - Physically divide candidate strands into equivalence classes * To separation * To separation ▶ Anneal the various ‘imperfect’ candidates to target ▶ Anneal the various ‘imperfect’ candidates to target ◊ fitness = 0: at least one C in each of B1 and B2 ◊ fitness = 0: at least one C in each of B1 and B2 ◊ fitness = 1: one perfect block containing only T, and ◊ fitness = 1: one perfect block containing only T, and one imperfect block containing at least one C one imperfect block containing at least one C ◊ fitness = 2(perfect candidate): only T in both B1 and B2 ◊ fitness = 2(perfect candidate): only T in both B1 and B2 * Clamp: constant for all individuals * Clamp: constant for all individuals ▶ only one seq associated with a perfect individual ▶ only one seq associated with a perfect individual
Candidate perfect: 5’ - - GGGCGGCCTCGCCTCCCCTGCTGG TTTTTT CCTTCTCCCTCTGTCGGGCTCGCGTT TTTTTT TTGTTGCTTCGTTTGTCCTTCCGTCC - - 3’ 5’ - - GGGCGGCCTCGCCTCCCCTGCTGG TTTTTT CCTTCTCCCTCTGTCGGGCTCGCGTT TTTTTT TTGTTGCTTCGTTTGTCCTTCCGTCC - - 3’ Candidate 2.1: 5’ - - GGGCGGCCTCGCCTCCCCTGCTGG TTTTTT CCTTCTCCCTCTGTCGGGCTCGCGTT CTTTTT TTGTTGCTTCGTTTGTCCTTCCGTCC - - 3’ 5’ - - GGGCGGCCTCGCCTCCCCTGCTGG TTTTTT CCTTCTCCCTCTGTCGGGCTCGCGTT CTTTTT TTGTTGCTTCGTTTGTCCTTCCGTCC - - 3’ Candidate 2.6: 5’ - - GGGCGGCCTCGCCTCCCCTGCTGG TTTTTT CCTTCTCCCTCTGTCGGGCTCGCGTT CCCCCC TTGTTGCTTCGTTTGTCCTCCTCC - - 3’ 5’ - - GGGCGGCCTCGCCTCCCCTGCTGG TTTTTT CCTTCTCCCTCTGTCGGGCTCGCGTT CCCCCC TTGTTGCTTCGTTTGTCCTCCTCC - - 3’ Candidate : 5’ - - GGGCGGCCTCGCCTCCCCTGCTGG CCCCCC CCTTCTCCCTCTGTCGGGCTCGCGTT CCCCCC TTGTTGCTTCGTTTGTCCTTCCGTCC - - 3’ 5’ - - GGGCGGCCTCGCCTCCCCTGCTGG CCCCCC CCTTCTCCCTCTGTCGGGCTCGCGTT CCCCCC TTGTTGCTTCGTTTGTCCTTCCGTCC - - 3’ Target strand: exact complement of Candidate Perfect 5’ - - CGACGGAAGGACAAACGAAGCAACAA AAAAAA AACGCGAGCCCGACAGAGGGAGAAGG AAAAAA CCAGCAGGGGAGGCGAGGCCGCCC - - 3’ 5’ - - CGACGGAAGGACAAACGAAGCAACAA AAAAAA AACGCGAGCCCGACAGAGGGAGAAGG AAAAAA CCAGCAGGGGAGGCGAGGCCGCCC - - 3’ = 1+ 1 = 1+ 1 = 2 = 2 = = 1 = = 1 = = 0 Fitness
Separation by Fitness ¶ 2-d DGGE in combination with PAGE ¶ 2-d DGGE in combination with PAGE ◊ Separation of a subset of candidate strands by fitness class ◊ Separation of a subset of candidate strands by fitness class ◊ Different candidate strands annealed to the Target strand ◊ Different candidate strands annealed to the Target strand should run differently according to their fitness should run differently according to their fitness ◊ Candidates having blocks which perfectly anneal to Target ◊ Candidates having blocks which perfectly anneal to Target strand are predicted to run more quickly through a gel than strand are predicted to run more quickly through a gel than candidate strands candidate strands
2-d DGGE
PAGE Lane 1: 25bp ladder Lane 2: Candidate Perfect/Target Lane 3: Candidate2.1/Target Lane 4: Candidate 2.6/Target
PAGE
Mobility Shift Assay
Conclusion “ Can 2-d DGGE and PAGE be used for separating candidates according to fitness in a Royal Road evolutionary computation?” ▶ Fitness-based separation ▶ Fitness-based separation - Clamp-block style encoding of individuals is useful for DNA - Clamp-block style encoding of individuals is useful for DNA implementation of a Royal Road problem implementation of a Royal Road problem - Verify a complete separation ability for the Royal Road fitness - Verify a complete separation ability for the Royal Road fitness function function - 2-d DGGE and PAGE separation - 2-d DGGE and PAGE separation : useful for implementing fitness separation for the Royal Road : useful for implementing fitness separation for the Royal Road problem and for other evolutionary algorithm problem and for other evolutionary algorithm