DNA and splicing (circular) Dipartimento di Informatica Sistemistica e Comunicazioni, Univ. di Milano - Bicocca ITALY Dipartimento di Informatica e Applicazioni, Univ. di Salerno, ITALY Paola Bonizzoni, Clelia De Felice, Giancarlo Mauri, Rosalba Zizza Circular splicing, definitions State of the art Our contributions Works in progress
<<An important aspect of this years meeting can be summed up us: SHOW ME THE EXPERIMENTAL RESULT! >> (T. Amenyo, Informal Report on 3rd Annual DIMACS Workshop on DNA Computing, 1997) We apologize... theoretical results
Before Adleman experiment (1994)... Tom Head 1987 (Bull. of Math. Biology) Formal Language Theory and DNA : an analysis of the generative capacity of specific recombinant behaviors SPLICING Unconventional models of computation
SPLICING LINEAR CIRCULAR
CIRCULAR SPLICING restriction enzyme 1 restriction enzyme 2 ligase enzymes
Circular languages: definitions and examples Conjugacy relation on A* w, w A*, w ~ w w=xy, w = yx Example abaa, baaa, aaab,aaba are conjugate A ~ = A* ~ = set of all circular words ~ w = [w] ~, w A* Circular language C A ~ set of equivalence classes A* A* ~ L Cir(L) = { ~ w | w L} (circularization of L) C L C {w A*| ~ w C}= Lin(C) (Full linearization of C) (A linearization of C, i.e. Cir(L)=C )
FA ~ ={ C A ~ | L A*, Cir(L) = C, L FA, FA Chomsky hierarchy} Definition: Theorem [Head, Paun, Pixton] C Reg Lin (C) Reg C Reg ~ Lin (C) Reg
Pauns definition Circular splicing systems (A= finite alphabet, I A ~ initial language) SC PA = (A, I, R) R A* | A* $ A* | A* rules ~hu1u2,~hu1u2, ~ku3u4~ku3u4 A ~ r = u 1 | u 2 $ u 3 | u 4 R u 2 hu 1 u4ku3u4ku3 ~ u 2 hu 1 u 4 ku 3 Definition I and closed under the application of the rules in R A circular splicing language C(SC PA ) (i.e. a circular language generated by a splicing system SC PA ) is the smallest circular language containing
Other definitions of splicing systems Heads definitionSC H = (A, I, T) T A* A* A* triples A ~ (p, x, q ), ( u,x,v) T vkux ~ hpx vkux q ~ hpxq, ~ kuxv q hpx (A= finite alphabet, I A ~ initial language) SC PI = (A, I, R) A ~ (, ; ), (, ; ) R ~ h h ~ h, ~ h h Pixtons definition R A* A* A* rules h
Problem: Theorem [ Paun96] Characterize FA ~ C(Fin, Fin) C(Reg, Fin) class of circular languages C= C(SC PA ) generated by SC PA with I and R both finite sets. F {Reg ~, CF ~, RE ~ } R +add. hyp. (symmetry, reflexivity, self-splicing) Theorem [Pixton95-96] R Fin+add. hyp. (symmetry, reflexivity) C(F, Fin) F F {Reg ~, CF ~, RE ~ } C(F, Reg) FC(Reg ~, Fin) Reg ~,
Circular finite splicing languages and Chomsky hierarchy CS ~ CF ~ Reg ~ ~ ((aa)*b) ~ (aa)* ~ (a n b n ) I= ~ aa ~ 1, R={aa | 1 $ 1 | aa} I= ~ ab ~ 1, R={a | b $ b | a}
Our contributions Reg ~ Fingerprint closed star languages X*, X regular group code Cir (X*) X finite cyclic languages weak cyclic, other examples ~ (a*ba*)* Reg ~ C(Fin, Fin)
Our contributions (continued) Comparing the three definitions of splicing systems C(SC H ) C(SC PA ) C(SC PI ) ~ (a*ba*)*, ~ ((aa)*b) =... ?
Star languages L A* is star language if L is regular, closed under conjugacy relation and L=X*, with X regular Proposition: SC PA =(A,I,R), I Cir(X*) C(SC PA ) Cir (X*) Consistence easily follows!!! Examples (b*(ab*a)*)* = X* (a*ba*)* = X* X=b ab*a X= a*ba* Definition
Fingerprint closed languages Definition For any cycle c, L contains the Fingerprints of c Fingerprint of a cycle c n c L power of the cycle, where the internal cycles are crossed a finite number of times c=(x(y(zz) j y) i x) n c i n y, j n x c q0q0 x x y y z z q0q0
Fingerprint closed star languages C(Fin,Fin) Theorem I=Cir({successful path containing fingerprint of cycles}) R={1 | 1 $ 1 | ƒ | ƒ fingerprint of cycle c, for any cycle c} Star languages not fingerprint closed (a*ba*)* but not generated!!! Star languages fingerprint closed X*, X regular group code X finite, Cir(X*) Sketch Take SC PA = (A, I, R) with (for example X=b ab*a) (for example X=A d )
Not Star Languages in C(Fin, Fin) new! Definition Cyclic(z) ={( ~ (z* p)) | p Pref (Lin( ~ z))} Example Cyclic(abc)= ~ (abc)*a ~ (abc)*ab ~ (abc)*b ~ (abc)*bc ~ (abc)*c ~ (abc)*ca z = abc A* Lin ( ~ z) =Lin ( ~ abc) ={abc, bca,cab} Pref(Lin ( ~ z)) =Pref(Lin ( ~ abc)) =Pref({abc, bca,cba}) = {a, ab, b, bc, c, ca} Cyclic Languages
Theorem Cyclic(z) C(Fin,Fin) The proof is quite technical... Example (continued) Cyclic (abc) is generated by SC PA = (A,I,R) where I,R are defined as follows I={ ~ ((abc) i p | 0 i 3, p Pref(Lin( ~ (abc))) } R={z ab | z $ z | ca z, z ab | z $ z b | c z, z ca | z $ z $ bc z, z a | z $ z | b z, z b | z $ z $ c z, z c | z $ z | a z } For any z, |z|>2, z unbordered word, then i.e. z uA* A*u
Other circular regular splicing languages ~ (abc)*a ~ (abc)*ab ~ (abc)*b ~ (abc)*bc ~ (abc)*c ~ (abc)*ca Cyclic(abc) ~ (abc)*ac weak cyclic languages Cyclic (abca).... bordered word...
Works in progress Characterize Reg ~ C(Fin, Fin) Characterize FA ~ C(Fin, Fin) C(SC PI ) = Star languages Additional hypothesis r= u 1 | u 2 $ u 3 | u 4 in R Reflexive: r = u 1 | u 2 $ u 1 | u 2 Symmetric: r = u 3 | u 4 $ u 1 | u 2 Self-splicing: From ~ xu 1 u 2 yu 3 u 4, with r,r as above, generates ~ u 4 xu 1, ~ u 2 yu 3.
DNA6 auditorium Thanks!