Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov.

Similar presentations


Presentation on theme: "Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov."— Presentation transcript:

1 Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov (Moscow) and Jean-Pierre Duval, Thierry Lecroq, Arnaud Lefebvre (Rouen) Haifa Stringology Workshop, April 3-8 2005

2 2 Periodicities (repetitions) in strings   period:  the (global) period: minimal period  periodicity = word of period  Example: square, cube  : fractional periodicity  periodicities = “runs” of squares  (cyclic) root, 8/3 exponent

3 3 Finding periodicities CGCGGCAGTTTTGCCGACTGTTTGGGACTTGCTCGAACTTGCCTATGCCAAGCTGCCGACGATTC CGCCCACCCTGTTGGAACGCGATTTTAATTTCCCGCCTTTTTCCGAACTCGAAGCCGAAGTCGCC AAAATCGCCGATTATCAAACGCGTGCCGGAAAGGAATGCCGCCGTGCAGCCTGAAACCTCCGCCC AATACCAGCACCGTTTCGCCCAAGCCATACGCGGGGGCGAAGCCGCAGACGGTCTGCCGCAAGAC CGACTGAACGTCTATATCCGCCTGATACGCAACAATATCTACAGCTTTATCGACCGTTGTTATAC CGAAACGCTGCAATACTTTGACCGCGAAGAATGGGGCCGTCTGAAAGAAGGTTTCGTCCGCGACG CGTGCGCCCAAACGCCCTATTTTCAAGAAATCCCCGGCGAGTTCCTCCAATATTGCCAAAGCCTG CCGCTTTTAGACGGCATTTTGGCACTGATGGATTTTGAATATACCCAATTGCTGGCAGAAGTTGC TCAAATTCCGGATATTCCCGACATTCATTATTCAAATGACAGCAAATACACACCTTCCCCTGCGG CCTTTATCCGGCAATATCGATATGATGTTACCGATGATTTGCATGAAGCGGAAACAGCCTTGTTA ATATGGCGAAACGCCGAAGATGATGTGATGTACCAAACATTGGACGGCTTCGATATGATGCTGCT AGAAATAATGGGGTTCTCCGCGCTTTCGTTTGACACCCTCGCCCAAACCCTTGTCGAATTTATGC CTGAGGACGATAATTGGAAAAATATTTTGCTTGGGAAATGGTCAGGCTGGACTGAACAAAGGATT ATCATCCCCTCCTTGTCCGCCATATCCGAAAATATGGAAGACAATTCCCCGGGCC

4 4 Finding periodicities CGCGGCAGTTTTGCCGACTGTTTGGGACTTGCTCGAACTTGCCTATGCCAAGCTGCCGACGATTC CGCCCACCCTGTTGGAACGCGATTTTAATTTCCCGCCTTTTTCCGAACTCGAAGCCGAAGTCGCC AAAATCGCCGATTATCAAACGCGTGCCGGAAAGGAATGCCGCCGTGCAGCCTGAAACCTCCGCCC AATACCAGCACCGTTTCGCCCAAGCCATACGCGGGGGCGAAGCCGCAGACGGTCTGCCGCAAGAC CGACTGAACGTCTATATCCGCCTGATACGCAACAATATCTACAGCTTTATCGACCGTTGTTATAC CGAAACGCTGCAATACTTTGACCGCGAAGAATGGGGCCGTCTGAAAGAAGGTTTCGTCCGCGACG CGTGCGCCCAAACGCCCTATTTTCAAGAAATCCCCGGCGAGTTCCTCCAATATTGCCAAAGCCTG CCGCTTTTAGACGGCATTTTGGCACTGATGGATTTTGAATATACCCAATTGCTGGCAGAAGTTGC TCAAATTCCGGATATTCCCGACATTCATTATTCAAATGACAGCAAATACACACCTTCCCCTGCGG CCTTTATCCGGCAATATCGATATGATGTTACCGATGATTTGCATGAAGCGGAAACAGCCTTGTTA ATATGGCGAAACGCCGAAGATGATGTGATGTACCAAACATTGGACGGCTTCGATATGATGCTGCT AGAAATAATGGGGTTCTCCGCGCTTTCGTTTGACACCCTCGCCCAAACCCTTGTCGAATTTATGC CTGAGGACGATAATTGGAAAAATATTTTGCTTGGGAAATGGTCAGGCTGGACTGAACAAAGGATT ATCATCCCCTCCTTGTCCGCCATATCCGAAAATATGGAAGACAATTCCCCGGGCC

5 5 Some work has been done... ... see R.Kolpakov,G.Kucherov, Periodic structures in words, chapter of the 3rd Lothaire volume Applied Combinatorics on Words, Cambridge University Press, 2005

6 6 Some work has been done... ... see R.Kolpakov,G.Kucherov, Periodic structures in words, chapter of the 3rd Lothaire volume Applied Combinatorics on Words, Cambridge University Press, 2005  different results based on common simple techniques: extension functions and s-factorization

7 7 Rest of this talk  Basics –extension functions –computing periodicities in time –s-factorisation (Lempel-Ziv factorization) –computing periodicities in time  Computing all local periods in time

8 8 Extension function: simplest definition  all values can be computed in time [Main&Lorentz 84]

9 9 Extension function: simplest definition  all values can be computed in time [Main&Lorentz 84]  a refined algorithm is presented in [Lothaire 05] (inspired from Manacher’s linear-time algorithm for computing palindromes)

10 10 Extension function: variants

11 11 Using extension functions to compute periodicities  Lemma: There exists a square of period iff

12 12 Using extension functions to compute periodicities  Example: a t a c g a a c g a a c g g t a c g a a c g a c g a a g a a c

13 13 Using extension functions to compute periodicities  Example: a t a c g a a c g a a c g g t a c g a a c g a c g a a g a a c

14 14 Using extension functions to compute periodicities This implies (using binary division) that  one can compute a compact representation of all squares (maximal periodicieis) in time  one can compute all squares in time [Crochemore 81, Main&Lorentz 84]  one can test the square-freeness in time

15 15 s-factorization (Lempel-Ziv factorization) , where : –if letter which immediately follows does not occur in, then –otherwise is the longest subword occurring at least twice in  Example:  s-factorization (Lempel-Ziv factorization) can be computed in linear time using suffix tree or DAWG

16 16 Why s-factorization is useful here

17 17 Why s-factorization is useful here

18 18 Why s-factorization is useful here  lemma of [Main 89]

19 19 Computing (a compact representation of) all squares in linear time 1.compute the s-factorization of (in ) 2.for each factor A.compute all maximal periodicities ending inside and crossing the border between and (in ) B.recover all maximal periodicities occurring inside from a left copy of (in ) Important: the number of maximal periodicities is while the number of squares can be

20 20 Using extension functions + s-factorization to compute periodicities This implies that  one can compute a compact representation of all squares (maximal periodicities) in time [Kolpakov,Kucherov 99]  one can compute all squares (but also cubes,...) in time  one can test the square-freeness in time [Crochemore 83, Main&Lorentz 85]

21 21 Local periods minimal (local) square at = minimal square centered at local period at (denoted ) = root length of the minimal square at internal square right-external square left- and right-external square

22 22 Critical Factorization Theorem   for any,  global period of   Critical Factorization Theorem: For every, there exists a position such that = global period of

23 23 Computing local periods (minimal squares)  compute separately –internal minimal squares –left-external and right-external minimal squares –both left- and right-external minimal squares  focus on internal minimal squares  compute s-factorization  for each factor, compute minimal squares ending in this factor

24 24 Minimal squares inside a factor

25 25 Minimal squares inside a factor

26 26 Minimal squares crossing factor border  focus on squares crossing the left border of

27 27 Minimal squares crossing factor border  focus on squares crossing the left border of  focus on those of them centered inside

28 28 Minimal squares crossing factor border  focus on squares crossing the left border of  focus on those of them centered inside  general idea: compute squares and pick the minimal ones

29 29 Minimal squares crossing factor border  focus on squares crossing the left border of  focus on those of them centered inside  general idea: compute squares and pick the minimal ones  be careful, the number of squares can be super-linear!!

30 30 Minimal squares crossing factor border  focus on squares crossing the left border of  focus on those of them centered inside  general idea: compute squares and pick the minimal ones  be careful, the number of squares can be super-linear!!  compute maximal periodicities in increasing order of periods

31 31 Minimal squares crossing factor border  focus on squares crossing the left border of  focus on those of them centered inside  general idea: compute squares and pick the minimal ones  be careful, the number of squares can be super-linear!!  compute maximal periodicities in increasing order of periods  only a linear number of squares need to be tested for minimality!!

32 32 Sketch of the proof  assume we are looking at squares of period

33 33 Sketch of the proof  assume we are looking at squares of period  consider largest period for which squares have been found

34 34 Sketch of the proof  assume we are looking at squares of period  consider largest period for which squares have been found  if, then test all squares of period (at most )

35 35 Sketch of the proof  assume we are looking at squares of period  consider largest period for which squares have been found  if, then test all squares of period (at most )  if, then either, or

36 36 Sketch of the proof  assume we are looking at squares of period  consider largest period for which squares have been found  if, then test all squares of period (at most )  if, then either, or

37 37 Sketch of the proof  assume we are looking at squares of period  consider largest period for which squares have been found  if, then test all squares of period (at most )  if, then either, or

38 38 Sketch of the proof  assume we are looking at squares of period  consider largest period for which squares have been found  if, then test all squares of period (at most )  if, then either, or

39 39 Sketch of the proof  assume we are looking at squares of period  consider largest period for which squares have been found  if, then test all squares of period (at most )  if, then either, or   at most squares need to be tested

40 40 Computing (right-)external squares

41 41 Computing (right-)external squares  use extension functions!

42 42 Computing (right-)external squares  use extension functions!

43 43 Computing (right-)external squares  use extension functions!

44 44 Computing (right-)external squares  use extension functions!  for each, find minimal such that  can be done in time

45 45 Conclusions  All local periods can be computed in  note that the global period of is


Download ppt "Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov."

Similar presentations


Ads by Google