Download presentation
Presentation is loading. Please wait.
Published byPrimrose Kennedy Modified over 9 years ago
1
Average Value of Sum of Exponents of Runs in Strings Kazuhiko Kusano, Wataru Matsubara, Akira Ishino, Ayumi Shinohara Graduate School of Information Sciences Tohoku University, Japan 1
2
Background 2
3
Run (Maximal Repetition) Substring w which has period Non-extendable left nor right Count once with it’s minimal period 3 :run
4
The Number & The Sum of Exponents The number of runs and the sum of exponents (repetition counts) of runs are interesting issue 4 2.5 2 3 2 2.2 Number: 6 Sum of exponents: 14.2
5
Maximum The maximum number of runs and the maximum value of sum of exponents of runs are still unknown 5 Number Sum of exponents ≦ cn Kolpakov and Kucherov, 1999 ≦ cn Kolpakov and Kucherov, 1999 ≦ 5n Rytter, 2006 ≦ 5n Rytter, 2006 ≦ 3.48n Puglisi et al., 2007 ≦ 3.44n Rytter, 2007 ≦ 1.048n Crochemore and Ilie, 2008 ≧ 0.927n Franek et al., 2003 ≧ 0.945n Matsubara et al., 2008 1.01.0 2.02.0 = n ? Conjecture = n ? Conjecture ≦ 2.9n Crochemore and Ilie, 2007 ≦ 2.9n Crochemore and Ilie, 2007 = 2n ? Conjecture = 2n ? Conjecture ≧ 1.854n Franek et al., 2003 ≧ 1.889n Matsubara et al., 2008 ≦ cn Kolpakov and Kucherov, 1999 ≦ 25n Rytter, 2006 ≦ 25n Rytter, 2006
6
Average The average number of runs is presented We show the average value of sum of exponents of runs 6 Number of runs Sum of exponents Puglisi & Simpson Australasian Journal of Combinatorics To appear (2008) : alphabet size (d) : Möbius function Our result
7
7
8
The average value of sum of exponents of runs in strings of length n is represented as follows 8 : alphabet size L(p) : number of Lyndon words of length p Number of runs Sum of exponents [Puglisi & Simpson, 2008]
9
Detail 9
10
Runs in all strings of length n 10 Complicated!
11
d(w,p)d(w,p)d(w,p)d(w,p) A string d(w,p) of length |w|-p is defined as follows w[i..j+p] is a run if and only if d(w,p)[i..j] is a 0-segment (maximal block of 0's) of length l ≧ p 11 w d(w,2) w>>2 w d(w,2)
12
Runs are classified according to its period 12 w d(w,1) d(w,2) d(w,3)
13
13 d(w,2) d(w,3) 0-segments are classified according to its length l=2 l=3 l=4
14
The number of 0-segments of length p in n c(n,p)c(n,p)c(n,p)c(n,p) 14 Example = 2, n = 5, p = 2
15
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p)
16
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p) 16
17
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up ( σ -1) 2 choices c(n,p)c(n,p)c(n,p)c(n,p) 17 ( -1) 2 choices σ n-p-2 choices n-p-2 choices (n - p+1) choices for position of 0-segments ≠
18
C(n,p)C(n,p)C(n,p)C(n,p) 0-segments of length l in d(w,p) correspond to runs of period p in w The length of the run is l+p and the exponents is (l+p)/p We denote by C(n,p) the sum of (l+p)/p for each 0- segments of length p or longer as follows 18 w d(w, p) p=2, l=3
19
C(n,p)C(n,p)C(n,p)C(n,p) 19 Example =2, n=5, p=2
20
0-segments and runs An 0-segment of length l ≧ p in d(w,p) correspond to p runs having period p in w because d(w,p) and w[0..p-1] determine w[p..n-1] 20 d(w,2) w 00000, 11111 and 22222 are not runs of period 2 but period 1
21
0-segments and runs An 0-segment of length l ≧ p in d(w,p) correspond to p runs having period p in w because d(w,p) and w[0..p-1] determine w[p..n-1] 21 d(w,2) w 00000, 11111 and 22222 are not runs of period 2 but period 1 In the roots all strings of length p appear once
22
Counting a run once To avoid counting a run more than once a run which has shorter period should be ignored A run has no shorter period ⇔ The root of a run is primitive The number of primitive strings of length p is pL(p) 22 L(p) :number of Lyndon words of length p
23
Counting a run once 23 To avoid counting a run more than once a run which has shorter period should be ignored A run has no shorter period ⇔ The root of a run is primitive The number of primitive strings of length p is pL(p) L(p) :number of Lyndon words of length p
24
Counting a run once 24 To avoid counting a run more than once a run which has shorter period should be ignored A run has no shorter period ⇔ The root of a run is primitive The number of primitive strings of length p is pL(p) L(p) :number of Lyndon words of length p
25
To avoid counting a run more than once a run which has shorter period should be ignored A run has no shorter period ⇔ The root of a run is primitive The number of primitive strings of length p is pL(p) Counting a run once 25 L(p) :number of Lyndon words of length p
26
Average value of sum of exponents The sum of exponents of runs in n and the average value of sum of exponents of runs in strings of length n are as follows 26
27
Limit of e(n) The average value e(n) grows almost linearly, as n increases 27
28
Limit of e(n) The limit of e(n)/n and the actual values are follows 28 e(n)/n 21.131 30.738 40.545 50.430 60.355 (d) :Möbius function
29
Summary 29
30
Summary The number of 0-segments of length p in n The sum of (l+p)/p for each runs of period p or longer as follows The average value of sum of exponents of runs in strings of length n 30 Thank you for your attension
31
31
32
周期 32 NG
33
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p) 33 ( σ -1) 2 choices ( -1) 2 choices σ n-p-2 choices n-p-2 choices (n - p+1) choices for position of 0-segments ≠
34
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p) 34 ( σ -1) 2 choices ( -1) 2 choices σ n-p-2 choices n-p-2 choices (n - p+1) choices for position of 0-segments ≠
35
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p) 35 ( σ -1) 2 choices ( -1) 2 choices σ n-p-2 choices n-p-2 choices (n - p+1) choices for position of 0-segments ≠
36
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p) 36 ( σ -1) 2 choices ( -1) 2 choices σ n-p-2 choices n-p-2 choices (n - p+1) choices for position of 0-segments ≠
37
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p) 37 ( σ -1) 2 choices ( -1) 2 choices σ n-p-2 choices n-p-2 choices (n - p+1) choices for position of 0-segments ≠
38
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p) 38 ( σ -1) 2 choices ( -1) 2 choices σ n-p-2 choices n-p-2 choices (n - p+1) choices for position of 0-segments ≠
39
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p) 39 ( σ -1) 2 choices ( -1) 2 choices σ n-p-2 choices n-p-2 choices (n - p+1) choices for position of 0-segments ≠
40
The number of 0-segments of length p in n Instead of 0-segments, pairs of strings ( , ), which separated by 0-segments of length p, are counted up c(n,p)c(n,p)c(n,p)c(n,p) 40 ( σ -1) 2 choices ( -1) 2 choices σ n-p-2 choices n-p-2 choices (n - p+1) choices for position of 0-segments ≠
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.