Download presentation
Presentation is loading. Please wait.
1
1 The Galil-Giancarlo algorithm Advisor: Prof. R. C. T. Lee Speaker: S. Y. Tang On the exact complexity of string matching: upper bounds, SIAM Journal on Computing, Vol. 21, No. 3, 1992, pp. 407-437. Galil, Z. and Giancarlo, R.
2
2 The Galil-Giancarlo algorithm is an algorithm which solves the string matching problem. String matching problem: Input: a text string T of length n and a pattern string P of length m. Output: all occurrences of P in T.
3
3 The Galil-Giancarlo algorithm(GG algorithm for short) is an algorithm which improves the worst case of the Colussi algorithm. There are two phases in the GG algorithm which are preprocessing and searching. The preprocessing phase is the same as the Colussi algorithm. The GG algorithm adds 5 cases to determine how to jump in the searching phase and this is the difference between GG algorithm and Colussi algorithm.
4
4 GGAGGGGGAA GGGGGAA GGGGGAA GGGCCGGTAG GGGACGG GGGACGG GGAACGG GGAACGG GGGGGCCGGA Case:1 Case:2 Case:3 If l<k ; p[l+1]≠t[j+k] Text Pattern l = 5 shift l = 3 shift l = 2 shift k = 2 k = 3 k = 5 If l=k ; p[l+1]≠t[j+k] If l>k
5
5 GGGACGGAC GGGACGG GGGACGG GGGACGG GGGGGACGG Case: 4 Case: 5 If l<k ; p[l+1]= t[j+k] Text Pattern l = 3 shift k = 3 k = 5 If l=k ; p[l+1]= t[j+k] ; Do not need to shift.
6
6 Example(1/7) GGGACGG T P mismatch Shift[4] = 4 TAGAAGGACGGAGGAACGGGGGACGG i0123456 X[i]X[i] GGGACGG Kmp[i] 20 Kmin[i] ---14-- Rmin[i] 555--67 Shift[i] 5551467 012345678910111213141516171819202122232425 shift GGGACGG We first compare noholes by using phase 1 of Colussi algorithm and shift by using the Shift[i].
7
7 Example(2/7) T P match TAGAAGGACGGAGGAACGGGGGACGG i0123456 X[i]X[i] GGGACGG Kmp[i] 20 Kmin[i] ---14-- Rmin[i] 555--67 Shift[i] 5551467 012345678910111213141516171819202122232425 GGGACGG
8
8 Example(3/7) T P mismatch TAGAAGGACGGAGGAACGGGGGACGG i0123456 X[i]X[i] GGGACGG Kmp[i] 20 Kmin[i] ---14-- Rmin[i] 555--67 Shift[i] 5551467 012345678910111213141516171819202122232425 GGGACGG After all noholes are matched, we compare holes by using phase 2 of Colussi algorithm and shift by using the Shift[i]. Shift[0] = 5 shift GGGACGG
9
9 Example(4/7) T P TAGAAGGACGGAGGAACGGGGGACGG i0123456 X[i]X[i] GGGACGG Kmp[i] 20 Kmin[i] ---14-- Rmin[i] 555--67 Shift[i] 5551467 012345678910111213141516171819202122232425 GGGACGG shift GGGACGG In this case, we use the Case 1 of the GG algorithm to shift because this case satisfies the condition overlay k. l = 3 k = 2
10
10 Example(5/7) TAGAAGGACGGAGGAACGGGGGACGG T P i0123456 X[i]X[i] GGGACGG Kmp[i] 20 Kmin[i] ---14-- Rmin[i] 555--67 Shift[i] 5551467 012345678910111213141516171819202122232425 GGGACGG shift After comparing the cases of the GG algorithm, We return to use the Colussi algorithm. 51243 GGGACGG All noholes are match mismatch Shift[2] = 5
11
11 Example(6/7) TAGAAGGACGGAGGAACGGGGGACGG T P i0123456 X[i]X[i] GGGACGG Kmp[i] 20 Kmin[i] ---14-- Rmin[i] 555--67 Shift[i] 5551467 012345678910111213141516171819202122232425 shift GGGACGG GGGACGG l = 3 k = 2 In the case, we use the Case 5 of the GG algorithm to shift because this case satisfies the condition of using the GG algorithm and l < k.
12
12 Example(7/7) TAGAAGGACGGAGGAACGGGGGACGG T P i0123456 X[i]X[i] GGGACGG Kmp[i] 20 Kmin[i] ---14-- Rmin[i] 555--67 Shift[i] 5551467 012345678910111213141516171819202122232425 132 GGGACGG Exact match After comparing the cases of the GG algorithm, We return to use the Colussi algorithm.
13
13 The cases under which the GG algorithm is not used. Case1: The pattern has only one period. The entire window is skipped. There is no way to know whether there is a prefix in the window equal to a prefix of the pattern. Example: T: GCAGCGGGAC P: GGAGC GGAGC i01234 X[i]X[i]GGAGC Kmp[i] 1 1 Kmin[i]--1-3 Rmin[i]55-4- Shift[i]55143 mismatch shift
14
14 Case2: A prefix of the pattern is already known to be equal to a prefix of the window. T: GGACGGAACGCA P: GGAGGGA GGAGGGA T: GCAGGAGCAGCA P: GGAGGAG GGAGGAG i0123456 X[i]X[i]GGAGGGA Kmp[i] 1 21 Kmin[i]--1--35 Rmin[i]44-44-- Shift[i]4414435 mismatch shift mismatch shift i0123456 X[i]X[i]GGAGGAG Kmp[i] 1 1 Kmin[i]--1--4 Rmin[i]33-66-7 Shift[i]3316647
15
15 preprocessing phase in O(m) time and space complexity. searching phase in O(n) time complexity. performs (4/3)n text character comparisons in the worst case. Time complexity
16
16 Conclusion The Galil-Giancarlo algorithm is very similar to Colussi algorithm. The Colussis algorithm performs very badly if the pattern starts and ends with a sequence of repetitions of the same symbol. For these patterns Colussis algorithm shifts by a single position and (3/2)n comparisons are actually performed. Galil and Giancarlo devised a way to avoid these shifts by a single position.
17
17 References 1.[B92] BRESLAUER, D., Efficient String Algorithmics, Ph. D. Thesis, Report CU-024-92, Computer Science Department, Columbia University, New York, NY, 1992. 2.[GG92] On the exact complexity of string matching: upper bounds, Galil, Z. and Giancarlo, R., SIAM Journal on Computing, Vol. 21, No. 3, 1992, pp. 407-437.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.