Download presentation
Presentation is loading. Please wait.
1
Princeton University COS 226 Algorithms and Data Structures Spring 2003 http://www.Princeton.EDU/~cs226 Knuth-Morris-Pratt Reference: Chapter 19, Algorithms in C by R. Sedgewick. Addison Wesley, 1990.
2
2 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. aabaaa Search Pattern 34 aa 56 a 01 aa 2 b accept state
3
3 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. aabaaa Search Pattern 34 aa 56 a 01 aa 2 b b b b b b a accept state
4
4 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
5
5 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
6
6 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
7
7 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
8
8 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
9
9 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
10
10 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
11
11 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
12
12 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
13
13 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state
14
14 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aabaaa Search Text baaab accept state
15
15 FSA Representation FSA used in KMP has special property. n Upon character match, go forward one state. n Only need to keep track of where to go upon character mismatch. – go to state next[j] if character mismatches in state j b0 a1 012345 0 2 3 2 0 4 0 5 3 6 aabaaa Search Pattern next002003 34 aa 56 a 01 aa 2 b b b b b b a accept state
16
16 KMP Algorithm Two key differences from brute force. n Text pointer i never backs up. Need to precompute next[] table. int kmpsearch(char p[], char t[], int next[]) { int i, j = 0; int M = strlen(p); // pattern length int N = strlen(t); // text length for (i = 0; i < N; i++) { if (t[i] == p[j]) j++; // char match else j = next[j]; // char mismatch if (j == M) return i – M + 1; // found } return N; // not found } KMP String Search
17
17 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Example. Building FSA for aabaaabb. State 6. p[0..5] = aabaaa – assume you know state for p[1..5] = abaaa X = 2 – if next char is b (match): go forward6 + 1 = 7 – if next char is a (mismatch): go to state for abaaaa X + 'a' = 2 – update X to state for p[1..6] = abaaab X + 'b' = 3 34 aa 56 a 01 aa 2 b b b b b b a
18
18 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Example. Building FSA for aabaaabb. 34 aa 56 a 01 aa 2 b b b b b b a 7 a b
19
19 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Example. Building FSA for aabaaabb. State 7. p[0..6] = aabaaab – assume you know state for p[1..6] = abaaab X = 3 – if next char is b (match): go forward7 + 1 = 8 – next char is a (mismatch): go to state for abaaaba X + 'a' = 4 – update X to state for p[1..7] = abaaabb X + 'b' = 0 34 aa 56 a 01 aa 2 b b b b b b a 7 a b
20
20 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Example. Building FSA for aabaaabb. 34 aa 56 a 01 aa 2 b b 7 8 b b b a b b a b a
21
21 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Crucial insight. n To compute transitions for state n of FSA, suffices to have: – FSA for states 0 to n-1 – state X that FSA ends up in with input p[1..n-1] To compute state X' that FSA ends up in with input p[1..n], it suffices to have: – FSA for states 0 to n-1 – state X that FSA ends up in with input p[1..n-1]
22
22 FSA Construction for KMP 01 a b b a X aabaaabb Search Pattern pattern[1..j]j
23
23 FSA Construction for KMP 01 a b b0 a1 0 aabaaabb Search Pattern X 00 pattern[1..j]jnext 0
24
24 FSA Construction for KMP 01 aa 2 b b b0 a1 01 0 2 aabaaabb Search Pattern X 0 a11 0 pattern[1..j]jnext 0 0
25
25 FSA Construction for KMP 301 aa 2 b b b a b0 a1 012 0 2 3 2 aabaaabb Search Pattern X 0 ab0 a1 2 1 0 pattern[1..j]jnext 0 2 0
26
26 FSA Construction for KMP 34 a 01 aa 2 b b b b a b0 a1 0123 0 2 3 2 0 4 aabaaabb Search Pattern X 0 ab0 aba1 a1 2 1 3 0 pattern[1..j]jnext 0 2 0 0
27
27 FSA Construction for KMP 34 aa 5 01 aa 2 b b b b b a b0 a1 01234 0 2 3 2 0 4 0 5 aabaaabb Search Pattern abaa X 2 0 ab0 aba1 a1 2 1 4 3 0 pattern[1..j]jnext 0 0 2 0 0
28
28 FSA Construction for KMP 34 aa 56 a 01 aa 2 b b b b b b a b0 a1 012345 0 2 3 2 0 4 0 5 3 6 aabaaabb Search Pattern abaa X 2 abaaa2 0 ab0 aba1 a1 2 1 4 3 5 0 pattern[1..j]jnext 0 3 0 2 0 0
29
29 FSA Construction for KMP 34 aa 56 a 01 aa 2 b b 7 b b b b a b a b0 a1 0123456 0 2 3 2 0 4 0 5 3 6 7 2 aabaaabb Search Pattern abaa X 2 abaaa2 abaaab3 0 ab0 aba1 a1 2 1 4 3 5 0 6 pattern[1..j]jnext 0 3 2 0 2 0 0
30
30 FSA Construction for KMP 34 aa 56 a 01 aa 2 b b 7 8 b b b a b b a b a b0 a1 01234567 0 2 3 2 0 4 0 5 3 6 7 2 8 4 abaa X 2 abaaa2 abaaab3 0 ab0 aba1 a1 2 1 4 3 5 0 6 abaaabb07 aabaaabb Search Pattern pattern[1..j]jnext 0 3 2 0 2 0 0 4
31
31 FSA Construction for KMP Code for FSA construction in KMP algorithm. void kmpinit(char p[], int next[]) { int j, X = 0, M = strlen(p); next[0] = 0; for (j = 1; j < M; j++) { if (p[X] == p[j]) { // char match next[j] = next[X]; X = X + 1; } else { // char mismatch next[j] = X + 1; X = next[X]; } FSA Construction for KMP
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.