Presentation is loading. Please wait.

Presentation is loading. Please wait.

Princeton University COS 226 Algorithms and Data Structures Spring 2003 Knuth-Morris-Pratt Reference: Chapter 19, Algorithms.

Similar presentations


Presentation on theme: "Princeton University COS 226 Algorithms and Data Structures Spring 2003 Knuth-Morris-Pratt Reference: Chapter 19, Algorithms."— Presentation transcript:

1 Princeton University COS 226 Algorithms and Data Structures Spring 2003 http://www.Princeton.EDU/~cs226 Knuth-Morris-Pratt Reference: Chapter 19, Algorithms in C by R. Sedgewick. Addison Wesley, 1990.

2 2 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. aabaaa Search Pattern 34 aa 56 a 01 aa 2 b accept state

3 3 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. aabaaa Search Pattern 34 aa 56 a 01 aa 2 b b b b b b a accept state

4 4 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

5 5 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

6 6 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

7 7 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

8 8 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

9 9 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

10 10 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

11 11 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

12 12 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

13 13 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aaabaa Search Text baaab accept state

14 14 Knuth-Morris-Pratt KMP algorithm. n Use knowledge of how search pattern repeats itself. n Build FSA from pattern. n Run FSA on text. 34 aa 56 a 01 aa 2 b b b b b b a aabaaa Search Pattern aabaaa Search Text baaab accept state

15 15 FSA Representation FSA used in KMP has special property. n Upon character match, go forward one state. n Only need to keep track of where to go upon character mismatch. – go to state next[j] if character mismatches in state j b0 a1 012345 0 2 3 2 0 4 0 5 3 6 aabaaa Search Pattern next002003 34 aa 56 a 01 aa 2 b b b b b b a accept state

16 16 KMP Algorithm Two key differences from brute force. n Text pointer i never backs up. Need to precompute next[] table. int kmpsearch(char p[], char t[], int next[]) { int i, j = 0; int M = strlen(p); // pattern length int N = strlen(t); // text length for (i = 0; i < N; i++) { if (t[i] == p[j]) j++; // char match else j = next[j]; // char mismatch if (j == M) return i – M + 1; // found } return N; // not found } KMP String Search

17 17 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Example. Building FSA for aabaaabb. State 6. p[0..5] = aabaaa – assume you know state for p[1..5] = abaaa X = 2 – if next char is b (match): go forward6 + 1 = 7 – if next char is a (mismatch): go to state for abaaaa X + 'a' = 2 – update X to state for p[1..6] = abaaab X + 'b' = 3 34 aa 56 a 01 aa 2 b b b b b b a

18 18 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Example. Building FSA for aabaaabb. 34 aa 56 a 01 aa 2 b b b b b b a 7 a b

19 19 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Example. Building FSA for aabaaabb. State 7. p[0..6] = aabaaab – assume you know state for p[1..6] = abaaab X = 3 – if next char is b (match): go forward7 + 1 = 8 – next char is a (mismatch): go to state for abaaaba X + 'a' = 4 – update X to state for p[1..7] = abaaabb X + 'b' = 0 34 aa 56 a 01 aa 2 b b b b b b a 7 a b

20 20 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Example. Building FSA for aabaaabb. 34 aa 56 a 01 aa 2 b b 7 8 b b b a b b a b a

21 21 FSA Construction for KMP FSA construction for KMP. n FSA builds itself! Crucial insight. n To compute transitions for state n of FSA, suffices to have: – FSA for states 0 to n-1 – state X that FSA ends up in with input p[1..n-1] To compute state X' that FSA ends up in with input p[1..n], it suffices to have: – FSA for states 0 to n-1 – state X that FSA ends up in with input p[1..n-1]

22 22 FSA Construction for KMP 01 a b b a X aabaaabb Search Pattern pattern[1..j]j

23 23 FSA Construction for KMP 01 a b b0 a1 0 aabaaabb Search Pattern X 00 pattern[1..j]jnext 0

24 24 FSA Construction for KMP 01 aa 2 b b b0 a1 01 0 2 aabaaabb Search Pattern X 0 a11 0 pattern[1..j]jnext 0 0

25 25 FSA Construction for KMP 301 aa 2 b b b a b0 a1 012 0 2 3 2 aabaaabb Search Pattern X 0 ab0 a1 2 1 0 pattern[1..j]jnext 0 2 0

26 26 FSA Construction for KMP 34 a 01 aa 2 b b b b a b0 a1 0123 0 2 3 2 0 4 aabaaabb Search Pattern X 0 ab0 aba1 a1 2 1 3 0 pattern[1..j]jnext 0 2 0 0

27 27 FSA Construction for KMP 34 aa 5 01 aa 2 b b b b b a b0 a1 01234 0 2 3 2 0 4 0 5 aabaaabb Search Pattern abaa X 2 0 ab0 aba1 a1 2 1 4 3 0 pattern[1..j]jnext 0 0 2 0 0

28 28 FSA Construction for KMP 34 aa 56 a 01 aa 2 b b b b b b a b0 a1 012345 0 2 3 2 0 4 0 5 3 6 aabaaabb Search Pattern abaa X 2 abaaa2 0 ab0 aba1 a1 2 1 4 3 5 0 pattern[1..j]jnext 0 3 0 2 0 0

29 29 FSA Construction for KMP 34 aa 56 a 01 aa 2 b b 7 b b b b a b a b0 a1 0123456 0 2 3 2 0 4 0 5 3 6 7 2 aabaaabb Search Pattern abaa X 2 abaaa2 abaaab3 0 ab0 aba1 a1 2 1 4 3 5 0 6 pattern[1..j]jnext 0 3 2 0 2 0 0

30 30 FSA Construction for KMP 34 aa 56 a 01 aa 2 b b 7 8 b b b a b b a b a b0 a1 01234567 0 2 3 2 0 4 0 5 3 6 7 2 8 4 abaa X 2 abaaa2 abaaab3 0 ab0 aba1 a1 2 1 4 3 5 0 6 abaaabb07 aabaaabb Search Pattern pattern[1..j]jnext 0 3 2 0 2 0 0 4

31 31 FSA Construction for KMP Code for FSA construction in KMP algorithm. void kmpinit(char p[], int next[]) { int j, X = 0, M = strlen(p); next[0] = 0; for (j = 1; j < M; j++) { if (p[X] == p[j]) { // char match next[j] = next[X]; X = X + 1; } else { // char mismatch next[j] = X + 1; X = next[X]; } FSA Construction for KMP


Download ppt "Princeton University COS 226 Algorithms and Data Structures Spring 2003 Knuth-Morris-Pratt Reference: Chapter 19, Algorithms."

Similar presentations


Ads by Google