Download presentation
Presentation is loading. Please wait.
1
Knuth-Morris-Pratt algorithm
Presented by Sathyasathish
2
Agenda Problem/issue Conventional Solution(Compare/one shift) & Ѳ
KMP solution & Ѳ
3
Pattern Matching Problem/issue
Finding occurrence of a pattern(string) ‘P’ in String ‘S’ and also finding the position in ‘S’ where the pattern match occurs Source:
4
Conventional Solution
Compare each character of P with S if match continue else shift one position String S a b c a b Pattern p Source:
5
Comparison S S a b c a b p Step 2: compare p[2] with S[2] a b c p a b
Source:
6
Comparison a b c p a b Step 3: compare p[3] with S[3] S
Mismatch occurs here.. p a b “Since mismatch is detected, shift ‘P’ one position to the Right and perform steps analogous to those from step 1 to step 3. At position where mismatch is detected, shift ‘P’ one position to the right and repeat matching procedure. “ Source:
7
Conventional match program
for ( i=0;i+P.length<T.length; i++) { x++; for ( j=0; i+j <T.length && j< P.length && T[i+j]==P[j]; ++z,j++) { //System.out.println(""+T[i+j]+P[j]); flag=false } j++; m=m+j; if (j ==P.length+1 ) System.out.println("found a match at "+(i+1)); System.out.println("Program Charecter comparision : "+(m)+"\nNumber of attepmts : "+x) Soucrce: migrated from C to java by Sathya
8
of Conventional Outer loop n times (n length of String ‘S’)
Inner loop m times (m length of Pattern ‘P’) Code: for (m){ for(n); } Ѳ (mn)
9
KMP Potential area where conventional algorithm can be improved are a follows It never keep track previously known character in the then string when there is a partial match , on mis- match it again does comparison for all character in the string KMP uses learning(from partial match) in the String and Pattern (overlap in the pattern)while comparison and we will see how much efficiency it has delivered
10
Example T: b a n a n a n o b a n o i=0: X i=1: X i=2: n a n X i=3: X i=4: n a n o i=5: X i=6: n X i=7: X i=8: X i=9: n X i=10: X After investing a lot of work making comparisons in the inner loop of the code, a lot about what's in the text in known (partial match of j characters starting at position i, you know what's in positions S[i]...S[i+j-1]. ), KMP uses this learning
11
KMP Solution Issue with Conventional Algorithm i=2: n a n i=3: n a n o(Invalid Shift or wasted shift) KMP First Optimization step -skipping Outer loop i=2: n a n x i=4: n a n o(valid shift or learnt shift) KMP Second Optimization step -skipping Inner loop i=2: n a n x
12
Comparison KMP
13
KMP Algorithm It differ from conventional algorithm when there is partial mismatch How it differ we will see in a while! First we have to under stand proper prefix and a proper suffix Example S=“nano “ Prefix-n,na , nan but not (nano itself) Suffix- 0, no, ano but not (nano itself) why we need to know this ?
14
Suffix Prefix Take : String :- abcdabfxxxxx Pattern :- abcdabe Start next comparison from String :- abcdabfXXXXXX Pattern :- abcdabe
15
How KMP achieve this First it preprocess the pattern irrespective of String to compared. And identify the occurrence of same proper prefix or suffix this is called border or window When there is a mismatch it goes and tries with next largest window Example :ABAMABA
16
Preprocessing
17
Preprocessing & window width table
18
String and Pattern matching
19
Ѳ KMP Table can be computed in Ѳ (m)
The searching phase can be performed in O(m+n) time Knuth-Morris-Pratt algorithm performs at most 2n-1 text character comparisons during the searching phase Since m<n overall Ѳ (n)
20
Thank you Questions??????????????????
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.