1 Approximate Boyer-Moore String Matching Source : SIAM Journal on Computing, Vol. 22, No. 2, 1993, pp.243-260 J. Tarhio and E. Ukkonen Advisor: Prof.

Slides:



Advertisements
Similar presentations
EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.
Advertisements

1 A B C
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
PDAs Accept Context-Free Languages
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala
/ /17 32/ / /
Reflection nurulquran.com.
EuroCondens SGB E.
Worksheets.
Slide 1Fig 26-CO, p.795. Slide 2Fig 26-1, p.796 Slide 3Fig 26-2, p.797.
Slide 1Fig 25-CO, p.762. Slide 2Fig 25-1, p.765 Slide 3Fig 25-2, p.765.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
1 Very fast and simple approximate string matching Information Processing Letters, 72:65-70, G. Navarro and R. Baeza-Yates Advisor: Prof. R. C. T.
1 Average Case Analysis of an Exact String Matching Algorithm Advisor: Professor R. C. T. Lee Speaker: S. C. Chen.
1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp G. Landau and U. Vishkin Advisor: Prof. R. C.
Speaker: C. C. Lin Adviser: R. C. T. Lee
1 Rules for Approximate String Matching R.C.T. Lee.
1 String Matching with Errors The Theory and Computation of Evolutionary Distances: Pattern Recognition, Sellers, P. H., Journal of Algorithms, Vol. 20,
1 Two Different Approximate String Matching Problems and Their Algorithms Speakers: C. W. Lu and Y. K. Shie Advisor: Richard Chia-Tung Lee.
David Burdett May 11, 2004 Package Binding for WS CDL.
Create an Application Title 1Y - Youth Chapter 5.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
CHAPTER 18 The Ankle and Lower Leg
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
The 5S numbers game..
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Stationary Time Series
Break Time Remaining 10:00.
The basics for simulations
Factoring Quadratics — ax² + bx + c Topic
EE, NCKU Tien-Hao Chang (Darby Chang)
A sample problem. The cash in bank account for J. B. Lindsay Co. at May 31 of the current year indicated a balance of $14, after both the cash receipts.
Tuned Boyer Moore Algorithm
PP Test Review Sections 6-1 to 6-6
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Progressive Aerobic Cardiovascular Endurance Run
Biology 2 Plant Kingdom Identification Test Review.
1..
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
Artificial Intelligence
When you see… Find the zeros You think….
Before Between After.
Slide R - 1 Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Prentice Hall Active Learning Lecture Slides For use with Classroom Response.
Subtraction: Adding UP
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Static Equilibrium; Elasticity and Fracture
Essential Cell Biology
Resistência dos Materiais, 5ª ed.
Clock will move after 1 minute
PSSA Preparation.
Essential Cell Biology
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
Copyright Tim Morris/St Stephen's School
1.step PMIT start + initial project data input Concept Concept.
9. Two Functions of Two Random Variables
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Presentation transcript:

1 Approximate Boyer-Moore String Matching Source : SIAM Journal on Computing, Vol. 22, No. 2, 1993, pp J. Tarhio and E. Ukkonen Advisor: Prof. R. C. T. Lee Speaker: Kuei-hao Chen

2 The k mismatches problem The k differences problem

3 Definition of the k mismatches problem Given a pattern string P of length m and a text string T of length n, we would like to find all approximate occurrences P in T with at most k mismatches. If k=1, then

4 Consider the following situation where a pattern P is matching with a windows W of T and there are already (k+1) mismatches:

5 Since there are already (k+1) mismatches, we must move the pattern. The following is obvious: P must be moved to such an extent that there are at most k mismatches between a suffix S of W and a substring S of P.

6 Our trick is as follows: Consider the (k+1)- suffix of W. There are two cases:

7 Case 1: There is one character in this (k+1)- suffix which exists in P in such a way as shown below. Move the pattern to match these characters. Note that in such a situation, there are at most k mismatches between the (k+1)-suffix and its corresponding substring in P.

8 Case 2: No such a character exists. Move the pattern in such a way that the k-prefix of P aligns with the k-suffix of W as shown below. Under such a situation, again, there are at most k-mismatches between the k-suffix of W and k- prefix of P.

9 The generalization of the BM algorithm for the k mismatches problem will be very natural: for k=0 the generalized algorithm is exact string matching. Recall that the k mismatches problem asks for finding all occurrences of P in T such that in at most k positions of P, T and P have different characters.

10 We just scan the pattern from right to the left until we have found k+1 mismatches (unsuccessful search) or the pattern ends (successful search).

11 Preprocessing phase for approximate matching D k table The value D k for a particular alphabet is defined as the rightmost position of that character in the pattern – 1 and the end position i where i=[m..m-k]. ΣA C G * D 1 [i=8, a] Example : Let k=1, m=8, a j P:GCAGAGAG i GCAGAGAG ΣA C G * D 1 [i=7, a] j P:GCAGAGA ΣA C G * D 1 [i=8, a] D 1 [i=7, a]

12 P = p 1 p 2 …p m,T = t 1 t 2 …t n Preprocessing For a Do For j=m downto m-k Do Begin d k [j,a] m Find a character a that it is close to p j. If it is found, we calculate the distance between the position of the character a and j and insert it into d k [j,a]. Algorithm for preprocessing phase

13 P = p 1 p 2 …p m,T = t 1 t 2 …t n Searching j=m; While j n+ k Do Begin h=j; i=m; mismatch=0; While i>0 and mismatch k Do Begin d=min(d k [i, t h ], d k [i-1, t h-1 ]); If t hp i Then mismatch=mismatch+1; i= i- 1; h= h-1 End of while; If mismatch k Then report match at position j; j= j+ d Endof while Algorithm for searching phase

14 Complete example for approximate string matching Example 1: Let k=1, m=4, n=17 T:TTAACGTAATGCAGCTA P:AGCT ΣA C G T D 1 [i=4, a] D 1 [i=3, a]

15 Example 1 (1/6) T:TTAACGTAATGCAGCTA P:AGCT ΣA C G T D 1 [i=4, a] D 1 [i=3, a]

16 Example 1 (2/6) T:TTAACGTAATGCAGCTA P:AGCT ΣA C G T D 1 [i=4, a] D 1 [i=3, a]

17 Example 1 (3/6) T:TTAACGTAATGCAGCTA P:AGCT ΣA C G T D 1 [i=4, a] D 1 [i=3, a]

18 Example 1 (4/6) T:TTAACGTAATGCAGCTA P:AGCT ΣA C G T D 1 [i=4, a] D 1 [i=3, a]

19 Example 1 (5/6) T:TTAACGTAATGCAGCTA P:AGCT ΣA C G T D 1 [i=4, a] D 1 [i=3, a]

20 Example 1 (6/6) T:TTAACGTAATGCAGCTA P:AGCT ΣA C G T D 1 [i=4, a] D 1 [i=3, a] j 16 + p, j 16+ 3, j 19 jump out of while loop

21 Example 2: Let k=1, m=8, n=24 T:GCATCGCAGAGAGTATACAGTACG P:GCAGAGAG ΣA C G * D 1 [i=8, a] D 1 [i=7, a]

22 Example 2 (1/14) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D 1 [i=8, a] D 1 [i=7, a]

23 Example 2 (3/14) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D[i=8, a] D[i=7, a]

24 Example 2 (4/14) ΣA C G * D[i=8, a] D[i=7, a] T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG

25 Example 2 (5/14) ΣA C G * D[i=8, a] D[i=7, a] T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG Then report match at position j; j 13 + p, j 13+ 2, j 15

26 Example 2 (6/14) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D[i=8, a] D[i=7, a]

27 Example 2 (7/14) ΣA C G * D[i=8, a] D[i=7, a] T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG

28 Example 2 (8/14) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D[i=8, a] D[i=7, a]

29 Example 2 (9/14) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D[i=8, a] D[i=7, a]

30 Example 2 (11/14) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D[i=8, a] D[i=7, a]

31 Example 2 (13/14) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D[i=8, a] D[i=7, a]

32 Example 2 (14/14) ΣA C G * D[i=8, a] D[i=7, a] T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG If h = 0 Then report match at position j; j 24 + p, j 24+ 2, j 26 jump out of while loop

33 Time complexity preprocessing phase in O(m+ kc) time and O(kc) space complexity. searching phase in O(mn) time complexity.

34 Definition of the k differences problem Given a pattern string P of length m and a text string T of length n, we would like to find all approximate occurrences P in T with edit distance not larger than k.

35 The basic approach to solve the problem is to find the edit distance for T(1, i) and P for every i [Ukk85b] : Let Edit be an m+1 by n+1 table such that Edit(i, j) is the minimum edit distance between p 1 p 2 …p j and any substring of T ending at t i.

36 Table Edit must be completely evaluated column-by-column in time O(mn).

37 If we can find out all occurrences of i where Edit(T(1, i), P) cannot be smaller than k. We may skip this i. This paper is based upon Rule 7 proposed by Professor Lee.

38 Rule 7 If k characters in String A do not appear in String B, Distance(A,B) is not smaller than k.

39 In the scanning phase, we define some terms first. A diagonal h of Edit for h=-m,…, n, consists of all Edit(i, j) such that i- j=h. For every Edit(i, j), there is a minimizing arc from Edit(i-1, j) to Edit(i, j) if Edit(i, j)=Edit(i-1, j)+1, from Edit(i, j-1) to Edit(i, j) if Edit(i, j-1)+1, and from Edit(i-1, j-1) to Edit(i, j) if Edit(i, j)=Edit(i-1, j-1) where p j =t i or if Edit(i, j)=Edit(i-1, j-1)+1 where p jt i. The costs of the arcs are 1, 1, 0 and 1, respectively.

40 A minimizing path is any path that consists of minimizing arcs and leads from an entry Edit(i, 0) on the first row of Edit to an entry Edit(h, m) on the last row of Edit. A minimizing path is successful if it leads to an entry Edit(h, m)k.

41 Lemma 1: The entries on a successful minimizing path M are contained in k+1 successive diagonals of Edit. Proof : Each addition of a diagonal comes from either an insertion or deletion. If there are more than (k+1) diagonals, there must be more than (k+1) operations, either deletions or insertions. Thus there cannot be more than (k+1) diagonals.

42 ABCABBA C B A B A C T:ABCABBA P:CBABAC S:C-AB-- P:CBAB AC EDIT(P, S)=3 There are (k+1)=3+1=4 successive diagonals because there are three deletions. Successive diagonals

43 BCABDAB C B A D B T:BCABDAB P:CBADB k =3 S:C-ABDABP:CBA-D-BS:C-ABDABP:CBA-D-B EDIT(P, S)=3 There are 1+2=3 <(k+1) =3+1=4 successive diagonals because there are one deletion and two insertions. Successive diagonals

44 By Lemma 1, for each diagonal d, any successful minimizing path starting at the top of this diagonal will have a bandwidth of 1+k+k=2k+1

45 ABCABBA C B A B A C T:ABCABBA P:CBABAC k=3 S:C-AB-- P:CBAB AC EDIT(P, S)=3 Result Successive diagonals The successful minimizing path is only in the bandwidth 7 of Edit. k=3

46 For the width of bandwidth k of Edit, we give it a name, call k-environment. For each j=1, …, m, let the k-environment of the pattern symbol p j be the string C j =p j-k …p j+k, where p a =ε for a m.

47 The longest vertical path in any minimizing path has length not greater than 2k+1. We only have to determine whether t i appears in the k environment of p j.

48 Given T=ATGCGAGAGAT, P=GCAGAGAGATG, and k=2. We select t 5, t 8 and t 11 three characters. The 2-environment of t 5 is C 5 =p 3 p 4 p 5 p 6 p 7 =AGAGA. The 2-environment of t 8 is C 8 =p 6 p 7 p 8 p 9 p 10 =GAGAT. The 2-environment of t 11 is C 11 =p 9 p 10 p 11 =ATG.

49 We now obtain a stronger version of Rule 7. Lemma 2: Let a successful minimizing path M go through some entry on a diagonal h of Edit. Then for at most k indexes j, 1j m, character t h+j does not occur in the k environment of C j. A formal proof can be found in the paper. In the following, we give some physical feeling of it.

50 In this case, although there are two mismatches, by deleting a which mismatches x, we may achieve a perfect match. Thus the edit distance between T and P may still be 1. k=1

51 In this case, it can be seen that deleting one character in P will not result in a perfect match. Thus, the edit distance between T and P must be larger than 1. k=1

52 The shift table is based on table D k. We determine the first diagonal after h, say h+d, where at least one of the characters t h+m, t h+m-1, …, t h+m-k matches with corresponding character of P. Finally, the maximum of k+1 and d is the length of the shift.

53 The algorithm explains when a possible occurrence of P in T was found, DP approach is immediately used to find alignment result.

54 Input: P = p 1 p 2 …p m,T = t 1 t 2 …t n and k Output: All occurrence P in T Initially, the start position h of T =0, i=h+m; While i n+ k do begin j=m; bad=0; While i>k and bad k do begin If t i does not occur in C j then bad=bad+1 j=j-1;i=i-1 end; If bad k then W is a sequence from t h-k to t h+m. Using dynamic programming to align W with P Output alignment result. We calculate shift steps d=min(D k [i, t r ], D k [i-1, t r-1 ],); h=h+max(k+1,d) end; Algorithm

55 Complete example for approximate string matching For example : Let k=1, m=8, n=24 T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D 1 [i=8, a] D 1 [i=7, a]

56 Example(1/15) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D[i=8, a] D[i=7, a] k=1 >k>k t 8 =A appears in P(7,8) t 7 =C does not appear in P(6,8) t 6 =G appears in P(5,7) t 5 =C does not appear in P(4,6) Shifting is needed now.

57 Example(2/15) ΣA C G * D[i=8, a] D[i=7, a] T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG >k>k k=1 t 9 =G appears in P(7,8) t 8 =A appears in P(6,8) t 7 =C does not appear in P(5,7) t 6 =G appears in P(4,6) t 5 =C does not appear in P(3,5) Shifting is needed now.

58 Example(3/15) ΣA C G * D[i=8, a] D[i=7, a] T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG >k>k k=1 t 11 =G appears in P(7,8) t 10 =A appears in P(6,8) t 9 =G appears in P (5,7) t 8 =A appears in P(4,6) t 7 =C does not appear in P(3,5) t 6 =G appears in P(2,4) t 5 =C appears in P(1,3) t 4 =T does not appear in P(1,2) Shifting is needed now.

59 Example(4/15) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG ΣA C G * D[i=8, a] D[i=7, a] GCAGAGAG Output : CGCAGAGAGT G C A G A G A G W= CGCAGAGAGT P= GCAGAGAG k=1

60 Example(5/15) GCAGAGA- GCAGAGAG Output : CGCAGAGAGT G C A G A G A G W= CGCAGAGAGT P= GCAGAGAG

61 Example(6/15) - CAGAGAG GCAGAGA G Output : CGCAGAGAGT G C A G A G A G W= CGCAGAGAGT P= GCAGAGAG

62 Example(7/15) CGCAGAG AG - GCAGAGA G Output : CGCAGAGAGT G C A G A G A G W= CGCAGAGAGT P= GCAGAGAG

63 Example(8/15) GCAGAGAGT GCAGAGAG- Output : CGCAGAGAGT G C A G A G A G W= CGCAGAGAGT P= GCAGAGAG

64 Example(9/15) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG aA C G * D[i=8, a] D[i=7, a] >k>k k=1 t 15 =A appears in P(7,8) t 14 =T does not appear in P(6,8) t 13 =G appears in P(5,7) t 12 =A appears in P(4,6) t 11 =G appears in P(3,5) t 10 =A appears in P(2,4) t 9 =G appears in P(1,3) t 8 =A does not appear in P(1,2) Shifting is needed now.

65 Example(10/15) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG aA C G * D[i=8, a] D[i=7, a] >k>k k=1 t 16 =T does not appear in P(7,8) t 15 =A appears in P(6,8) t 14 =T does not appear in P(5,7) Shifting is needed now.

66 Example(11/15) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG aA C G * D[i=8, a] D[i=7, a] >k>k k=1 t 18 =C does not appear in P(7,8) t 17 =G appears in P(6,8) t 16 =T does not appear in P(5,7) Shifting is needed now.

67 Example(12/15) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG aA C G * D[i=8, a] D[i=7, a] >k>k k=1 t 19 =A appears in P(7,8) t 18 =C does not appear in P(6,8) t 17 =G appears in P(5,7) t 16 =T does not appear in P(4,6) Shifting is needed now.

68 Example(13/15) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG aA C G * D[i=8, a] D[i=7, a] >k>k k=1 t 20 =G appears in P(7,8) t 19 =A appears in P(6,8) t 18 =C does not appear in P(5,7) t 17 =G appears in P(4,6) t 16 =T does not appear in P(3,5) Shifting is needed now.

69 Example(14/15) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG aA C G * D[i=8, a] D[i=7, a] >k>k k=1 t 22 =G appears in P(7,8) t 21 =A appears in P(6,8) t 20 =G appears in P(5,7) t 19 =A appears in P(4,6) t 18 =G does not appear in P(3,5) t 17 =G appears in P(2,4) t 16 =T does not appear in P(1,3) Shifting is needed now.

70 Example(15/15) T:GCATCGCAGAGAGTATGCAGAGCG P:GCAGAGAG aA C G * D[i=8, a] D[i=7, a] GCAGAGCG GCAGAGAG jump out of while loop TGCAGAGCG G C A G A G A G W= TGCAGAGCG P= GCAGAGAG Result : k=1

71 Time complexity preprocessing phase and searching phase in O(mn/k) time and O(| Σ |n) space complexity.

72 References [Bae89a] R. Baeza-Yates, Efficient Text Searching. Ph.D. Thesis, Report CS-89-17, University of Waterloo, Computer Science Department, [Bae89b] R. Baeza-Yates, String searching algorithms revisited. In: Proceedings of the Workshop on Algorithms and Data Structures (ed. F. Dehne et al.), Lecture Notes in Computer Science 382, Springer- Verlag, Berlin, 1989, pp.75–96. [BoM77] R. Boyer and S. Moore, A fast string searching algorithm. Communcations of the ACM 20, 1977, pp.762–772. [ChL90] W. Chang and E. Lawler, Approximate string matching in sublinear expected time. In: Proceedings of the 31st IEEE Annual Symposium on Foundations of Computer Science, 1990, pp.116–124. [Fel65] W. Feller, An Introduction to Probability Theory and Its Applications. Vol. I. John Wiley & Sons, 1965.

73 References [Fel66] W. Feller, An Introduction to Probability Theory and Its Applications. Vol. II. John Wiley & Sons, [GaG86] Z. Galil and R. Giancarlo, Improved string matching with k mismatches. SIGACT News,Vol. 17, 1986, pp.52–54. [GaG88] Z. Galil and R. Giancarlo, Data structures and algorithms for approximate string matching. Journal of Complexity, Vol. 4, 1988, pp.33– 72. [GaP89] Z. Galil and K. Park, An improved algorithm for approximate string matching. Proceedings of the 16t International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science 372, Springer-Verlag, Berlin, 1989, pp.394–404. [GrL89] R. Grossi and F. Luccio, Simple and efficient string matching with k mismatches. Information Processing Letters, Vol. 33, 1989, pp.113–120. [Hor80] N. Horspool, Practical fast searching in strings. Software Practice & Experience, Vol. 10, 1980, pp.501–506. [JTU90] P. Jokinen, J. Tarhio and E. Ukkonen, A comparison of approximate string matching algorithms. In preparation. [Kos88] S. R. Kosaraju, Efficient string matching. Extended abstract. Johns Hopkins University, 1988.

74 References [KMP77] D. Knuth, J. Morris and V. Pratt, Fast pattern matching in strings. SIAM Journal on Computing, Vol. 6, 1977, pp.323–350. [LaV88] G. Landau and U. Vishkin, Fast string matching with k differences. Journal of Computer and System Sciences, Vol. 37 (1988), 63– 78. [LaV89] G. Landau and U. Vishkin, Fast parallel and serial approximate string matching. Journal of Algorithms, Vol. 10 (1989), pp.157–169. [Sel80] P. Sellers, The theory and computation of evolutionary distances: Pattern recognition. Journal of Algorithms, Vol. 1, 1980, pp.359–372. [Ukk85a] E. Ukkonen, Algorithms for approximate string matching.Information Control, Vol. 64, 1985, pp.100–118. [Ukk85b] E. Ukkonen, Finding approximate patterns in strings. Journal of Algorithms, Vol. 6, 1985, pp.132–137. [UkW90] E. Ukkonen and D. Wood, Fast approximate string matching with suffix automata. Report A , Department of Computer Science, University of Helsinki, [WaF75] R. Wagner and M. Fischer, The string-to-string correction problem. Journal of the ACM, Vol. 21, 1975, pp.168–173.

75 THANK YOU