Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Rabin-Karp Algorithm String Matching Jonathan M. Elchison 19 November 2004 CS-3410 Algorithms Dr. Shomper.

Similar presentations


Presentation on theme: "The Rabin-Karp Algorithm String Matching Jonathan M. Elchison 19 November 2004 CS-3410 Algorithms Dr. Shomper."— Presentation transcript:

1 The Rabin-Karp Algorithm String Matching Jonathan M. Elchison 19 November 2004 CS-3410 Algorithms Dr. Shomper

2 Background String matching Naïve method n ≡ size of input string m ≡ size of pattern to be matched O( (n-m+1)m ) Θ( n 2 ) if m = floor( n/2 ) We can do better

3 How it works Consider a hashing scheme Each symbol in alphabet Σ can be represented by an ordinal value { 0, 1, 2,..., d } |Σ| = d “Radix-d digits”

4 How it works Hash pattern P into a numeric value Let a string be represented by the sum of these digits Horner’s rule (§ 30.1) Example { A, B, C,..., Z }→ { 0, 1, 2,..., 26 } BAN → 1 + 0 + 13= 14 CARD→ 2 + 0 + 17 + 3= 22

5 Upper limits Problem For long patterns, or for large alphabets, the number representing a given string may be too large to be practical Solution Use MOD operation When MOD q, values will be < q Example BAN= 1 + 0 + 13= 14 14 mod 13 = 1 BAN → 1 CARD= 2 + 0 + 17 + 3= 22 22 mod 13 = 9 CARD → 9

6 Searching

7 Spurious Hits Question Does a hash value match mean that the patterns match? Answer No – these are called “spurious hits” Possible cases MOD operation interfered with uniqueness of hash values 14 mod 13 = 1 27 mod 13 = 1 MOD value q is usually chosen as a prime such that 10q just fits within 1 computer word Information is lost in generalization (addition) BAN → 1 + 0 + 13 = 14 CAM → 2 + 0 + 12 = 14

8 Code RABIN-KARP-MATCHER( T, P, d, q ) n ← length[ T ] m ← length[ P ] h ← d m-1 mod q p ← 0 t 0 ← 0 for i ← 1 to m► Preprocessing dop ← ( d*p + P[ i ] ) mod q t 0 ← ( d*t 0 + T[ i ] ) mod q for s ← 0 to n – m► Matching do if p = t s then if P[ 1..m ] = T[ s+1.. s+m ] then print “Pattern occurs with shift” s if s < n – m then t s+1 ← ( d * ( t s – T[ s + 1 ] * h ) + T[ s + m + 1 ] ) mod q

9 Performance Preprocessing (determining each pattern hash) Θ( m ) Worst case running time Θ( (n-m+1)m ) No better than naïve method Expected case If we assume the number of hits is constant compared to n, we expect O( n ) Only pattern-match “hits” – not all shifts

10 Demonstration http://www-igm.univ- mlv.fr/~lecroq/string/node5.htmlhttp://www-igm.univ- mlv.fr/~lecroq/string/node5.html

11 The Rabin-Karp Algorithm String Matching Jonathan M. Elchison 19 November 2004 CS-3410 Algorithms Dr. Shomper Sources: Cormen, Thomas S., et al. Introduction to Algorithms. 2nd ed. Boston: MIT Press, 2001. Karp-Rabin algorithm. 15 Jan 1997.. Shomper, Keith. “Rabin-Karp Animation.” E-mail to Jonathan Elchison. 12 Nov 2004.


Download ppt "The Rabin-Karp Algorithm String Matching Jonathan M. Elchison 19 November 2004 CS-3410 Algorithms Dr. Shomper."

Similar presentations


Ads by Google