Download presentation
Presentation is loading. Please wait.
Published byMildred Perry Modified over 9 years ago
1
1 String Matching of Bit Parallel Suffix Automata
2
2 Suffix Automata Base on a Deterministic Acyclic Word Graph (DAWG) To facilitate comparing equivalence suffix string Nondeterministic suffix automata Deterministic suffix automata Subset Construction
3
3 Suffix Automata Search Also called Backward Deterministic automata Matching (BDM) Build the factor x for pattern p endpos(x) set of all the pattern position where an occurrence of x ends Ex: Pattern = baabbaa, endpos(aa) = {3,7} Safe shift, if no equivalent suffix in pattern Text: shift left to right Fail to matching a factor Shift window Windows size = pattern length
4
4 BDM Algorithm Build automata Reached the final state
5
5 Suffix Automata Search Example 1. Build Reverse Deterministic Suffix Automata 2. endpos(x) to find a factor 3. Fail to find a factor, do a safe shift
6
6 1. T= [abbaba a ]bbaab a is a factor of p r and a reverse prefix of p. last =6 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
7
7 2. T= [abbab aa ]bbaab aa is a factor of p r and a reverse prefix of p. last =5 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
8
8 3. T= [abba baa ]bbaab aab is a factor of p r 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
9
9 4. T= [abb abaa ]bbaab We fail to recognize the next a.So we shift the window to last. We search again in position:T= abbab[aabbaab]. last=7 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
10
10 5. T= abbab[aabbaa b ] b is a factor of p r 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
11
11 6. T= abbab[aabba ab ] ba is a factor of p r 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
12
12 7. T= abbab[aabb aab ] baa is a factor of p r and a reverse prefix of p. last =4 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
13
13 8. T= abbab[aab baab ] baab is a factor of p r 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
14
14 9. T= abbab[aa bbaab ] baabb is a factor of p r 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
15
15 10. T= abbab[a abbaab ] baabba is a factor of p r 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
16
16 11. T= abbab[ aabbaab ] We recognize the word aabbaab and report an occurrence. 01234567 145 26 4 5 6 2367 7 37 a a a a a a b b b b b Suffix Automata Search Example
17
17 BNDM Algorithm Backward Nondeterministic Dawg Matching (BNDM) Handle class, multiple pattern, and allow errors Using bit parallelism, Combine Shift-Or and BDM Faster than BDM 20% ~ 25%, Faster than BM 10% ~ 40% Update Function
18
18 BNDM Algorithm
19
19 BNDM Example
20
20 BNDM Example
21
21 BNDM Further Improvement Handle long pattern Partition pattern p into subpatterns p i Build a array of D and B, process each part with basic algorithm If p i is found, than process p i+1 … Handle Class Modified B table only Have the ith bit set for all chars belonging to ith position in pattern Multiple Pattern Two method Interleave patterns, shift r bit for each D update Just concatenate, shift 1 bit, but modifed D = (D<<1) &(1 m-1 0) r Where r is # of patterns Approximate Matching Use Wu’s method
22
22 Performance Comparison In 1/100 of second per megabyte
23
23 Reference Gonzalo Navarro and Mathieu Raffinot. A Bit-parallel approach to Suffix Automata: Fast Extended String Matching. In M. Farach (editor), Proc. CPM'98, LNCS 1448. Pages 14-33, 1998. Gonzalo Navarro, Mathieu Raffinot, Fast and Flexible String Matching by Combining Bit- parallelism and Suffix Automata (1998)
24
24 Rreverse Pattern ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.