Exact string matching: one pattern (text on-line) 24/02/15 Experimental efficiency (Navarro & Raffinot) BNDM : Backward Nondeterministic Dawg Matching | | BOM : Backward Oracle Matching 64 32 16 Horspool 8 BOM BNDM 4 2 Long. pattern w 2 4 8 16 32 64 128 256
Multiple string matching 24/02/15 5 10 15 20 25 30 35 40 45 8 4 2 | | Wu-Manber SBOM lmin (5 strings) 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (10 strings) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (100 strings) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (1000 strings) Ad AC
Construct the trie of GTATGTA,GTAT,TAATA,GTGTA 24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA As you have seen this morning ....
Construct the trie of GTATGTA,GTAT,TAATA,GTGTA 24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA G A T As you have seen this morning ....
Construct the trie of GTATGTA,GTAT,TAATA,GTGTA 24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA G A T As you have seen this morning ....
Construct the trie of GTATGTA,GTAT,TAATA,GTGTA 24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA T G T A A T G T A A T A A As you have seen this morning ....
Construct the trie of GTATGTA,GTAT,TAATA,GTGTA 24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA T G T A A T G G T A T A A T A As you have seen this morning .... Which is the cost?
Set Horspool algorithm 24/02/15 How the comparison is made? By suffixes Text : Patterns: Trie of all inverse patterns Which is the next position of the window? a As you have seen this morning .... We shift until a is aligned with the first a in the trie not longer than lmin, or lmin
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G 1. Construct the trie of GTATGTA, GTAT, TAATA i GTGTA 2. Determine lmin= As you have seen this morning ....
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G 1. Construct the trie of GTATGTA, GTAT, TAATA i GTGTA 2. Determine lmin=4 A 1 C 4 (lmin) G T 3. Determine the shift table As you have seen this morning ....
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G 1. Construct the trie of GTATGTA, GTAT, TAATA i GTGTA 2. Determine lmin=4 A 1 C 4 (lmin) G 2 T 3. Determine the shift table As you have seen this morning ....
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G 1. Construct the trie of GTATGTA, GTAT, TAATA i GTGTA 2. Determine lmin=4 A 1 C 4 (lmin) G 2 T 1 3. Determine the shift table As you have seen this morning .... 4. Find the patterns
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning .... …
Set Horspool algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G As more patterns we search for, shorter shifts we do! A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning .... … Is the expected length of the shifts related with the number of patterns?
Set Horspool algorithm Wu-Manber algorithm 24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC 3 (LMIN-L+1) AG AT CA CC CG … 2 símbols As you have seen this morning ....
Set Horspool algorithm Wu-Manber algorithm 24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC 3 (LMIN-L+1) AG AT CA CC CG … 2 símbols 3 As you have seen this morning ....
Set Horspool algorithm Wu-Manber algorithm 24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC 3 (LMIN-L+1) AG AT CA CC CG … 2 símbols 3 1 As you have seen this morning ....
Set Horspool algorithm Wu-Manber algorithm 24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC 3 (LMIN-L+1) AG AT 1 CA CC CG … 2 símbols 3 3 As you have seen this morning ....
Set Horspool algorithm Wu-Manber algorithm 24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC 3 (LMIN-L+1) AG 3 AT 1 CA 3 CC 3 CG 3 … 2 símbols AA 1 AT 1 GT 1 TA 2 TG 2 As you have seen this morning ....
Search for ATGTATG,TATG,ATAAT,ATGTG Wu-Manber algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G AA 1 AT 1 GT 1 TA 2 TG 2 text: ACATGCTATGTGACATAATA As you have seen this morning ....
Search for ATGTATG,TATG,ATAAT,ATGTG Wu-Manber algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G AA 1 AT 1 GT 1 TA 2 TG 2 text: ACATGCTATGTGACATAATA As you have seen this morning ....
Search for ATGTATG,TATG,ATAAT,ATGTG Wu-Manber algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G AA 1 AT 1 GT 1 TA 2 TG 2 text: ACATGCTATGTGACATAATA As you have seen this morning ....
… Wu-Manber algorithm Search for ATGTATG,TATG,ATAAT,ATGTG 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G AA 1 AT 1 GT 1 TA 2 TG 2 text: ACATGCTATGTGACATAATA But given k patterns, how many symbols we should take ? As you have seen this morning .... … log|Σ| 2*lmin*k
Multiple string matching 24/02/15 5 10 15 20 25 30 35 40 45 8 4 2 | | Wu-Manber SBOM lmin (5 strings) 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (10 strings) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (100 strings) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (1000 strings) Ad AC
BOM algorithm (Backward Oracle Matching) 24/02/15 Which is the next position of the window? How the comparison is made? Text : Pattern : Automata: Factor Oracle Check if the suffix is a factor of any pattern As you have seen this morning .... The position determined by the last character of the text with a transition in the automata
Factor Oracle of k strings 24/02/15 How can we build the Factor Oracle of GTATGTA, GTAA, TAATA i GTGTA ? G T A T G T A T G A 1,4 A A T A As you have seen this morning .... 3 2
Factor Oracle of k strings 24/02/15 Given the Factor Oracle of GTATGTA G T As you have seen this morning ....
Factor Oracle of k strings 24/02/15 Given the Factor Oracle of GTATGTA G T A T As you have seen this morning ....
Factor Oracle of k strings 24/02/15 Given the Factor Oracle of GTATGTA G T A T T A As you have seen this morning ....
Factor Oracle of k strings 24/02/15 Given the Factor Oracle of GTATGTA G T A T G T A As you have seen this morning ....
Factor Oracle of k strings 24/02/15 Given the Factor Oracle of GTATGTA G T A T G T T G A As you have seen this morning ....
Factor Oracle of k strings 24/02/15 Given the Factor Oracle of GTATGTA G T A T G T A 1 T G A As you have seen this morning .... … we insert GTAA
Factor Oracle of k strings 24/02/15 …inserting GTAA G T A T G T A T G A 2 1 A As you have seen this morning ....
Factor Oracle of k strings 24/02/15 Given the AFO of GTATGTA and GTAA G T A T G T A T G 1 A A As you have seen this morning .... 2 … we insert TAATA
Factor Oracle of k strings 24/02/15 … inserting TAATA G T A T G T A T G A 1 A A T A 3 As you have seen this morning .... 2
Factor Oracle of k strings 24/02/15 Given the AFO of GTATGTA, GTAA and TAATA G T A T G T A T G A 1 A A T A As you have seen this morning .... 3 2 …we insert GTGTA
Factor Oracle of k strings 24/02/15 …inserting GTGTA G T A T G T A T G A 1 A A T A As you have seen this morning .... 3 2
Factor Oracle of k strings 24/02/15 G T A T G T A T G A 1,4 A A T A 3 2 As you have seen this morning .... This is the Automata Factor Oracle of GTATGTA, GTAA, TAATA and GTGTA
SBOM algorithm How the comparison is made? 24/02/15 Which is the next position of the window? How the comparison is made? Text : Pattern : Automata: Factor Oracle (Inverse patterns of length lmin) Check if the suffix is a factor of any pattern As you have seen this morning .... The position determined by the last character of the text with a transition in the automata
SBOM algorithm: example 24/02/15 We search for the patterns ATGTATG, TAATG,TAATAAT i AATGTG … the we build the Automata Factor Oracle of GTATG, GTAAT, TAATA and GTGTA of length lmin=5 G T A T G T A 1 4 A T G A As you have seen this morning .... A T A 2 3
SBOM algorithm: example 24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....
SBOM algorithm: example 24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....
SBOM algorithm: example 24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....
SBOM algorithm: example 24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....
SBOM algorithm: example 24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....
SBOM algorithm: example 24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....
SBOM algorithm: example 24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....
SBOM algorithm: example 24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGT… As you have seen this morning ....
Multiple string matching 24/02/15 5 10 15 20 25 30 35 40 45 8 4 2 | | Wu-Manber SBOM lmin (5 strings) 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (10 strings) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (100 strings) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (1000 strings) Ad AC