Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exact string matching: one pattern (text on-line)

Similar presentations


Presentation on theme: "Exact string matching: one pattern (text on-line)"— Presentation transcript:

1 Exact string matching: one pattern (text on-line)
24/02/15 Experimental efficiency (Navarro & Raffinot) BNDM : Backward Nondeterministic Dawg Matching | | BOM : Backward Oracle Matching 64 32 16 Horspool 8 BOM BNDM 4 2 Long. pattern w

2 Multiple string matching
24/02/15 8 4 2 | | Wu-Manber SBOM lmin (5 strings) 8 4 2 Wu-Manber SBOM (10 strings) Ad AC 8 4 2 Wu-Manber SBOM (100 strings) Ad AC 8 4 2 Wu-Manber SBOM (1000 strings) Ad AC

3 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA
24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA As you have seen this morning ....

4 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA
24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA G A T As you have seen this morning ....

5 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA
24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA G A T As you have seen this morning ....

6 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA
24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA T G T A A T G T A A T A A As you have seen this morning ....

7 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA
24/02/15 Construct the trie of GTATGTA,GTAT,TAATA,GTGTA T G T A A T G G T A T A A T A As you have seen this morning .... Which is the cost?

8 Set Horspool algorithm
24/02/15 How the comparison is made? By suffixes Text : Patterns: Trie of all inverse patterns Which is the next position of the window? a As you have seen this morning .... We shift until a is aligned with the first a in the trie not longer than lmin, or lmin

9 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G 1. Construct the trie of GTATGTA, GTAT, TAATA i GTGTA 2. Determine lmin= As you have seen this morning ....

10 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G 1. Construct the trie of GTATGTA, GTAT, TAATA i GTGTA 2. Determine lmin=4 A 1 C 4 (lmin) G T 3. Determine the shift table As you have seen this morning ....

11 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G 1. Construct the trie of GTATGTA, GTAT, TAATA i GTGTA 2. Determine lmin=4 A 1 C 4 (lmin) G 2 T 3. Determine the shift table As you have seen this morning ....

12 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G 1. Construct the trie of GTATGTA, GTAT, TAATA i GTGTA 2. Determine lmin=4 A 1 C 4 (lmin) G 2 T 1 3. Determine the shift table As you have seen this morning .... 4. Find the patterns

13 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....

14 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....

15 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....

16 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....

17 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....

18 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....

19 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning ....

20 Set Horspool algorithm
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G As more patterns we search for, shorter shifts we do! A 1 C 4 (lmin) G 2 T 1 text: ACATGCTATGTGACA… As you have seen this morning .... Is the expected length of the shifts related with the number of patterns?

21 Set Horspool algorithm Wu-Manber algorithm
24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC (LMIN-L+1) AG AT CA CC CG 2 símbols As you have seen this morning ....

22 Set Horspool algorithm Wu-Manber algorithm
24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC (LMIN-L+1) AG AT CA CC CG 2 símbols 3 As you have seen this morning ....

23 Set Horspool algorithm Wu-Manber algorithm
24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC (LMIN-L+1) AG AT CA CC CG 2 símbols 3 1 As you have seen this morning ....

24 Set Horspool algorithm Wu-Manber algorithm
24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC (LMIN-L+1) AG AT 1 CA CC CG 2 símbols 3 3 As you have seen this morning ....

25 Set Horspool algorithm Wu-Manber algorithm
24/02/15 How the length of shifts can be increased? By reading blocks of symbols instead of only one! Given ATGTATG,TATG,ATAAT,ATGTG A 1 C 4 (lmin) G 2 T 1 1 símbol AA 1 AC (LMIN-L+1) AG 3 AT 1 CA 3 CC 3 CG 3 2 símbols AA 1 AT 1 GT 1 TA 2 TG 2 As you have seen this morning ....

26 Search for ATGTATG,TATG,ATAAT,ATGTG
Wu-Manber algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G AA 1 AT 1 GT 1 TA 2 TG 2 text: ACATGCTATGTGACATAATA As you have seen this morning ....

27 Search for ATGTATG,TATG,ATAAT,ATGTG
Wu-Manber algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G AA 1 AT 1 GT 1 TA 2 TG 2 text: ACATGCTATGTGACATAATA As you have seen this morning ....

28 Search for ATGTATG,TATG,ATAAT,ATGTG
Wu-Manber algorithm 24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G AA 1 AT 1 GT 1 TA 2 TG 2 text: ACATGCTATGTGACATAATA As you have seen this morning ....

29 … Wu-Manber algorithm Search for ATGTATG,TATG,ATAAT,ATGTG
24/02/15 Search for ATGTATG,TATG,ATAAT,ATGTG T A G AA 1 AT 1 GT 1 TA 2 TG 2 text: ACATGCTATGTGACATAATA But given k patterns, how many symbols we should take ? As you have seen this morning .... log|Σ| 2*lmin*k

30 Multiple string matching
24/02/15 8 4 2 | | Wu-Manber SBOM lmin (5 strings) 8 4 2 Wu-Manber SBOM (10 strings) Ad AC 8 4 2 Wu-Manber SBOM (100 strings) Ad AC 8 4 2 Wu-Manber SBOM (1000 strings) Ad AC

31 BOM algorithm (Backward Oracle Matching)
24/02/15 Which is the next position of the window? How the comparison is made? Text : Pattern : Automata: Factor Oracle Check if the suffix is a factor of any pattern As you have seen this morning .... The position determined by the last character of the text with a transition in the automata

32 Factor Oracle of k strings
24/02/15 How can we build the Factor Oracle of GTATGTA, GTAA, TAATA i GTGTA ? G T A T G T A T G A 1,4 A A T A As you have seen this morning .... 3 2

33 Factor Oracle of k strings
24/02/15 Given the Factor Oracle of GTATGTA G T As you have seen this morning ....

34 Factor Oracle of k strings
24/02/15 Given the Factor Oracle of GTATGTA G T A T As you have seen this morning ....

35 Factor Oracle of k strings
24/02/15 Given the Factor Oracle of GTATGTA G T A T T A As you have seen this morning ....

36 Factor Oracle of k strings
24/02/15 Given the Factor Oracle of GTATGTA G T A T G T A As you have seen this morning ....

37 Factor Oracle of k strings
24/02/15 Given the Factor Oracle of GTATGTA G T A T G T T G A As you have seen this morning ....

38 Factor Oracle of k strings
24/02/15 Given the Factor Oracle of GTATGTA G T A T G T A 1 T G A As you have seen this morning .... … we insert GTAA

39 Factor Oracle of k strings
24/02/15 …inserting GTAA G T A T G T A T G A 2 1 A As you have seen this morning ....

40 Factor Oracle of k strings
24/02/15 Given the AFO of GTATGTA and GTAA G T A T G T A T G 1 A A As you have seen this morning .... 2 … we insert TAATA

41 Factor Oracle of k strings
24/02/15 … inserting TAATA G T A T G T A T G A 1 A A T A 3 As you have seen this morning .... 2

42 Factor Oracle of k strings
24/02/15 Given the AFO of GTATGTA, GTAA and TAATA G T A T G T A T G A 1 A A T A As you have seen this morning .... 3 2 …we insert GTGTA

43 Factor Oracle of k strings
24/02/15 …inserting GTGTA G T A T G T A T G A 1 A A T A As you have seen this morning .... 3 2

44 Factor Oracle of k strings
24/02/15 G T A T G T A T G A 1,4 A A T A 3 2 As you have seen this morning .... This is the Automata Factor Oracle of GTATGTA, GTAA, TAATA and GTGTA

45 SBOM algorithm How the comparison is made?
24/02/15 Which is the next position of the window? How the comparison is made? Text : Pattern : Automata: Factor Oracle (Inverse patterns of length lmin) Check if the suffix is a factor of any pattern As you have seen this morning .... The position determined by the last character of the text with a transition in the automata

46 SBOM algorithm: example
24/02/15 We search for the patterns ATGTATG, TAATG,TAATAAT i AATGTG … the we build the Automata Factor Oracle of GTATG, GTAAT, TAATA and GTGTA of length lmin=5 G T A T G T A 1 4 A T G A As you have seen this morning .... A T A 2 3

47 SBOM algorithm: example
24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....

48 SBOM algorithm: example
24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....

49 SBOM algorithm: example
24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....

50 SBOM algorithm: example
24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....

51 SBOM algorithm: example
24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....

52 SBOM algorithm: example
24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....

53 SBOM algorithm: example
24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGTATG As you have seen this morning ....

54 SBOM algorithm: example
24/02/15 Search for ATGTATG, TAATG,TAATAAT i AATGTG G T A T G T A 1 4 A T G A A T A 2 3 text: ACATGCTAGCTATAATAATGT… As you have seen this morning ....

55 Multiple string matching
24/02/15 8 4 2 | | Wu-Manber SBOM lmin (5 strings) 8 4 2 Wu-Manber SBOM (10 strings) Ad AC 8 4 2 Wu-Manber SBOM (100 strings) Ad AC 8 4 2 Wu-Manber SBOM (1000 strings) Ad AC


Download ppt "Exact string matching: one pattern (text on-line)"

Similar presentations


Ads by Google