Download presentation
Presentation is loading. Please wait.
Published byCharlotte Thomas Modified over 9 years ago
1
Design & Analysis of Algorithms COMP 482 / ELEC 420 John Greiner
2
String Matching Pattern: CCATT Text: ACTGCCATTCCTTAGGGCCATGTG Brute force & variant Automata-like strategies Suffix trees 2 To do: [CLRS] 32 Supplements #5
3
Some Applications Text & programming languages –Spell-checking –Linguistic analysis –Tokenization –Virus scanning –Spam filtering –Database querying DNA sequence analysis Music identification & analysis 3
4
Brute Force Exact Match Pattern: CCATT Text: ACTGCCATTCCTTAGGGCCATGTG Algorithm? Example of worst case? Running time? 4
5
Rabin-Karp, 1981 5
6
Intuition for Better Algorithms Pattern: CCATT Text: CCAGCCATTCCTTAGGGCCATGTG After failing match at 1 st position, what should we do? 6
7
Quick Overview of Two Algorithms 7
8
Finite Automata Matching Example Pattern: CCATT Text: CCAGCCATTCCTTAGGGCCATGTG 8 CCACCAT CCATT CCC CC A C T C T C C
9
Regular Expressions 9
10
Regular Expression to Finite Automaton 10
11
Regular Expression to Finite Automaton 11
12
12
13
Eliminating Non-Determinism Each state is a set of the old states. 13 0 1 a a b 0 1 0,1 a ba b a b a,ba,b
14
Can Minimize 14 0 1 a a b 0 0,1 a b 1 a b a b a,ba,b
15
Finite Automaton Approach Summary Preprocessing expensive for REs. Matching is flexible and still linear. 15
16
Suffix Tree (Trie) Text: CCAGAT How many leaves, nodes, edges? Total space? How to search for a pattern? Running time? 16 TA TGATAGAT C CAGAT 5 4 321 6 GAT Each edge labeled with two indices.
17
Require Suffixes to End at Leaves Text: CCAGAT Reason: Simplicity. Distinguish nodes & leaves. Example text that would break that? 17 TA TGATAGAT C CAGAT 5 4 321 6 GAT
18
Forcing Suffixes to End at Leaves Text: CCAGAC$ 18 $A C$GAC$AGAC$ C CAGAC$ 5 4 321 7 GAC$ 6 $
19
How Would You Create the Tree? Text: CCAGAC$ 19 $A C$GAC$ C 5 4 321 7 6 $AGAC$CAGAC$
20
Suffix Tree Construction Example Text: nananabanana$ 20 nananabanana$
21
Suffix Tree Construction Example Text: nananabanana$ 21 nananabanana$ananabanana$
22
Suffix Tree Construction Example Text: nananabanana$ Match nanabanana$, nananabanana$ : 4 comparisons 22 nana banana$nabanana$ ananabanana$
23
Suffix Tree Construction Example Text: nananabanana$ Match anabanana$, ananabanana$ : 3 redundant comps Next … Match nabanana$, nanabanana$ : 2 redundant comps Match abanana$, anabanana$ : 1 redundant comp 23 nana banana$nabanana$ banana$ ana
24
First Algorithm Improvement: No Redundant Matching 24
25
Suffix Tree Construction Example Text: nananabanana$ Match nanabanana$, nananabanana$ : 4 comparisons 25 nana banana$nabanana$ ananabanana$
26
Suffix Tree Construction Example Text: nananabanana$ No matching. Just form new node and adjust edge string indices. 26 nana banana$nabanana$ banana$ ana
27
Suffix Tree Construction Example Text: nananabanana$ No matching. Just form new node and adjust edge string indices. 27 na banana$nabanana$ banana$ ana na banana$
28
Suffix Tree Construction Example Text: nananabanana$ No matching. Just form new node and adjust edge string indices. 28 na banana$nabanana$ banana$ na banana$ a
29
Suffix Tree Construction Example Text: nananabanana$ 29 na banana$nabanana$ banana$ na banana$ a
30
Suffix Tree Construction Example Text: nananabanana$ anana is there, but it takes work to match & follow links. 30 na banana$nabanana$banana$ na banana$ a $ na
31
Suffix Tree Construction Example Text: nananabanana$ nana is there, but it takes work to match & follow links. 31 na banana$nabanana$banana$ na banana$ a $ na$
32
Second Algorithm Improvement: Suffix Links 32 banana$a na banana$$na banana$$nabanana$banana$$na banana$$ $
33
Suffix Tree Construction Example Text: nananabanana$ Have to follow links for match on first time. Need to create suffix link for new node. 33 na banana$nabanana$banana$ na banana$ a $ na
34
Suffix Tree Construction Example Text: nananabanana$ Need to create suffix link for new node. Use parent’s suffix link to get close. 34 na banana$nabanana$banana$ na banana$ a $ na
35
Suffix Tree Construction Example Text: nananabanana$ 35 na banana$nabanana$banana$ na banana$ a $ na
36
Suffix Tree Construction Example Text: nananabanana$ Already found node. No searching or matching. 36 na banana$nabanana$banana$ na banana$ a $ na$
37
Suffix Tree Construction Example Text: nananabanana$ Follow suffix link. 37 na banana$nabanana$banana$ na banana$ a $ na$$
38
Suffix Tree Construction Example Text: nananabanana$ Follow suffix link. 38 na banana$nabanana$banana$ na banana$ a $ na$$ $
39
Suffix Tree Construction Example Text: nananabanana$ Follow suffix link. 39 na banana$nabanana$banana$ na banana$ a $ na$$ $$
40
Suffix Tree Construction Example 40 na banana$nabanana$banana$ na banana$ a $ na$$ $$ $
41
Almost Correct Analysis of Construction 41
42
Creating & Following Suffix Links 42
43
Correct Analysis of Construction 43
44
Some Applications of Suffix Trees Search for fixed patterns Search for regular expressions Find longest common substrings, longest repeated substrings Find most commonly repeated substrings Find maximal palindromes Find Lempel-Ziv decomposition (for text compression) As used in Bioinformatics Data compression Data clustering 44
45
Supplementary Resources Exact String Matching Algorithms >30 algorithms, with animations Course on string matching (Biosequencing) Wikipedia: Suffix treesSuffix trees Tutorial on Suffix treesSuffix trees Tutorial on Suffix trees with applet & codeSuffix trees Notes on string matching and building suffix treesstring matching building suffix trees Suffix tree slides adapted from those by Guy Blelloch, CMU 15-853. 45
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.