Download presentation
Presentation is loading. Please wait.
Published byMolly Small Modified over 9 years ago
2
OUTLINE Suffix trees Suffix arrays
3
Suffix trees Indexing techniques are used to locate highest – scoring alignments. One method of indexing uses the suffix tree. Suffix is the short sub-sequence.
4
Suffix trees Problems: – Given a pattern P (sub-sequence) find all occurances of P in text S. – Given two strings find their longest common sub-string
5
Suffix trees Problems in Bioinformatics: – Multiple genome alignment – Identification of sequence repeats
6
Suffix trees Suffix tree: – For example: S: abdfrg (length:6) S has 6 suffixes: g, rg, frg, dfrg, bdfrg, abdfrg
7
Suffix trees Suffixes can be stored in a suffix tree and this tree. in O(n) time (n: length of the string) A string pattern of length m can be searched in O(m) time
8
Suffix trees Suffix tree: – S = S[1…n] is a string of length n, – A suffix tree is a tree with n leaves, – n leaves represent n suffixes of the string, – ababc$
9
Suffix trees If a suffix is a prefix of another suffix we can not construct a tree with leaves as suffixes xabxa xa and a are not leaf nodes.
10
Suffix trees Insert e special character (for example $) at the end of the string to solve the problem xabxa$
11
Suffix trees How to construct suffix tree: – Assume we have a string S[1…n] – Start from the suffix S – For example consdier vbacxad$
12
Suffix trees How to construct suffix tree (cont.): – Enter the next suffix S[2…n] – Which is bacxad$
13
Suffix trees How to construct suffix tree (cont.): – Enter the next suffix S[3…n] – Which is acxad$
14
Suffix trees How to construct suffix tree (cont.): – Enter the next suffix – Which is cxad$
15
Suffix trees How to construct suffix tree (cont.): – Enter the next suffix – Which is xad$
16
Suffix trees How to construct suffix tree (cont.): – Enter the next suffix – Which is ad$, we have a matching leaf (first character of acxad$). So split the edge
17
Suffix trees How to construct suffix tree (cont.): – Enter the next suffix – Which is ad$, we have a matching leaf (first character of acxad$). So split the edge
18
Suffix trees How to construct suffix tree (cont.): – Enter the next suffix – Which is ad$, we have a matching leaf (first character of acxad$). So split the edge
19
Suffix trees How to construct suffix tree (cont.): – Enter the next suffix – Which is d$
20
Suffix trees How to construct suffix tree (cont.): – Enter the next suffix – Which is $
21
Suffix trees Suffix tree of vbacxad$:
22
Suffix trees Pattern match using suffix trees: – Try to match a pattern on a path, starting from the root: The pattern does not match, The match ends in a node u of the tree, The match ends inside an edge.
23
Suffix trees Example: (consider vbacxad$ ) – Suffixes: 1. vbacxad$ 2. bacxad$ 3. acxad$ 4. cxad$ 5. xad$ 6. ad$ 7. d$ 8. $
24
Suffix trees Example: (consider vbacxad$ ) – Suffixes: 1. vbacxad$ 2. bacxad$ 3. acxad$ 4. cxad$ 5. xad$ 6. ad$ 7. d$ 8. $ Search for: cxa a xdb
25
Suffix arrays Consider the string: The suffix array:
26
Suffix arrays Search is in mississippi$:
27
References M. Zvelebil, J. O. Baum, “Understanding Bioinformatics”, 2008, Garland Science Andreas D. Baxevanis, B.F. Francis Ouellette, “Bioinformatics: A practical guide to the analysis of genes and proteins”, 2001, Wiley.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.