Penka Borukova Student at Telerik Academy
1. Boyer Moore String Search Algorithm 2. The bad character rule 3. The good suffix rule 4. The algorithm itself 2
Finds pattern in text Useful for very large text Space and time are expensive! Some definitions S[i…n] – suffix S[1…i] – prefix P – Pattern T – Text 4
Boyer-Moore uses information gained by preprocessing P to skip as many alignments as possible The strings are matched from the end and toward the beginning of P The comparisons continue until either a mismatch occurs or the beginning of P is reached (which means of P is reached (which means there is a match) there is a match) 5
When mismatch is found Bad Character Rule Good Suffix Rule Bad Character Rule Good Suffix Rule 6
The idea of bad character rule is to shift P by more than one characters when possible R(x): The right-most occurrence of character x in P R(x)=0 if x does not occur in the pattern When mismatch is found shift is found shift it over underneath it over underneath 8
Pattern -> ABBABAA R(‘A’) = 1; R(‘B’)= 2; R(x) = 0; x >=0 && x =0 && x < 256 && x != ‘A’ && x != ‘B’ 9ABCABCBBCAABBABAA ABCABCBBCAABBABAA
Bad character rule focuses on characters Work well in practice with large alphabets like the English alphabet Work less well with small alphabets like DNA Space required: O(| ) for the number of characters in the alphabet 10
2D table indexed first by the index of the character c in the Alphabet and second by the index i in the Pattern Return the occurrence of c in P with the next- highest index j < i or -1 if there is no such occurrence The proposed shift will then be i - j Space required: O(n| |) 11
Good suffix rule focuses on substrings L’(i): For each i, L’(i) is the largest position less than n such that substring P[i,…,n] matches a suffix of P[1,…, ’(i) ] with the additional requirement that the character preceding that suffix is not equal to character P[i-1] If there is no such a position, L’(i) =0. 13
Example: 14ABCABCABCAAABCAACA ABCABCABCAAABCAACA
l’(i) : the length of the largest suffix of P[i,…,n], that is also a prefix of P If none exists, then l’(i)=0. 15ABCABCABCAACACA ABCABCABCAACACA
Example: Indexes of P: Pattern: Pattern: l’(i) : l’(i) : L’(i) : L’(i) : CCACBCCBACC
Precompute L’(i), l’(i) for each position in P Precompute R(x) or R(x,i) for each character x in Align P to T Compare right to left On mismatch, shift by the max possible from (extended) bad character rule and good suffix rule and return to compare 17
O(m + n) - if the pattern does not appear in the text O(nm) - when the pattern occurs in the text This is the worst case 18
форум програмиране, форум уеб дизайн курсове и уроци по програмиране, уеб дизайн – безплатно програмиране за деца – безплатни курсове и уроци безплатен SEO курс - оптимизация за търсачки уроци по уеб дизайн, HTML, CSS, JavaScript, Photoshop уроци по програмиране и уеб дизайн за ученици ASP.NET MVC курс – HTML, SQL, C#,.NET, ASP.NET MVC безплатен курс "Разработка на софтуер в cloud среда" BG Coder - онлайн състезателна система - online judge курсове и уроци по програмиране, книги – безплатно от Наков безплатен курс "Качествен програмен код" алго академия – състезателно програмиране, състезания ASP.NET курс - уеб програмиране, бази данни, C#,.NET, ASP.NET курсове и уроци по програмиране – Телерик академия курс мобилни приложения с iPhone, Android, WP7, PhoneGap free C# book, безплатна книга C#, книга Java, книга C# Николай Костов - блог за програмиране
3Moore_string_search_algorithm 3Moore_string_search_algorithm 3Moore_string_search_algorithm flensburg.de/lang/algorithmen/pattern/bmen. htm flensburg.de/lang/algorithmen/pattern/bmen. htm flensburg.de/lang/algorithmen/pattern/bmen. htm bnotes.pdf bnotes.pdf bnotes.pdf mlv.fr/~lecroq/string/node14.html mlv.fr/~lecroq/string/node14.html mlv.fr/~lecroq/string/node14.html 21
“C# Telerik Academy csharpfundamentals.telerik.com csharpfundamentals.telerik.com Telerik Software Academy academy.telerik.com academy.telerik.com Telerik Facebook facebook.com/TelerikAcademy facebook.com/TelerikAcademy Telerik Software Academy Forums forums.academy.telerik.com forums.academy.telerik.com 22