Download presentation
Presentation is loading. Please wait.
1
HW 1 solution comments superstring question from last week –(Patchrawat’s solution) –aa(ba) n –(ba) n ba –a(ba) n bb Comparison –2n+6 versus 4n+5 which is asymptotically 2
2
Problem 1: Tandem Arrays 1: Two comments about overlap examples –Definition: more than one occurrence of pattern –Overlap: = aaa String: aaaaaaaaaaa Array1: aaaaaaaaa Array2: aaaaaaaaa Array3: aaaaaaaaa
3
Tandem Arrays continued Computing efficiently –Z algorithm on $S Now we have an array of Z-values Occurrences of are marked by n values –Previous example Z-values 3 3 3 3 3 3 3 3 3 2 1 –Process right to left when I find a value of at least n, check entry n to the left If it has value n, add my value to it If not, and my value is >n, output my location and my value divided by n
4
Problems 2 and 3 2: everyone did well 3: Most did fine, but I wanted a more precise answer in some cases
5
Problem 3 Case z k ’ > | | needs no comparisons –P(r+1) != P(| |+1) or else current z-box larger –P(| |+1) = P(| |+1) since z k ’ > | | –therefore, P(r+1) != P(| |+1) and z k = | | kk’ r
6
Problem 5 Complaint: many answers had 3n character assignments and essentially read the characters 3n times total Better answer: FSA approach GG,1GA,2 A, output GGA to frame 1
7
Problems 6 and 8 Most submitted programs were ok –I tried to write comments somewhere on your assignments if there were any bugs In the future, provide –README file or makefile –clear input instructions let me input test cases so I can try simple values
8
Problems 9 and 10 9: Please submit using handin so that I can more easily use it to test any programs –Should be fairly comprehensive 10: A few people wrote some comments down and maybe an example –Empirical means experimental Design a sets of tests with inputs of some type Characterize your input set Give me summarized statistical data on how the various algorithms did
9
Problem 7 Key idea –While one shift with just the bad character rule may be worse than one shift with the max of the bad character and good suffix rule, future shifts may pay off –A couple of people had correct solutions where bad character alone was better, but I would like you to push it a little to see how much better it can be Example –Text: a(a n-1 x) k –Pattern: ba n-1 –n+k comparisons versus kn comparisons
10
Bad character example n=4, k=4 aaaaxaaaxaaaxaaax baaa
11
Same example with both rules n=4, k=4 aaaaxaaaxaaaxaaax baaa
12
Problem 4 Hard problem All answers had mistakes or were very vague about how to update the mapping as we changed the starting point of our z-box Consider the following example
13
Example Parameters: a, b, c, d, e Tokens: X P = aXXabXXbaX T = ecXXcdXXdeXXedX Z values for P –1 2 3 4 5 6 7 8 9 0 –a X X a b X X b a X –- 0 0 1 6 0 0 1 2 0 copy to board
14
Example continued ecXXcdXXdeXXedX aXXabXXbaX 08 –a maps to c –b maps to d 001
15
Example continued ecXXcdXXdeXXedX aXXabXXbaX 08001 aXXabXXbaX –From P, we have 6 for next entry which extends beyond Z-box window of 4 –By problem 3, this would be just 4, but right answer is 10 –Now the mapping is a to d, b to e, and we need to do this WITHOUT going backwards and rechecking previously check positions. How?
16
Offset Array Offset array for P –aXXabXXbaX –1234567890 –0003000350 Offset array for T –ecXXcdXXdeXXedX –123456789012345 –000030003900350 Matching –Match if both offsets are (0 or to left of current Z-box) –Else match if both offsets are identical
17
Example with offsets ecXXcdXXdeXXedX aXXabXXbaX 08001 aXXabXXbaX –offset for e is 9 which is outside z-box –offset for b is 0 –offset for d is 5 –offset for a is 5
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.