Presentation is loading. Please wait.

Presentation is loading. Please wait.

HW 1 solution comments superstring question from last week –(Patchrawat’s solution) –aa(ba) n –(ba) n ba –a(ba) n bb Comparison –2n+6 versus 4n+5 which.

Similar presentations


Presentation on theme: "HW 1 solution comments superstring question from last week –(Patchrawat’s solution) –aa(ba) n –(ba) n ba –a(ba) n bb Comparison –2n+6 versus 4n+5 which."— Presentation transcript:

1 HW 1 solution comments superstring question from last week –(Patchrawat’s solution) –aa(ba) n –(ba) n ba –a(ba) n bb Comparison –2n+6 versus 4n+5 which is asymptotically 2

2 Problem 1: Tandem Arrays 1: Two comments about overlap examples –Definition: more than one occurrence of pattern –Overlap:  = aaa String: aaaaaaaaaaa Array1: aaaaaaaaa Array2: aaaaaaaaa Array3: aaaaaaaaa

3 Tandem Arrays continued Computing efficiently –Z algorithm on  $S Now we have an array of Z-values Occurrences of  are marked by n values –Previous example Z-values 3 3 3 3 3 3 3 3 3 2 1 –Process right to left when I find a value of at least n, check entry n to the left If it has value n, add my value to it If not, and my value is >n, output my location and my value divided by n

4 Problems 2 and 3 2: everyone did well 3: Most did fine, but I wanted a more precise answer in some cases

5 Problem 3 Case z k ’ > |  | needs no comparisons –P(r+1) != P(|  |+1) or else current z-box larger –P(|  |+1) = P(|  |+1) since z k ’ > |  | –therefore, P(r+1) != P(|  |+1) and z k = |  | kk’  r

6 Problem 5 Complaint: many answers had 3n character assignments and essentially read the characters 3n times total Better answer: FSA approach GG,1GA,2 A, output GGA to frame 1

7 Problems 6 and 8 Most submitted programs were ok –I tried to write comments somewhere on your assignments if there were any bugs In the future, provide –README file or makefile –clear input instructions let me input test cases so I can try simple values

8 Problems 9 and 10 9: Please submit using handin so that I can more easily use it to test any programs –Should be fairly comprehensive 10: A few people wrote some comments down and maybe an example –Empirical means experimental Design a sets of tests with inputs of some type Characterize your input set Give me summarized statistical data on how the various algorithms did

9 Problem 7 Key idea –While one shift with just the bad character rule may be worse than one shift with the max of the bad character and good suffix rule, future shifts may pay off –A couple of people had correct solutions where bad character alone was better, but I would like you to push it a little to see how much better it can be Example –Text: a(a n-1 x) k –Pattern: ba n-1 –n+k comparisons versus kn comparisons

10 Bad character example n=4, k=4 aaaaxaaaxaaaxaaax baaa

11 Same example with both rules n=4, k=4 aaaaxaaaxaaaxaaax baaa

12 Problem 4 Hard problem All answers had mistakes or were very vague about how to update the mapping as we changed the starting point of our z-box Consider the following example

13 Example Parameters: a, b, c, d, e Tokens: X P = aXXabXXbaX T = ecXXcdXXdeXXedX Z values for P –1 2 3 4 5 6 7 8 9 0 –a X X a b X X b a X –- 0 0 1 6 0 0 1 2 0 copy to board

14 Example continued ecXXcdXXdeXXedX aXXabXXbaX 08 –a maps to c –b maps to d 001

15 Example continued ecXXcdXXdeXXedX aXXabXXbaX 08001 aXXabXXbaX –From P, we have 6 for next entry which extends beyond Z-box window of 4 –By problem 3, this would be just 4, but right answer is 10 –Now the mapping is a to d, b to e, and we need to do this WITHOUT going backwards and rechecking previously check positions. How?

16 Offset Array Offset array for P –aXXabXXbaX –1234567890 –0003000350 Offset array for T –ecXXcdXXdeXXedX –123456789012345 –000030003900350 Matching –Match if both offsets are (0 or to left of current Z-box) –Else match if both offsets are identical

17 Example with offsets ecXXcdXXdeXXedX aXXabXXbaX 08001 aXXabXXbaX –offset for e is 9 which is outside z-box –offset for b is 0 –offset for d is 5 –offset for a is 5


Download ppt "HW 1 solution comments superstring question from last week –(Patchrawat’s solution) –aa(ba) n –(ba) n ba –a(ba) n bb Comparison –2n+6 versus 4n+5 which."

Similar presentations


Ads by Google