1 / 17

HW 1 solution comments

HW 1 solution comments. superstring question from last week (Patchrawat’s solution) aa(ba) n (ba) n ba a(ba) n bb Comparison 2n+6 versus 4n+5 which is asymptotically 2. Problem 1: Tandem Arrays. 1: Two comments about overlap examples Definition: more than one occurrence of pattern

bian
Download Presentation

HW 1 solution comments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HW 1 solution comments • superstring question from last week • (Patchrawat’s solution) • aa(ba)n • (ba)nba • a(ba)nbb • Comparison • 2n+6 versus 4n+5 which is asymptotically 2

  2. Problem 1: Tandem Arrays • 1: Two comments about overlap examples • Definition: more than one occurrence of pattern • Overlap: • b = aaa • String: aaaaaaaaaaa • Array1: aaaaaaaaa • Array2: aaaaaaaaa • Array3: aaaaaaaaa

  3. Tandem Arrays continued • Computing efficiently • Z algorithm on b$S • Now we have an array of Z-values • Occurrences of b are marked by n values • Previous example Z-values • 3 3 3 3 3 3 3 3 3 2 1 • Process right to left • when I find a value of at least n, check entry n to the left • If it has value n, add my value to it • If not, and my value is >n, output my location and my value divided by n

  4. Problems 2 and 3 • 2: everyone did well • 3: Most did fine, but I wanted a more precise answer in some cases

  5. a a k’ k r Problem 3 • Case zk’ > |b| needs no comparisons • P(r+1) != P(|a|+1) or else current z-box larger • P(|a|+1) = P(|b|+1) since zk’ > |b| • therefore, P(r+1) != P(|b|+1) and zk = |b|

  6. GG,1 GA,2 A, output GGA to frame 1 Problem 5 • Complaint: many answers had 3n character assignments and essentially read the characters 3n times total • Better answer: FSA approach

  7. Problems 6 and 8 • Most submitted programs were ok • I tried to write comments somewhere on your assignments if there were any bugs • In the future, provide • README file or makefile • clear input instructions • let me input test cases so I can try simple values

  8. Problems 9 and 10 • 9: Please submit using handin so that I can more easily use it to test any programs • Should be fairly comprehensive • 10: A few people wrote some comments down and maybe an example • Empirical means experimental • Design a sets of tests with inputs of some type • Characterize your input set • Give me summarized statistical data on how the various algorithms did

  9. Problem 7 • Key idea • While one shift with just the bad character rule may be worse than one shift with the max of the bad character and good suffix rule, future shifts may pay off • A couple of people had correct solutions where bad character alone was better, but I would like you to push it a little to see how much better it can be • Example • Text: a(an-1x)k • Pattern: ban-1 • n+k comparisons versus kn comparisons

  10. Bad character example • n=4, k=4 • aaaaxaaaxaaaxaaax • baaa • baaa • baaa • baaa • baaa

  11. Same example with both rules • n=4, k=4 • aaaaxaaaxaaaxaaax • baaa • baaa • baaa • baaa • baaa

  12. Problem 4 • Hard problem • All answers had mistakes or were very vague about how to update the mapping as we changed the starting point of our z-box • Consider the following example

  13. Example • Parameters: a, b, c, d, e • Tokens: X • P = aXXabXXbaX • T = ecXXcdXXdeXXedX • Z values for P • 1 2 3 4 5 6 7 8 9 0 • a X X a b X X b a X • - 0 0 1 6 0 0 1 2 0 • copy to board

  14. Example continued • ecXXcdXXdeXXedX • aXXabXXbaX • 08 • a maps to c • b maps to d • 001

  15. Example continued • ecXXcdXXdeXXedX • aXXabXXbaX • 08001 • aXXabXXbaX • From P, we have 6 for next entry which extends beyond Z-box window of 4 • By problem 3, this would be just 4, but right answer is 10 • Now the mapping is a to d, b to e, and we need to do this WITHOUT going backwards and rechecking previously check positions. How?

  16. Offset Array • Offset array for P • aXXabXXbaX • 1234567890 • 0003000350 • Offset array for T • ecXXcdXXdeXXedX • 123456789012345 • 000030003900350 • Matching • Match if both offsets are (0 or to left of current Z-box) • Else match if both offsets are identical

  17. Example with offsets • ecXXcdXXdeXXedX • aXXabXXbaX • 08001 • aXXabXXbaX • offset for e is 9 which is outside z-box • offset for b is 0 • offset for d is 5 • offset for a is 5

More Related