1 / 53

Estimating the longest increasing sequence in polylogarithmic time

2. The problem. Given array f:[n] ? N, find (length of) Longest Increasing Subsequence (LIS)Textbook dynamic programming problem[CLRS 01] Chapter 15.4 (Longest Common Subsequence), Starred Problem 15.4-6[Schensted, Fredman] O(n log n) algorithm. 4. 24. 10. 9. 15. 17. 20. 18. 4. 19. 3. 4. 10.

edith
Download Presentation

Estimating the longest increasing sequence in polylogarithmic time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. 1 Estimating the longest increasing sequence in polylogarithmic time Michael Saks (Rutgers University) C. Seshadhri (Sandia National Labs)

    2. 2 The problem Given array f:[n] ? N, find (length of) Longest Increasing Subsequence (LIS) Textbook dynamic programming problem [CLRS 01] Chapter 15.4 (Longest Common Subsequence), Starred Problem 15.4-6 [Schensted, Fredman] O(n log n) algorithm

    3. 3 A partial list of references…. [Schensted 61] [Fredman 75] [Apostolica Guerra 87] [Atshul et al 90] [Ramanan 97] [Goldreich Goldwasser Ron 97] [Baik Deift Johansson 99] [Delcher et al 99] [Dodis et al 99] [Aldous Diaconis 99] [Ergun et al 99] [Bespamyatnuikh Segal 00] [Fischer 01] [Liben-Nowell Vee Zhu 03] [Zhang 03] [Ailon et al 03] [Parnas Ron Rubinfeld 03] [Gal Gopalan 07] [Gopalan et al 07] [Sun Woodruff 07] [Ergun Jowhari 08]

    4. 4 Massive data sets Array f is extremely large Algorithm should run in time at most polylog(n) In particular, read only polylog(n) locations How well can we approximate |LIS|?

    5. 5 Massive data sets Array f is extremely large, don’t want to read all of it What can we say about LIS length, if we see very little? |LIS| = LIS length Read only poly(log n) positions Obviously randomized

    6. 6 Uniform sampling in action Choose uniform random sample of polylog(n) size |LIS| = n/2, but random sample (almost) always monotically increasing Similar examples where |LIS|=o(n) but sample always increasing.

    7. 7 Our result For any (constant) d > 0 Algorithm gives additive dn approximation to |LIS| Running time is C(d)(log n)c (C(d) =2O(1/d) )

    8. 8 Our result For any (constant) d > 0 Algorithm outputs an interval of width dn that almost surely contains |LIS| = additive dn approximation. Running time is C(d)(log n)c (C(d) =2O(1/d) ) Previously: only known for d >= ˝ [Ailon Chazelle Liu S 03] [Parnas Ron Rubinfeld 03]

    9. Plan for talk Sketch additive n/2 approximation of [PRR] and [ACCL] Obstacles to improvement The two main algorithmic ideas: Finding good splitters Boosting approximation quality Sketch of the algorithm 9

    10. 10 Prelims: the array in space Index x maps to point P(x)= (x,f(x))

    11. 11 Prelims: the array in space Input array is viewed as a set of points in plane, Increasing sequence is a set of points going up and right

    12. 12 The use of randomness Find fraction of green Randomized - in constant time [Chernoff-Hoeffding] Estimate to within g fraction with probability 1- a using O(log(1/ a)/g2) samples Used in many places in the algorithm

    13. 13 Main algorithmic component Classification algorithm: Takes as input a single input and outputs good or bad

    14. 14 The algorithm classify Classify an index as good or bad Good indices form an increasing sequence At most (n-|LIS|+ dn) bad indices. Classification of a single index runs very fast

    15. Classification -> Estimation Given the classification algorithm: Choose random sample of indices and run Classify on each. Output the fraction of indices that are classified as good 15

    16. Ensuring that Good is increasing To ensure that Good is increasing must have: For any violation P(x),P(z) At least one of x,z is classified bad 16

    17. 17 The violation counting trick Every index in [x,z] is a violation with x or z [Ergun Kannan Kumar Rubinfeld Viswanathan 99] So at least half the points in [x,z] are violation with x or violation with z

    18. 18 The generic algorithm

    19. 19 The generic algorithm

    20. 20 The algorithm of [PRR], [ACCL] Study samples in “neighborhoods” of P(x) If > ˝- e violations in any neighborhood then bad else good Neighborhoods of length (1+e)k, so O(log n) neighborhoods

    21. Can we improve analysis to show approximation error below n/2? LIS has size n/2 but Classify will declare every index to be bad 21

    22. 22 Structure of Sample used in Classify(x) Neighborhood sample: Density of sample decreases exponentially with distance from x Can we use neighborhood samples more effectively to Classify x?

    23. Limitation of neighborhood sample Can construct two input arrays A and B There is an interval J of n/6 indices: For array A, classify(x) must return bad for almost all x in J ... ...or else approximation has at least n/6 error For each x in J, neighborhood samples of x in A look the same as neighborhood samples of x in B. So for B, classify(x) returns bad for almost all x in J. In array B, excluding almost all x in J results in approximation error at least n/6 23

    24. Back to the drawing board 24

    25. 25 A dynamic program Splitter: point (i,j) that is consistent with LIS Find LIS is each blue region and piece together.

    26. 26 A dynamic program But we don’t know right splitter. So try all n possible splitters Choose the one that gives the largest sum of LIS’s MaxS (|LIS-below-S| + |LIS-above-S|)

    27. 27 The dynamic program LIS in all small boxes gives LIS for bigger boxes Essentially a special case of Savitch’s algorithm Exact version is not so efficient Is this approach relevant for a fast approximation algorithm?

    28. 28 Classification via splitter finding Search for a splitter. Guess the splitter (somehow!) Recurse on subbox containing P

    29. 29 How do we guess a splitter?

    30. 30 Sufficient: an approximate splitter No. of LIS points lost < µn (violations with splitter)

    31. 31 An approximate splitter

    32. 32 An approximate splitter We lose a µ-fraction of LIS points at each level Total loss = µn log n Set µ = 1/(100 log n), then total loss is n/100 (1% of points)

    33. How do we find approximate splitters? 33

    34. 34 µ - Conservative splitters µ - Conservative splitter is (trivially) an approximate splitter µ - Conservative splitters are easily found by random sampling

    35. 35 Getting a conservative splitter We can sample (log n) different candidates and check all of them You might miss a conservative splitter… What if no conservative splitter exists?

    36. 36 No conservative splitters Every point is in violation with at least µ n points No conservative splitter

    37. 37 If no conservative splitter…. So we know that |LIS| < (1-µ) n Leads to the next idea. Boosting approximations

    38. 38 Boosting Approximations Given: an additive dn-approximation algorithm Want d?n-approximation algorithm (d’ < d)?

    39. 39 Boosting approximations Take sum of outputs as total LIS estimate |LIS| = |LIS1| + |LIS2|, Est = Est1 + Est2 |Est1 – LIS1| < dn1 |Est2 – LIS2| < dn2 So |Est – LIS| < d(n1 + n2) n1+n2 < (1-µ)n, so |Est – LIS| < d(1-µ)n

    40. Assume we know the true splitter S But S is not a conservative splitter. n1+n2 < (1-µ)n |LIS| = |LIS1| + |LIS2|, |Est1 - LIS1| < dn1 and |Est2 - LIS2| < dn2 So |Est – LIS| < d(1-µ) n Reduced relative error! 40 Boosting approximations

    41. 41 We don’t know the best splitter Assume no conservative splitter Try O(log n) random splitters; one is “close enough” to best For each S, Est(S) = Est1(S) + Est2(S) Est = max (Est(S):S) is a d(1-µ) approximation to LIS Only polylog(n) calls to d - approximation algorithm

    42. 42 The hoped-for dichotomy

    43. 43 The DP revisited Try splitter S If not µ -conservative, then at least µn points excluded by S Add the LIS estimates in each box to get overall LIS estimate We take max over a sample of S’s Gives a d(1-µ)-approx

    44. 44 A generalization Suppose every “chain” has at most (1-µ)n points

    45. 45 A generalization Suppose every “chain” has at most (1-µ)n points Find chain with largest sum of estimates We get d(1-µ)-approx But there are more than poly(n) chains!

    46. 46 Use dynamic programming Run d-approx on all poly(log n) such boxes Use Dynamic Program to find chain with largest sum of estimates Longest path in DAG Can solve in poly(log n) time

    47. 47 Suppose every “chain” has at most (1-µ)s points In poly(log n) time, with poly(log n) calls to d-approx, we get d(1-µ)-approx

    48. 48 µ - Improved splitters µ - Conservative splitter: can only afford µ=O(log n) A more sensitive condition: For every interval I around n/2, no more than µ|I|+O(n/log n) violations with S

    49. µ - Improved splitters over Recurrence for Classify works provided: We have µ – Improved splitters with µ at most some small constant. (Needed µ = O(log n) before) 49

    50. 50 The dichotomy, again If we are unable to find µ-improved splitter in this box Build grid in the box Dichotomy Lemma: If no µ-improved splitter, no chain has more than (1-µ)n + n/log(n) points We can use boosting find d(1-µ)-approx for LIS in box

    51. 51 Algorithm classify in one slide We get d(1-µ)-approx Overall running time becomes (log n)1/d

    52. 52 The even better version Don’t solve this dynamic program exactly! Use our sublinear algo to approximately solve in (loglog n) time. Then apply it recursively. This sounds like a horrendous mess but….. ….Recursion doesn’t actually appear in implementation: accomplished implicitly by dynamically adjusting various parameters in the basic algorithm. Running time is C(d) (log n)c

    53. 53 Final remarks We get C(d) (log n)c time, C(d) is (at least) K1/d Can we get (log n)/d time? Recent related work (S.-Seshadhri): multiplicative (1+ d) approximation to n-|LIS| in streaming model with space O(log(n)/ d) (Best previous: factor 2 approximation) Ergun-Jowhari 08 Other dynamic programs?

More Related