1 / 34

On the R ange M aximum-Sum S egment Q uery Problem

On the R ange M aximum-Sum S egment Q uery Problem. Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan University, Taiwan. The Maximum-Sum Segment. Also called the maximum-sum interval or the maximum-scoring region

Download Presentation

On the R ange M aximum-Sum S egment Q uery Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On the Range Maximum-Sum Segment Query Problem Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan University, Taiwan Chen and Chao

  2. The Maximum-Sum Segment • Also called the maximum-sum interval or the maximum-scoring region • Given a sequence of numbers, the maximum-sum segment is simply the contiguous subsequence having the greatest total sum. • <5, -5.1, 1, 3, -4, 2, 3, -4, 7> With greatest total sum = 8 Zero prefix-/suffix-sums are possible. Chen and Chao

  3. A Relevant Problem - RMQ • Range Minima (Maxima) Query Problem (also called Discrete Range Searching) • Given a sequence of numbers, by preprocessing the sequence we wishto retrieve the minimum (maximum) value within a given querying interval efficiently • <5, -5.1, 1, 3, -4, 2, 3, -4, 7> Minimum Maximum Chen and Chao

  4. Range Maximum-Sum Segment Query Problem Definition: • The input is a sequence <a1,a2,……an> of real numbers which is to be preprocessed. • A query is comprised of two intervals S and E. • Our goal is to return the maximum-sum segment whose starting index lies in S and end index lies in E. Chen and Chao

  5. A Nonoverlapping Example • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Total sum = 6 Starting region End region Chen and Chao

  6. An Overlapping Example • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Total sum = 8 Starting region End region Chen and Chao

  7. Our Results • We propose an algorithm that runs in O(n) preprocessing time and O(1) query time under the unit-cost RAM model. • We show that the RMSQ techniques yield alternative O(n) time algorithms for the following problems: • The maximum-sum segment with length constraints • All maximal-sum segments Chen and Chao

  8. Strategy • Reduce the RMSQ to the RMQ problem • Theorem. If there is a <f(n), g(n)>-time solution for the RMQ problem, then there is a <f(n)+O(n), g(n)+O(1)>-time solution for the RMSQ problem. O(n) RMSQ RMQ O(1) Chen and Chao

  9. Cumulative Sum/ Prefix Sum prefix-sum(i) = a1+a2+…+ai Chen and Chao

  10. Computing sum(i,j)in O(1) time • prefix-sum(i) = a1+a2+…+ai • all n prefix sums are computable in O(n) time. • sum(i, j) = prefix-sum(j) – prefix-sum(i-1) j i prefix-sum(j) prefix-sum(i-1) Chen and Chao

  11. Case 1: Nonoverlapping Maximize Maximize Minimize sum(i, j ) = prefix-sum(j) – prefix-sum(i-1) Prefix-sum sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Range Minima Query Find the highest point here Find the lowest point here Chen and Chao

  12. Case 2: Overlapping • Some problems may occur • Prefix-sum sequence • 9, -10, 4, -2, 5, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Negative Sum !! Find the highest point here Find the lowest point here Chen and Chao

  13. Case 2: Overlapping • Divide into 3 possible cases: • Prefix-sum sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Range Minima Query Preprocessing time = f(n) Query time = g(n) Range Minima Query Preprocessing time = f(n) Query time = g(n) Find the highest point here Find the highest point here What should we do? Find the lowest point here Find the lowest point here Chen and Chao

  14. Dealing with the Special Case:Single Range Query • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 • Challenge: Can this special case be reduced to the RMQ problem? Total sum = 6 Chen and Chao

  15. Reduction Procedure • Step 1. Find a partner for each index. • Step 2. Record the sum of each pair in an array • Step 3. Retrieve the maximum-sum pair by applying the RMQ techniques Chen and Chao

  16. Our First Attempt (1) • Step 1: For each index i, we define the lowest point preceding i as its partner • Prefix-sum sequence: i Lowest point Find a partner within this region Chen and Chao

  17. Our First Attempt (2) • Step 2: Record sum(partner(i), i) in an array i Lowest point sum(partner(i), i) Chen and Chao

  18. Our First Attempt (3) • Step 3: Apply the RMQ techniques to the array i Applying RMQ to this sequence Querying this interval The maximum-sum pair can be retrieved Lowest point sum(partner(i), i) Chen and Chao

  19. Bump into Difficulties • What if its partners go beyond the querying interval? i We might have to update every pair! Needs to be updated partner(i) sum(partner(i), i) Chen and Chao

  20. A Better Partner • Prefix-sum sequence Find the nearest point at least as large as i i Left_bound(i) Find the lowest point New partner(i) Chen and Chao

  21. Why Is It Better? (1) • It remains the best choice. • It saves lots of update steps. • It turns out that zero or one point needs to be updated. Chen and Chao

  22. Why Is It Better? (2)-- Remains the Best Find the nearest higher point i Left_bound(i) Find the lowest point partner(i) Impossible region Chen and Chao

  23. Why Is It Better? (3)-- Minimal-Maximal Property • Height(partner(i))< Height(j) < Height(i), for all partner(i)< j< i Next higher point Maximal point Minimal point i partner(i) No one higher than i No one lower than partner(i) Chen and Chao

  24. Why Is It Better? (4)-- Save Some Updates • Prefix-sum sequence Next higher point Can not be the right end of the maximum-sum segment Querying interval i partner(i) No one higher than i Chen and Chao

  25. Why Is It Better? (5)-- Nesting Property • For two indices i < j, it cannot be the case that partner(i)<partner(j) ≦i<j Maximal point i j Minimal point Minimal point Maximal point partner(j) partner(i) Chen and Chao

  26. Why Is It Better? (6)-- An example • No overlapping is allowed • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 • Nesting Property • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Chen and Chao

  27. When a Query Comes-- Case 1: No Exceeding • The maximum pair (partner(i), i) lies in the querying interval Retrieve the maximum pair Querying interval i partner(i) We are done. Output (partner(i), i). Chen and Chao

  28. When a Query Comes-- Case 2: Exceeding • The maximum pair (partner(i), i) goes beyond the querying interval Retrieve the maximum pair Retrieve the maximum pair Querying interval j i Maximal Minimal partner(i) Update partner(i) partner(j) (Partner(i), i) is the maximum pair. Compare (new_partner(i), i) and (partner(j), j) Can not be the right end of the maximum-sum segment. Nesting property Chen and Chao

  29. Time Complexity • RMSQ can be reduced to the RMQ problem in O(n) time • Since under the unit-cost RAM model, there is a <O(n), O(1)>-time solution for the RMQ problem, there is a <O(n), O(1)>-time solution for the RMSQ problem. • On the other hand, RMQ can be reduced to the RMSQ problem in O(n) time, too. (Range Maxima Query: For each two adjacent elements, we augment a negative number whose absolute value is larger than them.) O(n) RMQ RMSQ O(1) Chen and Chao

  30. Use RMSQ Techniques to Solve TwoRelevant Problems • 1. Finding the Maximum-Sum Segment with length constraints in O(n) time. - Y.-L. Lin, T. Jiang, K.-M. Chao, 2002 - T.-H Fan et al.,2003 • 2. Finding all maximal scoring subsequences in O(n) time. - W. L. Ruzzo & M. Tompa, 1999 Chen and Chao

  31. Problem 1:The Maximum-Sum Segment with Length Constraints • Lin, Jiang, and Chao [JCSS 2002] and Fan et al. [CIAA 2003] gave O(n)-time algorithmsfor this problem. • Length at least L, and at most U L U Chen and Chao

  32. Problem 1: Finding the Maximum-Sum Segment with Length Constraints • Length at least L, at most U • For each index i, find the maximum-sum segment whose starting point lies in [i-U+1, i-L+1] and end point is i i RMSQ query L U Runs in O(n) time since each query costs O(1) time Chen and Chao

  33. Problem 2: All Maximal-Sum Segments • Ruzzo and Tompa [ISMB 1999] gave a O(n)-time algorithm for this problem. • Recursive definition. L(S) R(S) S Chen and Chao

  34. Problem 2: Finding All Maximal Scoring Subsequences • Recursive calls. • Input sequence: L(S) R(S) S RMSQ query Runs in O(n) time since each query costs O(1) time Chen and Chao

More Related