1 / 29

On the R ange M aximum-Sum S egment Q uery Problem

On the R ange M aximum-Sum S egment Q uery Problem. Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan University, Taiwan 2004/12. Outline. Motivation Problems that raised from Bioinformatics applications

naiya
Download Presentation

On the R ange M aximum-Sum S egment Q uery Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On the Range Maximum-Sum Segment Query Problem Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan University, Taiwan 2004/12

  2. Outline • Motivation • Problems that raised from Bioinformatics applications • Definition of our research problem (RMSQ) • Our main idea • Finding partners for each indices • Reduce the problem to the Range Minima Query problem (RMQ) • Conclusions and applications • Solving three relevant problems in O(n) time

  3. Applications to biomolecular sequence analysis • Locating conserved regions or GC-rich regions • Assign a real number (also called scores) to each residue • Looking for the maximum-sum or maximum-average segments • With length constraints or average lower bound

  4. What is a Maximum-Sum Segment? • Also called maximum-sum intervals or maximum scoring regions • Given a sequence of numbers, the maximum-sum segment is simply the continuous subsequence having the greatest total sum. • <5, -5.1, 1, 3, -4, 2, 3, -4, 7> Total sum = 8 zero prefix/suffix sum is not allowed

  5. Finding the maximum-sum segment with length constraints • Lin, Jiang, and Chao [JCSS 2002] and Fan et al. [CIAA 2003] gave the O(n)-time algorithm for this problem, respectively. • Length at least L, at most U L U

  6. Finding all maximal-sum segments • Ruzzo and Tompa [ISMB 1999] gave a O(n) time algorithm for this problem. • Recursive calls. L R S

  7. Finding the longest segment with average constraints • Wang and Xu [Bioinformatics 2003] gave a linear time algorithm

  8. Our results • We propose an algorithm that runs in O(n) preprocessing time and O(1) query time • We use the RMSQ techniques we developed to solve the three problems mentioned above in O(n) time

  9. Problem Definition • Range Maximum-Sum Segment Query problem • The input is a sequence <a1,a2,……an> of real numbers which is to be preprocessed. A query is comprised of two intervals [i, j] and [k, l], our goal is to return the maximum-sum segment whose starting index lies in [i, j] and ending index lies in [k, l].

  10. A Nonoverlapping Example • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Total sum = 6 Starting region End region

  11. An Overlapping Example • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Total sum = 8 Starting region End region

  12. Main Idea • Reduce to the RMQ problem • Theorem. If there is a <f(n), g(n)>-time solution for the RMQ problem, then there is a <f(n)+O(n), g(n)+O(1)>-time solution for the RMSQ problem. O(n) RMSQ RMQ O(1)

  13. A relevant problem - RMQ • Range Minima Query Problem (also called Discrete Range Searching)

  14. Cumulative sum

  15. Case 1: Nonoverlapping Maximize Maximize Minimize sum(i, j ) = prefix-sum(j) – prefix-sum(i-1) • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Can be reduced to the RMQ problem Find a highest point here Find a lowest point here

  16. Case 2: Overlapping • Some problems occur in the overlapping case: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Negative Sum !! Find a highest point here Find a lowest point here

  17. Case 2: Overlapping • Divide into 3 possible cases: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Find a highest point here Find a highest point here Find a lowest point here Find a lowest point here

  18. A special case of RMSQ:single range query • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 • Challenge: Can this special case be reduced to the RMQ problem? Total sum = 6

  19. Idea • Step 1. Find a partner for each index. • Step 2. Record the sum of each pair in an array • Step 3. Reduce to the RMQ problem -- retrieve the maximum-sum pair within the querying interval

  20. Our First Attempt (1) • Step 1: For each index i, we define the lowest point preceding i as its partner i partner(i)

  21. Our First Attempt (2) • Step 2: Record sum(i, partner(i)) in an array i partner(i) sum(i, partner(i))

  22. Our First Attempt (3) • Step 3: Apply the RMQ techniques to an array i Retrieve the maximum-sum pair partner(i) sum(i, partner(i))

  23. Faults • What if its partner go beyond the querying interval? i Worst case Needs to be updated partner(i) sum(i, partner(i))

  24. A Better Partner

  25. Nesting Property • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 9,-10, 4,-2, 6,-5, 4,-3 ,8, -11, 8,-3, 9,-5, 3 Apply RMQ techniques Update can be done in O(1) time

  26. Use RMSQ Techniques to Solve the Other two relevant problems • 1. Finding the Maximum-Sum Segment with length constraints in O(n) time. - Y.-L. Lin, T. Jiang, K.-M. Chao, 2002 - T.-H Fan, S. Lee, H.-I. Lu, T.-S. Tsou, 2003 • 2. Finding all maximal scoring subsequences in O(n) time. - W. L. Ruzzo & M. Tompa, 1999

  27. Maximum-Sum Segment with length constraints • Length at least L, at most U L U Runs in O(n) time since each query costs O(1) time

  28. All Maximal Scoring Subsequences • Recursive calls. L R S Runs in O(n) time since each query costs O(1) time

  29. The End • Thank You.

More Related