1 / 51

The Optimal-Location Query

Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia. The Optimal-Location Query. Motivation. “ What is the optimal location in Boston area to build a new McDonald’s store?” Optimality: maximize the number of customers who think the new store is closer to them.

reuben
Download Presentation

The Optimal-Location Query

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia The Optimal-Location Query

  2. Motivation • “What is the optimal location in Boston area to build a new McDonald’s store?” • Optimality: maximize the number of customers who think the new store is closer to them.

  3. We consider the L1 distance: |x1 - x2|+|y1 - y2| Formal Definition • Given a set S of sites, a set O of weighted objects, and a query range Q , • Find a location lQ which maximizes oOo.weight s.t. sS, d(o,l)  d(o,s).

  4. We consider the L1 distance: |x1 - x2|+|y1 - y2| Formal Definition • Given a set S of sites, a set O of weighted objects, and a query range Q , • Find a location lQ which maximizes oOo.weight s.t. sS, d(o,l)  d(o,s).

  5. Q Example o :3 2 o :6 4 o :5 3 o :4 s 1 2 s 1

  6. Example Q o :3 2 l1 19 o :6 4 22 o :5 10 3 o :4 s 1 12 2 s 1 The “Influence” of l1 is 5+6=11.

  7. Example Q o :3 19 2 l2 l1 o :6 4 22 o :5 18 3 o :4 s 1 12 2 s 1 The “Influence” of l1 is 5+6=11. The Influence of l2 is 5.

  8. Content • Problem Definition • Straightforward Solution • Problem Transformation • The R-tree-based solution • The OL-tree • The VOL-tree • Performance

  9. Using the RNN Algorithm… o :3 2 l1 19 o :6 4 22 o :5 10 3 o :4 s 1 12 2 s 1 The RNNs of l1 are O3 and O4.

  10. Straightforward Solution o :3 2 o :6 4 o :5 3 o :4 s 1 2 s 1 Compute the influence for every location in Q. Problematic: infinite number of candidates!.

  11. Content • Problem Definition • Straightforward Solution • Problem Transformation • The R-tree-based Solution • The OL-tree • The VOL-tree • Performance

  12. nn_buffer of O4. • Any location within the nn_buffer is a closer site if built. • nn_buffer is a diamond. nn_buffer of an Object O2:3 O3:5 O4:6 O1:4 S2 S1

  13. Any location here is an optimal location! • Find a location with maximum overlap among objects’ nn_buffer. Problem Transformation O2:3 Q O3:5 O4:6 O1:4 S2 S1

  14. o 45 • Rotate the coordinate 45°. • All nn_buffers become axis-parallel squares. • Focus on the rotated coordinate. The Rotated Coodinate Y X' o y x' Y' y' x X

  15. Content • Problem Definition • Straightforward Solution • Problem Transformation • The R-tree-based Solution • The OL-tree • The VOL-tree • Performance

  16. Store the objects in an R-tree. • Retrieve the objects whose nn_buffers intersect Q. • Plane sweep to find a region which has maximum overlap. The R-tree-based Solution

  17. Object retrieval: • Store point objects, • but retrieve nn_buffersin increasing order of lower X. • Plane sweep: • Straightforwardly: O(n2). • Our method: O(n log n). Two Contributions

  18. Keep a heap of index entries + objects. • Sorted in increasing order of nn_buffer’s lower X. t t • While heap is not empty, pop an entry. • If pop an object, send it to plane sweep. • If pop an index entry, push its children (intersecting Q). Best-first Retrieval

  19. 0 5 12 7 3 0 2 5 8 9 12 -∞ +∞ 4 Naïve Plane Sweep Y 12 O2:3 9 O1:4 8 5 O3:5 2 O4:6 X

  20. 0 5 12 7 3 0 2 5 8 9 12 -∞ +∞ +2 0 7 14 9 5 3 0 2 5 8 9 11 12 -∞ +∞ Suppose next insertion: add 2 to the Y-range [2,11]. Not Efficient! O(n2)

  21. 0 5 12 7 3 0 2 5 8 9 12 -∞ +∞ 0 0 0 5 9 -∞ +∞ The aSB-tree Extended from the SB-tree [YW01]: • keeps max overlap information at index entries. • handle a query range Q.

  22. 0 5 12 7 3 0 2 5 8 9 12 -∞ +∞ The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. +2 0 0 0 5 9 -∞ +∞

  23. 0 5 12 7 3 0 2 5 8 9 12 -∞ +∞ The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. 0 2 0 5 9 -∞ +∞ +2 +2

  24. The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. 0 2 0 5 9 -∞ +∞ 0 7 12 7 5 3 0 2 5 8 9 11 12 -∞ +∞

  25. Content • Problem Definition • Straightforward Solution • Problem Transformation • The R-tree-based Solution • The OL-tree • The VOL-tree • Performance

  26. Idea: partition the space, and keep max overlapped region for each partition! • Like a k-d-B-tree. • Stores nn_buffers. 1 3 2 4 • An nn_buffer may have multiple copies. The OL-tree 1: add to fullcover. 2,3,4: recursively insert.

  27. Index entry has, besides range: • fullcover: total weight of nn_buffers fully covering the whole area; • localmax: among the nn_buffers inserted into the sub-tree, max overlap. • maxrange: the region where localmax occurred. • Leaf entry: • A rectangle and its weight. Stored Information

  28. r ( , 0, 9) root r ( , 2, 7) 3 r ( , 0, 4) 1 r ( , 1, 4) 2 r ( , 1, 2) 33 r ( , 2, 3) r ( , 4, 3) 32 31 sub-trees omitted

  29. maxrange: where localmax occurred fullcover: 2 nn_buffers fully cover r3 r ( , 0, 9) root r localmax: Among those inserted, max overlap is 7 ( , 2, 7) 3 r ( , 0, 4) 1 r ( , 1, 4) 2 r ( , 1, 2) 33 r ( , 2, 3) r ( , 4, 3) 32 31 sub-trees omitted

  30. Query Processing • Start with root, insert index entries into heap. • Sorting key: upper bound of real max overlap in the sub-tree. • localmax +  fullcovers of ancestor entries. • Accurate if Q intersects with maxrange.

  31. r ( , 0, 9) root Real max overlap = 0+2+1 +localmax = 5 r ( , 2, 7) 3 r ( , 0, 4) 1 r ( , 1, 4) 2 r ( , 1, 2) 33 localmax r ( , 2, 3) r ( , 4, 3) 32 31 sub-trees omitted

  32. Query Processing • Start with root, insert index entries into heap. • Sorting key: upper bound of real max overlap in the sub-tree. • localmax +  fullcovers of ancestor entries. • Accurate if Q intersects with maxrange. • Keep a running value: max overlap M. • Pruning 1: Q intersects with maxrange. • Pruning 2: upper bound of max overlap < M.

  33. r ( , 0, 9) • r2 is pruned since Q intersects r2.maxrange. M = 0+1+4=5. Q root r ( , 2, 7) 3 r ( , 0, 4) 1 • r1 is pruned since the upper bound of overlap • = 4 < M. r ( , 1, 4) 2 r ( , 1, 2) 33 r ( , 2, 3) r ( , 4, 3) 32 31 sub-trees omitted

  34. r ( , 0, 9) root Sometimes, we need to examine a leaf node. Plane sweep it! r ( , 2, 7) 3 r ( , 0, 4) 1 r ( , 1, 4) 2 r ( , 1, 2) 33 r ( , 2, 3) r ( , 4, 3) 32 31 sub-trees omitted

  35. OL-tree is not practical • worst-case space complexity O(n2) • complex re-organization • How to improve? • Only keep a few top levels of the OL-tree. ==> Virtual OL-tree! OL-tree  VOL-tree

  36. VOL-tree

  37. If Q is here, perform range search on the R-tree. Example

  38. Comparison with R-tree Approach • The R-tree approach examines all nn_buffers intersecting with Q. • By using a small, in-memory VOL-tree, the new approach can prune the search space.

  39. To insert an nn_buffer here, recompute! Challenge • With dynamic updates, to keep localmax and maxrange is expensive.

  40. Index entry (range, fullcover, maxrange, localmax) lowermax, uppermax • lowermax ≤ localmax ≤ uppermax Solution

  41. Index entry (range, fullcover, maxrange, localmax) lowermax, uppermax • lowermax ≤ localmax ≤ uppermax • Any location in maxrange has overlap = lowermax. • At a location outside maxrange, the overlap can be more than lowermax, but < uppermax. Solution

  42. Case 1: increase uppermax. Case 2: increase uppermax and lowermax. Update • Case 1: the new nn_buffer does not intersect with maxrange. • Case 2: intersects.

  43. Query • Similar to the OL-tree. • To compute upper bound of max overlap, use uppermax. • When Q intersects maxrange, may or may not prune.

  44. Content • Problem Definition • Straightforward Solution • Problem Transformation • The R-tree-based Solution • The OL-tree • The VOL-tree • Performance

  45. Setup • Digital Chart from the R-tree Portal. • O: 24,493 populated places. • S: 9,203 cultural landmarks. • Pagesize: 1KB. Buffersize: 256 pages. • Object R-tree: 753 pages. • Pentium IV Dell PC, 3.2GHz. • Java. • Measure total I/O of 100 random queries.

  46. Size of the VOL-tree

  47. Small Query Area

  48. Large Query Area

  49. Varying Buffer Size

  50. Effect of Update

More Related