1 / 21

Optimal Planar Point Enclosure Indexing

Optimal Planar Point Enclosure Indexing. Lars Arge, Vasilis Samoladas and Ke Yi Department of Computer Science Duke University Technical University of Crete. Two Dual Problems. Range searching. Point enclosure. √. √. Internal memory External memory. √. ?. Outline.

carl-koch
Download Presentation

Optimal Planar Point Enclosure Indexing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimal Planar Point Enclosure Indexing Lars Arge, Vasilis Samoladas and Ke Yi Department of Computer Science Duke University Technical University of Crete

  2. Two Dual Problems Range searching Point enclosure √ √ Internal memory External memory √ ?

  3. Outline • Previous results in internal memory • Computation models in external memory • Previous results in external memory • Our lower bound result • Matching upper bound • Conclusions

  4. Previous Results: Internal Memory • Computation model: Pointer machine • Range searching (T is the output size) • O(N) space, O(Nε+T) time ([BM 80]) • O(N logN / loglogN)space, O(logN+T)time [Chazelle 88] • Tight for O(logcN+T) query structures, [Chazelle 90] • Can do better on a RAM • Other tradeoffs … • Point enclosure [Chazelle 86] • Ө(N) space, Ө(logN+T) time • Optimal in both space and time

  5. External Memory: Models • External pointer machine • Natural generalization of the internal pointer machine • Each node contains B data objects • Out-degree 2 →B • Bounding-volume hierarchy (Non-replicating index structure) • Tree structure • Each object is stored only once • Indexability model [HKP 97] D Block I/O M P

  6. External Memory: Models • Indexability model • No “structure” at all! • Only models layout of data • Each block contains B data objects • Can “magically” find the smallest set Πof blocks whose union contains all results • Cost is defined to be |Π| Indexability model 1D range searching External pointer machine All other known results Bounding volume hierarchy R-trees, kd-trees

  7. Previous Results: External Memory • Range searching (n=N/B) • Similar to internal memory, tradeoff between space and time • O(logBn+T/B) query time • O(n log n / loglogBn) space [ASV 99] • Tight in external pointer machine [SR 95] • Improved to indexability model [ASV 99] • O(n) space • O( ) time [kdB-tree, GI 99, KS 99] • Tight in bounding-volume hierarchies • Can do O(nε+T/B) with constant redundancy • Tight in indexability model [ASV 99]

  8. Previous Results: External Memory • Point enclosure • Ω( ) for bounding-volume hierarchies [ABGHH 01] • Easy to get a O(n) space, O(log2n+T/B) query structure (n, log n+T/B) (nBε, logBn+T/B) (n, log n + T/B)? B 2

  9. Indexability Model in Details • N data objects laid out in disk blocks, possibly with redundancy • Each block holds at most B objects • Cost of a query q: minimum # blocks needed to retrieve all answers • Can find those blocks without cost • Redundancy rand access overhead A • r: Average # copies in the index • Size is rn blocks • A: Ratio of the query cost to the ideal cost in the worst case • Any query can be covered by blocks (A ≤ B) • Lower bound expressed as a tradeoff between r and A • 2D range searching [ASV 99]

  10. Previous Results in Indexability Model • Set queries [HKP 97] • A set S of N objects, queries can be any subset of S • For any r≤n/B, A=B • Trivial • Range searching • [HKP 97] • [SP 98] • Only tight for the special case when points form a grid • [ASV 99]

  11. Redundancy Theorem [SP 98] (Asymptotic version) For N data objects, if there exist m queries q1, …, qm, such that for any 1≤ i,j ≤ m, i ≠ j, |qi| ≥B, |qi∩qj| ≤ B/A2, then, we have the redundancy • Combinatorial in nature • Used successfully to obtain the range searching lower bound

  12. Point Enclosure Lower Bound Construction (1) • Set of queries: the Fibonacci lattice (one of low-discrepancy point sets) • m points in a m×m grid • Only property used: any rectangle with area αm contains between and points • Set of objects • Tiling rectangles of αti×m/ti • t=(m/α)1/B, i=1,…,B • m=αN/B • Θ(B·m2/(αm)) = Θ(N)rectangles are constructed • |qi| ≥ B is satisfied

  13. Point Enclosure Lower Bound Construction (2) • Any A that satisfies |qi∩qj| ≤ B/A2 will become a lower bound • Make A as large as possible • For a rectangle to cover q1 and q2, we must have αti≥x and m/ti≥y, or x/α≤ ti ≤ m/y • q1 and q2are two points from the Fibonacci lattice, so xy≥c2m • # such rectangles ≤

  14. Point Enclosure Lower Bound Construction (3) • Disprove earlier (n, logBn+T/B) conjecture • Still a square root factor away • What’s wrong? The construction technique, or the model itself?

  15. Refine the Indexability Model O(logBn + |q|/B) Search costRetrieval cost • Observation: retrieval cost is relatively high for small queries • Refine: add an addictive factor! • Old: any query q is covered by blocks • New: Any query q is covered by blocks • Modify the Redundancy Theorem accordingly • The two conditions: |qi| ≥B, |qi∩qj| ≤ B/A2 |qi| ≥BA0, |qi∩qj| ≤ B/A12

  16. The Refined Redundancy Theorem For N data objects, if there exist m queries q1, …, qm, such that for any 1≤ i,j ≤ m, i ≠ j, |qi| ≥ BA0, |qi∩qj| ≤ B/(2A1)2, then, we have the redundancy Proof Sketch: Each query can be covered by blocks, and apply the original Redundancy Theorem with A=2A1

  17. Old construction |q| = B Blayers of tiling rectangles Size of Fibonacci latticem=αN/B Total # rectangles: N New construction |q| = BA0 BA0layers of tiling rectangles Size of Fibonacci latticem=αN/(BA0) Total # rectangles: N Fix the Construction

  18. Range Searching vs. Point Enclosure • Range searching • Original model • New model • Point enclosure • Dual bounds in external memory! r

  19. Matching Upper Bounds (1) • In the external pointer machine model • Only interested in the case A1=O(1) • Goal: for any r ≤ B, design an index with redundancy r that answers query in O(logrn+T/B) I/Os • Building block: one-sided segment intersection queries • Given N horizontal segments • Report all segment directly above a query point • Persistent B-tree (modified) • O(n) space, O(logBn+T/B) query • Search on the x-coordinate ofthe query point • Retrieve the segments

  20. Matching Upper Bounds (2) • Divide plane into r horizontal slabs • Associate two one-sided segmentintersection structures to each slab • One for all top sides of rectanglesthat cross its bottom boundary • One for all bottom sides ofrectangles that cross its top boundary and all bottom sidesof rectangles that completely span the slab • Recursively handle rectangles that fall completely within a slab, resulted in a tree with fanout r • Any rectangle is stored at most r times: redundancy is r • Query: follow the tree top-down, ask two one-sided queries at each level. O(logrn logBN+T/B) I/Os → O(logrn+T/B) by fractional cascading

  21. Conclusions • A tight lower bound on the tradeoff between the redundancy and access overhead of any index for the 2D point enclosure queries, given in the new indexability model • A matching upper bound in the external pointer machine The END

More Related