1 / 17

I/O-Efficient Structures for Orthogonal Range Max and Stabbing Max Queries

I/O-Efficient Structures for Orthogonal Range Max and Stabbing Max Queries. Second Year Project Presentation Ke Yi Advisor: Lars Arge Committee: Pankaj K. Agarwal and Jun Yang. Problem Definition: Range Max Queries. Range-aggregate queries : range-count, range-sum, range-max

coye
Download Presentation

I/O-Efficient Structures for Orthogonal Range Max and Stabbing Max Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. I/O-Efficient Structures for OrthogonalRange Max and Stabbing Max Queries Second Year Project Presentation Ke Yi Advisor: Lars Arge Committee: Pankaj K. Agarwal and Jun Yang

  2. Problem Definition: Range Max Queries • Range-aggregate queries: range-count, range-sum, range-max • N points in Rd • Each point p is associated with a weight w(p) • Query rectangle Q • Compute max{w(p) | pQ} • Static and dynamic

  3. Problem Definition: Stabbing Max Queries • N hyper-rectangles in Rd • Each rectangle γis associated with a weight w(γ) • Query point q • Compute max{w(γ) | qγ}

  4. Model • I/O Model • N : Elements in structure • B : Elements per block • M : Elements in main memory • n = N/B • Assumptions • M>B2 • Each word holds log2N bits • Any coordinate or weight can be stored in one word D Block I/O M P

  5. Related Work & Our Results: Range Queries • 1D range queries are easy: B-tree • O(n) space, O(logBn) query & update • 2D range queries: • Poly-logarithmic query: CRB-tree [AAG03] • O(nlogBn) space, O(log2Bn) query • Linear space: kdB-tree, cross-tree, O-tree • query, O(logBn) update • Our results:

  6. Related Work & Our Results: Stabbing Queries • 1D stabbing queries • SB-tree [YW01] • O(n) space, O(logBn) query & insert • Does not allow deletions! • 2D stabbing queries • No structures with worst-case guarantee • Our results:

  7. 2D Range Max Queries • The external version of Chazelle’s structure [C88] • Linear space, • Static: O(log1+εN) query • Dynamic: O(log3N log log N) query & update • Overall structure • A normal B-tree Φ on y-coordinates of all the points • A Fan-out base B-tree T on x-coordinates • Pv: all points stored in the subtree of v • Each internal node v stores two secondary structures Cv, Mv storing information about Pv in a compressed manner • Cv and Mv of size O(|Pv| / logBn) → linear size in total • Weights of points stored at leaves explicitly

  8. 2D Range Max Queries • Cv borrowed from CRB-tree • Compute the ranks of the points one level down in O(1) I/Os • Identify the weight of a point explicitly in O(logBn) I/Os • Mv computes the maximum weight in a multislab inO(logBn) I/Os • Answering a query: • Use Φ to compute the ranksin the root of T • Use Mv to compute maximumat each level • For a total of O(log2Bn) I/Os v v1 v2 v3 v4 v5 v6

  9. 2D Range Max Queries: Mv • Divide Pv into chunks of BlogBN • Divide each chunk into minichunks of size B • Three-level structures • Mv=(Ψ1, Ψ2, Ψ3) • each of size O(|Pv| / logBn) v

  10. 2D Range Max Queries: Mv • Basic idea: encode the range max information in a compressed manner, identify the maximum point using Cv once its rank is found • Ψ3[l]: for each minichunk, stores a (slab index, weight rank) pair for each point inside the minichunk • Find the rank of the maximum-weight point in O(1) I/Os; • Identify it in O(logBN) I/Os. • Ψ2[k]: for each chunk, encode a Cartesian tree on the O(logBN) minichunks for each of the O(B) multislabs • Find the minichunk containing the maximum-weight point in O(1) I/Os; • Use Ψ3to find the exact point in O(logBN) I/Os; • Ψ1: A fanout B-tree on the O(|Pv| / (BlogBn)) chunks • Find the maximum-weight point in O(logBN) I/Os.

  11. 2D Range Max Queries • Static structures • O(n) size, O(log2BN) query, O(nlogBN) construction • O(n) size, O(logB1+εN) query, O(NlogBN) construction • Dynamization: • Throw away Ψ2 and expandΨ3 • O(nlogBlogBN) size • O(log3BN) query, worst case • O(log2BN logM/BlogBN) insert, amortized • O(log2BN) delete, amortized • Extending to d-dimension • Standard technique • Pay an extra O(logd-2BN) factor to all these bounds

  12. v 1D Stabbing Max Queries • Modify the external interval tree [AV96] to support max • Fan-out base B-tree on x-coordinates • Interval stored in highest node v where it contains slab boundary • In one left (right) slab structure and the multislab structure • Answering a query • Search down tree and visit O(logBN)nodes • Compute the maximum weight in left (right)slab structure and the multislab structure

  13. 1D Stabbing Max Queries • Slab structures are implemented using B-trees • Query and update: O(logBN) I/Os • Multislab structure: Fan-out B-tree • At each internal node, we store the maximum weight for each of the slabs and for each of the children • Query: O(1) I/Os (only look at the root) • Update: O(logBN) I/Os • Rebalancing the base tree: O(logBN) I/Os • Weight-balanced B-trees • Overall cost: size O(n), query O(log2BN), update O(logBN).

  14. 1D Stabbing Max Queries • Space-time tradeoff: • O(nlogBεN) size • O(nlogB2-εN) query • Can handle the general semigroup queries • A semigroup (S, +) • Each weight w(γ) S • Want to compute ∑ qγw(γ) • Ideas can also be used to improve the internal memory algorithm • Linear size, O(log2N / log log N) query and update

  15. 2D Stabbing Max Queries • Extend our 1D stabbing query structure • Use our 2D range query structure as a building block • Extending to d-dimension • Standard technique • Pay an extra O(logd-2BN) factor to all these bounds

  16. Conclusions and Open Problems • In this project, we developed I/O-efficient • linear space structures with poly-logarithmic query cost for the static 2D range max queries • near linear space structures with poly-logarithmic query & update cost for the dynamic 2D range max queries • linear space structures with poly-logarithmic query cost for the dynamic 1D stabbing max queries • near linear space structures with poly-logarithmic query & update cost for the dynamic 2D stabbing max queries • Open problems • Linear size dynamic structures for the 2D range & stabbing max queries? • General semigroup queries?

  17. THE END Thank you!

More Related