1 / 22

A Self-adjusting D ata Structure for Multi-dimensional Point S ets

A Self-adjusting D ata Structure for Multi-dimensional Point S ets. Eunhui Park & David M. Mount University of Maryland Sep. 2012. Motivation. Sleator & Tarjan introduced the splay tree almost 30 years ago. S elf adjusts to access distribution

elvis
Download Presentation

A Self-adjusting D ata Structure for Multi-dimensional Point S ets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Self-adjusting Data Structurefor Multi-dimensional Point Sets Eunhui Park & David M. Mount University of Maryland Sep. 2012

  2. Motivation • Sleator & Tarjan introduced the splay tree almost 30 years ago. • Self adjusts to access distribution • Supports insertion and deletion inO(log n) amortized time • Efficient access: • Balance property – maccesses in O((m+n) log n) time • Scanning property [Elmasry 2004] – access all items in O(n) time • Working set property – … on temporal locality • Static optimality property – Efficient access based on frequency • Static & dynamic finger [Cole, 2000] properties – … on spatial locality Is there a multi-dimensional generalization?

  3. Background • Compressed Quadtree • Hierarchical partition of space • O(n) space • O(log n) access time if augmented: • Topology tree [Frederickson1985, Har-Peled2005 ] • Skip quadtree [Eppstein, Goodrich, Sun 2005] • Quadtreap [Mount, Park 2010] based on treap [Seidel, Aragon 1996] • Efficient approximate proximity queries • Approximate nearest neighbor search • Approximate range search

  4. Objective • Like quadtrees: • A versatile geometric partition tree • Supports efficient approximate proximity queries • Like splay trees: • Adjusts to access distribution • Supports insertion/deletion in O(log n) amortized time • Supports splay tree access properties: balance, static optimality, working set, static finger Quadtree + Splay tree Splay Quadtree

  5. Overview • BD-tree • BD-tree • Rotation • Splaying operation • Basic splaying • Splaying • Efficiency • Insertion/deletion • Search and access efficiency

  6. BD-tree • Each node is associated with a region of space called a cell. • Each cell is defined by an outer box and an optional inner box. • Partition operations: split and shrink. • Internal nodes: split nodes and shrink nodes. • Each leaf has a single point or a single inner box. Box Decomposition tree (BD-tree) : A geometric data structure based on a hierarchical decomposition of space into d-dimensional axis-aligned rectangles box cell leaves

  7. BD-tree: Partitioning Operations • Split Partitions a cell by an axis-orthogonal hyperplanethat bisects the cell’s longest side. • Shrink Partitions a cell by a shrinking box, which lies within the cell. C D E C right left E D split C C C outer inner F F C\F shrink

  8. 523686 BD-tree: Promotion • By construction, nodes are generated in shrink-split pairs. We merge each into a single ternary node, called a pseudo-node. • Tree can be restructured through a local operation, called promotion. shrink node outer inner split node right left pseudo-node right outer left x y E y x A D E B C D C A D E B C B A

  9. Splay Quadtree • Given an internal node, x, splay(x) uses promotions to transform x to the root of the tree • This makes future accesses to x more efficient g x splay(x) b f g e c d c f d b e x

  10. Basic Splaying • As in Sleator & Tarjan, splaying is based on primitive operations: • Zig-zag • Zig-zig z z x y x F G F G z y D y x A B D E A E B C F G C A D E B C x z y y y A B F G z x z D x D E C D A E B C F G E F G A B C

  11. The Problem of Right Promotion • Inner-left convention: • If an internal node’s cell has an inner box, it resides in its left child • If necessary, left and right children are relabeled to satisfy this • This guarantees that each cell has constant complexity • Right promotion may violate this convention y x E y x B A E C A D u v D A E B C D u v u v If this cell has an inner box, u C B Now, y’s cell has two inner boxes, u and v !

  12. Splaying in 3-Phases • Promotions must be carefully structured to avoid this problem • 3-phased approach (3 passes from bottom to top) • As in Sleator & Tarjan, amortized efficiency is established by a potential-based analysis. g a g R g R b f b O R L g f e O c O a c c d L d R d c e d b L R f a b f L e a e

  13. Insertion and deletion • Insert(q): locate leaf x containing q add q as new leaf splay(x) • Insertion can be performed in O(log n) amortized time. • Deletion can be performed in O(log n) amortized time. x q x x q

  14. Analogous to Splay Trees • Balance Theorem: Total access for q1, q2, …, qmtakes O((m+n)log n) time. • Working Set Theorem: For each access qj, let tj be the number of different queries since the last access of qj, or since the beginning if this is the qj’s first access. Total m access queries take O(). • Static Optimality Theorem: Given a quadtree subdivision Z, where each cell zZ has an access probability pz, the entropy of Z is defined as Total m access queries take O().

  15. Static Finger Theorem • 1-dim (Sleator & Tarjan 83) Total access for i1, i2, …, imtakes O(m). • d-dim • For a single point , - Let ×

  16. Static Finger Theorem • 1-dim (Sleator & Tarjan 83) Total access for i1, i2, …, imtakes O(m). • d-dim • But most geometric queries involve regions, not points - Let ×

  17. Static Finger Theorem • 1-dim (Sleator & Tarjan 83) Total access for i1, i2, …, imtakes O(m). • d-dim • queries - Let ×

  18. Static Finger Theorem • 1-dim (Sleator & Tarjan 83) Total access for i1, i2, …, imtakes O(m). • d-dim • For the technical reasons, need to expand - Let ×

  19. Static Finger Theorem • 1-dim (Sleator & Tarjan 83) Total access for i1, i2, …, imtakes O(m). • d-dim • Consider an expanded ball - Let • Define the working set to be the set of points within distance from • Total access for approx. range queries : (1/ε) d-1 • ANN queries • Box queries × : set of points in expanded ball

  20. Conclusions • Splay Quadtree: • Self-adjusting geometric data structure • Supports insertion/deletion in O(log n) amortized time • Supports efficient approximate proximity queries • Open problems: • Other properties of standard splay trees? • Dynamic finger theorem • Scanning theorem • Better notions of distance (or generally locality) in a geometric setting?

  21. References

  22. Thank you!

More Related