A Self-adjusting D ata Structure for Multi-dimensional Point S ets

Download Presentation

A Self-adjusting D ata Structure for Multi-dimensional Point S ets

Loading in 2 Seconds...

- 66 Views
- Uploaded on
- Presentation posted in: General

A Self-adjusting D ata Structure for Multi-dimensional Point S ets

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

A Self-adjusting Data Structurefor Multi-dimensional Point Sets

Eunhui Park & David M. Mount

University of Maryland

Sep. 2012

- Sleator & Tarjan introduced the splay tree almost 30 years ago.
- Self adjusts to access distribution
- Supports insertion and deletion inO(log n) amortized time
- Efficient access:
- Balance property – maccesses in O((m+n) log n) time
- Scanning property [Elmasry 2004] – access all items in O(n) time
- Working set property – … on temporal locality
- Static optimality property – Efficient access based on frequency
- Static & dynamic finger [Cole, 2000] properties – … on spatial locality

Is there a multi-dimensional generalization?

- Compressed Quadtree
- Hierarchical partition of space
- O(n) space
- O(log n) access time if augmented:
- Topology tree [Frederickson1985, Har-Peled2005 ]
- Skip quadtree [Eppstein, Goodrich, Sun 2005]
- Quadtreap [Mount, Park 2010] based on treap [Seidel, Aragon 1996]

- Efficient approximate proximity queries
- Approximate nearest neighbor search
- Approximate range search

- Like quadtrees:
- A versatile geometric partition tree
- Supports efficient approximate proximity queries

- Like splay trees:
- Adjusts to access distribution
- Supports insertion/deletion in O(log n) amortized time
- Supports splay tree access properties: balance, static optimality, working set, static finger

Quadtree + Splay tree Splay Quadtree

- BD-tree
- BD-tree
- Rotation

- Splaying operation
- Basic splaying
- Splaying

- Efficiency
- Insertion/deletion
- Search and access efficiency

- Each node is associated with a region of space called a cell.
- Each cell is defined by an outer box and an optional inner box.
- Partition operations: split and shrink.
- Internal nodes: split nodes and shrink nodes.
- Each leaf has a single point or a single inner box.

Box Decomposition tree (BD-tree) :

A geometric data structure based on a hierarchical decomposition of space into d-dimensional axis-aligned rectangles

box

cell

leaves

- Split
Partitions a cell by an axis-orthogonal hyperplanethat bisects the cell’s longest side.

- Shrink
Partitions a cell by a shrinking box, which lies within the cell.

C

D

E

C

right

left

E

D

split

C

C

C

outer

inner

F

F

C\F

shrink

523686

- By construction, nodes are generated in shrink-split pairs. We merge each into a single ternary node, called a pseudo-node.
- Tree can be restructured through a local operation, called promotion.

shrink node

outer

inner

split node

right

left

pseudo-node

right

outer

left

x

y

E

y

x

A

D

E

B

C

D

C

A

D

E

B

C

B

A

- Given an internal node, x, splay(x) uses promotions to transform x to the root of the tree
- This makes future accesses to x more efficient

g

x

splay(x)

b

f

g

e

c

d

c

f

d

b

e

x

- As in Sleator & Tarjan, splaying is based on primitive operations:
- Zig-zag
- Zig-zig

z

z

x

y

x

F

G

F

G

z

y

D

y

x

A

B

D

E

A

E

B

C

F

G

C

A

D

E

B

C

x

z

y

y

y

A

B

F

G

z

x

z

D

x

D

E

C

D

A

E

B

C

F

G

E

F

G

A

B

C

- Inner-left convention:
- If an internal node’s cell has an inner box, it resides in its left child
- If necessary, left and right children are relabeled to satisfy this

- This guarantees that each cell has constant complexity
- Right promotion may violate this convention

y

x

E

y

x

B

A

E

C

A

D

u

v

D

A

E

B

C

D

u

v

u

v

If this cell has an inner box, u

C

B

Now, y’s cell has two inner boxes,

u and v !

- Promotions must be carefully structured to avoid this problem
- 3-phased approach (3 passes from bottom to top)
- As in Sleator & Tarjan, amortized efficiency is established by a potential-based analysis.

g

a

g

R

g

R

b

f

b

O

R

L

g

f

e

O

c

O

a

c

c

d

L

d

R

d

c

e

d

b

L

R

f

a

b

f

L

e

a

e

- Insert(q): locate leaf x containing q
add q as new leaf

splay(x)

- Insertion can be performed in O(log n) amortized time.
- Deletion can be performed in O(log n) amortized time.

x

q

x

x

q

- Balance Theorem:
Total access for q1, q2, …, qmtakes O((m+n)log n) time.

- Working Set Theorem:
For each access qj, let tj be the number of different queries since the last access of qj, or since the beginning if this is the qj’s first access. Total m access queries take O().

- Static Optimality Theorem:
Given a quadtree subdivision Z, where each cell zZ has an access probability pz, the entropy of Z is defined as

Total m access queries take O().

- 1-dim (Sleator & Tarjan 83)
Total access for i1, i2, …, imtakes

O(m).

- d-dim
- For a single point ,
- Let

- For a single point ,

- d-dim

×

- 1-dim (Sleator & Tarjan 83)
Total access for i1, i2, …, imtakes

O(m).

- d-dim
- But most geometric queries involve regions, not points
- Let

- But most geometric queries involve regions, not points

- d-dim

×

- 1-dim (Sleator & Tarjan 83)
Total access for i1, i2, …, imtakes

O(m).

- d-dim
- queries
- Let

- queries

- d-dim

×

- 1-dim (Sleator & Tarjan 83)
Total access for i1, i2, …, imtakes

O(m).

- d-dim
- For the technical reasons, need to expand
- Let

- For the technical reasons, need to expand

- d-dim

×

- 1-dim (Sleator & Tarjan 83)
Total access for i1, i2, …, imtakes

O(m).

- d-dim
- Consider an expanded ball
- Let

- Define the working set to be the set of points within distance from
- Total access for approx. range queries :
(1/ε) d-1

- ANN queries
- Box queries

- Consider an expanded ball

- d-dim

×

: set of points in expanded ball

- Splay Quadtree:
- Self-adjusting geometric data structure
- Supports insertion/deletion in O(log n) amortized time
- Supports efficient approximate proximity queries

- Open problems:
- Other properties of standard splay trees?
- Dynamic finger theorem
- Scanning theorem

- Better notions of distance (or generally locality) in a geometric setting?

- Other properties of standard splay trees?