1 / 44

R-Trees - PowerPoint PPT Presentation

R-Trees. Extension of B+-trees. Collection of d-dimensional rectangles. A point in d-dimensions is a trivial rectangle. Non-rectangular Data. Non-rectangular data may be represented by minimum bounding rectangles (MBRs). Operations. Insert Delete

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'R-Trees' - miette

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

• Extension of B+-trees.

• Collection of d-dimensional rectangles.

• A point in d-dimensions is a trivial rectangle.

• Non-rectangular data may be represented by minimum bounding rectangles (MBRs).

• Insert

• Delete

• Find all rectangles that intersect a query rectangle.

• Good for large rectangle collections stored on disk.

• Data nodes (leaves) contain rectangles.

• Index nodes (non-leaves) contain MBRs for data in subtrees.

• MBR for rectangles or MBRs in a non-root node is stored in parent node.

• R-tree of order M.

• Each node other than the root has between m <= ceil(M/2) and M rectangles/MBRs.

• Assume m = ceil(M/2) henceforth.

• Typically, m = ceil(M/2).

• Root has between 2 and M rectangles/MBRs.

• Each index node has as many MBRs as children.

• All data nodes are at the same level.

• R-tree of order 4.

• Each node may have up to 4 rectangles/MBRs.

• Possible partitioning of our example data into 12 leaves.

n

o

p

a

c

d

e

k

l

b

f

g

h

i

j

Example

• Possible R-tree of order 4 with 12 leaves.

Leaves are data nodes that contain 4 input rectangles each.

a-p are MBRs

n

o

p

a

c

d

e

k

l

b

f

g

h

i

j

m

Example

• Possible corresponding grouping.

a

b

c

d

n

o

p

a

c

d

e

k

l

b

f

g

h

i

j

n

m

Example

• Possible corresponding grouping.

e

f

a

b

c

d

n

o

p

a

c

d

e

k

l

b

f

g

h

i

j

n

m

o

p

Example

• Possible corresponding grouping.

e

f

a

b

g

h

c

d

i

• Report all rectangles that intersect a given rectangle.

• Start at root and find all MBRs that overlap query.

• Search corresponding subtrees recursively.

n

o

p

a

c

d

e

k

l

b

f

g

h

i

j

n

m

o

p

Query

a

x

a

a

n

o

p

a

c

d

e

k

l

b

f

g

h

i

j

n

a

m

b

o

p

c

d

Query

• Search m.

a

a

x

x

• Similar to insertion into B+-tree but may insert into any leaf; leaf splits in case capacity exceeded.

• Which leaf to insert into?

• How to split a node?

m

o

p

Insert—Leaf Selection

• Follow a path from root to leaf.

• At each node move into subtree whose MBR area increases least with addition of new rectangle.

• Insert into m.

m

• Insert into n.

n

• Insert into o.

o

• Insert into p.

p

Insert—Split A Node

• Split set of M+1 rectangles/MBRs into 2 sets A and B.

• A and B each have at least m rectangles/MBRs.

• Sum of areas of MBRs of A and B is minimum.

• Split set of M+1 rectangles/MBRs into 2 sets A and B.

• A and B each have at least m rectangles/MBRs.

• Sum of areas of MBRs of A and B is minimum.

M = 8, m = 4

• Split set of M+1 rectangles/MBRs into 2 sets A and B.

• A and B each have at least m rectangles/MBRs.

• Sum of areas of MBRs of A and B is minimum.

M = 8, m = 4

m!(M+1-m)!

Insert—Split A Node

• Exhaustive search for best A and B.

• Compute area(MBR(A)) + area(MBR(B)) for each possible A.

• Note—for each A, the B is unique.

• Select partition that minimizes this sum.

• When |A| = m = ceil(M/2), number of choices for A is

Impractical for large M.

• Grow A and B using a clustering strategy.

• Start with a seed rectangle a for A and b for B.

• Grow A and B one rectangle at a time.

• Stop when the M+1 rectangles have been partitioned into A and B.

Insert—Split A Node

• Let S be the set of M+1 rectangles to be partitioned.

• Find a and b inS that maximize

area(MBR(a,b)) – area(a) – area(b)

Insert—Split A Node

• Let S be the set of M+1 rectangles to be partitioned.

• Find a and b inS that maximize

area(MBR(a,b)) – area(a) – area(b)

Insert—Split A Node

• Find an unassigned rectangle c that maximizes

|area(MBR(A,c)) – area(MBR(A))

- (area(MBR(B,c)) – area(MBR(B)))|

Insert—Split A Node

• Find an unassigned rectangle c that maximizes

|area(MBR(A,c)) – area(MBR(A))

- (area(MBR(B,c)) – area(MBR(B)))|

Insert—Split A Node

• Assign c to partition whose area increases least.

Insert—Split A Node

• Continue assigning in this way until all remaining rectangles must necessarily be assigned to one of the two partitions for that partition to have m rectangles.

Insert—Split A Node

• Linear Method—seed selection.

• Choose a and b to have maximum normalized separation.

Insert—Split A Node

• Linear Method—seed selection.

• Choose a and b to have maximum normalized separation.

Separation in x-dimension

• Linear Method—seed selection.

• Choose a and b to have maximum normalized separation.

M = 8, m = 4

Rectangles with max x-separation

• Linear Method—seed selection.

• Choose a and b to have maximum normalized separation.

M = 8, m = 4

Divide by x-width to normalize

Insert—Split A Node

• Linear Method—seed selection.

• Choose a and b to have maximum normalized separation.

Separation in y-dimension

• Linear Method—seed selection.

• Choose a and b to have maximum normalized separation.

M = 8, m = 4

Rectangles with max y-separation

• Linear Method—seed selection.

• Choose a and b to have maximum normalized separation.

M = 8, m = 4

Divide by y-width to normalize

• Linear Method—assign remainder.

• Assign remaining rectangles in random order.

• Rectangle is assigned to partition whose MBR area increases least.

• Stop when all remaining rectangles must be assigned to one of the partitions so that the partition has its minimum required m rectangles.

M = 8, m = 4

• If leaf doesn’t become deficient, simply readjust MBRs in path from root.

• If leaf becomes deficient, get from nearest sibling (if possible) and readjust MBRs.

• Combine with sibling as in B+ tree.

• Could instead do a more global reorganization to get better R-tree.

• R*-tree

• Leaf selection and node overflows in insertion handled differently.

• Hilbert R-tree

• R+-tree

• Index nodes have non-overlapping rectangles.

• A data object may be represented in several data nodes.

• No upper bound on size of a data node.

• No bounds (lower/upper) on degree of an index node.

• Cell tree

• Combines BSP and R+-tree concepts.

• Index nodes have non-overlapping convex polyhedrons.

• No lower/upper bound on size of a data node.

• Lower bound (but not upper) on degree of an index node.