1 / 42

R-Trees

R-Trees. 2-dimensional indexing structure. R-trees. 2-dimensional version of the B-tree:. B-tree of maximum degree 8; degree between 3 and 8. Internal nodes with k children have k -1 split values. R-trees. Can store: a set of polygons (regions of a subdivision)

kamin
Download Presentation

R-Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. R-Trees 2-dimensional indexing structure

  2. R-trees • 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes with k children have k-1 split values

  3. R-trees • Can store: • a set of polygons (regions of a subdivision) • a set of polygonal lines (or boundaries) • a set of points • a mix of the above • Stored objects may overlap

  4. R-trees • Originally by Guttman, 1984 • Dozens of variations and optimizations since • Suitable for windowing, point location and intersection queries • Heuristic structure, no order bounds ( O(..) ) • Tree with higher degree: suitable for background storage (short search paths);one node per disk block

  5. Every internal node contains entries (rectangle, pointer to child node) All leaves contain entries (rectangle, pointer to object) in database or file Rectangles are minimal bounding rectangles (MBR) The root has  2 and  M entries All other nodes have at least m and at most M entries All leaves have the same depth m > 1 and M > 2m(e.g. m = 200;M = 1000) Definition R-tree

  6. Object descriptions

  7. Grouping of objects Windowing query: the fewer rectangles intersected, the fewer subtrees to descend into

  8. Grouping of objects • Objects close together in same leaves small rectangles  queries descend in only few subtrees • Group the child nodes under a parent node such that small rectangles arise

  9. Heuristics for fast queries • Small area of rectangles • Small perimeter of rectangles • Little overlap among rectangles • Well-filled nodes (tree less deep  fewer disk accesses on each search path)

  10. Example R-tree

  11. Object descriptions

  12. point containment query

  13. point containment query

  14. Searching in an R-tree • Q is query object (point, window, object) • For each rectangle R in the current node,if Q and R intersect, • search recursively in the subtree under the pointer at R (at an internal node) • get the object corresponding to R and test for intersection with R (at a leaf)

  15. Inserting in an R-tree • Determine minimal bounding rectangle (MBR) of new object • When not yet at a leaf (choose subtree): • determine rectangle whose area increment after insertion of R is smallest • increase this rectangle if necessary and insert R • At a leaf: • if there is space, insert, otherwise Split Node

  16. Split Node • Divide the M+1 rectangles into two groups, each with at least m and at most M rectangles • Make a node for each group, with the rectangles and corresponding subtrees as entries • Hang the two new nodes under the parent node in the place of the overfull node; determine the new MBRs (if the root was overfull, make a new root with two child nodes) • If the parent has M+1 children, repeat Split Node with this parent

  17. Split Node, example New MBRs

  18. Strategies for Split Node, I • Determine R1 and R2 with largest MBR: the seeds for sets S1 and S2 • While |S1| , |S2| < M - m and not all rectangles distributed: • Take not yet distributed rectangle Rj, add tothe setwhose MBR increases least Linear R-tree of Guttman, 1984

  19. Example Split Node I

  20. Strategies for Split Node, II • Determine R1 and R2 with largest area(MBR)-area(R1) - area(R2): the seeds for sets S1 and S2 • While |S1| , |S2| < M - m and not all distributed: • Determine of every not yet distributed rectangle Rj:d1 = area increment of MBR(S1 Rj) (* w.r.t. MBR(S1) *)d2 = area increment of MBR(S2Rj) (* w.r.t. MBR(S2) *) • Choose Ri with maximal | d1 - d2 | ; add it to theset with smallest area increment Quadratic R-tree of Guttman, 1984

  21. Example Split Node, II

  22. Strategies for Split Node, III • Determine R1 and R2 with largest area(MBR)-area(R1) - area(R2): the seeds for setsS1 and S2 (* same as quadratic R-tree *) • Determine axis with largest normalized separation of R1 and R2( x-separation / x-range of MBR(R1 R2), ory-separation / y-range of MBR(R1 R2) ) • Sort rectangles according to that axis (lower left corner) and split evenly in subsets of size (M+1) / 2 Greene’s split, 1989

  23. Example Split Node, III Y-axis has largestnormalized separation

  24. Deletion from an R-tree • Find the leaf (node) and delete object; determine new (possibly smaller) MBR • If the node is too empty (<m entries): • delete the node recursively at its parent • insert all entries of the deleted node into the R-tree • Note: Insertions of entries/subtrees always occurs at the level where it came from

  25. Insert as rectangle on middle level

  26. Insert in a leaf object

  27. R*-trees • Experimentally determined measures for choices at insertion (Choose Subtree, Split Node) • Experimentally determined algorithms for: • Choose Subtree • Split Node

  28. R*-trees; Choose Subtree • At nodes directly above leaves: Choose entry (rectangle) with smallest overlap-increase • At higher nodes: Choose entry (rectangle) with smallest area-increase (same as before) R ,…, Rare the entry rectangles p 1

  29. R*-trees; Split Node Determine split axis: • For both the x- and the y-axis: • sort the rectangles by smallest and largest coordinate • determine the M - 2m + 2 allowed distributions into two groups • determine for each: the perimeter of the two MBRs • add the M - 2m + 2 perimeter lengths • Choose the axis with smallest sum of perimeters m m M - 2m + 1

  30. R*-trees; Split Node Determine split index (given the split axis): • Choose the distribution, among the M - 2m + 2, with the smallest area of intersection of the MBRs

  31. Nearest neighbor queries • An R-tree can be used for nearest neighbor queries • The idea is to perform a DFS, maintain the closest object so far and use the distance for pruning pruned closest object so far queried

  32. 1 4 2 5 3

  33. Forced reinsert • Build R-tree by repeated insertion: first inserted rectangles are possibly badly placed • Experiment: • make R-tree by inserting 20.000 rectangles • again, but afterwards, delete the first inserted 10.000 and insert them again! • Search time improvement of 20-50% !

  34. Summary R-trees • Versatile 2-dimensional search tree (referred to as: indexing structure, or spatial index) • Some variant used in most GIS • Well-suited for windowing, point location, intersection, and nearest neighbor queries • Heuristic structure, no order bounds ( O(..) ) • Dynamic; insertions and deletions supported • Tree with higher degree: well-suited for background storage (short search paths)

More Related