1 / 14

B+-trees

B+-trees. Model of Computation. CPU. Memory. Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B = n pages I/O complexity: in number of pages. Disk. I/O complexity.

enye
Download Presentation

B+-trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. B+-trees

  2. Model of Computation CPU Memory • Data stored on disk(s) • Minimum transfer unit: a page = b bytes or B records (or block) • N records -> N/B = n pages • I/O complexity: in number of pages Disk

  3. I/O complexity • An ideal index has space O(N/B), update overhead O(1) or O(logB(N/B)) and search complexity O(a/B) or O(logB(N/B) + a/B) where a is the number of records in the answer • But, sometimes CPU performance is also important… minimize cache misses -> don’t waste CPU cycles

  4. B+-tree • Records must be ordered over an attribute, SSN, Name, etc. • Queries: exact match and range queries over the indexed attribute: “find the name of the student with ID=087-34-7892” or “find all students with gpa between 3.00 and 3.5”

  5. B+-tree:properties • Insert/delete at log F (N/B) cost; keep tree height-balanced.(F = fanout) • Minimum 50% occupancy (except for root). Each node contains d <= m <= 2d entries/pointers. • Two types of nodes: index nodes and data nodes; each node is 1 page (disk based method)

  6. Example 100 Root 120 150 180 30 3 5 11 120 130 180 200 100 101 110 150 156 179 30 35

  7. Index node 57 81 95 to keys to keys to keys to keys < 57 57£ k<81 81£k<95 95£

  8. Data node Struct { Key real; Pointr *long; } entry; Node entry[B]; From non-leaf node to next leaf in sequence 57 81 95 To record with key 57 To record with key 81 To record with key 85

  9. Insertion • Find correct leaf L. • Put data entry onto L. • If L has enough space, done! • Else, must splitL (into L and a new node L2) • Redistribute entries evenly, copy upmiddle key. • Insert index entry pointing to L2 into parent of L. • This can happen recursively • To split index node, redistribute entries evenly, but push upmiddle key. (Contrast with leaf splits.) • Splits “grow” tree; root split increases height. • Tree growth: gets wider or one level taller at top.

  10. Deletion • Start at root, find leaf L where entry belongs. • Remove the entry. • If L is at least half-full, done! • If L has only d-1 entries, • Try to re-distribute, borrowing from sibling (adjacent node with same parent as L). • If re-distribution fails, mergeL and sibling. • If merge occurred, must delete entry (pointing to L or sibling) from parent of L. • Merge could propagate to root, decreasing height.

  11. Characteristics • Optimal method for 1-d range queries: Space: O(N/B), Updates: O(logB(N/B)), Query:O(logB(N/B) + a/B) • Space utilization: 67% for random input • Original B-tree: index nodes store also pointers to records

  12. Example 100 Range[32, 160] Root 120 150 180 30 3 5 11 120 130 180 200 100 101 110 150 156 179 30 35

  13. Other issues • Internal node architecture [Lomet01]: • Reduce the overhead of tree traversal. • Prefix compression: In index nodes store only the prefix that differentiate consecutive sub-trees. Fanout is increased. • Cache sensitive B+-tree • Place keys in a way that reduces the cache faults during the binary search in each node. • Eliminate pointers so a cache line contains more keys for comparison.

  14. References [Lomet01]:David B. Lomet: The Evolution of Effective B-tree: Page Organization and Techniques: A Personal Account. SIGMOD Record 30(3): 64-69 (2001) http://www.acm.org/sigmod/record/issues/0109/a1-lomet.pdf

More Related