Tree-Structured Indexes

Tree-Structured Indexes Chapter 10

Introduction • Indexes is access structures which are used to speed up the retrieval of records in respond to a query. • Index fields are used to construct index. • Any fields are of the file can be used to create an index and multiple indexes on different fields can be constructed on the same file. • To find a record or records in the file based on certain selection on an indexing field, one has to initially access the index which points to one or more blocks in the file where the required records are located. • The most types of indexes are based on ordered files (single level indexes) such as ISAM (Indexed Sequential Access Methods) and tree structures (Multilevel indexes) such as B or B+ tree are commonly used in DBMS to implement dynamically changing multilevel indexes. • Tree-structured indexing techniques support both range searches and equality searches. • ISAM: static structure that is effective when the file is not frequently updated;B+ tree: dynamic, adjusts gracefully under inserts and deletes.

Range Searches • The idea behind an ordered index access structure is similar to that behind the index used in a textbook, which lists important terms at the end of the book in alphabetical order along with a list of page numbers where the term appears in the book. • ``Find all students with gpa > 3.0’’ • If data is in sorted file, do binary search to find first such student, then scan to find others. • Cost of binary search can be quite high. • Simple idea: create a second file with one record per page of the format [Key, Pointer] called index entries. • Create an `index’ file. Index File kN k2 k1 Data File Page N Page 3 Page 1 Page 2 • Can do binary search on (smaller) index file!

Overflow page index entry ISAM P K P K P P K m 0 1 2 1 m 2 • Each index page contains one pointer more than the number of keys. Each key serves as a separator for the contents of the pages pointed to by the pointers to its left and right. • Index file may still be quite large. But we can apply the idea repeatedly! Non-leaf Pages Leaf Pages Primary pages • Leaf pages contain data entries.

Overflow page ISAM • Each tree node is a disk page. • The data entries are in the leaf pages of the tree and additional overflow pages chained to some leaf page. • When the file is created, all leaf pages are allocated sequentially and sorted on the search key value. • The non leaf pages are then allocated • If there are several inserts to the file subsequently, so that more entries are inserted into a leaf than will fit onto a single page, additional pages are needed because the index structure is static. These additional pages are allocated from an overflow area. Non-leaf Pages Leaf Pages Primary pages

Data Pages Comments on ISAM Index Pages Overflow pages • File creation: Leaf (data) pages allocated sequentially, sorted by search key; then index pages allocated, then space for overflow pages. • Index entries: <search key value, page id>; they `direct’ search for data entries, which are in leaf pages. • Search: Start at root and determine which subtree by use key comparisons to go to leaf. • Insert: Find leaf data entry belongs to, and put it there. • Delete: Find and remove from leaf; if empty overflow page, de-allocate. • Static tree structure: inserts/deletes affect only leaf pages.

Root 40 20 33 51 63 46* 55* 10* 15* 20* 27* 33* 37* 40* 51* 97* 63* Example ISAM Tree • All searches begins at the root, for example to locate a record with a key value 27, we start at the root and follow the left pointer since 27<40. • We then follow the middle pointer, since 20<=27<33 • We assume that each node can hold 2 entries; no need for `next-leaf-page’ pointers.

After Inserting 23*, 48*, 41*, 42* ... • The entry 23* belongs in the second data page, which already contains 20* and 27* and has no more space. We deal with situation by adding overflow page and putting 23* in the overflow page. • Inserting 48*, 41* and 42* leads to an overflow chain of two pages Root 40 Index Pages 20 33 51 63 Primary Leaf 46* 55* 10* 15* 20* 27* 33* 37* 40* 51* 97* 63* Pages 41* 48* 23* Overflow Pages 42*

... Then Deleting 42*, 51*, 97* • The deletion of an entry k* is handled by simply removing the entry. • If the entry is on an overflow page and the overflow page become empty, the page can be removed. • If the entry is on a primary page and deletion makes the primary empty, leave the empty primary page as it is. It serves as a placeholder for future insertion. • We do not move records from the overflow pages to the primary page when deletions on the primary page create space. • Thus, the number of primary leaf pages is fixed at file creation time. Root 40 20 33 51 63 46* 55* 10* 15* 20* 27* 33* 37* 40* 63* 41* 48* 23* • Note that 51* appears in index levels, but not in leaf!

Index Entries (Direct search) Data Entries ("Sequence set") B+ Tree: Most Widely Used Index • A static structure such as the ISAM index suffer from the problem that long overflow chains can develop as the file grows, which leading to a poor performance. • B+ tree is dynamic search structures that adjust gracefully to inserts and deletes. • Internal nodes direct the search and the leaf nodes contain the data entries. • To retrieve all leaf page we have to link them using page pointer. • Minimum 50% occupancy for each node (except for root). Each node contains d <= m <= 2d entries. The parameter d is called the order of the tree, which measure the capacity of a tree node. The root node is the only exception on this requirement, it is required 1<=m<=2d • Supports equality and range-searches efficiently.

Example B+ Tree • Search begins at root, and key comparisons direct it to a leaf (as in ISAM). • Search for 5*, 15*, all data entries >= 24* ... • The tree is of order d=2, means each node contains between 2 and 4 entries. • To search for 5* we follow the left most child pointer since 5<13. To search for entries 14 or 15 we follow the second pointer. To find 24*, we follow the fourth child pointer since 24<=24<=30. Root 30 13 17 24 39* 3* 5* 19* 20* 22* 24* 27* 38* 2* 7* 14* 16* 29* 33* 34* • Based on the search for 15*, we know it is not in the tree!

Inserting a Data Entry into a B+ Tree • Find correct leaf L. • Put data entry onto L. • If L has enough space, done! • Else, must splitL (into L and a new node L2) • Redistribute entries evenly, copy upmiddle key. • Insert index entry pointing to L2 into parent of L. • This can happen recursively • To split index node, redistribute entries evenly, but push upmiddle key. (Contrast with leaf splits.) • Splits “grow” tree; root split increases height. • Tree growth: gets wider or one level taller at top.

Entry to be inserted in parent node. (Note that 17 is pushed up and only 17 this with a leaf split.) 5 13 24 30 Inserting 8* into Example B+ Tree • It belongs in the left most leaf, which is already full, which cause split the leaf page. Insert an entry consisting of the pair [5, pointer] into the parent node. • Note how the key 5, which discriminate between the split leaf page and its newly created sibling, is copied up. • Since the parent is also full, another split occurs and the middle key [17, pointer] is pushed up the tree, in contrast to copied up in the leaf split. • Note difference between copy-upand push-up; be sure you understand the reasons for this. appears once in the index. Contrast 30 13 17 24 Entry to be inserted in parent node. (Note that 5 is s copied up and 5 continues to appear in the leaf.) 3* 5* 3* 5* 2* 7* 2* 7* 8*

Example B+ Tree After Inserting 8* Root 17 24 5 13 30 39* 2* 3* 5* 7* 8* 19* 20* 22* 24* 27* 38* 29* 33* 34* 14* 16* • Notice that root was split, leading to increase in height. • In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice.

Example B+ Tree After Inserting 8* using redistribution Redistribute entries of node N with a sibling before splitting the node. The sibling of a node N is a node that is immediately to the left or right of N. The entry belongs in the left most leaf, which is full. The only sibling of this leaf node contains only two entries and can thus accommodate more entries. Note how the entry in the parent node that points to the second leaf has a new key value; we copy up the new low key value on the second leaf. In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice. Root 30 8 17 24 16* 39* 3* 5* 19* 20* 22* 24* 27* 38* 2* 7* 8* 14* 29* 33* 34*

Deleting a Data Entry from a B+ Tree • Start at root, find leaf L where entry belongs. • Remove the entry. • If L is at least half-full, done! • If L has only d-1 entries, • Try to re-distribute, borrowing from sibling (adjacent node with same parent as L). • If re-distribution fails, mergeL and sibling. • If merge occurred, must delete entry (pointing to L or sibling) from parent of L. • Merge could propagate to root, decreasing height.

Example Tree After (Inserting 8*, Then) Deleting 19* and 20* ... • If the deletion cause the node to go below the occupancy threshold, we must either redistribute entries from an adjacent sibling or merge the node with a sibling to maintain minimum occupancy. • If entries are redistributed between two nodes, their parent node must be updated to reflect this; the key value in the index entry pointing to the second node must be changed to be the lowest search key in the second node. • If two nodes are merged, their parent must be updated to reflect this by deleting the index entry for the second node; this index entry is pointed to by the pointer variable oldchildentry when the delete call returns to the parent node.

Example Tree After (Inserting 8*, Then) Deleting 19* and 20* ... Root 17 • Deleting 19* is easy. Remove it from leaf page on which it appears and the leaf still contains two entries. • Deleting 20* is done with re-distribution, because the leaf will remain with only one entry after deletion. The only sibling of the leaf node that contains 20* has three entries, therefore we deal with this situation by redistribution. • We move entry 24* to the leaf page that contain 22* and copy up the new splitting key (27, which is the new low key value of the leaf from which we borrow 24*) into the parent. 24 5 13 30 39* 2* 3* 5* 7* 8* 19* 20* 22* 24* 27* 38* 29* 33* 34* 14* 16* 17 27 5 13 30 39* 2* 3* 5* 7* 8* 22* 24* 27* 29* 38* 33* 34* 14* 16*

... And Then Deleting 24* • The affected leaf contains only one entry (22*) after deletion, and the only sibling contains just two entries (27* and 29*), therefore we can not redistribute. These two nodes must be merged. • While merging, we`toss’ of index entry (27). • Deleting the entry 27 has created a non-leaf-level page with just one entry, which is below the minimum of d=2. To fix this problem, we must either redistribute or merge. The only sibling of this node contains just two entries (5, 13), so the redistribution is not possible, we must therefore merge. • To merge the two nodes, we also need to pull down the index entry in the parent that currently discriminates between these nodes. This index entry has key value 17 and now we have a total of four entries. • Note that pulling down the splitting key 17 means that it will no longer appears in the parent node following the merge. 30 39* 22* 27* 38* 29* 33* 34* Root 5 13 17 30 3* 39* 2* 5* 7* 8* 22* 38* 27* 33* 34* 14* 16* 29*

2* 3* 5* 7* 8* 39* 17* 18* 38* 20* 21* 22* 27* 29* 33* 34* 14* 16* Example of Non-leaf Re-distribution • Tree is shown below during deletion of 24*. (What could be a possible initial tree?) • In contrast to previous example, can re-distribute entry from left child of root to right child. Root 22 30 17 20 5 13

After Re-distribution • Intuitively, entries are re-distributed by `pushingthrough’ the splitting entry in the parent node. • It suffices to re-distribute index entry with key 20; we’ve re-distributed 17 as well for illustration. Root 17 22 30 5 13 20 2* 3* 5* 7* 8* 39* 17* 18* 38* 20* 21* 22* 27* 29* 33* 34* 14* 16*

Prefix Key Compression • Important to increase fan-out. (Why?) • Key values in index entries only `direct traffic’; can often compress them. • E.g., If we have adjacent index entries with search key values Dannon Yogurt, David Smith and Devarakonda Murthy, we can abbreviate DavidSmith to Dav. (The other keys can be compressed too ...) • Is this correct? Not quite! What if there is a data entry Davey Jones? (Can only compress David Smith to Davi) • In general, while compressing, must leave each index entry greater than every key value (in any subtree) to its left. • Insert/delete must be suitably modified.

3* 6* 9* 10* 11* 12* 13* 23* 31* 36* 38* 41* 44* 4* 20* 22* 35* Bulk Loading of a B+ Tree • If we have a large collection of records, and we want to create a B+ tree on some field, doing so by repeatedly inserting records is very slow. • Bulk Loadingcan be done much more efficiently. • Initialization: Sort all data entries, insert pointer to first (leaf) page in a new (root) page. Root Sorted pages of data entries; not yet in B+ tree

Bulk Loading (Contd.) Root 10 20 • Index entries for leaf pages always entered into right-most index page just above leaf level. When this fills up, it splits. (Split may go up right-most path to the root.) • Much faster than repeated inserts, especially when one considers locking! Data entry pages 6 12 23 35 not yet in B+ tree 3* 6* 9* 10* 11* 12* 13* 23* 31* 36* 38* 41* 44* 4* 20* 22* 35* Root 20 10 Data entry pages 35 not yet in B+ tree 6 23 12 38 3* 6* 9* 10* 11* 12* 13* 23* 31* 36* 38* 41* 44* 4* 20* 22* 35*

Summary of Bulk Loading • Option 1: multiple inserts. • Slow. • Does not give sequential storage of leaves. • Option 2:Bulk Loading • Has advantages for concurrency control. • Fewer I/Os during build. • Leaves will be stored sequentially (and linked, of course). • Can control “fill factor” on pages.

A Note on `Order’ • Order (d) concept replaced by physical space criterion in practice (`at least half-full’). • Index pages can typically hold many more entries than leaf pages. • Variable sized records and search keys mean differnt nodes will contain different numbers of entries. • Even with fixed length fields, multiple records with the same search key value (duplicates) can lead to variable-sized data entries (if we use Alternative (3)).

Summary • Tree-structured indexes are ideal for range-searches, also good for equality searches. • ISAM is a static structure. • Only leaf pages modified; overflow pages needed. • Overflow chains can degrade performance unless size of data set and data distribution stay constant. • B+ tree is a dynamic structure. • Inserts/deletes leave tree height-balanced; log F N cost. • High fanout (F) means depth rarely more than 3 or 4. • Almost always better than maintaining a sorted file.

Summary (Contd.) • Typically, 67% occupancy on average. • Usually preferable to ISAM, modulolockingconsiderations; adjusts to growth gracefully. • If data entries are data records, splits can change rids! • Key compression increases fanout, reduces height. • Bulk loading can be much faster than repeated inserts for creating a B+ tree on a large data set. • Most widely used index in database management systems because of its versatility. One of the most optimized components of a DBMS.

Tree-Structured Indexes