Database Indexing: ISAM and B+-Trees

Kεφάλαιο 8 ISAM και B-Δέντρα Φυσικός Σχεδιασμός για Βάσεις Δεδομένων

Memory Management Transaction Programs . DBMS Application SET-ORIENTED . Tuple Management Associative Access Database Access Methods TUPLE- ORIENTED Logging and Recovery Record Management Main Buffer Management Buffer Manager Manages Online External BLOCK-ORIENTED File Management Manages File Manager Nearline External Archive Manager Manages

Index Structures • A (single-level) index is an auxiliary file structure that makes it more efficient to search for a record in the file • The index is usually specified on one field of the file • The index is often called access path on the field • An index file usually occupies much less space than the actual data file (the entries are fewer and smaller) • A pointer to the block where a record is stored is found with binary search on the index • FORM of a record in the index file: key value pointer to record’s block

Indexes • Single-level Indexes can be distinguished in: • Primary Index: defined on a file which is ordered on the key field. It includes one entryfor each file block. The entry has the key field value for the first record in the block. (also called, sparse index) • Clustering Index: defined on a file which is ordered on a non-key field. It includes one index entry for each distinct value of the field. The entry points to the first data block that contain records with that field value. • Secondary Index: defined on a file which is unordered. Contains one entry per each record in the data file (also called, dense index)

Multi-Level Indexes • Index files are simply files, therefore they can also be indexed. • We thus end-up with a hierarchy of index structures(first level, second level, etc.) • Every level of the index is an ordered file, therefore, deletion and insertion require much work (maintain the index) • A multi-level index forms a search treeand it is assumed that the top (first) level index fits in one disk block

Indexed Sequential Access Method (ISAM) • ISAM is amulti-level index structure (tree)for directaccess to tuples in a relation sorted on some attribute • Each node of the tree is a page (block)on the disk • The nodes of the tree hold <key value, pointer > pairs, sorted on key value. The internal nodes point to lower nodes in the tree (a lower level index), while the leaf nodes point to pages in the relation. A pointer points to a subtree with key values greater or equal the corresponding key value and less than the key value for the next pointer 9 18 55 60 78 90

Gary 10 Mike 15 LIsa 20 Shirley 25 Bob 30 Robin 35 Ivan 40 Bill 45 Scott 43 Ron 50 Jill 55 Keith 60 Dan 65 ISAM Example • ISAM on Salary 10 20 30 10 40 40 50 60 OVERFLOW

Root 40 20 33 51 63 46* 55* 10* 15* 20* 27* 33* 37* 40* 51* 97* 63* Example ISAM Tree • Each node can hold 2 entries

After Inserting 23*, 48*, 41*, 42* ... Root 40 Index Pages 20 33 51 63 Primary Leaf 46* 55* 10* 15* 20* 27* 33* 37* 40* 51* 97* 63* Pages 41* 48* 23* Overflow Pages 42*

... Then Deleting 42*, 51*, 97* Root 40 20 33 51 63 46* 55* 10* 15* 20* 27* 33* 37* 40* 63* 41* 48* 23* • Note that 51* appears in index levels, but not in leaf!

ISAM Performance • PERFORMANCE. Suppose that we have D data pages and k pointers in each node (assume that D is a power of k, say D = kL ) LOOKUP COST: Sequential Scan: D Binary Search on Relation:log2 D + 1 Binary Search (single index):(log2 (D/k) + 1) + 1 Traversing the ISAM tree: logk D + 1 = L + 1

ISAM Summary • ADVANTAGES: • It provides a sorted (mostly) directory for a file (relation) • It is very good for exact queries (e.g., Salary = 40000) • ISAM facilitates the execution of range queries (e.g., Salary between 35000 and 60000) • DISADVANTAGES • It is a static structure and may easily get unbalanced • If there are many updates (volatile data), then the relation does not remain “mostly” sorted • The index does consume some space which may be valuable

B+ - Trees • A B+-treeis a multilevel tree-structured directory to index a sorted file • Leaf nodes contain tuples in sortedorder, the other internal nodes have a special form • Nodes correspond toa block (page) • Each node is keptbetween half- and completely-full • Insertions in nodes that are not full are very efficient; if a node is full, a split occurs • Deletions are very efficientif the node does not becomeless than half-full • The tree structure remains balanced at all times

B+ - Trees -- Summary • Form of internal nodes: • Variations of B+-trees: • B-trees : They are like B+-trees, but the internal nodes also contain pointers to data. They can get deeper and are difficult to implement, but occasionally, they are faster. • B*-trees : They are like B+-trees, but keep each node 2/3 full (at least). Shorter trees, faster, but worse for updates K1 < K1 < ... Kn-1 P1 K1 P2 K2 ... ... Pn-1 Kn-1 Pn P1 points to a node containing key values n, n< K1 P2 points to a node containing key values n, K1 < = n < K1

Index Entries (Direct search) Data Entries ("Sequence set") B+ Tree: The Most Widely Used Index • Insert/delete at log F N cost; keep tree height-balanced.(F = fanout, N = # leaf pages) • Minimum 50% occupancy (except for root). Each node contains d <= m <= 2d entries. The parameter d is called the order of the tree. • Supports equality and range-searches efficiently.

Example B+ Tree • Search begins at root, and key comparisons direct it to a leaf (as in ISAM). • Search for 5*, 15*, all data entries >= 24* ... Root 30 13 17 24 39* 3* 5* 19* 20* 22* 24* 27* 38* 2* 7* 14* 16* 29* 33* 34* • Based on the search for 15*, we know it is not in the tree!

B+ Trees in Practice • Typical order: 100. Typical fill-factor: 67%. • average fanout = 133 • Typical capacities: • Height 4: 1334 = 312,900,700 records • Height 3: 1333 = 2,352,637 records • Can often hold top levels in buffer pool: • Level 1 = 1 page = 8 Kbytes • Level 2 = 133 pages = 1 Mbyte • Level 3 = 17,689 pages = 133 MBytes

Inserting a Data Entry into a B+ Tree • Find correct leaf L. • Put data entry onto L. • If L has enough space, done! • Else, must splitL (into L and a new node L2) • Redistribute entries evenly, copy upmiddle key. • Insert index entry pointing to L2 into parent of L. • This can happen recursively • To split index node, redistribute entries evenly, but push upmiddle key. (Contrast with leaf splits.) • Splits “grow” tree; root split increases height. • Tree growth: getswider or one level taller at top.

Entry to be inserted in parent node. (Note that 17 is pushed up and only 17 this with a leaf split.) 5 13 24 30 Inserting 8* into Example B+ Tree Entry to be inserted in parent node. (Note that 5 is s copied up and • Observe how minimum occupancy is guaranteed in both leaf and index pg splits. Note difference between copy-upand push-up;be sure you understand the reasons for this. 5 continues to appear in the leaf.) 3* 5* 2* 7* 8* appears once in the index. Contrast

Example B+ Tree After Inserting 8* Root 17 24 5 13 30 39* 2* 3* 5* 7* 8* 19* 20* 22* 24* 27* 38* 29* 33* 34* 14* 16* • Notice that root was split, leading to increase in height. • In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice.

Deleting a Data Entry from a B+ Tree • Start at root, find leaf L where entry belongs. • Remove the entry. • If L is at least half-full, done! • If L has only d-1 entries, • Try to re-distribute, borrowing from sibling (adjacent node with same parent as L). • If re-distribution fails, mergeL and sibling. • If merge occurred, must delete entry (pointing to L or sibling) from parent of L. • Merge could propagate to root, decreasing height.

Example Tree After (Inserting 8*, Then) Deleting 19* and 20* ... • Deleting 19* is easy. • Deleting 20* is done with re-distribution. Notice how middle key is copied up. Root 17 27 5 13 30 39* 2* 3* 5* 7* 8* 22* 24* 27* 29* 38* 33* 34* 14* 16*

... And Then Deleting 24* • Must merge. • Observe `toss’of index entry (on right), and `pull down’of index entry (below). 30 39* 22* 27* 38* 29* 33* 34* Root 5 13 17 30 3* 39* 2* 5* 7* 8* 22* 38* 27* 33* 34* 14* 16* 29*

2* 3* 5* 7* 8* 39* 17* 18* 38* 20* 21* 22* 27* 29* 33* 34* 14* 16* Example of Non-leaf Re-distribution • Tree is shown below during deletion of 24*. (What could be a possible initial tree?) • In contrast to previous example, can re-distribute entry from left child of root to right child. Root 22 30 17 20 5 13

After Re-distribution • Intuitively, entries are re-distributed by `pushing through’the splitting entry in the parent node. • It suffices to re-distribute index entry with key 20; we’ve re-distributed 17 as well for illustration. Root 17 22 30 5 13 20 2* 3* 5* 7* 8* 39* 17* 18* 38* 20* 21* 22* 27* 29* 33* 34* 14* 16*

Prefix Key Compression • Important to increase fan-out. (Why?) • Key values in index entries only `direct traffic’; can often compress them. • E.g., If we have adjacent index entries with search key values Dannon Yogurt, David Smith and Devarakonda Murthy, we can abbreviate DavidSmith to Dav. (The other keys can be compressed too ...) • Is this correct? Not quite! What if there is a data entry Davey Jones? (Can only compress David Smith to Davi) • In general, while compressing, must leave each index entry greater than every key value (in any subtree) to its left. • Insert/delete must be suitably modified.

3* 6* 9* 10* 11* 12* 13* 23* 31* 36* 38* 41* 44* 4* 20* 22* 35* Bulk Loading of a B+ Tree • If we have a large collection of records, and we want to create a B+ tree on some field, doing so by repeatedly inserting records is very slow. • Bulk Loadingcan be done much more efficiently. • Initialization: Sort all data entries, insert pointer to first (leaf) page in a new (root) page. Root Sorted pages of data entries; not yet in B+ tree

Summary of Bulk Loading • Option 1: multiple inserts. • Slow. • Does not give sequential storage of leaves. • Option 2:Bulk Loading • Has advantages for concurrency control. • Fewer I/Os during build. • Leaves will be stored sequentially (and linked, of course). • Can control “fill factor” on pages.

A Note on `Order’ • Order (d) concept replaced by physical space criterion in practice (`at least half-full’). • Index pages can typically hold many more entries than leaf pages. • Variable sized records and search keys mean differnt nodes will contain different numbers of entries. • Even with fixed length fields, multiple records with the same search key value (duplicates) can lead to variable-sized data entries (if we use Alternative (3)).

Summary • Tree-structured indexes are ideal for range-searches, also good for equality searches. • ISAM is a static structure. • Only leaf pages modified; overflow pages needed. • Overflow chains can degrade performance unless size of data set and data distribution stay constant. • B+ tree is a dynamic structure. • Inserts/deletes leave tree height-balanced; log F N cost. • High fanout (F) means depth rarely more than 3 or 4. • Almost always better than maintaining a sorted file.

Summary (Contd.) • Typically, 67% occupancy on average. • Usually preferable to ISAM, modulolockingconsiderations; adjusts to growth gracefully. • If data entries are data records, splits can change rids! • Key compression increases fanout, reduces height. • Bulk loading can be much faster than repeated inserts for creating a B+ tree on a large data set. • Most widely used index in database management systems because of its versatility. One of the most optimized components of a DBMS.

Overview of Physical Design • After ER design, schema refinement, and the definition of views, we have the conceptual and externalschemas for our database. • The next step is to choose indexes, make clustering decisions, and to refine the conceptual and external schemas (if necessary) to meet performance goals. • We must begin by understanding the workload: • The most important queries and how often they arise. • The most important updates and how often they arise. • The desired performance for these queries and updates.

Understanding the Workload • For each query in the workload: • Which relations does it access? • Which attributes are retrieved? • Which attributes are involved in selection/join conditions? How selective are these conditions likely to be? • For each update in the workload: • Which attributes are involved in selection/join conditions? How selective are these conditions likely to be? • The type of update (INSERT/DELETE/UPDATE), and the attributes that are affected.

Decisions to Make • What indexes should we create? • Which relations should have indexes? What field(s) should be the search key? Should we build several indexes? • For each index, what kind of an index should it be? • Clustered? Hash/tree? Dynamic/static? Dense/sparse? • Should we make changes to the conceptual schema? • Consider alternative normalized schemas? (Remember, there are many choices in decomposing into BCNF, etc.) • Should we ``undo’’ some decomposition steps and settle for a lower normal form? (Denormalization.) • Horizontal partitioning, replication, views ...

Choice of Indexes • One approach: consider the most important queries in turn. Consider the best plan using the current indexes, and see if a better plan is possible with an additional index. If so, create it. • Before creating an index, must also consider the impact on updates in the workload! • Trade-off: indexes can make queries go faster, updates slower. Require disk space, too.

Issues to Consider in Index Selection • Attributes mentioned in a WHERE clause are candidates for index search keys. • Exact match condition suggests hash index. • Range query suggests tree index. • Clustering is especially useful for range queries, although it can help on equality queries as well in the presence of duplicates. • Try to choose indexes that benefit as many queries as possible. Since only one index can be clustered per relation, choose it based on important queries that would benefit the most from clustering.

Issues in Index Selection (Contd.) • Multi-attribute search keys should be considered when a WHERE clause contains several conditions. • If range selections are involved, order of attributes should be carefully chosen to match the range ordering. • Such indexes can sometimes enable index-only strategies for important queries. • For index-only strategies, clustering is not important! • When considering a join condition: • Hash index on inner is very good for Index Nested Loops. • Should be clustered if join column is not key for inner, and inner tuples need to be retrieved. • Clustered B+ tree on join column(s) good for Sort-Merge.

Example • Hash index on D.dname supports ‘Toy’ selection. • Given this, index on D.dno is not needed. • Hash index on E.dno allows us to get matching (inner) Emp tuples for each selected (outer) Dept tuple. • What if WHERE included: `` ... AND E.age=25’’ ? • Could retrieve Emp tuples using index on E.age, then join with Dept tuples satisfying dname selection. Comparable to strategy that used E.dno index. • So, if E.age index is already created, this query provides much less motivation for adding an E.dno index. SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=‘Toy’ AND E.dno=D.dno

SELECT E.dno FROM Emp E WHERE E.age>40 Examples of Clustering B+ tree index on E.age can be used to get qualifying tuples. • How selective is the condition? • Is the index clustered? Consider the GROUP BY query. • If many tuples have E.age > 10, using E.age index and sorting the retrieved tuples may be costly. • Clustered E.dno index may be better! Equality queries and duplicates: • Clustering on E.hobby helps! SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age>10 GROUP BY E.dno SELECT E.dno FROM Emp E WHERE E.hobby=Stamps

Multi-Attribute Index Keys • To retrieve Emp records with age=30 ANDsal=4000, an index on <age,sal> would be better than an index on age or an index on sal. • Such indexes also called composite or concatenated indexes. • Choice of index key orthogonal to clustering etc. • If condition is: 20<age<30 AND 3000<sal<5000: • Clustered tree index on <age,sal> or <sal,age> is best. • If condition is: age=30 AND 3000<sal<5000: • Clustered <age,sal> index much better than <sal,age> index! • Composite indexes are larger, updated more often.

Summary • Database design consists of several tasks: requirements analysis, conceptual design, schema refinement, physical design and tuning. • In general, have to go back and forth between these tasks to refine a database design, and decisions in one task can influence the choices in another task. • Understanding the nature of the workload for the application, and the performance goals, is essential to developing a good design. • What are the important queries and updates? What attributes/relations are involved?

Summary (Contd.) • Indexes must be chosen to speed up important queries (and perhaps some updates!). • Index maintenance overhead on updates to key fields. • Choose indexes that can help many queries, if possible. • Build indexes to support index-only strategies. • Clustering is an important decision; only one index on a given relation can be clustered! • Order of fields in composite index key can be important. • Static indexes may have to be periodically re-built. • Statistics have to be periodically updated.

Relation Implementations • (a) STORE EACH RELATION IN A FILE • For small relations, a heap may suffice • For larger relations, ISAM, B-tree, or Hashing • Allow for secondary indexes on user-specified fields Example Commands: modify R to isam on A1 index on R is S(A1) • (b) STORE EACH RELATION AS IN DBTG • Store related tuples from different relations together • Use multilist structures

Database Indexing: ISAM and B+-Trees

Database Indexing: ISAM and B+-Trees

Presentation Transcript