Chapter 12: Indexing and Hashing. Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL Multiple-Key Access. Basic Concepts.
Indexing techniques evaluated on basis of:
B+-tree indices are an alternative to indexed-sequential files.
A B+-tree is a rooted tree satisfying the following properties:
K1 < K2 < K3 < . . .< Kn–1
Properties of a leaf node:
B+-tree for account file (n = 3)
B+-tree for account file (n - 5)
Result of splitting node containing Brighton and Downtown on inserting Clearview
B+-Tree before and after insertion of “Clearview”
Before and after deleting “Downtown”
Deletion of “Perryridge” from result of previous example
Before and after deletion of “Perryridge” from earlier example
Example of B+-tree File Organization
B-tree (above) and B+-tree (below) on same data
Hash file organization of account file, using branch-name as key (See figure in next slide.)
Hash file organization of account file, using branch-name as key (see previous slide for details).
In this structure, i2 = i3 = i, whereas i1 = i – 1 (see next slide for details)
1. Compute h(Kj) = X
2. Use the first i high order bits of X as a displacement into bucket address table, and follow the pointer to appropriate bucket
To split a bucket j when inserting record with search-key value Kj:
Initial Hash structure, bucket size = 2
Hash structure after insertion of Mianus record
Hash structure after insertion of three Perryridge records
create index <index-name> or <relation-name> <attribute-list>)
E.g.: create index b-index on branch(branch-name)
drop index <index-name>
where branch-name = “Perryridge” and balance - 1000
1. Use index on branch-name to find accounts with balances of $1000; test branch-name = “Perryridge”.
2. Use indexon balance to find accounts with balances of $1000; test branch-name = “Perryridge”.
3. Use branch-name index to find pointers to all records pertaining to the Perryridge branch. Similarly use index on balance. Take intersection of both sets of pointers obtained.
Suppose we have an index on combined search-key
100110 OR 110011 = 110111 NOT 100110 = 011001
reducing the number of
pointer traversals vs. instructions (T-trees)
memory accesses vs. instructions (cache-conscious indexes)
(A1, A2, . . . , An) for n attribute search-key
search-key value hash value(Main, Harrison) 101 111 (Main, Brooklyn) 101 001 (Park, Palo Alto) 010 010 (Spring, Brooklyn) 001 001 (Alma, Palo Alto) 110 010