1 / 13

Indexing Techniques

Indexing Techniques. Storage Technology: Topic 4. Indexes. An index on a collection of records speeds up selections on the search key fields . Any subset of the fields of a record can be the search key for an index on the collection. An index is a collection of index entries .

brinly
Download Presentation

Indexing Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Indexing Techniques Storage Technology: Topic 4

  2. Indexes • An index on a collection of records speeds up selections on the search key fields. • Any subset of the fields of a record can be the search key for an index on the collection. • An index is a collection of index entries. • Retrieve all entries k* with key value k • Retrieve all entries k* between two key values • Retrieve entries in search key order

  3. Alternatives for Data Entry k* in Index • Three alternatives: • Data record with search key value k • issue : how much data repetition? • Issue: is this simply a fancy file format? • <k, rid of data record with search key value k> • <k, list of rids of data records with search key k> • Our focus: alternative 2. • Examples of indexing techniques: B+ trees, hash-based structures

  4. Index Classification: Clustering • Clustered vs. unclustered: If order of data records is the same as, or ``close to’’, order of data entries, then called clustered index. • At most one independent clustered index. • Cost of retrieving data through index varies greatly based on whether index is clustered or not! Why? • Usually, clustering desired for sorted access.

  5. Clustered vs. Unclustered Index Title: es_f52.fig Creator: /s/transfig-3.1.1/exe/fig2dev Version 3.1 Patchlevel 1 CreationDate: Wed Oct 11 19:11:29 1995

  6. Sparse Clustering • Dense vs. Sparse: If there is at least one data entry per search key value (in some data record), then dense. • Every sparse index is clustered! Title: l3_f1.fig Creator: /s/transfig-3.1.1/exe/fig2dev Version 3.1 Patchlevel 1 CreationDate: Wed Sep 6 17:49:58 1995

  7. Primary/Secondary Indexes • Definition 1: Primary == Clustered • Definition 2: Primary == search key contains primary key of the relation • We will use Definition 2

  8. Tree-Structured Indexing • Tree-structured indexing techniques support both range searches and equality searches • ``Find all students with gpa > 3.0’’ • If data is in sorted file, use binary search. • Simple idea: Create an `index’ file. • Can do binary search on (smaller) index file!

  9. ISAM • Index file may still be quite large. But we can apply the idea repeatedly! • Leaf pages contain data entries.

  10. Comments on ISAM • File creation: Leaf pages allocated sequentially, sorted by search key; then index pages allocated, then space for overflow pages. • Index entries: <search key value, page id>; they `direct’ search for data entries, which are in leaf pages. • Search: Start at root; use key comparisons to go to leaf. Cost log F N ; F = # entries/index pg, N = # leaf pgs • Insert: Find leaf data entry belongs to, and put it there. • Delete: Find and remove from leaf; if empty overflow page, de-allocate. • Static tree structure: inserts/deletes affect only leaf pages.

  11. Example ISAM Tree • Each node can hold 2 entries; no need for `next-leaf-page’ pointers. (Why?)

  12. After Inserting 23*, 48*, 41*, 42* ...

  13. ... Then Deleting 42*, 51*, 97* • Note that 51* appears in index levels, but not in leaf!

More Related