1 / 13

Index Structures 13.2 – Secondary Index

Index Structures 13.2 – Secondary Index. Aditya Govindaraju - 218. 30. 20. 80. 100. 90. 50. 70. 40. 10. 60. Secondary indexes. Sequence field. 100. 30. 20. 80. 90. 10. 40. 60. 50. 70. 90. 30. 20. 80. 100. does not make sense!. Secondary indexes. Sequence field.

linus-ware
Download Presentation

Index Structures 13.2 – Secondary Index

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Index Structures13.2 – Secondary Index • Aditya Govindaraju - 218

  2. 30 20 80 100 90 50 70 40 10 60 Secondary indexes Sequence field

  3. 100 30 20 80 90 10 40 60 50 70 90 30 ... 20 80 100 does not make sense! Secondary indexes Sequence field • Sparse index

  4. 90 30 20 80 100 50 70 40 10 60 50 10 10 60 50 20 30 90 70 40 ... ... sparse high level Secondary indexes Sequence field • Dense index

  5. Also: Pointers are record pointers (not block pointers; not computed) With secondary indexes: • Lowest level is dense • Other levels are sparse

  6. 20 20 10 10 30 10 40 40 40 40 Duplicate values & secondary indexes

  7. 30 20 20 10 10 10 40 40 40 40 40 10 20 10 30 40 10 40 ... 20 40 Duplicate values & secondary indexes one option... • Problem: • excess overhead! • disk space • search time

  8. 10 20 20 10 30 10 40 40 40 40 50 10 20 60 ... 30 40 Duplicate values & secondary indexes   Another idea (suggested in class):Chain records with same key?   • Problems: • Need to add fields to records • Need to follow chain to know records

  9. Why “bucket” idea is useful Indexes Records Name: primary EMP (name,dept,floor,...) Dept: secondary Floor: secondary

  10. Dept. index EMP Floor index Toy 2nd Query: Get employees in (Toy Dept) ^ (2nd floor)  Intersect toy bucket and 2nd Floor bucket to get set of matching EMP’s

  11. cat dog Inverted lists This idea used in text information retrieval Documents ...the cat is fat ... ...was raining cats and dogs... ...Fido the dog ...

  12. Common technique: more info in inverted list position location type d1 cat Title 5 Author 10 Abstract 57 d2 d3 dog Title 100 Title 12

  13. Posting: an entry in inverted list. Represents occurrence of term in article Size of a list: 1 Rare words or (in postings) miss-spellings 106 Common words Size of a posting: 10-15 bits (compressed)

More Related