1 / 11

Database indices

Database indices. Database Systems manage very large amounts of data. Examples: student database for NWU Social Security database To facilitate queries, we create indices.

reina
Download Presentation

Database indices

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database indices • Database Systems manage very large amounts of data. • Examples: • student database for NWU • Social Security database • To facilitate queries, we create indices. • An index is any data structure that takes as input a property (e.g. a value for a specific field), called the search key, and quickly finds all records with that property. • A database may have several indices (based on different keys) • Examples: • An index to search for students by id and another index to search by name.

  2. Database indices • The actual records (and the index) typically do not fit in memory. • Secondary (or tertiary) storage must be used. • Disk operations are very time consuming, so we would like to limit them. • Since CPU operations are much faster (one disk access equals several million CPU instructions), we would be willing to do any sort of preprocessing that may reduce disk I/O.

  3. Database indices • Idea: • Expand the BST idea to create a multi-way search tree: • Instead of a long, thin tree with at most 2 children per node, create a short, wide tree with many children per node. • Each node will then need to have children - 1keys. • Try to maintain the tree balanced and as full as possible. • Finding the correct branch to follow requires several comparisons (CPU operations) and leads to few disk accesses.

  4. B-trees • The most common data structure used for database indices is the B-tree. • A B-tree of order m is an m-way tree where • All leaves are on the same level • All internal nodes except the root have k-1 keys and k children wherem/2 k m • The root is either a leaf or has between 2 and m children.

  5. B-trees B-tree of order 3 keys greater than 19 and less than 51 keys smaller than 19 keys greater than 51 19 51 9 32 46 60 2 6 11 14 21 30 33 35 48 57 80

  6. B-trees: Insert Generalization of the search method for a BST Find the appropriate leaf  If  has room insert key else // overflow! split  See next slide for details on the split operation.

  7. B-trees: Insert How to handle an overflow at a leaf. Pick the middle key  of the leaf. Split the leaf in two, each part containing half of the elements. If the leaf has a parent p insert to the parentmake p the parent of the two pieces check whether the parent overflows if yes, repeat the splitting process for the parent. else // the leaf was the root create a new root and insert  to itmake it the parent of the two pieces

  8. B-trees: Delete Find the element to be deleted. If it is not a leaf, replace it with its immediate successor and delete the successor instead. else delete the element. Check for underflow (too few children) guaranteed to be in a leaf See next slide for more details.

  9. B-trees: Delete How to handle an underflow at a node n. Check whether any of the siblings can afford to lose children. If yes,transfer a child move a child from the sibling to n move a key from n's parent to n move a key from the sibling to the parent. else merge n with a sibling move a key from the parent to the merged node check the parent for underflow. The sibling loses a child so it must lose a key and n gains a child so it must gain a key. The parent loses a child so it must lose a key.

  10. B-trees • B-trees grow when a new root is created as a result of an insert operation. • B-trees shrink when the root has only two children and they merge as a result of a delete operation. This will cause the root's only key to move down and the root to become empty.

  11. B*-trees • Variant of the B-tree. • Each node must be at least 2/3 full • Overflow is handled mainly by redistributing keys among siblings. If all siblings are full, then 2 nodes are split into 3.

More Related