1 / 35

Balanced Search Trees

Balanced Search Trees. 15-211 Fundamental Data Structures and Algorithms. Margaret Reid-Miller 3 February 2005. Plan. Today 2-3-4 trees Red-Black trees Reading: For today: Chapters 13.3-4 Reminder: HW1 due tonight!!! HW2 will be available soon.

goro
Download Presentation

Balanced Search Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Balanced Search Trees 15-211 Fundamental Data Structures and Algorithms Margaret Reid-Miller 3 February 2005

  2. Plan • Today • 2-3-4 trees • Red-Black trees • Reading: • For today: Chapters 13.3-4 • Reminder: HW1 due tonight!!! HW2 will be available soon

  3. AVL-tree Review

  4. 5 5 6 3 3 2 2 7 7 8 7 5 4 5 2 1 1 8 4 4 4 6 6 9 9 3 AVL-Trees What is the key restriction on a binary search tree that keeps an AVL tree balanced? OK not OK

  5. AVL-Trees • Height balanced: • For each node the heights of left and right subtrees differ by at most 1, a representational invariance. • What is the mechanism to rebalance an out-of-balanced AVL tree caused by an insert?

  6. X Y Z The single rotation • Rotate the deepest out-of-balanced node. “Pulls” the child up one level. Z X Y

  7. The double rotation • First rotate around child node, then around the parent node. Z Z X Y2 Y1 Y2 X Y1

  8. Double rotation cont’d • Result is to “pull” the grandchild node up two levels. Z X X Y1 Y2 Z Y1 Y2

  9. AVL Tree Summary • In each node maintains a lazy deletion flag and the height of its subtree. • The height of an AVL tree is at most 45% greater than the minimum. • Requires at most one single or double rotation to regain balance after an insert. • Thus, guarantees O(log N) time for search and insert.

  10. 2-3-4 Trees

  11. Balanced 2-3-4 Trees • Maintain height balance in all subtrees. Depth property. • But allow nodes in the tree to expand to accommodate inserts. • In particular, nodes can have 2, 3 or 4 children. Node-size property. • E.g., a 4-node would have 3 keys that splits the keys into 4 intervals.

  12. 2-3-4 tree search • Search is similar to a binary search. • E.g., search for B G M Q A C H R S W

  13. G M Q A C H R S W 2-3-4 tree search • Search is similar to a binary search. • E.g., search for B

  14. G M Q A C H O S U W 2-3-4 Tree Insert • To insert, first search for a leaf node in which to put the key. • E.g., insert U G M Q A C H R S W

  15. H S U W A C H 2-3-4 Tree Insert • May need to split a node • E.g., insert T G Q T A C G Q U S T W

  16. 2-3-4 Tree Insert /* Either returns an empty node or a new root */ public Node BUinsert(int key) { if isEmptyNode() return new Node(key); /* Search for leaf to put key into */ Node subtree = findChild(key); // down which link? Node upNode = child.BUinsert(key); /* upNode is empty, the key at a leaf node, or * the result of a 4-node split that needs to be * propagated up. */ if upNode.isEmptyNode() return upNode; else return addToNode(upNode); // split? }

  17. Cascading splits • When inserting a key into a 4-node, the 4-node splits and a key moves up to the parent node. • This new key may in turn cause the parent to split, moving a key up to the grandparent, and so on up to the root. • When would this happen? • Is there a way to avoid these cascading splits?

  18. Bottom-up 2-3-4 trees • This BUinsert is called a bottom-up version of insert, since splits occur as we go back up the tree after the recursive calls. • Work occurs before and after the recursive calls.

  19. Preemptive Split • Every time we find a 4-node while traveling down a search path, we split the 4-node. • Note: Two 2-nodes have the same number of children as one 4-node. • Changes are local to the split node (no cascading). • Guaranteed to find a 2-node or 3-node at the leaf. • Splitting a root node creates a new root.

  20. 2-3-4 Tree Height • What is the height of the tree? At most log2 N + 1 • Why? The maximum depth is when every node is a 2-node. Since every leaf has the same depth, the tree is complete and has depth log2 N + 1.

  21. Number of splits • How many splits does an insertion require? At most log2 N + 1 splits. • Seems to require less than one split on average when tree is built from a random permutation. Trees tend to have few 4-nodes.

  22. Top-down 2-4-5 trees • The second method is called top-down as splits occur on the way down the tree. • All the work occurs before the recursive calls and no work occurs after the recursive calls. • Called tail-recursion, which is much more efficient. • Can AVL trees be made tail recursive?

  23. 2-3-4 trees • Advantages: • Guaranteed O(log N) time for search and insert. • Issues: • Awkward to maintain three types of nodes. • Need to modify the standard search on binary trees. • Splits need to move links between nodes. • Code has many cases to handle.

  24. Red Black Trees

  25. G B F H D I G Red-Black trees • A red-black tree is binary tree representation of a 2-3-4 tree using red and black nodes. I D F OR D I B H

  26. Red-black tree properties A Red-Black tree is a binary search tree where • Every node is colored either red or black. • Note: Every 2-3-4 node corresponds to one black node. • The root node is black. • Red nodes always have black parents (children) • Every path from the root to a leaf has same number of black nodes.

  27. 7 3 6 9 Red-black tree height 5 • What is the height of a red-black tree? • It is at most 2 log N + 2 since it can be at most twice as high as its corresponding 2-3-4 tree, which has height at most log N + 1.

  28. Red-black Tree Search • Search is the same as for binary search trees. • Color is irrelevant. • Search guaranteed to take O(log N) time. • Search typically occurs more frequently than insert.

  29. Red-black Tree Insert • Simple 4-node test (2 red children?) • Few splits as most 4-nodes tend to be near the leaves. • Some 4-node splits require only changing the color of three nodes. • Rotations needed only when a 4-node has a 3-node parent.

  30. Red-black Tree Summary • Advantages: • Guaranteed O(log N) time for search and insert. • Little overhead for balancing. • Trees are nearly optimal. • Top-down implementation can be made tail-recursive, so very efficient.

  31. B-Trees

  32. B-trees • A generalization of 2-3-4 trees. • Used for very large dictionaries where the data are maintained on disks. • Since disk lookups are very SLOW, want to read as few disk pages as possible. Want really shallow depth trees!

  33. B-trees Key Idea • Make the nodes in the trees have a huge number of links, k-way. • Typically choose k so that a node fills a disk page. • As with 2-3-4 trees, not all the nodes have k links. Some may have as few as k/2 links. • When a node overflows, split the node.

  34. B-trees • Takes O(log k/2 N) probes for search and insert. • Typically about 2-3 probes (disk accesses) • E.g., for N < 125 million and k = 1000, the height of the tree is less than 3. • As all searches go through the root node, usually keep the root node in memory. • Many variants • Common in many large data base systems.

  35. Conclusion • AVL trees have the disadvantage that insert is not tail recursive. • 2-3-4 trees are not practical, but are a good way to think about other approaches. • Red-black trees are very efficient and have guaranteed O(log N) insert and search. • B-trees have very shallow depth to minimize the number of disk reads needed for huge data bases.

More Related