- By
**goro** - Follow User

- 90 Views
- Updated on

Download Presentation
## Balanced Search Trees

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Balanced Search Trees

15-211 Fundamental Data Structures and Algorithms

Margaret Reid-Miller

3 February 2005

Plan

- Today
- 2-3-4 trees
- Red-Black trees
- Reading:
- For today: Chapters 13.3-4
- Reminder: HW1 due tonight!!!

HW2 will be available soon

AVL-Trees

- Height balanced:
- For each node the heights of left and right subtrees differ by at most 1, a representational invariance.
- What is the mechanism to rebalance an out-of-balanced AVL tree caused by an insert?

AVL Tree Summary

- In each node maintains a lazy deletion flag and the height of its subtree.
- The height of an AVL tree is at most 45% greater than the minimum.
- Requires at most one single or double rotation to regain balance after an insert.
- Thus, guarantees O(log N) time for search and insert.

Balanced 2-3-4 Trees

- Maintain height balance in all subtrees. Depth property.
- But allow nodes in the tree to expand to accommodate inserts.
- In particular, nodes can have 2, 3 or 4 children. Node-size property.
- E.g., a 4-node would have 3 keys that splits the keys into 4 intervals.

A C

H

O

S U W

2-3-4 Tree Insert- To insert, first search for a leaf node in which to put the key.
- E.g., insert U

G M Q

A C

H

R

S W

2-3-4 Tree Insert

/* Either returns an empty node or a new root */

public Node BUinsert(int key) {

if isEmptyNode() return new Node(key);

/* Search for leaf to put key into */

Node subtree = findChild(key); // down which link?

Node upNode = child.BUinsert(key);

/* upNode is empty, the key at a leaf node, or

* the result of a 4-node split that needs to be

* propagated up. */

if upNode.isEmptyNode() return upNode;

else

return addToNode(upNode); // split?

}

Cascading splits

- When inserting a key into a 4-node, the 4-node splits and a key moves up to the parent node.
- This new key may in turn cause the parent to split, moving a key up to the grandparent, and so on up to the root.
- When would this happen?
- Is there a way to avoid these cascading splits?

Bottom-up 2-3-4 trees

- This BUinsert is called a bottom-up version of insert, since splits occur as we go back up the tree after the recursive calls.
- Work occurs before and after the recursive calls.

Preemptive Split

- Every time we find a 4-node while traveling down a search path, we split the 4-node.
- Note: Two 2-nodes have the same number of children as one 4-node.
- Changes are local to the split node (no cascading).
- Guaranteed to find a 2-node or 3-node at the leaf.
- Splitting a root node creates a new root.

2-3-4 Tree Height

- What is the height of the tree?

At most log2 N + 1

- Why?

The maximum depth is when every node is a 2-node. Since every leaf has the same depth, the tree is complete and has depth log2 N + 1.

Number of splits

- How many splits does an insertion require?

At most log2 N + 1 splits.

- Seems to require less than one split on average when tree is built from a random permutation. Trees tend to have few 4-nodes.

Top-down 2-4-5 trees

- The second method is called top-down as splits occur on the way down the tree.
- All the work occurs before the recursive calls and no work occurs after the recursive calls.
- Called tail-recursion, which is much more efficient.
- Can AVL trees be made tail recursive?

2-3-4 trees

- Advantages:
- Guaranteed O(log N) time for search and insert.
- Issues:
- Awkward to maintain three types of nodes.
- Need to modify the standard search on binary trees.
- Splits need to move links between nodes.
- Code has many cases to handle.

B F H

D I

G

Red-Black trees- A red-black tree is binary tree representation of a 2-3-4 tree using red and black nodes.

I

D

F

OR

D

I

B

H

Red-black tree properties

A Red-Black tree is a binary search tree where

- Every node is colored either red or black.
- Note: Every 2-3-4 node corresponds to one black node.
- The root node is black.
- Red nodes always have black parents (children)
- Every path from the root to a leaf has same number of black nodes.

3

6

9

Red-black tree height5

- What is the height of a red-black tree?
- It is at most 2 log N + 2 since it can be at most twice as high as its corresponding 2-3-4 tree, which has height at most log N + 1.

Red-black Tree Search

- Search is the same as for binary search trees.
- Color is irrelevant.
- Search guaranteed to take O(log N) time.
- Search typically occurs more frequently than insert.

Red-black Tree Insert

- Simple 4-node test (2 red children?)
- Few splits as most 4-nodes tend to be near the leaves.
- Some 4-node splits require only changing the color of three nodes.
- Rotations needed only when a 4-node has a 3-node parent.

Red-black Tree Summary

- Advantages:
- Guaranteed O(log N) time for search and insert.
- Little overhead for balancing.
- Trees are nearly optimal.
- Top-down implementation can be made tail-recursive, so very efficient.

B-trees

- A generalization of 2-3-4 trees.
- Used for very large dictionaries where the data are maintained on disks.
- Since disk lookups are very SLOW, want to read as few disk pages as possible.

Want really shallow depth trees!

B-trees Key Idea

- Make the nodes in the trees have a huge number of links, k-way.
- Typically choose k so that a node fills a disk page.
- As with 2-3-4 trees, not all the nodes have k links. Some may have as few as k/2 links.
- When a node overflows, split the node.

B-trees

- Takes O(log k/2 N) probes for search and insert.
- Typically about 2-3 probes (disk accesses)
- E.g., for N < 125 million and k = 1000, the height of the tree is less than 3.
- As all searches go through the root node, usually keep the root node in memory.
- Many variants
- Common in many large data base systems.

Conclusion

- AVL trees have the disadvantage that insert is not tail recursive.
- 2-3-4 trees are not practical, but are a good way to think about other approaches.
- Red-black trees are very efficient and have guaranteed O(log N) insert and search.
- B-trees have very shallow depth to minimize the number of disk reads needed for huge data bases.

Download Presentation

Connecting to Server..