1 / 24

Section 7: BSTs and Tries

Section 7: BSTs and Tries. CS 225: Data Structures & Software Principles. 6. 10. 4. 7. 12. 1. 5. Binary Search Trees. A Binary Search Tree is a binary tree with the following properties: Values associated with nodes have a linear order (i.e. we can define "less-than" on node values)

dawn
Download Presentation

Section 7: BSTs and Tries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Section 7: BSTs and Tries CS 225: Data Structures & Software Principles

  2. 6 10 4 7 12 1 5 Binary Search Trees • A Binary Search Tree is a binary tree with the following properties: • Values associated with nodes have a linear order (i.e. we can define "less-than" on node values) • Every node's value is greater than any value in its left sub-tree and less than any value in its right sub-tree • Abbreviated as BST

  3. Binary Search Trees • Tree height • The height of a complete binary tree with n nodes is exactly log n • The maximum height of a binary tree with n nodes is n-1 • The minimum height of a binary tree with n nodes is log n • Why do we care?

  4. Binary Search Trees • Searching in a BST requires looking at one node per tree level (in the worst case) • Worst-case search time • for all possible search trees with n nodes: O(n) • for the best search tree with n nodes: O(log n) vs. …

  5. BST Implementation • We can build BSTs from last week's BinaryTree code – no implementation tricks are needed. (A BST is just a normal binary tree, used in a special way.) • Functions we might like: • Find (search for an item) • Insert (add a new item) • Remove (delete an item)

  6. Basic BST Operations: Find • Basic algorithm: • If we're searching in an empty tree, the value we're looking for isn't here. • Is the value we're looking for at the root? • If so, we're done – we found it • Otherwise, compare the value at the root to the one we're looking for, to figure out which subtree it should be in

  7. Basic BST Operations: Find template <typename Etype> bool BinarySearchTree<Etype>::find(Etype const & searchElem, typename BinarySearchTree<Etype>::TreeNode const * treePtr) const { // what goes here? }

  8. Basic BST Operations: Find template <typename Etype> bool BinarySearchTree<Etype>::find(Etype const & searchElem, typename BinarySearchTree<Etype>::TreeNode const * treePtr) const { if (treePtr == NULL) return false; else if (searchElem == treePtr->element) return true; else if (searchElem < treePtr->element) return find(searchElem, treePtr->left); else // searchElem > treePtr->element return find(searchElem, treePtr->right); }

  9. Basic BST Operations: Find Recursion incurs extra overhead; let's make this iterative. template <typename Etype> bool BinarySearchTree<Etype>::find(Etype const & searchElem, typename BinarySearchTree<Etype>::TreeNode const * treePtr) const { // what now? }

  10. Basic BST Operations: Find Recursion incurs extra overhead; let's make this iterative. template <typename Etype> bool BinarySearchTree<Etype>::find(Etype const & searchElem, typename BinarySearchTree<Etype>::TreeNode const * treePtr) const { while (treePtr != NULL) { if (searchElem == P->element) return true; else if (searchElem < P->element) P = P->left; else P = P->right; } return false; }

  11. Basic BST Operations: Insert • Insert • Must ensure that tree remains a binary search tree after insertion • Determine where the element would have been if it were actually in the BST; insert there • What does this mean implementation-wise? • Compare Insert() vs Find()

  12. Basic BST Operations: Remove • Remove • Not as easy • Start by finding the node we want to remove • Next, there are three cases to consider: • The node is a leaf • The node has one child • The node has two children

  13. Terminology for Remove • Consider root node (6) • In-order successor: smallest (left-most) element in right subtree • In-order predecessor: greatest (right-most) element in left subtree 6 10 4 7 12 1 5

  14. Basic BST Operations: Remove • If the node we're removing has… • No children: • Just delete it! • One child: • Attach the node's parent to its child • Then delete it! • Two children: • Find the in-order successor • Swap the values between the node and its IOS • Remove the "old" value from the right subtree(how do we know this removal will be "easy"?)

  15. Intermission • Midterm 1 is graded! • Average: ~65 (out of 90); s: ~17 • To request a regrade, write up a list of the problems you think we should look at, and why we should look at them. Give it to a TA. • If you want (or might want) a regrade, don't take your exam home today (leave it with me).

  16. Tries • Data structure optimized for lookups on a key that can be decomposed into characters • Represented using a tree of arrays • For a character set of size k, the corresponding Trie structure is a (k+1)-ary tree

  17. Tries • i-th character (starting at 0) in the data corresponds to a node at depth i • Need a mapping of character to an array index • The extra cell in the array represents the “null character” (  ) • Represents the end of a word • Points to a leaf • Ideally, no need to store key in a leaf, since it is completely determined by path followed • Info stored at the leaf • Spend only constant time at each level

  18. a b c … r s … z 0 a t 1 1 f a i 2 2 t r r 3 3 3 t 4 4 4 star stir raft 5 start Trie Example Words in Trie raft star start stir

  19. Tries • Running time of Find operation: O(L) where L is the length of the string we are looking for • Advantage: NOT dependent on the number of strings we have in the Trie structure • Disadvantage: memory waste • 27 cell array, one per character needed for Strings • Space: (k+1) * #nodes * sizeof(pointer) • Although, could be better for a large number of short strings

  20. Jason’s Code:TrieNode Data TrieNode { int nodeLevel; // level of the nodebool isLeaf; // is this a leaf?Array<TrieNode*> subtries; //array nodesString key; // string key in leaf nodes Etype storedInfo; // associated info in leaf nodes }

  21. Code Review: Trie Search • template <class Etype> • pair<bool, Etype> Trie<Etype>::find(String const & searchKey, typename Trie<Etype>::TrieNode const * nodePtr) { • if (nodePtr==NULL) • return pair<bool, Etype>(false, Etype()); • else if (nodePtr->isLeaf == true) { // found a leaf • if (searchKey == nodePtr->key) • return pair<bool, Etype>(true, nodePtr->storedInfo); • else • return pair<bool, Etype>(false, Etype()); • } • else { // not a leaf • int index = ascIndex(searchString[nodePtr->nodeLevel]); • return find(searchKey, (nodePtr->subtries)[index]); • } • }

  22. Code Review: Trie Insert • template <class Etype> • void Trie<Etype>::insert(String insKey, Etype insInfo, typename Trie<Etype>::TrieNode * & nodePtr, int prevLevel) { • if (nodePtr == NULL) { // NULL case • if (prevLevel == insKey.length()) { // make leaf node • nodePtr = new TrieNode(insKey, insInfo); • nodePtr->nodeLevel = prevLevel+1; • } • else { // make internal node • nodePtr = new TrieNode(); • nodePtr->nodeLevel = prevLevel+1; • insert(insKey, insInfo, • (nodePtr->subtries)[ ascIndex( • insKey[nodePtr->nodeLevel]) ], nodePtr->nodeLevel); • } • } // more…

  23. …Trie Insert • else if (nodePtr->isLeaf == true) { // leaf case • cout << "This key already exists in the trie!" << endl; • return; • } • else // nodePtr->nodeType == false, array node case • insert(insString, insInfo, • (nodePtr->subtries)[ascIndex( • insString[nodePtr->nodeLevel])], nodePtr->nodeLevel); • }

  24. Trie Remove • What's the basic idea? • Search for the key • If it's not there, give up • Otherwise, remember the leaf that corresponds to it • Delete the leaf • Figure out if this makes any of the internal nodes "empty"; if so, delete them too • When we actually implement this, we'll execute multiple steps at the same time…

More Related