1 / 85

CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures

CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures. Objectives: Introduce basic data structures, including Stacks and Queues Vectors, Lists, and Sequences Trees Priority Queues and Heaps Dictionaries and Hash Tables

estellel
Download Presentation

CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC401 – Analysis of AlgorithmsChapter 2Basic Data Structures Objectives: • Introduce basic data structures, including • Stacks and Queues • Vectors, Lists, and Sequences • Trees • Priority Queues and Heaps • Dictionaries and Hash Tables • Analyze the performance of operations on basic data structures

  2. Abstract Data Types (ADTs) • Example: ADT modeling a simple stock trading system • The data stored are buy/sell orders • The operations supported are • order buy(stock, shares, price) • order sell(stock, shares, price) • void cancel(order) • Error conditions: • Buy/sell a nonexistent stock • Cancel a nonexistent order • An abstract data type (ADT) is an abstraction of a data structure • An ADT specifies: • Data stored • Operations on the data • Error conditions associated with operations

  3. The Stack ADT • The Stack ADT stores arbitrary objects • Insertions and deletions follow the last-in first-out scheme • Think of a spring-loaded plate dispenser • Main stack operations: • push(object): inserts an element • object pop(): removes and returns the last inserted element • Attempting the execution of an operation of ADT may sometimes cause an error condition, called an exception • Exceptions are said to be “thrown” by an operation that cannot be executed • Auxiliary stack operations: • object top(): returns the last inserted element without removing it • integer size(): returns the number of elements stored • boolean isEmpty(): indicates whether no elements are stored • In the Stack ADT, operations pop and top cannot be performed if the stack is empty • Attempting the execution of pop or top on an empty stack throws an EmptyStackException

  4. Applications of Stacks • Direct applications • Page-visited history in a Web browser • Undo sequence in a text editor • Chain of method calls in the Java Virtual Machine • Indirect applications • Auxiliary data structure for algorithms • Component of other data structures main() { int i = 5; foo(i); } foo(int j) { int k; k = j+1; bar(k); } bar(int m) { … } bar PC = 1 m = 6 • The Java Virtual Machine (JVM) keeps track of the chain of active methods with a stack • When a method is called, the JVM pushes on the stack a frame containing • Local variables and return value • Program counter, keeping track of the statement being executed • When a method ends, its frame is popped from the stack and control is passed to the method on top of the stack foo PC = 3 j = 5 k = 6 main PC = 2 i = 5

  5. Array-based Stack • A simple way of implementing the Stack ADT uses an array • We add elements from left to right • A variable keeps track of the index of the top element • The array storing the stack elements may become full • A push operation will then throw a FullStackException • Limitation of the array-based implementation • Not intrinsic to the Stack ADT Algorithmsize() returnt +1 Algorithmpop() ifisEmpty()then throw EmptyStackException else tt 1 returnS[t +1] Algorithmpush(o) ift=S.length 1then throw FullStackException else tt +1 S[t] o • Limitations • The fixed maximum size • Trying to push a new element into a full stack causes an implementation-specific exception • Performance • Let n be the number of elements in the stack • The space used is O(n) • Each operation runs in time O(1)

  6. Stack Interface & ArrayStack in Java public classArrayStack implements Stack{private Object S[ ];private int top = -1; publicArrayStack(int capacity){S = new Object[capacity]);} public Object pop()throwsEmptyStackException{if isEmpty()throw newEmptyStackException(“Empty stack: cannot pop”);Object temp = S[top]; S[top] =null; top = top – 1;returntemp;} } public interfaceStack{ public int size(); public boolean isEmpty(); public Object top()throwsEmptyStackException; public voidpush(Object o); public Object pop()throwsEmptyStackException;} • Other Implementations of Stack • Extendable array-based stack • Linked list-based stack

  7. The Queue ADT • The Queue ADT stores arbitrary objects • Insertions and deletions follow the first-in first-out scheme • Insertions are at the rear and removals at the front • Main queue operations: • enqueue(object): inserts an element at the end of the queue • object dequeue(): removes and returns the element at the front • Auxiliary queue operations: • object front(): returns the element at the front without removing it • integer size(): returns the number of elements stored • boolean isEmpty(): indicates whether no elements are stored • Exceptions • Attempting the execution of dequeue or front on an empty queue throws an EmptyQueueException • Direct applications • Waiting lists, bureaucracy • Access to shared resources (e.g., printer) • Multiprogramming • Indirect applications • Auxiliary data structure for algorithms • Component of other data structures

  8. normal configuration Q 0 1 2 f r wrapped-around configuration Q 0 1 2 r f Array-based Queue • Use an array of size N in a circular fashion • Two variables keep track of the front and rear f index of the front element r index immediately past the rear element • Array location r is kept empty

  9. Array-based Queue Operations • We use the modulo operator (remainder of division) • Operation enqueue throws an exception if the array is full • This exception is implementation-dependent • Operation dequeue throws an exception if the queue is empty • This exception is specified in the queue ADT Algorithmsize() return(N-f +r) mod N AlgorithmisEmpty() return(f=r) Algorithmenqueue(o) ifsize()=N 1then throw FullQueueException else Q[r] o r(r + 1) mod N Algorithmdequeue() ifisEmpty()then throw EmptyQueueException else oQ[f] f(f + 1) mod N returno

  10. Queue Interface in Java • Java interface corresponding to our Queue ADT • Requires the definition of class EmptyQueueException • No corresponding built-in Java class public interfaceQueue{ public int size(); public boolean isEmpty(); public Object front()throwsEmptyQueueException; public voidenqueue(Object o); public Object dequeue()throwsEmptyQueueException;} • Other Implementations of Queue • Extendable array-based queue: The enqueue operation has amortized running time • O(n) with the incremental strategy • O(1) with the doubling strategy • Linked list-based queue

  11. The Vector ADT • The Vector ADT extends the notion of array by storing a sequence of arbitrary objects • An element can be accessed, inserted or removed by specifying its rank (number of elements preceding it) • An exception is thrown if an incorrect rank is specified (e.g., a negative rank) • Main vector operations: • object elemAtRank(integer r): returns the element at rank r without removing it • object replaceAtRank(integer r, object o): replace the element at rank with o and return the old element • insertAtRank(integer r, object o): insert a new element o to have rank r • object removeAtRank(integer r): removes and returns the element at rank r • Additional operations size() and isEmpty() • Direct applications • Sorted collection of objects (elementary database) • Indirect applications • Auxiliary data structure for algorithms • Component of other data structures

  12. V 0 1 2 n r V V 0 1 0 2 1 2 n n r r V o 0 1 2 n r Array-based Vector • Use an array V of size N • A variable n keeps track of the size of the vector (number of elements stored) • Operation elemAtRank(r) is implemented in O(1) time by returning V[r] • In operation insertAtRank(r, o), we need to make room for the new element by shifting forward the n - r elements V[r], …, V[n -1] • In the worst case (r =0), this takes O(n) time

  13. V 0 1 2 n r V 0 1 2 n r V o 0 1 2 n r Array-based Vector • In operation removeAtRank(r), we need to fill the hole left by the removed element by shifting backward the n - r -1 elements V[r +1], …, V[n -1] • In the worst case (r =0), this takes O(n) time • Performance • In the array based implementation of a Vector • The space used by the data structure is O(n) • size, isEmpty, elemAtRankand replaceAtRankrun in O(1) time • insertAtRankand removeAtRankrun in O(n) time • If we use the array in a circular fashion, insertAtRank(0)and removeAtRank(0)run in O(1) time • In an insertAtRankoperation, when the array is full, instead of throwing an exception, we can replace the array with a larger one (extendable array)

  14. next node elem  A B C D Singly Linked List • A singly linked list is a concrete data structure consisting of a sequence of nodes • Each node stores • element • link to the next node • Stack with singly linked list • The top element is stored at the first node of the list • The space used is O(n) and each operation of the Stack ADT takes O(1) time • Queue with singly linked list • The front element is stored at the first node • The rear element is stored at the last node • The space used is O(n) and each operation of the Queue ADT takes O(1) time

  15. Position ADT & List ADT • The Position ADT • models the notion of place within a data structure where a single object is stored • gives a unified view of diverse ways of storing data, such as • a cell of an array • a node of a linked list • Just one method: • object element(): returns the element stored at the position • The List ADT • models a sequence of positions storing arbitrary objects • establishes a before/after relation between positions • Generic methods: size(), isEmpty() • Query methods: isFirst(p), isLast(p) • Accessor methods: first(), last(), before(p), after(p) • Update methods: • replaceElement(p, o), swapElements(p, q) • insertBefore(p, o), insertAfter(p, o) • insertFirst(o), insertLast(o) • remove(p)

  16. prev next elem node trailer nodes/positions header elements Doubly Linked List • A doubly linked list provides a natural implementation of the List ADT • Nodes implement Position and store: • element • link to the previous node • link to the next node • Special trailer and header nodes

  17. p p A B C p A B C D A B C q p A B C X D p q A B X C A B C Doubly Linked List Operations • We visualize insertAfter(p, X), which returns position q • We visualize remove(p), where p = last() • Performance • The space used by a doubly linked list with n elements is O(n) • The space used by each position of the list is O(1) • All the operations of the List ADT run in O(1) time • Operation element() of the Position ADT runs in O(1) time

  18. Sequence ADT • The Sequence ADT is the union of the Vector and List ADTs • Elements accessed by • Rank or Position • Generic methods: • size(), isEmpty() • Vector-based methods: • elemAtRank(r), replaceAtRank(r, o), insertAtRank(r, o), removeAtRank(r) • List-based methods: • first(), last(), before(p), after(p), replaceElement(p, o), swapElements(p, q), insertBefore(p, o), insertAfter(p, o), insertFirst(o), insertLast(o), remove(p) • Bridge methods: • atRank(r), rankOf(p) • Direct applications: • Generic replacement for stack, queue, vector, or list • small database • Indirect applications: • Building block of more complex data structures • The Sequence ADT is a basic, general-purpose, data structure for storing an ordered collection of elements

  19. elements 0 1 2 3 positions S f l Array-based Implementation • We use a circular array storing positions • A position object stores: • Element • Rank • Indices f and l keep track of first and last positions

  20. Operation Array List size, isEmpty 1 1 atRank, rankOf, elemAtRank 1 n first, last, before, after 1 1 replaceElement, swapElements 1 1 replaceAtRank 1 n insertAtRank, removeAtRank n n insertFirst, insertLast 1 1 insertAfter, insertBefore n 1 remove n 1 Sequence Implementations

  21. Design Patterns • Adaptor • Position • Composition • Iterator • Comparator • Locator

  22. Design Pattern: Iterators • An iterator abstracts the process of scanning through a collection of elements • Methods of the ObjectIterator ADT: • object object() • boolean hasNext() • object nextObject() • reset() • Extends the concept of Position by adding a traversal capability • Implementation with an array or singly linked list • An iterator is typically associated with an another data structure • We can augment the Stack, Queue, Vector, List and Sequence ADTs with method: • ObjectIterator elements() • Two notions of iterator: • snapshot: freezes the contents of the data structure at a given time • dynamic: follows changes to the data structure

  23. Computers”R”Us Sales Manufacturing R&D US International Laptops Desktops Europe Asia Canada The Tree Structure • In computer science, a tree is an abstract model of a hierarchical structure • A tree consists of nodes with a parent-child relation • Applications: • Organization charts • File systems • Programming environments

  24. A C D B E G H F K I J Tree Terminology • Subtree: tree consisting of a node and its descendants • Root: node without parent (A) • Internal node: node with at least one child (A, B, C, F) • External node (a.k.a. leaf ): node without children (E, I, J, K, G, H, D) • Ancestors of a node: parent, grandparent, grand-grandparent, etc. • Depth of a node: number of ancestors • Height of a tree: maximum depth of any node (3) • Descendant of a node: child, grandchild, grand-grandchild, etc. subtree

  25. Tree ADT • Query methods: • boolean isInternal(p) • boolean isExternal(p) • boolean isRoot(p) • Update methods: • swapElements(p, q) • object replaceElement(p, o) • Additional update methods may be defined by data structures implementing the Tree ADT • We use positions to abstract nodes • Generic methods: • integer size() • boolean isEmpty() • objectIterator elements() • positionIterator positions() • Accessor methods: • position root() • position parent(p) • positionIterator children(p)

  26. Computers”R”Us Sales Manufacturing R&D US International Laptops Desktops Europe Asia Canada The Tree Structure • In computer science, a tree is an abstract model of a hierarchical structure • A tree consists of nodes with a parent-child relation • Applications: • Organization charts • File systems • Programming environments

  27. A C D B E G H F K I J Tree Terminology • Subtree: tree consisting of a node and its descendants • Root: node without parent (A) • Internal node: node with at least one child (A, B, C, F) • External node (a.k.a. leaf ): node without children (E, I, J, K, G, H, D) • Ancestors of a node: parent, grandparent, grand-grandparent, etc. • Depth of a node: number of ancestors • Height of a tree: maximum depth of any node (3) • Descendant of a node: child, grandchild, grand-grandchild, etc. subtree

  28. Tree ADT • Query methods: • boolean isInternal(p) • boolean isExternal(p) • boolean isRoot(p) • Update methods: • swapElements(p, q) • object replaceElement(p, o) • Additional update methods may be defined by data structures implementing the Tree ADT • We use positions to abstract nodes • Generic methods: • integer size() • boolean isEmpty() • objectIterator elements() • positionIterator positions() • Accessor methods: • position root() • position parent(p) • positionIterator children(p)

  29. Depth and Height Algorithmdepth(T,v) if T.isRoot(v) then return 0 else return 1+depth(T, T.parent(v)) • Depth -- the depth of v is the number of ancestors, excluding v itself • the depth of the root is 0 • the depth of v other than the root is one plus the depth of its parent • time efficiency is O(1+d) • Height -- the height of a subtree v is the maximum depth of its external nodes • the height of an external node is 0 • the height of an internal node v is one plus the maximum height of its children • time efficiency is O(n) Algorithmheight(T,v) if T.isExternal(v) then return 0 else h=0; for each wT.children(v) do h=max(h,height(T,w)) return 1+h

  30. 1 Make Money Fast! 2 5 9 1. Motivations 2. Methods References 6 7 8 3 4 2.3 BankRobbery 2.1 StockFraud 2.2 PonziScheme 1.1 Greed 1.2 Avidity Preorder Traversal AlgorithmpreOrder(v) visit(v) foreachchild w of v preorder (w) • A traversal visits the nodes of a tree in a systematic manner • In a preorder traversal, a node is visited before its descendants • The running time is O(n) • Application: print a structured document

  31. 9 cs16/ 8 3 7 todo.txt1K homeworks/ programs/ 4 5 6 1 2 Robot.java20K h1c.doc3K h1nc.doc2K DDR.java10K Stocks.java25K Postorder Traversal AlgorithmpostOrder(v) foreachchild w of v postOrder (w) visit(v) • In a postorder traversal, a node is visited after its descendants • The running time is O(n) • Application: compute space used by files in a directory and its subdirectories

  32. A C B D E F G I H Binary Tree • Applications: • arithmetic expressions • decision processes • searching • A binary tree is a tree with the following properties: • Each internal node has two children • The children of a node are an ordered pair • We call the children of an internal node left child and right child • Alternative recursive definition: a binary tree is either • a tree consisting of a single node, or • a tree whose root has an ordered pair of children, each of which is a binary tree

  33. Want a fast meal? + No Yes   How about coffee? On expense account? 2 - 3 b Yes No Yes No a 1 Starbucks Spike’s Al Forno Café Paragon Binary Tree Examples • Arithmetic expression binary tree • internal nodes: operators • external nodes: operands • Example: arithmetic expression tree for the expression (2(a-1)+(3  b)) • Decision tree • internal nodes: questions with yes/no answer • external nodes: decisions • Example: dining decision

  34. Properties of Binary Trees • Properties: • e = i +1 • n =2e -1 • h  i • h  (n -1)/2 • h+1 e 2h • h log2e • h log2 (n +1)-1 • Notation n number of nodes e number of external nodes i number of internal nodes h height

  35. The BinaryTree ADT extends the Tree ADT, i.e., it inherits all the methods of the Tree ADT Additional methods: position leftChild(p) position rightChild(p) position sibling(p) Update methods may be defined by data structures implementing the BinaryTree ADT BinaryTree ADT

  36. Inorder Traversal AlgorithminOrder(v) ifisInternal (v) inOrder (leftChild (v)) visit(v) ifisInternal (v) inOrder (rightChild (v)) • In an inorder traversal a node is visited after its left subtree and before its right subtree • Time efficiency is O(n) • Application: draw a binary tree • x(v) = inorder rank of v • y(v) = depth of v 6 2 8 1 4 7 9 3 5

  37. +   2 - 3 b a 1 Print Arithmetic Expressions AlgorithmprintExpression(v) ifisInternal (v)print(“(’’) inOrder (leftChild (v)) print(v.element ()) ifisInternal (v) inOrder (rightChild (v)) print (“)’’) • Specialization of an inorder traversal • print operand or operator when visiting node • print “(“ before traversing left subtree • print “)“ after traversing right subtree ((2  (a - 1)) + (3  b))

  38. +   2 - 3 2 5 1 Evaluate Arithmetic Expressions AlgorithmevalExpr(v) ifisExternal (v) returnv.element () else x evalExpr(leftChild (v)) y evalExpr(rightChild (v))  operator stored at v returnx  y • Specialization of a postorder traversal • recursive method returning the value of a subtree • when visiting an internal node, combine the values of the subtrees

  39. Euler Tour Traversal • Generic traversal of a binary tree • Includes a special cases the preorder, postorder and inorder traversals • Walk around the tree and visit each node three times: • on the left (preorder) • from below (inorder) • on the right (postorder) +   L R B 2 - 3 2 5 1

  40. Template Method Pattern public abstract class EulerTour{ protected BinaryTree tree;protected voidvisitExternal(Position p, Result r) { }protected voidvisitLeft(Position p, Result r) { }protected voidvisitBelow(Position p, Result r) { } protected voidvisitRight(Position p, Result r) { }protected Object eulerTour(Position p) { Result r = new Result();if tree.isExternal(p) { visitExternal(p, r); }else {visitLeft(p, r); r.leftResult = eulerTour(tree.leftChild(p));visitBelow(p, r); r.rightResult = eulerTour(tree.rightChild(p)); visitRight(p, r);return r.finalResult; } … • Generic algorithm that can be specialized by redefining certain steps • Implemented by means of an abstract Java class • Visit methods that can be redefined by subclasses • Template method eulerTour • Recursively called on the left and right children • A Result object with fields leftResult, rightResult and finalResult keeps track of the output of the recursive calls to eulerTour

  41. Specializations of EulerTour public class EvaluateExpressionextends EulerTour{ protected voidvisitExternal(Position p, Result r) {r.finalResult = (Integer) p.element(); } protected voidvisitRight(Position p, Result r) {Operator op = (Operator) p.element();r.finalResult = op.operation( (Integer) r.leftResult, (Integer) r.rightResult ); } … } • We show how to specialize class EulerTour to evaluate an arithmetic expression • Assumptions • External nodes store Integer objects • Internal nodes store Operator objects supporting method operation (Integer, Integer)

  42. B    A D F B A D F   C E C E Data Structure for Trees • A node is represented by an object storing • Element • Parent node • Sequence of children nodes • Node objects implement the Position ADT

  43. D C A B E  B A D     C E Data Structure for Binary Trees • A node is represented by an object storing • Element • Parent node • Left child node • Right child node • Node objects implement the Position ADT 

  44. Vector-Based Binary Tree • Level numbering of nodes of T: p(v) • if v is the root of T, p(v)=1 • if v is the left child of u, p(v)=2p(u) • if v is the right child of u, p(v)=2p(u)+1 • Vector S storing the nodes of T by putting the root at the second position and following the above level numbering • Properties: Let n be the number of nodes of T, N be the size of the vector S, and PM be the maximum value of p(v) over all the nodes of T • N=PM+1 • N=2^((n+1)/2)

  45. Tree interface BinaryTree interface extending Tree Classes implementing Tree and BinaryTree and providing Constructors Update methods Print methods Examples of updates for binary trees expandExternal(v) removeAboveExternal(w) v expandExternal(v) v A A   removeAboveExternal(w) A B B C w Java Implementation

  46. JDSL is the Library of Data Structures in Java Tree interfaces in JDSL InspectableBinaryTree InspectableTree BinaryTree Tree Inspectable versions of the interfaces do not have update methods Tree classes in JDSL NodeBinaryTree NodeTree JDSL was developed at Brown’s Center for Geometric Computing See the JDSL documentation and tutorials at http://jdsl.org InspectableTree Tree InspectableBinaryTree BinaryTree Trees in JDSL

  47. A priority queue stores a collection of items An item is a pair(key, element) Main methods of the Priority Queue ADT insertItem(k, o) -- inserts an item with key k and element o removeMin() -- removes the item with smallest key and returns its element Additional methods minKey(k, o) -- returns, but does not remove, the smallest key of an item minElement() -- returns, but does not remove, the element of an item with smallest key size(), isEmpty() Applications: Standby flyers Auctions Stock market Priority Queue ADT

  48. Keys in a priority queue can be arbitrary objects on which an order is defined Two distinct items in a priority queue can have the same key Mathematical concept of total order relation  Reflexive property:x  x Antisymmetric property:x  yy  x  x = y Transitive property:x  yy  z  x  z Total Order Relation

  49. A comparator encapsulates the action of comparing two objects according to a given total order relation A generic priority queue uses an auxiliary comparator The comparator is external to the keys being compared When the priority queue needs to compare two keys, it uses its comparator Methods of the Comparator ADT, all with Boolean return type isLessThan(x, y) isLessThanOrEqualTo(x,y) isEqualTo(x,y) isGreaterThan(x, y) isGreaterThanOrEqualTo(x,y) isComparable(x) Comparator ADT

  50. Sorting with a Priority Queue AlgorithmPQ-Sort(S, C) • Inputsequence S, comparator C for the elements of S • Outputsequence S sorted in increasing order according to C P priority queue with comparator C whileS.isEmpty () e S.remove (S.first ()) P.insertItem(e, e) whileP.isEmpty() e P.removeMin() S.insertLast(e) • We can use a priority queue to sort a set of comparable elements • Insert the elements one by one with a series of insertItem(e, e) operations • Remove the elements in sorted order with a series of removeMin() operations • The running time of this sorting method depends on the priority queue implementation

More Related