**Data Structure & Algorithm** Lecture 5 Heap Sort & Binary Tree JJCAO

**Recitation** • Bubble Sort : O(n^2) • Insertion Sort : O(n^2) • Selection Sort: O(n^2) • Merge Sort: O(nlgn)

**Importance of Sorting** Why don’t CS profs ever stop talking about sorting? • Computers spend more time sorting than anything else, historically 25% on mainframes. • Sorting is the best studied problem in computer science, with a variety of different algorithms known. • Most of the interesting ideas we will encounter in the course can be taught in the context of sorting, such as divide-and-conquer, randomized algorithms, and lower bounds. You should have seen most of the algorithms - we will concentrate on the analysis.

**Efficiency of Sorting** • Sorting is important because that once a set of items is sorted, many other problems become easy. • Large-scale data processing would be impossible if sorting took Ω(n^2) time.

**Heap Sort** • Running time – roughly nlog(n) • like Merge Sort • unlike Insertion Sort • In place • like Insertion Sort • unlike Merge Sort • Uses a heap

**Binary Tree** depthof a node: # of edges on path to the root : a node whose subtrees are empty

**Implementing Binary Trees** Relationships (left). Finding the minimum (center) & maximum (right) elements

**Complete Binary Trees** • Heightof a node: Number of edges on longest path to a leaf • Height of a tree = height of its root • Lemma: A complete binary tree of height h has -1 nodes Proof: By induction on h h=0: leaf, 21-1=1 node h>0: Tree consists of two complete trees of height h-1 plus the root. Total: (2h-1) + (2h-1) +1 = 2h+1-1

**Complete Binary Trees** • A Binary Tree is completeif every internal node has exactly two children and all leaves are at the same depth:

**Almost Complete Binary Trees** An almost complete binary tree is a complete tree possibly missing some nodes on the right side of the bottom level:

**Almost Complete Binary Trees** X X √

**(Binary) Heaps - ADT** • An almost complete binary tree • each node contains a key • Keys satisfy the heap property: each node’s key >=its children’s keys

**Max heap** Min heap

**Implementing Heaps by Arrays**

**Heapify Example** Heapify(A,i) – fix Heap properties given a violation at position i

**Heapify Example**

**Heapify Example**

**Heapify Example**

**Heapify** Heapifyon a node of height h takes roughly dh Steps Height of the tree is logn, so Heapify on the root node takes: dlogn steps.

**Build-Heap** (After BuildHeap – A[1] stores max element) • We have about n/2 calls to Heapify • Cost of <= dlogn - for each call to Heapify • TOTAL: d(n/2)logn • Butwe can do better and show a cost of cn to achieve a total running time linear in n.

**Build-Heap - Running Time** N=n

**Heap-Sort** Running Time: at most dnlgnfor some d>0 // O(n) // O(n) // O(lgn)

**Several Sort Algorithms** http://www.sorting-algorithms.com

**Heapsort** • Heapsort is an excellent algorithm, but a good implementation of quicksort, usually beats it in practice. • Nevertheless, the heap data structure itself has many uses: • Priority queue (most popular)

**Review - What is a Heap?** • a almost complete tree-like structure • usually based on an array • fast access to the largest (or smallest) data item.

**Priority Queue ADT** Priority Queue – a set of elements S, each with a key Operations: • insert(S,x) - insert element x into S S <- S U {x} • max(S) - return element of S with largest key • extract-max(S) - remove and return element of S with largest key

**Implementing PQs by Heaps** Heap-Maximum(A) 1 if heap-size[A] >= 1 • return( A[1] ) => Running Time: constant

**Heap Extract-Max** Heap-Extract-Max(A) 1 if heap-size[A] < 1 2 error “heap underflow” 3 max <- A[1] 4 A[1] <- A[heap-size[A]] 5 heap-size[A] <-heap-size[A]-1 6 Heapify(A,1) 7 return max Running Time: dlgn + c = d’lgn when heap-size[A] = n

**Heap Insert**

**Heap-Insert** Heap-Insert(A,key) 1 heap-size[A] <-heap-size[A]+1 2 i <-heap-size[A] 3 while i>0 and A[parent(i)]<key 4 A[i] <-A[parent(i)] 5 i <- parent(i) 6 A[i] <-key Running Time: dlgn when heap-size[A] = n

**Priority Queue Sorting** PQ-Sort(A) 1 S <- Φ 2 for i <-1 to n 3 Heap-Insert(S,A[i]) //O(lgn), O(lg(S.size)) 4 for i <-n downto1 // O(n) 5 SortedA[i] <-Extract-Max(S) //O(lgn) // O(n) // O(n) // O(lgn) Running Time: at most dnlgnfor some d>0

**Comparison of Special-Purpose Structures** Use max-priority queues to schedule jobs on a shared computer. It keeps track of the jobs to be performed and their relative priorities. When a job is finished or interrupted, the scheduler selects the highest-priority job from among those pending by calling EXTRACT-MAX. The scheduler can add a new job to the queue at any time by calling INSERT.

**Homework 4** • Hw04-GuiQtScribble • Deadline: 22:00, Oct. ?, 2011