CS 253: Algorithms

CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis

How Fast Can We Sort?

How Fast Can We Sort? • Insertion sort: O(n2) • Bubble Sort, Selection Sort: (n2) • Merge sort: (nlgn) • Quicksort: (nlgn) - average • What is common to all these algorithms? • They all sort by making comparisons between the input elements

Comparison Sorts • Comparison sorts use comparisons between elements to gain information about an input sequence a1, a2, …, an • Perform tests: ai < aj, ai≤aj, ai = aj, ai≥aj, orai>aj to determine the relative order of aiandaj • For simplicity, assume that all the elements are distinct

Lower-Bound for Sorting Theorem: To sort n elements, comparison sorts must make (nlgn)comparisons in the worst case.

node leaf: Decision Tree Model • Represents the comparisons made by a sorting algorithm on an input of a given size. • Models all possible execution traces • Control, data movement, other operations are ignored • Count only the comparisons

Worst-case number of comparisons? • Worst-case number of comparisons depends on: • the length of the longest path from the root to a leaf (i.e., the height of the decision tree)

4 1 3 2 16 9 10 Lemma Any binary tree of height hhas at most 2h leaves Proof:by induction on h Basis:h = 0 tree has one node, which is a leaf # of Leaves = 1 ≤ 20 (TRUE) Inductive step:assume true for h-1 (i.e. #Leaves ≤ 2h-1) • Extend the height of the tree with one more level • Each leaf becomes parent to two new leaves No. of leaves at level h = 2  (no. of leaves at level h-1) ≤ 2  2h-1 ≤ 2h h-1 h

What is the least number of leaves in a Decision Tree Model? • Allpermutations on n elements must appear as one of the leaves in the decision tree: n! permutations • At least n! leaves

Lower Bound for Comparison Sorts Theorem:Any comparison sort algorithm requires (nlgn)comparisons in the worst case. Proof: How many leaves does the tree have? • At least n! (each of the n!permutations must appear as a leaf) • There are at most 2hleaves (by the previous Lemma)  n! ≤ 2h  h ≥ lg(n!) = (nlgn) (see next slide) h leaves Exercise 8.1-1: What is the smallest possible depth of a leaf in a decision tree for a comparison sort?

lg(n!) = (nlgn) • n! ≤ nn  lg(n!) ≤ nlgn lg(n!) = O(nlgn) • n! ≥ 2n  lg(n!) ≥ nlg2=n  lg(n!) = Ω(n)  n ≤lg(n!) ≤ nlgn • We need a tighter lower bound! • UseStirling’s approximation (3.18):

Counting Sort • Assumptions: • Sort n integers which are in the range [0 ... r] • r is in the order of n, that is, r=O(n) • Idea: • For each element x, find the number of elements ≤ x • Place x into its correct position in the output array ( ) output array

Step 1 Find the number of times A[i] appears in A Allocate C[1..r] (histogram) For 1 ≤ i ≤ n, ++C[A[i]} (i.e., frequencies/histogram)

Step 2 Find the number of elements ≤ A[i] (i.e. cumulative sums)

Algorithm • Start from the last element of A • Place A[i] at its correct place in the output array • Decrease C[A[i]] by one

A 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 C C C C C C 2 1 1 2 2 1 2 2 0 2 2 2 4 4 4 2 3 4 3 5 6 6 5 7 7 7 7 0 7 7 8 8 8 1 8 8 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 2 0 0 0 5 3 0 2 2 3 3 3 3 3 3 0 3 3 B B B B Example (cumulative sums) (frequencies)

A 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 C C C 0 0 0 2 2 2 3 3 3 4 4 5 7 7 7 8 7 8 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 2 0 0 0 0 0 0 0 0 5 2 3 2 2 0 2 2 3 2 3 3 3 3 3 3 3 3 3 0 3 3 5 5 3 B B B B Example (cont.)

j 1 n Alg.: COUNTING-SORT(A, B, n, r) • for i ← 0to r • do C[ i ] ← 0 • for j ← 1to n • do C[A[ j ]] ← C[A[ j ]] + 1 % C[i] contains the number of elements =i ; frequencies • for i ← 1to r • do C[ i ] ← C[ i ] + C[i -1] % C[i] contains the number of elements ≤i ; cumulative sum • for j ← ndownto1 • do B[C[A[ j ]]] ← A[ j ] % B[.] contains sorted array • C[A[ j ]] ← C[A[ j ]] – 1 A 0 r C 1 n B

Analysis of Counting Sort Alg.: COUNTING-SORT(A, B, n, k) • for i ← 0to r • do C[ i ] ← 0 • for j ← 1to n • do C[A[ j ]] ← C[A[ j ]] + 1 • for i ← 1to r • do C[ i ] ← C[ i ] + C[i -1] • for j ← ndownto1 • do B[C[A[ j ]]] ← A[ j ] • C[A[ j ]] ← C[A[ j ]] – 1 (r) (n) (r) (n) Overall time: (n + r)

Analysis of Counting Sort • Overall time: (n + r) • In practice we use COUNTING sort when r = O(n)  running time is (n) • Counting sort is stable • Counting sort is notin place sort

Radix Sort • Represents keys as d-digit numbers in some base-k e.g. key = x1x2...xdwhere 0 ≤ xi ≤ k-1 • Example: key=15 key10 = 15, d=2, k=10 where 0 ≤ xi ≤ 9 key2 = 1111, d=4, k=2 where 0 ≤ xi ≤ 1

Radix Sort • 326 • 453 • 608 • 835 • 751 • 435 • 704 • 690 • Assumptions: d=Θ(1) and k =O(n) • Sorting looks at one column at a time • For a d digit number, sort the least significant digit first • Continue sorting on the next least significant digit, until all digits have been sorted • Requires only d passes through the list

Radix Sort Alg.: RADIX-SORT(A, d) for i ← 1 to d do use a stable sort to sort array A on digit i • 1 is the lowest order digit, d is the highest-order digit How do things go wrong if an unstable sorting alg. is used?

Analysis of Radix Sort • Given n numbers of d digits each, where each digit may take up to k possible values, RADIX-SORT correctly sorts the numbers in (d(n+k)) • One pass of sorting per digit takes (n+k) assuming that we use counting sort • There are d passes (for each digit)  (d(n+k)) Since d=Θ(1) and k =O(n) Therefore, Radix Sort runs in (n) time

Conclusions • In the worst case, any comparison sort will take at least nlgnto sort an array of n numbers • We can achieve a O(n) running time for sorting if we can make certain assumptions on the input data: • Counting sort: each of the n input elements is an integer in the range [0 ... r]and r=O(n) • Radix sort:the elements in the input are integers represented with d digits in base-k, where d=Θ(1) and k =O(n)

Problem You are given 5 distinct numbers to sort. Describe an algorithm which sorts them using at most 6comparisons, or argue that no such algorithm exists. Solution: Total # of leaves in the comparison tree = 5! If the height of the tree is h, then (total # of leaves ≤ 2h)  2h ≥ 5! h ≥ log2(5!) ≥ log2120 h > 6  There is at least one input permutation which will require at least 7 comparisons to sort. Therefore, no such algorithm exists.

CS 253: Algorithms

CS 253: Algorithms

Presentation Transcript

An Overview of Pitch Detection Algorithms

Greedy Algorithms And Genome Rearrangements

4 Greedy Algorithms

MAP Estimation Algorithms in

Matrix Multiplication and Graph Algorithms

BackTracking Algorithms

Genetic Algorithms

Text Retrieval Algorithms

Combinatorial Algorithms

Divide-and-conquer algorithms

Parallel Algorithms

UMass Lowell Computer Science 91.404 Analysis of Algorithms Prof. Karen Daniels Spring, 2001

Cache Based Iterative Algorithms

Multiple Sequence Alignment (MSA)

Data Structures and Algorithms

Algorithms and Data Structures (CSC112)

Combinatorial Algorithms

Genetic Algorithms

461191 Discrete Mathematics Lecture 3: Algorithms, The Integers, and Matrices

Router Design

Data Structures and Algorithms

Semi-feasible Algorithms Hem-Ogi Chapter 3