 Download Download Presentation CS 253: Algorithms

# CS 253: Algorithms

Download Presentation ## CS 253: Algorithms

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis

2. How Fast Can We Sort?

3. How Fast Can We Sort? • Insertion sort: O(n2) • Bubble Sort, Selection Sort: (n2) • Merge sort: (nlgn) • Quicksort: (nlgn) - average • What is common to all these algorithms? • They all sort by making comparisons between the input elements

4. Comparison Sorts • Comparison sorts use comparisons between elements to gain information about an input sequence a1, a2, …, an • Perform tests: ai < aj, ai≤aj, ai = aj, ai≥aj, orai>aj to determine the relative order of aiandaj • For simplicity, assume that all the elements are distinct

5. Lower-Bound for Sorting Theorem: To sort n elements, comparison sorts must make (nlgn)comparisons in the worst case.

6. node leaf: Decision Tree Model • Represents the comparisons made by a sorting algorithm on an input of a given size. • Models all possible execution traces • Control, data movement, other operations are ignored • Count only the comparisons

7. Worst-case number of comparisons? • Worst-case number of comparisons depends on: • the length of the longest path from the root to a leaf (i.e., the height of the decision tree)

8. 4 1 3 2 16 9 10 Lemma Any binary tree of height hhas at most 2h leaves Proof:by induction on h Basis:h = 0 tree has one node, which is a leaf # of Leaves = 1 ≤ 20 (TRUE) Inductive step:assume true for h-1 (i.e. #Leaves ≤ 2h-1) • Extend the height of the tree with one more level • Each leaf becomes parent to two new leaves No. of leaves at level h = 2  (no. of leaves at level h-1) ≤ 2  2h-1 ≤ 2h h-1 h

9. What is the least number of leaves in a Decision Tree Model? • Allpermutations on n elements must appear as one of the leaves in the decision tree: n! permutations • At least n! leaves

10. Lower Bound for Comparison Sorts Theorem:Any comparison sort algorithm requires (nlgn)comparisons in the worst case. Proof: How many leaves does the tree have? • At least n! (each of the n!permutations must appear as a leaf) • There are at most 2hleaves (by the previous Lemma)  n! ≤ 2h  h ≥ lg(n!) = (nlgn) (see next slide) h leaves Exercise 8.1-1: What is the smallest possible depth of a leaf in a decision tree for a comparison sort?

11. lg(n!) = (nlgn) • n! ≤ nn  lg(n!) ≤ nlgn lg(n!) = O(nlgn) • n! ≥ 2n  lg(n!) ≥ nlg2=n  lg(n!) = Ω(n)  n ≤lg(n!) ≤ nlgn • We need a tighter lower bound! • UseStirling’s approximation (3.18):

12. Counting Sort • Assumptions: • Sort n integers which are in the range [0 ... r] • r is in the order of n, that is, r=O(n) • Idea: • For each element x, find the number of elements ≤ x • Place x into its correct position in the output array ( ) output array

13. Step 1 Find the number of times A[i] appears in A Allocate C[1..r] (histogram) For 1 ≤ i ≤ n, ++C[A[i]} (i.e., frequencies/histogram)

14. Step 2 Find the number of elements ≤ A[i] (i.e. cumulative sums)

15. Algorithm • Start from the last element of A • Place A[i] at its correct place in the output array • Decrease C[A[i]] by one

16. A 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 C C C C C C 2 1 1 2 2 1 2 2 0 2 2 2 4 4 4 2 3 4 3 5 6 6 5 7 7 7 7 0 7 7 8 8 8 1 8 8 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 2 0 0 0 5 3 0 2 2 3 3 3 3 3 3 0 3 3 B B B B Example (cumulative sums) (frequencies)

17. A 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 C C C 0 0 0 2 2 2 3 3 3 4 4 5 7 7 7 8 7 8 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 2 0 0 0 0 0 0 0 0 5 2 3 2 2 0 2 2 3 2 3 3 3 3 3 3 3 3 3 0 3 3 5 5 3 B B B B Example (cont.)

18. j 1 n Alg.: COUNTING-SORT(A, B, n, r) • for i ← 0to r • do C[ i ] ← 0 • for j ← 1to n • do C[A[ j ]] ← C[A[ j ]] + 1 % C[i] contains the number of elements =i ; frequencies • for i ← 1to r • do C[ i ] ← C[ i ] + C[i -1] % C[i] contains the number of elements ≤i ; cumulative sum • for j ← ndownto1 • do B[C[A[ j ]]] ← A[ j ] % B[.] contains sorted array • C[A[ j ]] ← C[A[ j ]] – 1 A 0 r C 1 n B

19. Analysis of Counting Sort Alg.: COUNTING-SORT(A, B, n, k) • for i ← 0to r • do C[ i ] ← 0 • for j ← 1to n • do C[A[ j ]] ← C[A[ j ]] + 1 • for i ← 1to r • do C[ i ] ← C[ i ] + C[i -1] • for j ← ndownto1 • do B[C[A[ j ]]] ← A[ j ] • C[A[ j ]] ← C[A[ j ]] – 1 (r) (n) (r) (n) Overall time: (n + r)

20. Analysis of Counting Sort • Overall time: (n + r) • In practice we use COUNTING sort when r = O(n)  running time is (n) • Counting sort is stable • Counting sort is notin place sort

21. Radix Sort • Represents keys as d-digit numbers in some base-k e.g. key = x1x2...xdwhere 0 ≤ xi ≤ k-1 • Example: key=15 key10 = 15, d=2, k=10 where 0 ≤ xi ≤ 9 key2 = 1111, d=4, k=2 where 0 ≤ xi ≤ 1

22. Radix Sort • 326 • 453 • 608 • 835 • 751 • 435 • 704 • 690 • Assumptions: d=Θ(1) and k =O(n) • Sorting looks at one column at a time • For a d digit number, sort the least significant digit first • Continue sorting on the next least significant digit, until all digits have been sorted • Requires only d passes through the list

23. Radix Sort Alg.: RADIX-SORT(A, d) for i ← 1 to d do use a stable sort to sort array A on digit i • 1 is the lowest order digit, d is the highest-order digit How do things go wrong if an unstable sorting alg. is used?

24. Analysis of Radix Sort • Given n numbers of d digits each, where each digit may take up to k possible values, RADIX-SORT correctly sorts the numbers in (d(n+k)) • One pass of sorting per digit takes (n+k) assuming that we use counting sort • There are d passes (for each digit)  (d(n+k)) Since d=Θ(1) and k =O(n) Therefore, Radix Sort runs in (n) time

25. Conclusions • In the worst case, any comparison sort will take at least nlgnto sort an array of n numbers • We can achieve a O(n) running time for sorting if we can make certain assumptions on the input data: • Counting sort: each of the n input elements is an integer in the range [0 ... r]and r=O(n) • Radix sort:the elements in the input are integers represented with d digits in base-k, where d=Θ(1) and k =O(n)

26. Problem You are given 5 distinct numbers to sort. Describe an algorithm which sorts them using at most 6comparisons, or argue that no such algorithm exists. Solution: Total # of leaves in the comparison tree = 5! If the height of the tree is h, then (total # of leaves ≤ 2h)  2h ≥ 5! h ≥ log2(5!) ≥ log2120 h > 6  There is at least one input permutation which will require at least 7 comparisons to sort. Therefore, no such algorithm exists.