1 / 40

Divide & Conquer Algorithms

Divide & Conquer Algorithms. Part 2. QuickSort. Worst time: ( n 2 ) Expected time: ( n lg n ) Constants in the expected time are small Sorts in place. QuickSort (cont).

Download Presentation

Divide & Conquer Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Divide & Conquer Algorithms Part 2

  2. QuickSort • Worst time: (n2) • Expected time: (nlgn) • Constants in the expected time are small • Sorts in place

  3. QuickSort (cont) • DIVIDE – Partition A[p..r] into two subarrays A[p..q-1] and A[q+1..r] such that each element of A[p..q-1] is  A[q]  each element of A[q+1..r] • Conquer – Sort the two subarrays by recursive calls to Quicksort • Combine – Since subarrays are sorted in place, they are already sorted

  4. QuickSort (cont) To sort entire array: QuickSort( A, 1, length(A) ) QuickSort( A, p, r ) • if p < r • q  Partition( A, p, r ) • QuickSort( A, p, q-1 ) • QuickSort( A, q+1, r )

  5. QuickSort (cont) Partition( A, p, r ) x  A[ r ] i  p – 1 for j  p to r-1 if A[ j ]  x i  i + 1 Exchange( A[ i ], A[ j ] ) Exchange( A[ i+1 ], A[ r ] ) return i+1

  6. QuickSort (cont)

  7. QuickSort (cont) Return i+1 which is 4

  8. Performance of QuickSort • Partition function’s running time - (n) • Running time of QuickSort depends on the balance of the partitions • If balanced, QuickSort is asymptotically as fast as MergeSort • If unbalanced, it is asymptotically as bad as Insertion Sort

  9. Performance of QuickSort (cont) • Worst-case Partitioning • Partitions always of size n-1 and 0 • Occurs when array is already sorted • Recurrence for the running time:

  10. cn cn c(n-1) c(n-1) n c(n-2) c(n-2) c . . . Performance of QuickSort (cont) Total: (n2)

  11. Performance of QuickSort (cont) • Best-case Partitioning • Partitions always of size n/2 and n/2-1 • The recurrence for the running time: Case 2 of Master Method

  12. Performance of QuickSort (cont) • Balanced Partitioning • Average-case is closer to best-case • Any split of constant proportionality (say 99 to 1) will have a running time of (nlgn) • The recurrence will be • Because it yields a recursion tree of depth (lgn), where cost at each level is (n) • See page 151 (new book) for picture • Or next slide

  13. Performance of QuickSort (cont) cn cn log100n c(n/100) c(99n/100) cn c(9801n/10000) c(n/10000) c(99n/10000) c(99n/10000) cn T(1) cn T(1) T(1) T(1) log100/99n  cn Total: (nlgn)

  14. Performance of QuickSort (cont) • Intuition for the average case • The behavior depends on the relative ordering of the values • Not the values themselves • We will assume (for now) that all permutations are equally likely • Some splits will be balanced, and some will be unbalanced

  15. Performance of QuickSort (cont) • In a recursion tree for an average-case, the “good” and “bad” splits are distributed randomly throughout the tree • For our example, suppose • Bad splits and good splits alternate • Good splits are best-case splits • Bad splits are worst-case splits • Boundary case (subarray size 0) has cost of 1

  16. n n (n) (n) 0 n - 1 (n-1)/2 (n-1)/2 (n-1)/2-1 (n-1)/2 Performance of QuickSort (cont) • The (n-1) cost of the bad split can be absorbed into the (n) cost of the good split, and the resulting split is good • Thus the running time is (nlgn), but with a slightly larger constant

  17. Randomized QuickSort (1/30/2019) • How do we increase the chance that all permutations are equally likely? • Random Sampling • Don’t always use last element in subarray • Swap it with a randomly chosen element from the subarray • Pivot now is equally likely to be any of the r – p + 1 elements • We can now expect the split to be reasonably well-balanced on average

  18. Randomized QuickSort (cont) Randomized-Partition( A, p, r ) • i  Random( p, r ) • Exchange( A[ r ], A[ i ] ) • return Partition( A, p, r ) Note that Partition( ) is same as before

  19. Randomized QuickSort (cont) Randomized-QuickSort( A, p, r ) • if p < r • q  Randomized-Partition( A, p, r ) • Randomized-QuickSort( A, p, q-1 ) • Randomized-QuickSort( A, q+1, r )

  20. Analysis of QuickSort • A more rigorous analysis • Begin with worst-case • We intuited that worst-case running time is (n2) • Use substitution method to show this is true

  21. Analysis of QuickSort (cont) • Guess: (n2) • Show: for some c > 0 • Substitute: q2+(n-q-1)2 is max at endpoints of range. Therefore it is • (n-1)2 = n2 – 2n +1

  22. Analysis of QuickSort (cont) • Problem 7.4-1 has you show that • Thus the worst-case running time of QuickSort is (n2)

  23. Analysis of QuickSort (cont) • We will show that the upper-bound on expected running time is (nlgn) • We’ve already shown that the best-case running time is (nlgn) • Combined, these will give an expected running time of (nlgn)

  24. Analysis of QuickSort (cont) • Expected Running Time • Work done is dominated by Partition • Each time a pivot is selected, this element is never included in subsequent calls to QuickSort • And the pivot is in its correct place in the array • Therefore, at most n calls to Partition will be made • Each call to Partition involves (1) work plus the amount of work done in the for loop • Count the total number of times line 4 is executed, we can bound the amount of time spent in the for loop Line 4: if A[j]  x

  25. Analysis of QuickSort (cont) • Lemma 7.1 • Let X be the number of comparisons performed in line 4 of Partition over the entire execution of QuickSort on an n-element array. Then the running time of QuickSort is (n + X) • Proof: • There are n calls to Partition, each of which does (1) work then executes the for loop (which includes line 4) some number of times • Since the for loop executes line 4 during each iteration, X represents the number of iterations of the for loop along with the number of comparisons performed • Therefore T(n) = (n (1) + X) = (n + X)

  26. Analysis of QuickSort (cont) • We need to compute X • We do this by computing an overall bound on the total number of comparisons • NOT by computing the number of comparisons at each call to Partition • Definitions: • z1, z2, …, zn elements in the array • zi  ith smallest element • set Zij = {zi, zi+1, …, zj}

  27. Analysis of QuickSort (cont) • When does the algorithm compare zi and zj? • Note: each pair of elements is compared at most once. Why? • Our analysis uses indicator random variables

  28. Indicator Random Variables • They provide a convenient method for converting between probabilities and expectations • These are random variables which take on only the value 0 or 1, so they “indicate” whether or not something has happened • Indicator Random Variable I{A} is defined as: • Given a sample space S and an event A • I{A} = 1 if A occurs 0 if A does not occur

  29. Indicator Random Variables (cont) • A simple example • Determine the number of heads when flipping a coin • Sample space S = {H, T} • Simple random variable Y • These are random variables whose range contains only a finite number of elements • In this case, it takes on the values H and T • Each with a probability of ½ • XH is associated with the event Y = H • XH = I{Y = H} = 1 if Y = H 0 if Y = T

  30. Indicator Random Variables (cont) • The expected number of heads in one flip is the expected value of our indicator variable XH • Thus the expected number of heads in one flip is ½

  31. Indicator Random Variables (cont) • Lemma 5.1 • Given a sample space S and an event A in the sample space S, let XA = I{A}. Then E[XA] = Pr{A} • See proof on page 95 • To compute the number of heads in n coin flips • Method 1 – compute the probability of getting 0 heads, 1 heads, 2 heads, etc

  32. Indicator Random Variables (cont) • Method 2 • Let Xi be the indicator random variable associated with the event “the ith flip is heads” • Let Yi be the random variable denoting the outcome of the ith flip • Xi= I{Yi = H} • Let X be the random variable denoting the total number of heads in the n coin flips

  33. Indicator Random Variables (cont) • Take the expectation of both sides: By Lemma 5.1

  34. (Back to) Analysis of QuickSort • We will use indicator random variables • Xij = I{zi is compared to zj} • it indicates whether the comparison took place at any time during execution of QuickSort • Since each pair is compared at most once:

  35. Analysis of QuickSort (cont) • Take expectation of both sides

  36. Analysis of QuickSort (cont) • We still need to compute Pr{ziis compared to zj} • Start by thinking about when two items are not compared • once a pivot x is chosen with zi < x < zj • zi and zj will never be compared • if zi is the first pivot chosen in Zij • zi will be compared to every other element in Zij • Similarly for zj • Thus, zi and zj are compared iff the first pivot from Zij is either zi or zj

  37. Analysis of QuickSort (cont) • What is the probability that this event occurs? • Before a pivot has been chosen from Zij, all elements of Zij are in the same partition • Each element of Zij is equally likely to be chosen as the first pivot • the probability is

  38. Analysis of QuickSort (cont) • Thus we have Because the two events are mutually exclusive

  39. Bound on Harmonic Series: Analysis of QuickSort (cont) Change of variables: k = j – i Note the changes in the summation variables • Combining the two boxed equations

  40. Analysis of QuickSort (cont) • Thus, using Randomized-Partition, the expected running time of QuickSort is (nlgn)

More Related