 Download Presentation Sorting

# Sorting - PowerPoint PPT Presentation Download Presentation ## Sorting

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Sorting • We have actually seen already two efficient ways to sort:

2. A kind of “insertion” sort • Insert the elements into a red-black tree one by one • Traverse the tree in in-order and collect the keys • Takes O(nlog(n)) time

3. Heapsort (Willians, Floyd, 1964) • Put the elements in an array • Make the array into a heap • Do a deletemin and put the deleted element at the last position of the array

4. Put the elements in the heap 79 65 26 19 15 29 24 23 33 40 7 79 65 26 24 19 15 29 23 33 40 7 Q

5. Make the elements into a heap 79 65 26 19 15 29 24 23 33 40 7 79 65 26 24 19 15 29 23 33 40 7 Q

6. Make the elements into a heap Heapify-down(Q,4) 79 65 26 19 15 29 24 23 33 40 7 79 65 26 24 19 15 29 23 33 40 7 Q

7. Heapify-down(Q,4) 79 65 26 7 15 29 24 23 33 40 19 79 65 26 24 7 15 29 23 33 40 19 Q

8. Heapify-down(Q,3) 79 65 26 7 15 29 24 23 33 40 19 79 65 26 24 7 15 29 23 33 40 19 Q

9. Heapify-down(Q,3) 79 65 26 7 15 29 23 24 33 40 19 79 65 26 23 7 15 29 24 33 40 19 Q

10. Heapify-down(Q,2) 79 65 26 7 15 29 23 24 33 40 19 79 65 26 23 7 15 29 24 33 40 19 Q

11. Heapify-down(Q,2) 79 65 15 7 26 29 23 24 33 40 19 79 65 15 23 7 26 29 24 33 40 19 Q

12. Heapify-down(Q,1) 79 65 15 7 26 29 23 24 33 40 19 79 65 15 23 7 26 29 24 33 40 19 Q

13. Heapify-down(Q,1) 79 7 15 65 26 29 23 24 33 40 19 79 7 15 23 65 26 29 24 33 40 19 Q

14. Heapify-down(Q,1) 79 7 15 19 26 29 23 24 33 40 65 79 7 15 23 19 26 29 24 33 40 65 Q

15. Heapify-down(Q,0) 79 7 15 19 26 29 23 24 33 40 65 79 7 15 23 19 26 29 24 33 40 65 Q

16. Heapify-down(Q,0) 7 79 15 19 26 29 23 24 33 40 65 7 79 15 23 19 26 29 24 33 40 65 Q

17. Heapify-down(Q,0) 7 19 15 79 26 29 23 24 33 40 65 7 19 15 23 79 26 29 24 33 40 65 Q

18. Heapify-down(Q,0) 7 19 15 40 26 29 23 24 33 79 65 7 19 15 23 40 26 29 24 33 79 65 Q

19. Summery • We can build the heap in linear time (we already did this analysis) • We still have to deletemin the elements one by one in order to sort that will take O(nlog(n))

20. Quicksort (Hoare 1961)

21. quicksort Input: an array A[p, r] Quicksort (A, p, r) if (p < r) then q = Partition (A, p, r) //q is the position of the pivot element Quicksort(A, p, q-1) Quicksort(A, q+1, r)

22. p r j i 2 8 7 1 3 5 6 4 j i 2 8 7 1 3 5 6 4 j i 2 8 7 1 3 5 6 4 j i 2 1 7 8 3 5 6 4 i j 2 8 7 1 3 5 6 4

23. j i 2 1 7 8 3 5 6 4 j i 2 1 3 8 7 5 6 4 j i 2 1 3 8 7 5 6 4 j i 2 1 3 8 7 5 6 4 j i 2 1 3 4 7 5 6 8

24. 2 8 7 1 3 5 6 4 r p Partition(A, p, r) x ←A[r] i ← p-1 for j ← p to r-1 do if A[j] ≤ x then i ← i+1 exchange A[i] ↔ A[j] exchange A[i+1] ↔A[r] return i+1

25. Analysis • Running time is proportional to the number of comparisons • Each pair is compared at most once  O(n2) • In fact for each n there is an input of size n on which quicksort takes cn2  Ω(n2)

26. But • Assume that the split is even in each iteration

27. T(n) = 2T(n/2) + bn How do we solve linear recurrences like this ? (read Chapter 4)

28. Recurrence tree bn T(n/2) T(n/2)

29. Recurrence tree bn bn/2 bn/2 T(n/4) T(n/4) T(n/4) T(n/4)

30. Recurrence tree bn bn/2 bn/2 logn T(n/4) T(n/4) T(n/4) T(n/4) In every level we do bn comparisons So the total number of comparisons is O(nlogn)

31. Observations • We can’t guarantee good splits • But intuitively on random inputs we will get good splits

32. Randomized quicksort • Use randomized-partition rather than partition Randomized-partition (A, p, r) i ← random(p,r) exchange A[r] ↔ A[i] return partition(A,p,r)

33. On the same input we will get a different running time in each run ! • Look at the average for one particular input of all these running times

34. Expected # of comparisons Let X be the expected # of comparisons This is a random variable Want to know E(X)

35. Expected # of comparisons Let z1,z2,.....,zn the elements in sorted order Let Xij = 1 if zi is compared to zj and 0 otherwise So,

36. by linearity of expectation

37. by linearity of expectation

38. Consider zi,zi+1,.......,zj ≡ Zij Claim: zi and zj are compared  either zi or zj is the first chosen in Zij Proof: 3 cases: • {zi, …, zj} Compared on this partition, and never again. • {zi, …, zj} the same • {zi, …, zk, …, zj} Not compared on this partition. Partition separates them, so no future partition uses both.

39. just explained = Pr{zi or zj is first pivot chosen from Zij} = Pr{zi is first pivot chosen from Zij} + Pr{zj is first pivot chosen from Zij} mutually exclusive possibilities Pr{zi is compared to zj} = 1/(j-i+1) + 1/(j-i+1) = 2/(j-i+1)

40. Simplify with a change of variable, k=j-i+1. Simplify and overestimate, by adding terms.

41. Lower bound for sorting in the comparison model

42. A lower bound • Comparison model: We assume that the operation from which we deduce order among keys are comparisons • Then we prove that we need Ω(nlogn) comparisons on the worst case

43. 1 2 1 2 2 1 1 2 3 3 2 1 3 1 3 2 2 1 2 1 2 3 3 1 1 2 1 2 3 2 3 1 1 2 3 3 Model the algorithm as a decision tree 1

44. Important Observations • Every algorithm can be represented as a (binary) tree like this • Each path corresponds to a run on some input • The worst case # of comparisons corresponds to the longest path

45. The lower bound Let d be the length of the longest path n! ≤ #leaves ≤ 2d log2(n!) ≤d

46. Lower Bound for Sorting • Any sorting algorithm based on comparisons between elements requires (N log N) comparisons.

47. Beating the lower bound • We can beat the lower bound if we can deduce order relations between keys not by comparisons Examples: • Count sort • Radix sort

48. Linear time sorting • Or assume something about the input: random, “almost sorted”

49. Sorting an almost sorted input • Suppose we know that the input is “almost” sorted • Let I be the number of “inversions” in the input: The number of pairs ai,ajsuch that i<j and ai>aj

50. Example 1, 4 , 5 , 8 , 3 I=3 I=10 8, 7 , 5 , 3 , 1