1 / 25

New discoveries during the exploration of sorting: How I got my thesis topic

New discoveries during the exploration of sorting: How I got my thesis topic. By Spencer Morgan Reference: Algorithms, Sequential, Parallel, and Distributed by K. Berman & J. Paul, Thompson Course Technology, 2005. Quicksort.

asha
Download Presentation

New discoveries during the exploration of sorting: How I got my thesis topic

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. New discoveries during the exploration of sorting:How I got my thesis topic By Spencer Morgan Reference: Algorithms, Sequential, Parallel, and Distributed by K. Berman & J. Paul, Thompson Course Technology, 2005

  2. Quicksort • Partitions around an element and recursively sorts the partitioned sets • Efficient in the average case on large data sets • Thought to be not as efficient as insertion sort on small data sets • some implementations of Quicksort will switch to insertion sort when the partition size is small

  3. Trying to improve the secondary sort Use a binary search to identify where the element will end up to reducing the number of comparisons. BPInsert

  4. Move the search to the center • The average location in the list is in the middle. • Keeping the sorted area in the center will lessen the amount of movement (or assignments). • But elements near the center move back and forth often occupying temporarily their resulting location SMInsert

  5. Treesort Keep track of where elements need to go to avoid unnecessary moves back and forth. What resulted is a version of treesort where: • elements are added to a tree, and, • a map of where each element will end up is created, and • the map is processed putting the elements where they will end up

  6. A major improvement Separating the assignments from the comparisons allows the two processes to be analyzed and improved independently.

  7. Lower bound for worst-case complexity of comparison Lower bound for worst-case complexity of compares of any comparison-based sorting algorithm is log2n! or Ω(n log n). Proposition 3.5.4 page 99 of Algorithms by Berman & Paul.

  8. Lower bound for worst-case complexity of assignments • Lower bound for worst-case complexity of assigns of any in-place sorting algorithm is Integer(3n/2). • Worst-case will have all elements out of place • To help conceptualize this, let’s consider different cases

  9. 2 Elements 1 2 1 2 If there are two elements, three assignments are required: One to temp, one direct, and one from temp. 1 3 2 1 1 to temp 1 direct 1 from temp

  10. 3 Elements 1 2 2 3 1 3 With three elements four assignments are required: One to temp, two direct, and one from temp 4 1 2 3 1 1 to temp 2 direct 1 direct 1 from temp

  11. Define Circuit • I define a circuit as: two or more out-of-place elements that that can be put into place with only one assignment to and from a temporary location • If there are n elements in a circuit, the optimal number of assignments will be n+1 to put them in place • The 2 & 3 element cases are each 1 circuit

  12. Lower bound for assignments • The lower bound for assignments is the number of elements out of place + the number of circuits.

  13. More Elements With 4 elements out of place the worst-case is when there are 2 circuits of 2 elements (requiring 6 assignments). The worst case for assignments is when: • all elements are out of place, and • the number of circuits is maximized (because each circuit requires an extra assignment).

  14. Lower bound for worst-case complexity of assignments • The maximum number of circuits for any data set is Integer(n/2). • The lower bound for assignments in the worst-case is n + Integer(n/2) or Integer(3n/2). • This is significantly less than the lower bound of comparisons (n log n).

  15. Equal elements Equal elements can allow more efficient assignments if the sort is not stable. This means equal elements do not have to keep the same position relative to each other. The stable ordering has 2 circuits: (21,1) and (31,32,22) which requires 7 assignments to put in-place Spencer-Stable w/5 21 1 31 32 22 1 21 22 31 32

  16. Improvements If some elements are already in a valid location (32 in our example) there is no need to move them. So we could leave them where they are. This unstable ordering has 2 circuits: (21,1) and (31, 22) which requires 6 assignments to put in-place Spencer-Unstable w/5 21 1 31 32 22 1 21 22 32 31

  17. Optimal Assignments Connecting circuits among equal elements will reduce comparisons This unstable ordering has 1 circuit: (21,1,31, 22) which requires 5 assignments Spencer-Optimal w/5 21 1 31 32 22 1 22 21 32 31

  18. Possible Problems • Optimal assignments have been attained • But treesort has worst-case comparisons of order n2

  19. Alter the order of entry Add the center element and recursively add the left and right elements (the tree will be balanced with ordered data sets). Worst-case complexity is still n2 comparisons; but the chances of having a data set like that in practice are reduced. Spencer not sequential

  20. Splay Tree A splay tree can make comparisons more efficient (with even partially ordered data) by doing rotations to move the most recently accessed element to the root. But this can result in n2 comparisons and rotations. Splay Killer

  21. Splay 1 If only the first element is rotated some patterns can still be identified but with half the compares in our extreme case. But this is still poor performance and only has benefits over a regular tree in isolated cases. Splay1

  22. Splay/2 • only rotates every other node • allows greater restructuring of the splay tree than Splay1 but only half as much as Splay Splay/2

  23. Is the tree necessary? • Since using the tree was what originally allowed me to dissociate the assignments for the comparisons, to this point, I have focused on using them • But since the comparisons are now isolated from the assignments that constraint is not necessary

  24. Mergesort • Use a non-in-place (linked list) version of mergesort as the comparison step • But this version of mergesort is stable and does not provide the necessary information about equal elements to achieve optimal assignments (or comparisons) BPMerge & Spencer-BPMerge

  25. Improved Mergesort • Use a version of mergesort that keeps track of equal elements • If all elements are unique, it will have the same number of comparisons • If there are equal elements, there can be a reduction of comparisons to as little as n-1. SMerge

More Related