Sorting

Sorting • In the sorting problem, we are: • given a collection C of n elements that can be compared according to a total order relation • the task is to rearrange the elements in C in increasing (or at least non-decreasing if there are ties) order. Complexity of Algorithms

Priority Queue • Priority queue is a container of elements, each having an associated key • keys determine the ’’priority’’ used to pick elements to be removed • PQ fundamental methods • insertItem(k,e) : insert e to PQ • removeMin(k) : remove min. element • minElement() : return min. element • minKey() : return key of min. el. Complexity of Algorithms

PQ-Sorting • In the first phase we put the elements of C into an initially empty priority queue P by means of a series of ninsertItem operations • In the second phase, we extract the elements from P in non-decreasing order by means of series of nremoveMin operations Complexity of Algorithms

PQ-Sorting (pseudo code) Complexity of Algorithms

Heap Data Structure • A heap is a realisation of PQ that is efficient for both insertions and removals • heap allows to perform both insertions and removals in logarithmic time • In heap the elements and their keys are stored in (almost complete) binary tree Complexity of Algorithms

Heap-Order Property • In a heapT, for every nodev other than the root, the key stored at v is greater than (or equal) to the key stored at its parent Complexity of Algorithms

PQ/Heap Implementation • heap: complete binary treeT containing elements with keys satisfying heap-order property; implemented using a vector representation • last: reference to the last used node of T • comp: comparator that defines the total order relation on keys and maintains the minimum element at the root of T Complexity of Algorithms

PQ/Heap Implementation Complexity of Algorithms

Up-Heap Bubbling (insertion) Complexity of Algorithms

Down-Heap Bubbling (removal) Complexity of Algorithms

Heap Performance Complexity of Algorithms

Heap-Sorting • Thm: The heap-sort algorithm sorts a sequence of S of n comparable elements in O(n log n) time, where • Bottom-up construction of heap with n items takes O(n) time, and • Extraction of n elements (in increasing order) from the heap takes O(n log n) time Complexity of Algorithms

Divide-and-Conquer • Divide: if the input size is small then solve the problem directly; otherwise divide the input data into two or more disjoint subsets • Recur:recursively solve the sub-problems associated with the subsets • Conquer: take the solutions to the sub-problems and merge them into a solution to the original problem Complexity of Algorithms

Merge-Sorting • Divide: if input sequence S has 0 or 1 element then return S; otherwise split S into two sequences S1 and S2, each containing about ½ elements of S • Recur:recursively sort sequences S1 and S2 • Conquer: Put the elements back into S by merging the sorted sequences S1 and S2 into a single sorted sequence Complexity of Algorithms

Merge-Sorting Complexity of Algorithms

Merge-Sorting (example) Complexity of Algorithms

Merge-Sorting (analysis) • Thm:merging two sorted sequencesS1 and S2 takes O(n1+n2) time, where n1 is the size of S1 and n2 is the size of S2 (see comp108 notes) • Thm:Merge-sort runs in O(n log n) time in the worst (and average) case Complexity of Algorithms

Merge-Sorting (analysis) Complexity of Algorithms

Merge-Sort (recurrence eq.) • Worst-case running time of merge-sort t(n) can be expressed by recurrence equation: • Assuming that n is a power of 2 we get: • t(n) = 2(2t(n/22) + (cn/2)) + cn = 22t(n/22) + 2cn = … = 2it(n/2i) + icn = O(n log n), for i=log n (closed form) Complexity of Algorithms

Quick-Sort • Divide: if ¦S¦>1, select a pivotx in S and create three sequences: L, E and G, s.t., • L stores elements in S < x • E stores elements in S = x • G stores elements in S > x • Recur: recursively sort sequences L & G • Conquer: put sorted elements from L, E and finally from G back to S. Complexity of Algorithms

Quick-Sort Tree Complexity of Algorithms

Quick-Sort (example) Complexity of Algorithms

Quick-Sort (worst case) • Let si be the sum of the input sizes of the nodes at depth i in a quick sort tree T • si  n-i (and si = n-i when use of pivots lead always to only one nonempty sequence: either L or G) • The worst-case complexity is bounded by: • which is O(n2). Complexity of Algorithms

Quick-Sort (randomised algorithm) • Thm: the expected running time of randomised (pivot is chosen in random) quick-sort is O(n log n) • Proof: • Fact: the expected number of times that a fair coin must be flipped until it shows heads k times is 2k. • We say that a random chosen pivot is right if neither of the groups L nor G is > ¾ ¦S¦ • The probability of a success in choosing a right pivot is ½ • Any path in the quick-sort tree can contain at most log4/3 n nodes with right pivots • Hence, the expected length of each path is 2log4/3 n Complexity of Algorithms

Quick-Sort (randomised algorithm) Complexity of Algorithms

Lower Bound (comparison-based model) • In comparison-based model the input elements can be compared only with themselves and the result of each comparison xi xj is always yes or no • Thm: the running time of any comparison-based sorting algorithm is (n log n) in the worst case • Proof: • Sorting of n elements can be identified with recognising a particular permutation of n elements • There is n!=n·(n-1) ·…·2·1 permutations of n elements • Each comparison splits a group of permutations into two groups (one that satisfies the inequality and one that doesn’t) • In order to ensure that the size of each group of permutations is brought down to one we need log2(n!) > log (n/2)n/2=n/2·log n/2 = (n log n) comparisons Complexity of Algorithms

Lower Bound (comparison-based model) Complexity of Algorithms

Sorting in Linear Time (bucket-sort) • Bucket-sort is not based on comparisons but rather on using keys as indices of a bucket array B that has entries within an integer range [0,…,N-1] • Initially all items from input sequence S are moved to appropriate buckets, i.e., an item with key k is moved to bucket B[k] • Then we move all items back into S according to their order of appearance in consecutive buckets B[0], B[1], …, B[N] Complexity of Algorithms

Sorting in Linear Time (bucket-sort) Complexity of Algorithms

Selection • In selection problem we are interested in identifying a single element in terms of its rank relative to an ordering of the entire set • Examples include identifying the minimum and the maximum elements, but we may be also interested in identifying the median or general kth element • The selection problem can be solved with a help of efficient sorting algorithm in time O(n log n) • However, the selection problem can be solved in time O(n) using more accurate prune-and search (decrease-and-conquer) method Complexity of Algorithms

Prune-and-Search • In prune-and-search method we solve a given problem by pruning away a fraction of input objects and recursively solving a smaller problem • When the problem is reduced to constant size it is solved by some brute-force method • The solution to the original problem is completed by returning back from all the recursive calls Complexity of Algorithms

Randomised Quick-Select • Prune: pick an element x from S at random and use it as a pivot to subdivide S into three groups L, E and G, where • L stores elements in S < x • E stores elements in S = x • G stores elements in S > x • Search: based on the value of k, we determine on which of these sets to recur Complexity of Algorithms

Randomised Quick-Select Complexity of Algorithms

Selection - complexity • Thm: the expected running time of randomised quick-select on a sequence of size n is O(n) • Thm: there exists a deterministic algorithm for a selection problem that works (in the worst-case) in time O(n) Complexity of Algorithms

Sorting

Sorting

Presentation Transcript

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting

Sorting