1 / 14

Medians and Order Statistics

Medians and Order Statistics. i-th order statistic: i-th smallest element n elements: median is n odd: (n+1)/2 n even: n/2 or n/2+1 Assume distinct numbers. Input : A, n, 1<=i<=n Output : element x of A larger than i-1 elements of A. Solutions. O(n log n) time based on …

susan
Download Presentation

Medians and Order Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Medians and Order Statistics • i-th order statistic: i-th smallest element • n elements: median is • n odd: (n+1)/2 • n even: n/2 or n/2+1 • Assume distinct numbers. • Input: A, n, 1<=i<=n • Output: element x of A larger than i-1 elements of A.

  2. Solutions • O(n log n) time based on … • O(n) time average. • O(n) time worst case.

  3. Minimum and Maximum • How many comparisons? • At most n-1. • Examine each element and keep trach of smallest one: • Comparison based • Each element must be compared • Each must loose once (except winner). • What about simultaneous min and max?

  4. Min & Max • Can do with 2n-2 comparisons. • Can do better • Form pairs of elements • Compare elements in each pair • Pair (ai, ai+1), assume ai < ai+1, then • Compare (min,ai), (ai+1,max) • 3 comparisions for each pair.

  5. Average Time Median Selection • Divide-and-Conquer (prune-and-search). • Randomized: behavior determined by output of random number generator. • Based on QuickSort: • Partition input array recursively, but • Work only on one side!

  6. QuickSort(A,p,r) If p < r then q=partition(A,p,r) QuickSort(A,p,q) QuickSort(A,q+1,r). First call: QuickSort(A,1,n) After partition(A,p,r): A[i]<A[q}, i<q; A[q]<A[j}, q<j. RandSelect(A,p,r,i) If p == r then return A[p] q=RandPartition(A,p,r) k=q-p+1 /* size of A[p..q] If i ≤ k then return RandSelect(A,p,q,i) Else return RandSelect(A,q+1,r,i-k). First call: RandSelect(A,1,n,i). Returns the i-th smallest element in A[p..r]. Randomized Selection

  7. Selection (cont.) • RandPartition (see 8.3, 8.4 textbook) gives partition with low side: • 1 element with probability 2/n • j elements with probability 1/n, for j=2,3,…,n. • Assume i-th element always on larger side: T(n)≤(T(max(1,n-1)+Σk=1..n-1T(max(k,n-k)))/n+O(n) ≤(T(n-1)+2 Σk=n/2..n-1T(k))/n+O(n) =2(Σk=n/2..n-1T(k))/n+O(n), since T(n-1)=O(n2). • Then T(n)=O(n) (proof by substitution).

  8. Worst Case Linear Time Selection • O(n) worst case algorithm. • Works in similar way: recursively partition input array • Idea: guarantee good split • E.g., in QuickSort assume at each recursion level have T(n)=T(9n/10)+T(n/10)+O(n). • Then, T(n)=O(n log n). • Use deterministic partitioning: • Compute the element to partition around.

  9. Steps to find i-th smallest elementAlgorithm Select • Divide elements in n/5 groups of 5 elements, plus at most one group with (n mod 5) elements. • Find median of each group: • Insertion sort: O(1) time (at most 5 elements). • Take middle element (largest if two medians). • Use Select recursively to find median x of medians.

  10. Algorithm Select (cont.) • Partition input array around median-of-medians x. Let k be the number of elements on low side, n-k on high side. • a1,a2,…,ak | ak+1,ak+2,…,an • ai < aj, for 1 ≤ i ≤ k, k+1 ≤ j ≤ n. • Use Select recursively to: • Find i-th smallest element on low side, if i ≤ k • Find (i-k)-th smallest on high side, if i > k.

  11. Analysis • Find lower bound on number of elements greater than x. • At least half of medians in step 2 greater than x. Then, • At least half of the groups contribute 3 elements that are greater than x, except: • Last group (if less than 5 elements); • x own group. • Discard those two groups: • Number of elements greater than x is ≥ 3((n/5)/2-2)=3n/10-6. • Similarly, number of elements smaller than x is ≥3n/10-6. • Then, in worst case, Select is called recursively in Step 5 on at most 7n/10+6 elements (upper bound).

  12. Analysis (cont.) • Steps 1,2 and 4: O(n) time. • Step 3: T(n/5) • Step 5: at most T(7n/10+6) • 7n/10+6 < n for n > 20. • T(n) ≤ T(|¯n/5¯|)+T(7n/10+6)+O(n), n > n1. • Use substitution to solve: • Assume T(n) ≤ cn, for n > n1; find n1 and c.

  13. Analysis (cont.) • T(n) ≤ c|¯n/5¯| + c(7n/10+6) + O(n) ≤ cn/5 + c + 7cn/10 + 6c +O(n) = 9cn/10 + 7c + O(n) • Want T(n) ≤ cn: • Pick c such that c(n/10-7) ≥ c1n, where c1 is constant from O(n) above (n1 = 80).

  14. Questions • Why not groups of 7 elements? • Why not groups of 3 elements? • T(n)=O(?)

More Related