580 likes | 714 Views
This presentation discusses randomized selection algorithms that can determine the C-th order statistic in O(n) comparisons with high probability. We explore the BFPRT algorithm, which improves upon previous methods by carefully analyzing performance and reducing complexity. The algorithm includes dividing items into groups to find medians, recursively applying selection strategies, and establishing upper and lower bounds on the sizes of subsets. These probabilistic techniques offer efficient solutions to selection problems, particularly in large datasets.
E N D
2004 spring Randomized Algorithmspresentation B90902036許 平B90902050郭煜楓B90902057陳文祥
Randomized Selection in n+C+o(n)comparisons • Alexandros V. Gerbessiotis CS Department, New Jersey Institute of Technology, Newark • Constantinos J. Siniolakis The American College of Greece, 6 Gravias St., Aghia Paraskevi, Athens, 15342, Greece. Randomzied Algorithm
Part 1 Introduction
Introduction • With high probability for any constantρ > 1 requires n + C + o(n) comparisons to determine the C-th order statistic of n keys • Reduced by extending the algorithm of Floyd and Rivest and analyzing its resulting performance more carefully Randomzied Algorithm
Selection in O(n) : BFPRT algorithm • M. Blum • R. W. Floyd • V. R. Pratt • R. L. Rivest • R. E. Tarjan • 1973 • T(n) < c‧n Randomzied Algorithm
BFPRT algorithm • 1. Divide the items into floor(n/5) groups of5itemseach. Last group can be smaller. • 2. Find the median of each group(usingsorting). • 3. Use SELECT to recursively find themedian of the floor(n/5) group medians. Randomzied Algorithm
BFPRT algorithm • 4. Partition the input by using thismedian-of-median as pivot. • 5. Suppose low side of the partition has selements, and high side has n - s elements. • 6. If k ≤s, recursively call SELECT(k) onlow side; otherwise, recursively callSELECT(k - s) on high side. Randomzied Algorithm
BFPRT algorithm • Analysis : • x = pivot • At least half the medians are ≥ x • At least half of the floor(n/5) groups contributeat least3 items tothe high side. • |Item|≥ x are at least Randomzied Algorithm
BFPRT algorithm • Recursive call to SELECT is on sizeat most7n/10 + 6. • Step 3 has a recursive call T(n/5), and • Step 5 has a recursive call T(7n/10 + 6). Randomzied Algorithm
BFPRT algorithm • Inductively verify that T(n) ≤ cnfor scome constant c • T(n) ≤ c(n/5) + c(7n/10 + 6) + O(n) ≤ 9cn/10 + 6c + O(n) ≤ cn • In above, choose c so that c(n/10 - 6) beatsthe function O(n) for all n. Randomzied Algorithm
Part 1.5 The scenario of our random selection algorithm
Step 1. Random Sampling • 1. Let Y be randomly chosen subset of size2λ -1 < n/2 from X • 2. sort y in little time ! |X| = n |Y| = 2λ - 1 Randomzied Algorithm
Step 2. 左護法 右護法 • Assume : 2cλ > μ/2 for λ≧1, μ≧1 • L = y2cλ- μ/2 • R = y2cλ+ μ/2 • X0 = { x | x ∈ X and x < L } • X1 = { x | x ∈ X and L < x < R } • X2 = { x | x ∈ X and x > R } Randomzied Algorithm
Part 2 Claim
Sampling result X Y X0 X1 X2 Randomzied Algorithm
Probabilistic express • |X0| = n0 , |X2| = n2 • P0(j) : the probability that n0 = j • P2(j) : the probability that n2 = j Randomzied Algorithm
Claim 1 Randomzied Algorithm
That is … • P0(j) : 左護法 = j • P2(j) : 右護法 = N – J – 1 X0 左 X1 右 X2 Randomzied Algorithm
That is … |X| = n |Y| = 2λ - 1 Randomzied Algorithm
That is … L = y 2cλ- μ/2 X0 = j Select L-1 numbers into X0 from y Randomzied Algorithm
That is … = Randomzied Algorithm
Our goal • Establishing lower and upper bounds on the sizes of X0 and X2 , we guarantee that therequired statistic is located in set X1. • Proving that the probability of having such a set of size less than cn– f(n) for some functionf(n)of n is negligible Randomzied Algorithm
Proposition 1 • Proposition 1 • Let C ≤ N/2 , c = C/N . λ≥ 1, μ≥ 6, 2λ-1 < N/2 n ≥ N, 2cλ>μ/2, ρ>1 if f(N) = (Nμ) / ( 2λ) and μ= 2 ( 3ρ2cλlogn ) ^ 0.5 prove P( |X0| > cN + f(N) – 1 ) ≥ 1- n^(1- ρ) Randomzied Algorithm
Proposition 1 • Def Randomzied Algorithm
Proposition 1 • 證明 剩下證明 Randomzied Algorithm
Proposition 1 • 首先可以知道 接下來要用一個特殊的方法來算出上面那條等式的上界 Randomzied Algorithm
Proposition 1 • 觀察下列這道等式 可以看出 可以被視作為從 N 個物件裡, 取出 2λ-1個, 其中第 2cλ-μ/2個最大是 B 的機率 (…附註1) Randomzied Algorithm
附註1 Concrete Math, p169, (5.26) l = N – 1, k = j, q = 0 m = 2(1-c)λ+μ/2 -1 n = 2cλ-μ/2 -1 Randomzied Algorithm
Proposition 1 • Let q = ( 2λ-1 ) / N . 在 N 的數中選出 2λ-1 的數, 每個數被選中的機率. 定義 p.s 前 j 個數中取 2cλ-μ/2-1 個數 取第 j+1 個數 , 一定是 2cλ-μ/2 th in sorted X 後 N-j-1 的數中取 2(1-c)λ+μ/2-1 個數 Randomzied Algorithm
Lemma1 Let Randomzied Algorithm
Lemma1-proof B.Bollobas. Random Graphs. Academic Press, New York, 1984 d = N, c = 2λ-1, a = q = (2λ-1)/N b = 1 - a Randomzied Algorithm
lemma2 我們要使用 generating function 的方法, 來證上面的式子 將上面那項當作是 多項式 ((1-q)+qx)^j 的 x^(2cλ-μ/2 -1) 係數 Randomzied Algorithm
lemma2-proof 那我們要求的方程式就會是 的 x^(2cλ-μ/2 -1) 係數 Randomzied Algorithm
lemma2-proof D. Angluin and L. G. Valiant. Fast probabilistic algorithms for hamiltonian circuits and matchings. Journal of Computer and System Sciences, 18:155-193,1979 Randomzied Algorithm Set 2cλ-μ/2 = ( 1+β)(B+1)q … β< 1 ? 見 lemma 3 可以得到hint
lemma3 Let f(N) = Nμ/(2λ) 因為c=C/N>0 可以省略 Randomzied Algorithm
Proposition 1 (…lemma1) (…lemma2 ) (…lemma1) (…lemma4) (後面兩項可以省略 因為都小於 1) 因為2λ-1<N/2 , 所以前面兩項相乘 < n Randomzied Algorithm
lemma3-review Let f(N) = Nμ/(2λ) 因為c=C/N>0 可以省略 Randomzied Algorithm
lemma4 Randomzied Algorithm
Lemma4-proof 剩下證 Randomzied Algorithm
Claim 2 • If Then • Proof : • Similarly inequation as Proposition 1, set f(n)=0, B = cn => Randomzied Algorithm
Claim 3 • If Then • Proof : • Symmetric as Claim 2 Randomzied Algorithm
Claim 4 • If Then • Proof : • Similarly inequation as Proposition 1 Randomzied Algorithm
Claim 5 • If Then • Proof : • Similarly inequation as Proposition 1 • c -> (1-c) • Same lower bound for μ: Randomzied Algorithm
Part 3 Algorithm
begin SELECT (X, C) letc = C/n, λ= , such that let select randomly a sample denote by if ( ) then let else let let fori = 1 to n do if ( ) then Line 1 ~ 11
let else if ( ) then let else let if ( ) then return SELECT_DET( ); else reexecute procedure SELECT; Line 12~20
Step 1: Sampling Original input : n numbers X A sample of 2λ+1 numbers S
if ( ) let else let let Line 5~8 Randomzied Algorithm
Partition S using m and M m C M Rl Rr S m M Randomzied Algorithm