140 likes | 276 Views
Randomized QS. Ch5, CLRS Dr. M. Sakalli, Marmara University Picture 2006, RPI. Randomized Algorithm. Randomize the algorithm so that it works well with high probability on all inputs In this case, randomize the order that candidates arrive
E N D
Randomized QS. Ch5, CLRS Dr. M. Sakalli, Marmara University Picture 2006, RPI
Randomized Algorithm • Randomize the algorithm so that it works well with high probability on all inputs • In this case, randomize the order that candidates arrive • Universal hash functions: randomize selection of hash function to use
Probabilistic Analysis and Randomized Algorithms • Hiring problem • n candidates interviewed for a position • One interview each day, if qualified hire the new, fire the current employed. • Best = Meet the most qualified one at the 1st interview. • for k = 1 to n Pay ci interview cost;. if candidate(k) is better Sack current, bestk; // sack current one and hire(k); Pay ch hiring (and sacking) cost; • Worst-case cost is increased quality: nch + nch, ch> ci, then O(nch) • Best-case, the least cost: ((n-1) ci + ch), prove that hiring problem is (n) • Suppose applicants arriving in a random qualification order and suppose that the randomness is equally likely to be any one of the n! of permutations 1 through n. • A uniform random permutation.
Probabilistic Analysis and Randomized Algorithms • An algorithm is randomized, not just if its behaviors is controlled by but if the output is generated by a random-number generator. • C rand(a,b), with boundaries are inclusive.. Equally likely (probable) outputs of X = {Xi} for i=1:n. • Expected value: E[X] = Σin xi * (probability density of each x) = E[X] = Σin xi Pr{x=xi} • In the hiring case, • Defining an indicator, Indicator random variable, Xi = I { indicator of candidate if hired (or of coin), 1(H) or 0(T)}
H and T, three fair coins, head 3$s, every tail 2$s, and expected value of earnings. • HHH = 9$s, 1/8, • HHT = 4$s, 3/8, • HTT = -1$s, 3/8, • TTT = -6$s, 1/8 • E[earnings]= 9/8+12/8-3/8-6/8=12/8=1.5
Probabilistic Analysis and Randomized Algorithms • Lemma 1. Given a sample space S, and an event e in the sample space, let Xe=I(e)be indicator of occurrence of e, then, E[Xe]=Pr{e}. • Proof: Fr the definition of expected value, • E[XA] = Σen xe Pr{x=e} = 1* Pr{Xe} + 0 * Pr{notXe} = Pr{Xe}, where note=S-e. • In the binary case, equally distributed, uniform distribution, E[e] = Pr{Xe} = 1/2.
X = Σj=1n{xi} • Expected value of a candidate been hired is the probability of candidate hired. • E[xj]= Pr{if candidate(i) is hired}, • Candidate i is hired if better than previous i-1 candidates. • The probability of being better is 1/i. • Then, E[xi] = Pr{xi}= 1/i • Expected value of hiring (the average number of hired ones out of n arriving candidates in random rank) E[X]. Uniform distribution, equally likely, 1/i.
Expected value of hiring (the average number of hired ones out of n arriving candidates in random rank) E[X]. Uniform distribution, equally likely, 1/i. • E[X] = E[Σj=1n Ii Pr{xi}], Ii={1, 0} indicator random value here. • E[X] = ΣinE[xi], from linearity of expected value. . • Σi=1:(n)(1/i) = 1+1/2+1/3…, harmonic number (divergent).. Int 1 to (n+1), (1/x)=ln(n+1)<=ln(n)+O(1) • Σi=2:(n)(1/i) = 1/2+1/3….. Int 1 to (n+1),(1/x) =ln(n)<=ln(n) • Σi=1:(n)(1/i) = 1+1/2+1/3….. ln(n)+1 • E(X) = ln(n)+O(1), • Expected value of all hirings.. Upper boundary is ln(n). • Lemma5.2: When candidates presented in random, the cost of hiring is O(chlgn). Proof from Lemma5.1. • How to randomize.. Some random outputs of permutations will not be random. Would it matter?..
In the case of dice, Pr{heads}= 1/2, in which case, for n tries, E[X] = E[Σj=1n{1/2}] =n/2. • Biased Coin • Suppose you want to output 0 and 1 with the probabilities of 1/2 • You have a coin that outputs 1 with probability p and 0 with probability 1-p for some unknown 0 < p < 1 • Can you use this coin to output 0 and 1 fairly? • What is the expected running time to produce the fair output as a function of p? • Let Si be the probability that we successfully hire the best qualified candidate AND this candidate was the ith one interviewed • Let M(j) = the candidate in 1 through j with highest score • What needs to happen for Si to be true? • Best candidate is in position i: Bi • No candidate in positions k+1 through i-1 are hired: Oi • These two quantities are independent, so we can multiply their probabilities to get Si
Characterizing the running time of a randomized algorithm. • E[T(n)] = Σj=1ntiPr{ti} void qksrt( vector<int> & data) { RandGen rand; qksrt(data, 0, data.lngth() - 1, rand ); } void qksrt( vector<int> & dt, int srt, int end, RandGen & rand ) { if( start < end ) { int bnd = partition(dt, srt, end, rand.RandInt(srt, end ) ); qksrt(dt, start, bnd, rand); qksrt(dt, bnd + 1, end, rand ); } }
we are equally likely to get each possible split, since we choose the pivot at random. • Express the expected running time as: • T(0) = 0; • T(n) = Σk=0n-1 1/n[T(k)+T(n-k+1)+n]; • T(n) = n + Σk=0n-1 1/n[T(k)+T(n-k+1)]; • Bad split, choosing from 1st or 4th quarter, that is i k i+ ¼[j-i+1] or j - ¼[j-i+1] k j • Good split i + ¼[i-j+1] k j - ¼[j-i+1]
Mixing good and bad coincidence with equal likelihood. • T(n) = n + Σk=0n-1 1/n[T(k)+T(n-k+1)]; • T(n) = n + (2/n) Σj=n/2n-1 [T(k)+T(n-k+1)]; = n + (2/n){Σk=n/23n/4 [T(k)+T(n-k+1)] + Σk=3n/4n-1 [T(k)+T(n-k+1)]}; • n + (2/n){ Σk=n/23n/4 [T(3n/4)+T(n/4)] + Σk=3n/4n-1 [T(n-1)+T(0)]}; • n + (2/n)(n/4){[T(3n/4)+T(n/4)] + [T(n-1)]}; • n +(1/2){[T(3n/4)+T(n/4)] + [T(n-1)]}; • Prove that for all n T(n) cnlog(n), T(n) is the statement obtained above. Inductive proof, for n=0, and n=n, • Probability of many bad splits is very small. With high probability the list is divided into fractional pieces which is enough balance to get asymptotic n log n running time. • n + (1/2)[c(3n/4) log(3n/4) + c(n/4) log(n/4)] +(1/2)c(n-1)log(n-1)
MIT notess • L(n)= 2U(n/2) + Θ(n) lucky • U(n)= L(n –1) + Θ(n) unlucky • L(n)= 2U(n/2 –1) + 2Θ(n/2)+ Θ(n) • L(n)= 2U(n/2 –1) + Θ(n) • L(n)= Θ(nlogn) • And there is more there..
Computing S • Bi = 1/n • Oi = k/(i-1) • Si = k/(n(i-1)) • S = Si>k Si = k/n Si>k 1/(i-1) is probability of success • k/n (Hn – Hk): roughly k/n (ln n – ln k) • Maximized when k = n/e • Leads to probability of success of 1/e