**Study GroupRandomized Algorithms** Jun 7, 2003 Jun 14, 2003

**Randomized Algorithms** • A randomized algorithm is defined as an algorithm that is allowed to access a source of independent, unbiased random bits, and it is then allowed to use these random bits to influence its computation. Output Input Algorithm Random bits

**Monte Carlo and Las Vegas** There are two kinds of randomized algorithms: • Monte Carlo: A Monte Carlo algorithm runs for a fixed number of steps for each input and produces an answer that is correct with a bounded probability • Las Vegas: A Las Vegas algorithm always produces the correct answer, but its runtime for each input is a random variable whose expectation is bounded.

**Question** • Is the max-cut algorithm that we discussed previously a Monte Carlo or Las Vegas algorithm? • We will see two other examples today.

**Randomized Quick Sort** • In traditional Quick Sort, we will always pick the first element as the pivot for partitioning. • The worst case runtime is O(n2) while the expected runtime is O(nlogn) over the set of all input. • Therefore, some input are born to have long runtime, e.g., an inversely sorted list.

**Randomized Quick Sort** • In randomized Quick Sort, we will pick randomly an element as the pivot for partitioning. • The expected runtime of any input is O(nlogn).

**Analysis of Randomized QS** • Let s(i) be the ith smallest element in the input list S. • Xij is a random variable such that Xij = 1 if s(i) is compared with s(j); Xij = 0 otherwise. • Expected runtime t of randomized QS is: • E[Xij] is the expected value of Xij over the set of all random choices of the pivots, which is equal to the probability pij that s(i) will be compared with s(j).

**Analysis of Randomized QS** • We can represent the whole sorting process by a binary tree T: • Notice that s(i) will be compared with s(j) where i<j if and only if s(i) or s(j) is the first one among the set {s(i), s(i+1), …, s(j)} to be selected as the pivot. • Note that pij = 2/(j-i+1). Why? 1st pivot 5 2nd pivot 3rd pivot 2 7 4th pivot 5th pivot 4 1

**Analysis of Randomized QS** • Therefore, the expected runtime t: • Note that • Randomized QS is a Las Vegas algorithm.

**Randomized Min-cut** • Given an undirected, connected multi-graph G(V,E) , we want to find a cut (V1,V2) such that the number of edges between V1 and V2 is minimum. • This problem can be solved optimally by applying the max-flow min-cut algorithm O(n2) time by trying all pairs of source and destination.

**Randomized Min-cut** • In randomized Min-cut, we repeatedly do the following: • Pick randomly an edge e(u,v). Merge u and v, and remove all the edges between u and v. For example: • until there are only 2 vertices left. We will report the cut between these 2 vertices as the min-cut. y y x x u v u,v z z

**Analysis of Randomized Min-cut** • Let k be the min-cut of the given graph G(E,V) where |V|=n. • Then |E| ≥ kn/2. • The probability q1 of picking one of those k edges in the first merging step ≤ 2/n • The probability p1 of not picking any of those k edges in the first merging step ≥ (1-2/n) • Repeat the same argument for the first n-2 merging steps. • Probability p of not picking any of those k edges in all the merging steps ≥ (1-2/n)(1-2/(n-1))(1-2/(n-2))…(1-2/3)

**Analysis of Randomized Min-cut** • Therefore, the probability of finding the min-cut: • If we repeat the whole procedure n2/2 times, the probability of not finding the min-cut is at most • Randomized Min-cut is a Monte Carlo Algorithm.

**Question** • What will happen if we apply a similar approach to find the max-cut instead? Will it be better or worse than the previous method of random assignment?

**Complexity Classes** • There are some interesting complexity classes involving randomized algorithms: • Randomized Polynomial time (RP) • Zero-error Probabilistic Polynomial time (ZPP) • Probabilistic Polynomial time (PP) • Bounded-error Probabilistic Polynomial time (BPP)

**RP** • Definition: The class RP consists of all languages L that have a randomized algorithm A running in worst-case polynomial time such that for any input x in ∑*:

**RP** • Independent repetitions of the algorithms can be used to reduce the probability of error to exponentially small. • Notice that the success probability can be changed to an inverse polynomial function of the input size without affecting the definition of RP. Why?

**ZPP** • Definition: The class ZPP is the class of languages which have Las Vegas algorithms running in expected polynomial time. • ZPP = RP∩ co-RP. Why? • (Note that a language L is in co-X where X is a complexity class if and only if it’s complement ∑*-L is in X.)

**PP** • Definition: The class PP consists of all languages L that have a randomized algorithm A running in worst-case polynomial time such that for any input x in ∑*:

**PP** • To reduce the error probability, we can repeat the algorithm several times on the same input and produce the output which occurs in the majority of those trials. • However, the definition of PP is quite weak since we have no bound on how far from ½ the probabilities are. It may not be possible to use a small number (e.g., polynomial no.) of repetitions to obtain a significantly small error probability.

**Question** • Consider a randomized algorithm with 2-sided error as in the definition of PP. Show that a polynomial no. of independent repetitions of this algorithm needs not suffice to reduce the error probability to ¼. (Hint: Consider the case where the error probability is ½ - ½n . )

**BPP** • Definition: The class BPP consists of all languages L that have a randomized algorithm A running in worst-case polynomial time such that for any input x in ∑*:

**BPP** • For this class of algorithms, the error probability can be reduced to ½n with only a polynomial number of iterations. • In fact, the probability bounds ¾ and ¼ can be changed to ½ +1/p(n) and ½ -1/p(n) respectively where p(n) is a polynomial function of the input size n without affecting the definition of BPP. Why?