1 / 22

A threshold of ln(n) for approximating set cover

A threshold of ln(n) for approximating set cover. By Uriel Feige Lecturer: Ariel Procaccia. Set Cover. We are given an (unweighted!) collection F of subsets of U={1,…,n}. We must find as few as possible subsets from F such that their union covers U.

braima
Download Presentation

A threshold of ln(n) for approximating set cover

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A threshold of ln(n) for approximating set cover By Uriel Feige Lecturer: Ariel Procaccia

  2. Set Cover • We are given an (unweighted!) collection F of subsets of U={1,…,n}. • We must find as few as possible subsets from F such that their union covers U. • Hypergraph vertex cover (lecture of 10/3) is a special case of set cover.

  3. Approximation algorithms • A greedy algorithm: • Recall algorithms 1… • At each stage, add to the cover the subset which maximizes the number of new elements. • This algorithm gives an approximation of ln(n)-lnln(n)+O(1). • An approximation of ln(n) can also be achieved using a linear programming relaxation.

  4. A bad example for the greedy algorithm U={(x,y): 1 ≤ x,y ≤ n} S1 = {(x,y): x ≤ n/2} S2 = {(x,y): x ≥ n/2} T1= {(x,y): 1 ≤ y ≤ n/2} T2= {(x,y): n/2 ≤ y ≤ 3n/4} … 32 16 8 8 Ratio: logn/2

  5. Overview (1) • We wish to prove Theorem 8: If a poly time algorithm can approximate set cover within (1-ε)ln(n), then NP is a subset of DTIME(nO(loglog(n))). • The strategy: a reduction from a k-prover proof system for an NP-complete problem. Familiar with IP?

  6. Overview (2) • Parts of the lecture: • MAX 3SAT-5. • The k-prover proof system. • Partition systems. • The reduction to set cover. • Thundering applause. • May the force be with us. Similar to the lecture of 17/3.

  7. MAX 3SAT-5 • Input: A CNF formula with n variables and 5n/3 clauses, in which every clause contains exactly three literals, every variable appears in exactly 5 clauses, and a variable does not appear in a clause more than once. • Theorem 1: For some >0, it is NP-hard to distinguish between 3CNF-5 formulas where OPT=1 or OPT≤(1-). Reduction from MAX-3SAT-B

  8. Two prover proof system for 3SAT-5 • Presented as to provide intuition and help in the analysis of the k-prover system. • One round, two prover system for 3SAT-5. • Protocol: • Vselects an index of a clause, sends it to the first prover, selects a random var in the clause, and sends it to the second prover. • First prover returns 3 bits, second returns 1 bit. • V accepts if the following conditions hold: • Clause check: the assignment sent by the first prover satisfies the clause. • Consistency check: the assignment sent by the second prover is identical to the assignment for the same variable sent by the first prover. Assgn. To clause and var

  9. Proposition 2: If OPT=1-ε, then under the optimal strategy of the provers, V accepts with probability (1-ε/3). • Proof: • The strategy of the second prover defines an assignment A to the variables. • If V selects a clause that is not satisfied by A (prob. ε), the first prover must set one of the variables differently from A. • The consistency check fails with prob. ≥ 1/3.■ There is a strategy with acceptance prob ε/3

  10. Parallel repetition • We would like to lower the error. • Modified proof system: V sends to the first prover l clauses, from each chooses a var, and sends these l vars to the second prover. • Theorem 3 (Ran Raz): If a one round two prover system is repeated l times independently in parallel, then the error is 2-cl, where c>0 is a constant that depends only on the original proof system. • The error of the modified two prover system is ≤ 2-cl, for some universal constant c. C for subtle

  11. The k-prover proof system P1: 0011 P2: 0101 P3: 1100 • Binary code with k codewords, each of length l and weight l/2, with Hamming distance at least l/3. • Each prover is associated with a codeword. • The Protocol: • The verifier selects l clauses C1,…,Cl, then selects a var from each clause: the distinguished variables x1,…,xl. • Prover Pi receives Cj for those coordinates in its codeword that are 1, and xj for the coordinates that are 0, and replies with 2l bits. • The answer of the prover induces an assignment to the distinguished variables. • Acceptance predicate: • Weak: at least one pair of provers is consistent. • Strong: every pair of provers is consistent. c1 c2 v3 v4 100,010,1,0 V

  12. Lemma 4: If OPT=1 then the provers have a strategy that causes V to always strongly accept. If OPT≤(1-ε), then the verifier weakly accepts with probability at most k22-cl, where c>0 is a constant that depends only on ε. • Proof: • If OPT=1, the provers can base their answers on a satisfying assignment. • Assume OPT=(1- ε), and that V weakly accepts with prob. ≥ δ. Then with respect to Pi and Pj, V accepts with probability δ/k2. • There are ≥ l/6 coordinates on which Pi receives a clause, and Pj a var in this clause. Other coordinates: for free. •  The provers have a strategy that succeeds with prob. ≥ δ/k2 on l/6 parallel repetitions of the original proof system. • δ/k2<2-cl. ■

  13. Partition systems • A partition system B(m,L,k,d) has the following properties. • There exists a ground set B of m distinct points. • There is a collection of L distinct partitions p1,…,pL. • For 1≤i≤L, partition pi is a collection of k disjoint subsets of B whose union is B. • Any cover of the m points by subsets that appear in pairwise different partitions requires at least d subsets. • Lemma 6: For every c≥0 and m sufficiently large, there is a partition system B(m,L,k,d) whose parameters satisfy the following inequalities: • L ≈ (log(m))c. • K can be chosen arbitrarily as long as k < ln(m)/3ln(ln(m)). • d = (1-f(k))k∙ln(m), where f(k)→∞ as k→∞. • Proof: a randomized construction usually works. k = 4 Looks familiar? L=4

  14. |U|? The reduction m points, k subsets r=1 L=2l • R=(5n)l : num of r. • r↔Br(m,L,k,d): m=nΘ(l), L=2l, d=(1-f(k))k∙ln(m). • Partition↔dist. vars, subset↔prover. B(r,j,i) = i’th subset, partition j, partition system r. • Subsets are S(q,a,i): all r s.t. (q,i)@r, extract from a an assignment ar to dist. vars. S(q,a,i) is the union of all subsets B(r,ar,i), for all r s.t. (q,i)@r. • Q=nl/2(5n/3)l/2 (questions to Pi). r=2 00 01 10 11 S(q,a,1) B(2,4,3) r=R

  15. ar=01 • Lemma 5: • Completeness: OPT=1  there is a set cover of size kQ. • Soundness: OPT=(1-ε)  more than (1-2f(k))kQln(m) subsets are required. • Proof of Completeness: • IfOPT=1, the provers answer consistently with the satisfying assignment. • For any r, consider S(q1,a1,r),…,S(qk,ak,r) s.t. (qi,i)@r, and ai is the appropriate answer. • Br(m,L,k,d) is covered by these k sets. • Similar for every r. The number of subsets is kQ. ■

  16. Proof of soundness: the saga begins • Assume OPT≤1-ε, and that there exists C that covers U, such that |C|=(1-δ)kQln(m), δ=2f(k). • q to prover Pi↔ weight w(q,i)=number of answers a s.t. S(q,a,i) is in C. • Σq,iw(q,i) = |C|. • r↔w(r)= Σ(q,i)@rw(q,i). • This weight = number of subsets that participate in covering Br(m,L,k,d). • Call r good if w(r)<(1-δ/2)k∙ln(m).

  17. The good, the bad, and the random • Proposition 6: The fraction of good r is at least δ/2. • Proof: • Assume otherwise, then: • On the other hand, • Hence |C|>(1- δ)kQln(m) – contradiction. ■ r is good if: w(r)= Σ(q,i)@rw(q,i)<(1-δ/2)k∙ln(m)

  18. Fix coin tosses for provers • Proposition 7: C covers U, |C|=(1-δ)kQln(m)  for some strategy for the k provers, V weakly accepts with prob. ≥ 2δ/(k∙ln(m))2. • This proves soundness, since 2δ/(k∙ln(m))2>k22-cl, for l=Θ(loglogn). • Proof of proposition 7: • Randomized strategy: on q to Pi, select a from the set of a s.t. S(q,a,i) is in C. • For a fixed r: sets B(r,p,i) in the cover of Br(m,L,k,d) ↔ sets S(q,a,i) in C. • For a good r: C used two subsets from p in the cover of Br(m,L,k,d). • Denote B(r,p,i) and B(r,p,j) ↔ S(qi,ai,i) and S(qj,aj,j). • Denote: a is in Ar,i iff S(qi,a,i) is in C. • r is good  |Ar,i|+|Ar,j|<k∙ln(m). • The prob. that the provers answer ai and aj≥ 4/(k∙ln(m))2. Answers are consistent with p  V weakly accepts. • The prob. that V chooses a good r ≥ δ/2. ■

  19. Theorem 8: If there is some ε>0 such that a polynomial time algorithm can approximate set cover within (1-ε)ln(n), then NP is a subset of DTIME(nO(loglog(n))). • Proof: • Assume there is a poly time algorithm A that approximates set cover within (1-ε)ln(m). • Reduce GAP-3SAT-5 to set cover as described, with k s.t. f(k)<ε/4, and m=(5n)2l/ε. • m,R,Q=nO(loglogn)  time to perform the reduction is nO(loglogn). • ln(m)>(1-ε/2)ln(N), where N=mR. • By lemma 5, for a YES instance all points can be covered by kQ subsets, and for a NO instance all points cannot be covered by (1-2f(k))kQln(m). • The ratio is (1-2f(k))ln(m)>(1-ε)lnN. ■

  20. Closing remarks (1) • Max k-cover: • Given U and F as before, select k subsets that such that their union has maximum cardinality. • The obvious greedy algorithm approximates max k-cover within a ratio of at least 1-1/e ≈ 0.632. • Theorem 8 can be directly used to prove: if max k-cover can be approximated in a polynomial time within a ration of (1 - 1/e + ε) for some ε>0, then NP is a subset of DTIME(nO(loglogn)).

  21. Closing remarks (2) • Refinements: • In our hardness of approximation result for set cover, ε is a constant. We may strengthen our assumption so as to improve our result. • ZTIME=class of languages that have a probabilistic algorithm that runs in expected time t (with zero error). • If for some η>0, NP is not a subset of ZTIME(2n^η), then for some constant c’>0 there is no polynomial time algorithm that approximates set cover within ln(n) –c’(lnln(n))2.

  22. Questions and cool animations ?

More Related