Pseudorandomness for Approximate Counting and Sampling

Pseudorandomness for Approximate Counting and Sampling Ronen Shaltiel University of Haifa Chris Umans Caltech

What is this talk about? Main technical result: • We define and construct “pseudorandom objects” associated with: • Approximate counting of accepting instances of a given circuit. • Random sampling of accepting instances of a given circuit. • But in fact it all relates to derandomization and it’s a long story: • Once upon a time there was an evil magician called Merlin and a handsome prince called Arthur. One day as Arthur was tossing coins he came about a beautiful NP statement…

This talk is about derandomization • Derandomization of procedures that use both randomness and nondeterminism. • Arthur-Merlin games (by derandomization we mean AM=NP). • Approximate counting of accepting instances. • random sampling of accepting instances • Goal: Get rid of randomness (we don’t expect to get rid of nondeterminism). • Under what assumptions? • We derandomize some randomized procedures using assumptions that seem weaker than those we are “supposed to use”.

Approximate counting and sampling of accepting instances • Two common computational tasks used frequently in complexity: • approximate counting: • given circuit C on n bits • output approximation of |C-1(1)|: • random sampling: • given circuit C on n bits • output random x in C-1(1) • Solvable using randomness and nondeterminism [Sto,JVV,BGP]. What do we mean by derandomizing a sampling procedure? objects of interest (C recognizes) {0,1}n

Derandomization: Hardness versus Randomness Initiated by [BM,Yao]. Assumption: hard functions exist. Conclusion: Derandomization. A lot of works: [BM82,Y82,HILL,NW88,BFNW93, I95,IW97,IW98,KvM99,STV99,ISW99,MV99, ISW00,SU01,U02,TV02,KI03,GST03]

input A output random bits input A output PRG seed pseudo-random bits few truly random bits many“pseudo-random” bits Pseudo-Random Generators Use a short “seed” of very few truly random bits to generate a long string of pseudo-random bits. Pseudo-randomness: no efficient algorithm can distinguish truly random bits from pseudo-random bits. Nisan-Wigderson setting: The test A can’t run PRG. (i.e., for tests that runs in time n3 the PRG is allowed to run in time n5).

Hardness versus Randomness Assumption: hard functions exist. Exists pseudo-random generator Conclusion: Derandomization.

Hard function Proof takes a distinguishing A and uses it to construct a circuit/algorithm for the supposedly hard function. input A output PRG PRG seed pseudo-random bits Derandomization Algorithm for function a contradiction The meta-argument Assume (for contradiction) that A that is not fooledby PRG A The hardness assumption is against procedures at least as complex as A. Meta-Argument: We can’t derandomize the probabilistic version of a complexity class C without a lower bound against C.

A brief survey: Achieving the meta argument Meta-Argument: We can’t derandomize the probabilistic version of a complexity class C without a lower bound against C. Actually, we usually require a lower bound against the nonuniform version of C of size 2Ω(n) [KvM99]. Assumption: There is a function in E=DTIME(2O(n)) that cannot be computed for size 2Ω(n)circuits of a certain type.

SAT SAT SAT PNP …. …. PNP|| ordinary circuit …. NP coNP SAT SAT SAT NP  coNP ordinary circuit …. P Different types of nondeterminsim Adaptive SAT circuit Nonadaptive SAT circuit • PNP : Poly-time with access to a SAT oracle. • PNP||: Poly-time with nonadaptive access to a SAT oracle.

Our results

Beating the Meta-argument Arthur-Merlin counting S2P sampling • Prvs results:Each can be derandomized using respective hardness. • Our results: All can be derandomized using only hardness for non-deterministic circuits. (Same assumption as the one for AM). • This results beat the meta-argument! • It is known that S2P contains PNP. • We’ve “derandomized” S2P using a lower bound for a weaker circuit class than supposed to!

A little bit more formally… • Theorem: Assume that there is a problem in E=DTIME(2O(n)) that cannot be computed by size 2Ω(n)(SV-)nondeterministic circuits then: • AM=NP (known result [MV99,SU01], new proof) • Approximate counting and “sampling” can be done in PNP||. • S2P=PNP • BPPpath=PNP|| • The learning algorithm of Bshouty et al. can be derandomized. • More… • Remarks: • E can sometimes be replaced by stronger classes: NE  coNE, ENP|| ,ENP.

Main technical result Theorem: (boosting hardness): if E requires size 2Ω(n)nondeterministic circuits then E requires size 2Ω(n)PNP||-circuits. Contra-positive: (downward collapse): If Ehas PNP||-circuits of size s(n) then E has nondeterministic circuits of size s(n)O(1). (E can be replaced by PSPACE, P#P, ENP, E||NP, NEXP  coNEXP)

Quick survey on assumptions implying AM = NP  L worst-case hard for PNP||-circuits  L average-case hard for PNP||-circuits KvM KvM • PRG for PNP||-circuits  L worst-case hard for non-det. circuits  L average-case hard for non-det. circuits AK  PRG for non-det. circuits SU AM = NP MV  HSG for co-non-det. circuits this paper All assumptions are equivalent.

Strong PRGs from weak assumptions  L worst-case hard for PNP||-circuits KvM • PRG for PNP||-circuits  L worst-case hard for non-det. circuits “Boosting hardness”  PRG for (co-) non-det. circuits SU AM = NP MV  HSG for co-non-det. circuits this paper PRG for stronger circuits than “supposed to”.

 L worst-case hard for PNP||-circuits KvM • PRG for PNP||-circuits  L worst-case hard for non-det. circuits “Boosting hardness”  PRG for (co-) non-det. circuits SU AM = NP MV  HSG for co-non-det. circuits this paper The current picture of nondeterministic hardness  L worst-case hard for adap. PNP-circuits • PRG for adap. PNP-circuits KvM open problem

Proof of main result

We have to use that f is complete for E Outline of proof • Assumption: small PNP||-circuitC for a complete f in E: (for simplicity assume that it makes only one SAT query). • Goal: Show that f has small nondeterministic circuitC’: Note: in general can’t replace small PNP||-circuit with small nondeterministic circuit (implies, e.g., coNP  NP/poly) ordinary circuit SAT C ordinary circuit …. • Naïve attempt for simulating a SAT query in a nondeterministic circuit: • Guess whether the query is answered by “yes” or “no”. • If query is answered by “yes”: guess satisfying assignment and verify. • If query is answered by “no”: ?????????

w.l.o.g. a function in E is a low degree multivariate polynomial Theorem: (low degree extension) [BF] There is a function family fn:Fqn Ffor q=nO(1) that is complete for E.

Simulating C by a randomized nondeterministic circuit C’ low degree f • On input x: Pass a random low degree curve through x. • Field size polynomial => curve has poly many points x1,..,xq. • Suppose we construct a nondeterministic circuit C’ that computes f(x1),..,f(xq) with at most an  fraction of errors. • Then we can compute f(x)! • Because f restricted to curve is a low degree univariate polynomial. Use Reed-Solomon decoding. ordinary circuit x SAT x1 x2 xq C x3 ordinary circuit x4 x5 …. Fqn

Using nonuniformity (following [FF91,SU01,BT03,..]) All points y in Fd s.t. the SAT query on y is answered “yes”. low degree f • On input x: Pass a random low degree curve through x. • Let p = fraction of y’s in Fd s.t. the SAT query on y is answered “yes”. • Hardwire p to circuit C’. • Points on random curve are k-wise independent for k=poly. • ∀x with high probability (over curve) the fraction of xi‘s on curve s.t. the SAT query on xi is answered “yes” is p. ordinary circuit x SAT x1 x2 xq C x3 ordinary circuit x4 x5 …. Fqn

By choosing large enough poly degree for curve. There exists a fixed choice of random bits that is good for all x’s. Simulating C on all xi‘s on curve with only few errors. f ordinary circuit x SAT x1 x2 xq C x3 ordinary circuit x4 x5 …. Fqn • the fraction of xi‘s on curve s.t. the SAT query on xi is answered “yes” is p. • Goal: Simulate C(x1),..,C(xq) with at most -fraction oferrors. For every xi we simulate C up to the SAT query. • Guess fraction of p-xi‘s on curve and witnesses showing that all queries of xi‘s are answered “yes”. • Assume queries of other points on curve are answered “no”. • <2 errors.

Applications

Story so far… Arthur-Merlin counting sampling S2P • Goal: Derandomize using only hardness for nondeterministic circuits. • We’ve seen: can boost hardness: From nondeterministic circuits to nonadaptive SAT circuits. • This gives: new proof for AM=NP. • “Implies”: derandomizing counting and sampling. • What does it mean to derandomize sampling?

Sampling accepting instances: given circuit C on n bits. sample random x in C-1(1) Conditional discrepancy set: given circuit C on n bits. Output x1,..,xpoly(n) in C-1(1) No circuit of size (say n2) can distinguish a random xi from a random accepting x. C-1(1) C-1(1) x x {0,1}n {0,1}n A pseudorandom object for sampling accepting instances Sampling accepting instances: • given circuit C on n bits. • sample random x in C-1(1) Standard sampling: • sample random x in {0,1}n. Discrepancy set: • Output x1,..,xpoly(n) in {0,1}n • No circuit of size (say n2) can distinguish a random xi from a random x.

More applications Arthur-Merlin counting sampling S2P • Goal: Derandomize using only hardness for nondeterministic circuits. • We’ve seen: new proof for AM=NP. • We’ve seen: can boost hardness: From nondeterministic circuits to nonadaptive SAT circuits. • “Implies”: derandomizing counting and sampling. • Under the same hardness assumption S2P=PNP.

Derandomizing S2P • S2P ZPPNP [Cai] • Cai’s proof gives that: • Every S2P language has an algorithm that runs in PNP and uses conditional discrepancy sets. Theorem: if ENP requires exponential size nondeterministic circuits, then S2P= PNP.

Conclusions • conditional discrepancy set generators are “pseudorandom object” for sampling accepting instances. • (SV-)nondeterministic hardness assumption sufficient for: • AM = NP (and all assumptions are equivalent) • placing approximate counting in PNP|| • placing sampling in PNP|| • Placing S2P in PNP. • Use given assumptions in stronger ways!

Open questions • strengthen downward collapse to adaptive case? current result: “If E  PNP||/poly then E  NP/poly” open problem: “If E  PNP/poly then E  NP/poly” • uniform version? open problem: “If E  PNP|| then E  AM” Our techniques give E  AM/log. Improvement by [KF05], E  NP/log. • More examples of beating the meta-argument. Can it be done for weaker classes?

That’s it… Thank You!

Tool: low degree extension • Every language L  E has a low-degree extension L  E. • extend to f:Fqd  Fq • f has low total degree (≤ hd) f can be computed in E and is a robust version of f. • f:{0,1}n {0,1} • H  Fq (e.g. H={0,1}). • think of f as f:Hd  Fq • Identify f with low-degree polynomial p:Hd Fq Hd Fqd

A pseudorandom generator for sampling objects of interest (C recognizes) • Approximate counting: • given circuit C • output approximation of |C-1(1)|: • Namely: a number r s.t. |C-1(1)|(1-) ≤ r ≤ |C-1(1)|(1+) Theorem: in PNP|| if E requires exponential size (SV-)nondeterministic circuits. {0,1}n

Derandomizing Approximate counting objects of interest (C recognizes) • Approximate counting: • given circuit C • output approximation of |C-1(1)|: • Namely: a number r s.t. |C-1(1)|(1-) ≤ r ≤ |C-1(1)|(1+) Theorem: in PNP|| if E requires exponential size (SV-)nondeterministic circuits. {0,1}n

Approximate counting and sampling • approximate counting: • given circuit C • output approximation of |C-1(1)|: • |C-1(1)|(1-) ≤ r ≤ |C-1(1)|(1+) • Note: PRGs for det circuits give: • |C-1(1)| -  ≤ r ≤ |C-1(1)| +  Theorem: in PNP|| if E requires exponential size (SV-)nondeterministic circuits. objects of interest (C recognizes) {0,1}n

Proof sketch • Start from weak assumption (hardness for (SV-)nondeterministic circuits). • Use boosting theorem to obtain PRG against PNP|| circuits. • Algorithm for counting works in “BPPNP||“. • Replace random bits with pseudorandom bits (careful: counting is not a decision problem).

Probabilistic procedure for Approximate counting [S,JVV,BGP] • try random hash fn. h into 1, 2, 3, … bits • NP query: y that has too many preimages? • with high probability when 2k  |C-1(1)| no y has too many preimages. Output 2k. {0,1}1 {0,1}k … {0,1}n {0,1}n

Derandomized procedure for Approximate counting • try hash functions h into 1, 2, 3, … bits that are the outputs of a PRG fooling PNP||-circuits • NP query: “y that has too many preimages?” • when 2k  |C-1(1)| no y has too many preimages with high probability over all hash functions. • therefore many hash functions that are outputs of the PRG will pass the NP test. Output 2k. {0,1}1 {0,1}k …

objects of interest (C recognizes) Pseudorandom Sampling • Discrepancy set generator: • given s, output T  {0,1}ns.t. for all circuits D of size s: |Prx[D(x) = 1] - Prt[D(t) = 1]| ≤  • Conditional discrepancy set generator: • given C, s, output T  {0,1}ns.t. for all circuits D of size s: |Prx[D(x)=1|C(x) = 1] - PrtT[D(t)=1|C(t)=1]| ≤  {0,1}n

Sampling • Conditional discrepancy set generator: • given C, s, output T  {0,1}ns.t. for all circuits D of size s: |Prx[D(x)=1|C(x)=1] - PrtT[D(t)=1|C(t)=1]| ≤  Theorem: in PNP|| if E requires exponential size SV-nondeterministic circuits.

Proof sketch • Start from weak assumption (hardness for SV nondeterministic circuits). • Use boosting theorem to obtain PRG against PNP|| circuits. • Algorithm for sampling works in “BPPNP“. • Observe that adaptive NP queries are used mainly to find NP witnesses. (Given NP statement find witness). • Replace with non-adaptive witness finding [BCGL90] to get “BPPNP||”. • Replace random bits with pseudorandom bits.

Random sampling • pick random y, use NP oracle to enumerate: Sy = {x : C(x) = 1 and h(x) = y} (note: |Sy| ≤ n2) • pick random i in {1,2,…, n2} • output ith item in list, or “fail” if no ith item (requires adaptive queries). {0,1}1 {0,1}k … 2k  |C-1(1)| {0,1}n {0,1}n

Pseudorandom Sampling • as before, using nonadaptive NP queries, can obtain hash function h:{0,1}n {0,1}k such that 2k  |C-1(1)| and no y has > n2 preimages. • idea: use NP oracle to enumerate: Sy = {x : C(x) = 1 and h(x) = y} for those y that are the outputs of a PRG fooling PNP||-circuits • Note: Seems that we require fooling PNP circuits! {0,1}1 {0,1}k …

Sampling • idea: use NP oracle to enumerate: Sy = {x : C(x) = 1 and h(x) = y} for those y that are the outputs of a PRG fooling NP||-circuits • Two issues: • need to convert small circuit that catches this conditional discrepancy set into small NP||-circuit that catches the PRG. • enumeration step seems to require adaptive use of NP oracle.

Non-adaptive witness finding • can deal with both issues using non-adaptive NP witness finding • usual technique: given (x1, x2, …, xn) • 2 queries: is (c1, x2, …, xn) satisfiable for c1=0,1 • if satisfiable for c1=0, then 2 queries: is (0, c2, …, xn) satisfiable for c2=0,1 else 2 queries: is (1, c2, …, xn) satisfiable for c2=0,1 • etc… • at most 2n adaptive queries total

Non-adaptive witness finding • usual technique if unique satisfying assignment: given (x1, x2, …, xn) • is (c1, x2, …, xn) satisfiable for c1=0,1 • is (x1, c2, …, xn) satisfiable for c2=0,1 • … • is (x1, x2, …, cn) satisfiable for cn=0,1 • assemble into single satisfying assignment • 2n non-adaptive queries total

Non-adaptive witness finding • Valiant-Vazirani: randomized procedure • given (x1, x2, …, xn),produce 1, 2, …, n • with high probability this is a “good” set: at least one i has a unique satisfying assignment • Key observation (in KvM):there is a small circuit that given 1, 2, …, n uses non-adaptive NP queries to decide if input is a “good” set • the output of a PRG fooling NP||-circuits includes a “good” set • use non-adaptive procedure from previous slide in parallel on all formulas in the output of the PRG

Putting it all together “pseudorandom object for sampling” Conditional discrepancy set generator: • given C, s, output T  {0,1}ns.t. for all circuits D of size s: |Prx[D(x)=1|C(x)=1] - PrtT[D(t)=1|C(t)=1]| ≤  Theorem: in PNP|| if E requires exponential size SV-nondeterministic circuits.

Applications • S2P = those languages L expressible as x  L  y z R(x, y, z) = 1 x  L  z y R(x, y, z) = 0 • given x, form matrix: x  L: 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 x  L: cell (y, z) = R(x, y, z) y z

Applications • Background BPP  S2P • known: PNP S2P • S2P ZPPNP (Cai) Theorem: if ENP requires exponential size SV-nondeterministic circuits, then S2P= PNP. • Proof idea: Cai’s argument can be viewed as non-randomized reduction to sampling. Note: This is the strongest example we have of breaking the barrier. Moral: Make better use of assumptions.

Pseudorandomness for Approximate Counting and Sampling

Pseudorandomness for Approximate Counting and Sampling

Presentation Transcript

Pseudorandomness from Shrinkage

New Sampling-Based Summary Statistics for Improving Approximate Query Answers

Counting and not counting for the community college

Approximate Counting

Adaptive annealing: a near-optimal connection between sampling and counting

Randomness and PSEudorandomness

Flexible Approximate Counting

Approximate Counting of Cycles in Streams

Bayesian Networks: Sampling Algorithms for Approximate Inference

Approximate Counting of Frequent Query Patterns over XQuery Stream

Adaptive annealing: a near-optimal connection between sampling and counting

Sampling and Approximate Counting for Weighted Matchings

Combinatorial Problems II: Counting and Sampling Solutions

Adaptive annealing: a near-optimal connection between sampling and counting

Approximate Counting via Correlation Decay in Spin Systems

Robust Estimation With Sampling and Approximate Pre-Aggregation

The Unified Theory of Pseudorandomness

Approximate Inference 2: Importance Sampling

Approximate Inference by Sampling

Adaptive annealing: a near-optimal connection between sampling and counting