Communication Complexity in Property Testing Methodology | Oded Goldreich

A Property TestingDouble-Feature of Short Talks Oded Goldreich Weizmann Institute of Science Talk at Technion, June 2013

On the communication complexity methodology for proving lower bounds on the query complexity of property testing Oded Goldreich Weizmann Institute of Science

Communication Complexity Property Testing z By Ax Before Blais, Brody, and Matulef (2011) (In order to derive a lower bound on testing the property , reduce a two-party communication problem  to.) T The models seem incompatible: (1) no natural partition in PT, (2) no distance in CC.

Let fi(x,y) be the i-th bit of F(x,y), and suppose that B is an upper bound on the (deterministic) communication complexity of each fi and that C is a lower bound on the randomized communication complexity of . Then, testing requires at least C/Bqueries. in =1 F(x,y) (x,y) The Methodology of Blais, Brody, and Matulef In order to derive a lower bound on testing the property , reduce a two-party communication problem  to.That is, present a mapping Fof pairs of inputs (x,y)0,1n+nfor the CC-problem tol(n)-bit long inputs for testing such that (x,y)impliesF(x,y) and(x,y)implies that F(x,y)is far from  . =0 far from In [BBM], l(n)=n and each fi is a function of xi and yi only.This restriction complicates the use of the methodology.

THM: Let F:0,1n+n 0,1l(n) be such that (x,y)impliesF(x,y)and(x,y)implies that F(x,y)is far from . Letfi(x,y)be the i-th bit of F(x,y).Then, RCC() ≤ maxi{DCC(fi)} ∙ PT(). Extends to CC promise problems Proof: Each of the two parties invokes a local copy of the tester using the shared randomness. Each query (i.e., i) made by the tester is answered by invoking the corresponding CC protocol (for fi). Note that the two local executions are kept identical.The error probability of this protocol equals that of the tester. ■ Soundness of the Methodology RCC = randomized CC (with error, say 1/3). Shared randomness.DCC = deterministic CC (or randomized with error 1/6n).PT = query complexity of testing (w.r.t distance as in “far”).

THM: Let F:0,1n+n 0,1l(n) be such that (x,y)impliesF(x,y)and(x,y)implies that F(x,y)is far from . Letfi(x,y)be the i-th bit of F(x,y).Then, RCC() ≤ maxi{DCC(fi)} ∙ PT(); i.e.,PT() ≥ RCC()/maxi{DCC(fi)}. THM: Let C:0,1n 0,1l(n) be a linear code of constant relative distance, and k:NN. Then, the query complexity of the set {C(x):x0,1n & wt(x)=k} is (k). PF:Reduce from k-DISJn (disjointness for k/2-subsets), using F(x,y)=C(x+y)=C(x)+C(y). Note that each bit in F(x,y)has DCC=2 (by exchanging the corresponding bits of C(x) and C(y)). COR: Testing k-linearity has query complexity (k). [C = Hadamard] Applying the Methodology Note: Typically, the i-th bit of F(x,y) depends on a linear number of bits in x and in y. An alternative proof that uses the original BBM formulation needs to maneuver around this difficulty.

THM: Let F:0,1n+n 0,1l(n) be such that (x,y)impliesF(x,y)and(x,y)implies that F(x,y)is far from . Letfi(x,y)be the i-th bit of F(x,y).Then, PT() ≥ RCC()/maxi{DCC(fi)}. Restriction: fi(x,y)=fnc(i,xi,yi). THM: Let C:0,1n 0,1l(n) be a linear code of constant relative distance, and k:NN. Then, the query complexity of the set {C(x):x0,1n & wt(x)=k} is (k). An alternative proof via the restricted methodology introduces an auxiliary CC problem (“C-encoded k-DISJ”) ’ that consists of pairs (C(x),C(y)) s.t (x,y)k-DISTnand reduces (in the CC world) k-DISJ to ’ and then applies the restricted method to ’. Applying the Restricted Methodology The general methodology frees the prover/user from this type of acrobatics. Interestingly, this is only a matter of convenience; that is, it does notadd power (i.e., “anything provable via general is essentially provable by restricted”).

THM: Let F:0,1n+n 0,1l(n) be such that (x,y)impliesF(x,y)and(x,y)implies that F(x,y)is far from . Letfi(x,y)be the i-th bit of F(x,y).Then, PT() ≥ RCC()/maxi{DCC(fi)}. Restriction: fi(x,y)=fnc(i,xi,yi). THM (imprecise sketch): Suppose that  ,  and F satisfy the conditions of the general methodology with B=maxi{DCC(fi)}. Then, there exists ’, ’ and F’ that satisfy the conditions of the restricted methodology while RCC(’)≥RCC() and PT()=(PT(’)/B). Emulating the Restricted Methodology Still, the general methodology frees the prover/user from this type of acrobatics.

On Multiple Input Problems in Property Testing Oded Goldreich Weizmann Institute of Science

For any fixed property  and proximity parameter . Direct m-Sum Problem:Given a sequence of m inputs, output a sequence of m outputs that each satisfy the testing requirements; that is, for every i, if the ith input is in then the ith output is 1 w.p.≥2/3, whereas if the input is -far from  then the output is 1 w.p. ≥ 2/3. Direct m-Product Problem:Given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if some input is -far from . m-Concatenation Problem:Given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if the average distance of the inputs from  is at least . Three types of multiple input problems The results at a glance: For DS and DP the query complexity is m times the query complexity of , for CP it is about the same as for .

For any  and , w.r.t. error probability at most 1/3. THM 1:m-DS() = (m∙PT()). THM 2:m-DP() = (m∙PT()). THM 3: Typically(*), m-DP() = Õ(PT()). *) “Typically” = ifPT() increases at least linearly with 1/ The main results m-DS:Given a sequence of m inputs, output a sequence of m outputs such that, for every i, if the ith input is in then the ith output is 1 w.p.≥2/3, whereas if the input is -far from  then the output is 1 w.p. ≥ 2/3. m-DP:Given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if some input is -far from . m-CP:Given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if the average distance of the inputs from  is at least .

THM 1:m-DS() = (m∙PT()). (m-DS = given a sequence of m inputs, output a sequence of m outputs such that, for every i, if the ith input is in  the ith output is 1 w.p.≥2/3, whereas if the input is -far from  then the output is 1 w.p. ≥ 2/3.) Re the lower bound: In the model of query complexity, it is easy to decouple the execution of the multiple-instance procedure into a sequence of single-instance executions, and the only issue at hand is the possibly uneven and adaptive allocation of resources among the executions. We need to consider the allocation of resources w.r.t some distribution on instances; which one? The one provided by the MiniMax Principle! The real contents of the MMP is not that the worst-case performance of each randomized algorithm is bounded by the average-case performance (of all deter’ algorithms) w.r.t some fixed input distribution, but rather that this bound is tight! Comments re the proof of THM1

THM 2:m-DP() = (m∙PT()). (m-DP = given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if some input is -far from .) Re the upper bound: A straightforward reduction of DP to DS will require error reduction (and so we would lose a (log m) factor). LEM:m-DP can be reduced to O(j) instances of 2-(j-1)m-DS, for j=1,…,log m. Idea: Proceed in iterations, initializing I (the set of “far” suspects) to [m]. Comments re the proof of THM2 In iteration j, run DS on the instances with index in I, with error parameter exp(-j), and reset I to be the set of indices with output 0. If|I|>m/2j, then halt with output 0. If I is empty, halt with output 1. Re the lower bound: Via an adaptation of the proof of THM1.

LEM:m-DP can be reduced to O(j) instances of 2-(j-1)m-DS, for j=1,…,log m. Idea: Proceed in iterations, initializing I (the set of “far” suspects) to [m]. Case: All inputs in  Case:  an input far from  * 1 1 0 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 Illustration for the proof of LEM 1 0 1 0 In iteration j, run DS on the instances with index in I, with error parameter exp(-j), and reset I to be the set of indices with output 0. If|I|>m/2j, then halt with output 0. If I is empty, halt with output 1.

THM 3: Typically(*), m-DP() = Õ(PT()). (m-CP = given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if the average distance of the inputs from  is at least .) *) “Typically” = ifPT() increases at least linearly with 1/ Re the upper bound: A straightforward algorithm would sample O(1/) instances and run the -tester for  on each of them. Complexity O(PT/). One can do better using Levin’s economical work investment strategy.Let l = log(2/). For j=1,…,l, take a sample of O(l/2j) instances and invoke a 2-j-tester on each. Comments re the proof of THM3 Suppose Es[q(s)] > , for q:[N][0,1].(Invested work is proportional to 1/q(s), unknown a priori.)Then, exists j[l] such that Probs[q(s)>2-j] > 2j/4l.

Non-adaptive and/or one-sided error testers The only deviation from the general case is for the one-sided error version of DP: Its complexity is (m∙PT()+PTose()). (m-DP = given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if some input is -far from .) (OSE is the adaptive version) Re the upper bound:We adapt the procedure presented in the proof of the efficient reduction of DP to DS (cf., Lemma for THM2). Recall that this procedure proceeds in iterations halting with output 1 if I(the set of “far” suspects) becomes empty and outputting 0 if I is ever too big. We modify the procedure such that in the latter case Additional results and comments it selects a random iin I, and invokes the one-sided error tester on the ith instance, and decides accordingly.In contrast, in the invocations of the reduction procedure, we use the two-sided error tester.

End The slides of this talk are available at http://www.wisdom.weizmann.ac.il/~oded/T/2pt13.ppt The “CC Methodology” paper is available at http://www.wisdom.weizmann.ac.il/~oded/p_ccpt.html The “Multiple Input” paper is available at http://www.wisdom.weizmann.ac.il/~oded/p_mi-pt.html

Property Testing: an illustration Gothic cathedral ?

? ? ? ? ? Focus: sub-linear time algorithms – performing the task by inspecting the object at few locations. Property Testing: informal definition A relaxation of a decision problem: For a fixed property Pand any object O, determine whether O has property P or is far from having property P(i.e., O is far from any other object having P). Objects viewed as functions. Inspecting = querying the function/oracle.

Property Testing: the standard (one-sided error) def’n • A property P = nPn , where Pn is a set of functions with domain Dn. • The tester gets explicit input n and , • and oracle access to a function with domain Dn. • If f Pn then Prob[Tf(n,) accepts] = 1. (or > 2/3) • If f is -far from Pn then Prob[Tf(n,) rejects] > 2/3.(Distance is defined as fraction of disagreements.) Focus: query complexity, q(n,) « |Dn| Special focus: q(n,)=q(), independent of n. Terminology:is called the proximity parameter.

Let fi(x,y) be the i-th bit of F(x,y), and suppose that B is an upper bound on the (deterministic) communication complexity of each fi and that C is a lower bound on the randomized communication complexity of . Then, testing requires at least C/Bqueries. The Methodology of Blais, Brody, and Matulef In order to derive a lower bound on testing the property , reduce a two-party communication problem  to.That is, present a mapping Fof pairs of inputs (x,y)0,1n+nfor the CC-problem tol(n)-bit long inputs for testing such that (x,y)impliesF(x,y) and(x,y)implies that F(x,y)is far from  . In [BBM], l(n)=n and each fi is a function of xi and yi only.This restriction complicates the use of the methodology.

Communication Complexity in Property Testing Methodology | Oded Goldreich

Communication Complexity in Property Testing Methodology | Oded Goldreich

Presentation Transcript

Cross feature testing in database systems

Property Testing of Data Dimensionality

A (short) review of the talks and posters presented

A Tutorial on Property Testing

Property-Based Testing A Silver Bullet ?

Double-Testing Discussion

Booster Short Kicker Testing Update

Testing a two-jet model of short Gamma-ray bursts

NCSX- Cable Property Testing

Property testing of Tree Regular Languages

DOUBLE FEATURE

Double feature:

short stay property management

Short Term Property Rental

Double feature:

Quantum Double Feature

DOUBLE JEOPARDY: Short Story Unit

Feature Listings - AWT Property Management

Short Term Property Rental | Swipeproperty.ie

Short Term Property Management | Shorttermpropertymanagement.co.uk