Presentation is loading. Please wait.

Presentation is loading. Please wait.

Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013.

Similar presentations


Presentation on theme: "Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013."— Presentation transcript:

1 Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013

2 My Aim: Promote Research in Property Testing I think this area is still under-explored. This holds in particular wrt testing graph properties (in the various models).

3 Gothic cathedral ? Property Testing (super-fast approximate decision): an illustration One Motivation: Real objects are far apart. Other motivations: Approx. per se, or a preliminary step. Deciding by inspecting few locations in the object.

4 Property Testing: informal definition A relaxation of a decision problem: For a fixed property P and any object O, determine whether O has property P or is far from having property P ( i.e., O is far from any other object having P ). Focus: sub-linear time algorithms = performing the task by inspecting the object at few locations. ?? ? ? ? Objects viewed as functions. Inspecting = querying the function/oracle.

5 Property Testing: the standard (one-sided error) def’n A property P =  n P n, where P n is a set of functions with domain D n. The tester T gets explicit input n and , and oracle access to a function f with domain D n. If f  P n then Prob[T f (n,  ) accepts] = 1 (or > 2/3). If f is  -far from P n then Prob[T f (n,  ) rejects] > 2/3. (Distance is defined as fraction of disagreements.) Focus: query complexity, q(n,  ) « |D n | Special focus: q(n,  )=q(  ), independent of n. Terminology:  is called the proximity parameter.

6 Example 1: Testing Linearity Given (access to) a function f:G  H, determine whether it is linear* (or far from any linear function). The BLR Tester (repeated O(1/  ) when given proximity par.  ): 1.Select uniformly and independently x,y  G. 2.Accept if and only if f(x)+f(y)=f(x+y). *) G and H are group, and f is linear (or a group homomorphism) if f(x+y)=f(x)+f(y) holds for every x,y  G, where the 1 st “+” is of G and the 2 nd of H.

7 Linearity Testing cont’ed (recall f:G  H) 1.Select uniformly and independently x,y  G. 2.Accept if and only if f(x)+f(y)=f(x+y). Analysis*: Clearly if f is linear, then the test accept w.p. 1. Suppose that f is δ-far from being linear (i.e., disagrees with each linear function on δ lin (f)  δ fraction of the domain). Let h be a linear function closest to f. Then, Prob[Test rejects f] = Prob x,y [f(x)+f(y) ≠ f(x+y)]  3  Prob x,y [f(x)≠h(x) & f(y)=h(y) & f(x+y)=h(x+y)]  3  Prob x,y [f(x)≠h(x)]  (1 – 2  Prob x,y [f(y)≠h(y) | f(x) ≠h(x)]) = 3δ lin (f)  (1 – 2δ lin (f)) So assuming that the rejection probability increases with the distance, we are done. But does this natural assumption hold? *) The analysis refers to a single iteration of the test.

8 Linearity Testing cont’ed (2) (recall f:G  H) 1.Accept if and only if f(x)+f(y)=f(x+y). 2.Select uniformly and independently x,y  G. Does the rejection probability increase with the distance? Surprisingly the answer is no, at least for G = (Z 2 ) n and H = Z 2 ! Notes: The maximum distance is ½. The lower bound 3δ-6δ 2 is tight in [0,5/16]. Add’l lower bounds are δ and 45/128 for δ  5/16. Indeed, strange… BTW, by an alternative simple pf: min(1/6,δ/2). distance (i.e. δ) ⅜ ¼ 3δ–6δ23δ–6δ2 δ Rejection prob. 45/128 5/16

9 Example 2: Testing Bipartiteness in the “Dense Graphs Model” Note: The representation effects both the type of queries and the distance measure. A graph G=([N],E) is represented by a function g:[N]  [N]  {0,1} (i.e., g(u,v)=1 iff (u,v) is an edge in G). This (representation) determines: 1.The type of queries: adjacency queries 2.The distance measure: #differences/N 2

10 Testing Bipartiteness in the Dense Graphs Model The GGR Tester (input graph G, adjacency queries, prox. par.  ): 1.Select uniformly a subset of Õ(1/   ) vertices in G. 2.Accept if and only if the subgraph induced by this set is bipartite. Analysis: Clearly if G is bipartite, then the test accept w.p. 1. Suppose that G=([N],E) is  -far from being bipartite (i.e.,  N 2 edges must be omitted from G to make it bipartite). Partition the sample to two (non-equal) parts, a Õ(1/  ) subset denoted U, and a Õ(1/  2 ) subset denoted S. Consider all 2-way partitions of U: For each partition (U 1,U 2 ), consider the partition induced on all (graph) vertices such that all neighbors of U i are on the side opposite to it. To be con’t

11 Testing Bipartiteness in the Dense Graphs Model, cont. Analysis: Suppose that G=([N],E) is  -far from being bipartite. Partition the sample to two (non-equal) parts, a Õ(1/  ) subset denoted U, and a Õ(1/   ) subset denoted S. Consider all 2-way partitions of U: For each partition (U 1,U 2 ), consider the partition induced on all graph vertices such that all neighbors of U i are on the side opposite to it. Vertices that neighbor both U i ’s “witness’’ the badness of (U 1,U 2 ). W.h.p., almost all high degree vertices have neighbors in U (i.e., “high degree” = degree at least  “almost all” = all but at most  There are many violating edges between vertices assigned same side (i.e., “many” = at least    edges ). [Since U “dominates” almost edges.] A vertex pair selected at random hits such a pair w.p. at least  Thus, each potential partition is “rejected” (i.e., we find a violating edge wrt it) with probability at least (1-  ) |S|/2 = 1-exp(-|U|), which implies that w.p. at least 2/3 the subgraph induced by U ∪ S is not bipartite.

12 Example 3: Testing Triangle-Freeness in the “Dense Graphs Model” Task: Given proximity parameter  and (adjacency) query access to G, determine whether G is triangle-free or  N 2 edges must be omitted to eliminate all triangles. (“Clearly”) The following tester will do: Select a sample of M(  ) vertex triples and accept if and only if none of these triplets induces a triangle in G. (We query the relevant pairs...) How large should M(  ) be? Please guess; to be cont’ed...

13 Testing Triangle-Freeness in the “Dense Graphs Model”, cont. Task: Given proximity parameter  and (adjacency) query access to G, determine whether G is triangle-free or  -far from being triangle-free. The candidate tester: Select a sample of M(  ) vertex triples and accept if and only if none of these triplets induces a triangle in G. How large should M(  ) be? Guess #1: M(  ) = O(1/  3 ). Wrong! Guess #2: M(  ) = poly(1/  ). Wrong! Well, I don’t know the answer… Still, it is known that it is at least super-polynomial in 1 /  and at most a tower of poly( 1 /  ) many exponents.

14 In General: Testing Graph Properties in the Dense Model q adaptive queries  O(q 2 ) non-adaptive queries [GT]. Properties testable in F(  ) queries (Characterization by [AFNS, BCLSSV]) testable in poly(1/  ) queries Triangle-freeness [A] Bipartite [GGR, BT] CC, BCC [GR08] testable in Õ(1/  ) queries testable in Õ(1/  ) non-adaptive queries

15 Testing Graph Properties in the Dense Model: The “lowest” complexity level BL(H) = the set of graphs obtained by a (not necessarily balanced) blow-up of the graph H. testable in Õ(1/  ) non-adaptive queries BL(H) for any fixed graph H [AG] The special case of H being a clique was done in [GR08]. A blow-up of a 5-cycle

16 Example 4: Testing Bipartiteness in the “Bounded-Degree Graphs Model” Note: The representation effects both the type of queries and the distance measure. A graph G=([N],E) of maximal degree d is represented by a function g:[N]  [d]  [N] ∪ {0} (i.e., g(u,i)=v iff v is the i th neighbor of u in G). This (representation) determines: 1.The type of queries: incidence queries 2.The distance measure: #differences/dN

17 Testing Bipartiteness in the Bounded-Degree Graphs Model The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly O(1/  ) start vertices. 2.For each start vertex s take m = Õ(N 1/2 /poly(  )) random walks, each of length l = poly((log N)/  )). 3.Accept if and only if the subgraph explored is bipartite. Analysis: Clearly if G is bipartite, then the test accept w.p. 1. Suppose that G=([N],E) is  -far from being bipartite (i.e.,  dN edges must be omitted from G to make it bipartite). Simplying assumption (unjustified!): G is an expander. Let p v (  ) = probability that a lazy random l -walk starting at vertex s reaches v such that the induced path has length of parity . Consider a 2-partition placing v according to p v (0) vs p v (1). To be con’t. Lower bound: Ω (N 1/2 ) queries. ( In contrast to “dense” graph model.)

18 Testing Bipartiteness in the Bounded-Degree Model (cont.) The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly O(1/  ) start vertices. 2.For each start vertex s take m = Õ(N 1/2 /poly(  )) random walks, each of length l = poly((log N)/  )). 3.Accept if and only if the subgraph explored is bipartite. Analysis: Simplying assumption (unjustified!): G is an expander. Let p v (  ) = probability that a lazy random l -walk starting at vertex s reaches v such that the induced path has length of parity . Consider a 2-partition placing v according to p v (0) vs p v (1). If ∑  ∑ (u,v)  E p u (  )  p v (  ) <  /N then this partition has at most  dN violating edges, otherwise (i.e., larger sum) the tester rejects w. probability at least 2/3. Thus, if G is not  -close to Bipartite, the tester rejects w.p. > 2/3. Lazy random walk = in each step stays in place w.p. ½. each of these requires a pf

19 Example 5: Testing Cycle-freeness in the “Bounded-Degree Graphs Model” The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly s=O(1/   ) start vertices. 2.For each start vertex, explore (*) till visiting O(1/  ) vertices. 3.If a cycle was found in any of these explorations reject. 4.Otherwise, let n denote the number of start vertices that reside in “large” components (i.e., CC that were not fully explored) and m be half the sum of their degrees. Accept iff |n-m| <  s/2. *) The exploration is rather arbitrary. Observation: A graph is cycle-free iff the number of edges in it equals the number of vertices minus the number of connected components. We shall approximate both. Aux. Obs. : The number of large CCs is negligible, hence the number of small CC approximates the total number of CC.

20 Testing Cycle-freeness in the “Bounded-Degree Graphs Model”, cont. The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly s=O(1/   ) start vertices. 2.For each start vertex, explore till visiting O(1/  ) vertices. 3.If a cycle was found in any of these explorations reject. 4.Otherwise, let n denote the number of start vertices that reside in “large” components (i.e., CC that were not fully explored) and m be half the sum of their degrees. Accept iff |n-m| <  s/2. The tester approximates the number of edges and the number of connected components. Hence, it has two-sided error. THM: Cycle-freeness has no one-sided error tester of o(N 1/2 ) query complexity, but does have a one-sided error tester of Õ(N 1/2 ) queries. For constant , this tester makes O(1) queries! N.B.: The tester does not try to find cycles. In contrast, a one-sided error tester may only reject when seeing cycles.

21 In General: Testing Graph Properties in the Bounded-Degree Model Questions (wrt constant proximity parameter): Testability in sub-linear query complexity. Testability in constant query complexity. One-sided vs two-sided probability error. E.g., cycle-freeness has a constant- query tester of two-sided error, no one-sided error tester of o(N 1/2 ) query complexity, but does have a one-sided error tester of Õ(N 1/2 ) queries.

22 End The slides of this talk are available at http://www.wisdom.weizmann.ac.il/~oded/T/pt-intro.ppt A survey on testing graph properties is available at http://www.wisdom.weizmann.ac.il/~oded/p_tgp.html Other surveys are available at http://www.wisdom.weizmann.ac.il/~oded/surveys.html

23 Example 6: Testing Connectivity in the “Bounded-Degree Graphs Model” The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly O(1/  ) start vertices. 2.For each start vertex, explore (*) till visiting Õ(1/  ) vertices. 3.Accept if and only if no small connected component is seen. *) The exploration is rather arbitrary. In a more efficient tester Steps (1) & (2) are replaced by selecting, for each i=1,…,log(1/  ), 2 i start vertices and exploring from each of these vertices till visiting O(2 -i /  ). Observation: A graph is far from being connected if and only if it has many (small) connected components.


Download ppt "Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013."

Similar presentations


Ads by Google