Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013.

Slides:

Advertisements

Similar presentations

Lower Bounds for Additive Spanners, Emulators, and More David P. Woodruff MIT and Tsinghua University To appear in FOCS, 2006.

Advertisements

Finding Cycles and Trees in Sublinear Time Oded Goldreich Weizmann Institute of Science Joint work with Artur Czumaj, Dana Ron, C. Seshadhri, Asaf Shapira,

Gillat Kol joint work with Ran Raz Locally Testable Codes Analogues to the Unique Games Conjecture Do Not Exist.

Lecture 5 Graph Theory. Graphs Graphs are the most useful model with computer science such as logical design, formal languages, communication network,

A UNIFIED FRAMEWORK FOR TESTING LINEAR-INVARIANT PROPERTIES ARNAB BHATTACHARYYA CSAIL, MIT (Joint work with ELENA GRIGORESCU and ASAF SHAPIRA)

Approximating the Domatic Number Feige, Halldorsson, Kortsarz, Srinivasan ACM Symp. on Theory of Computing, pages , 2000.

Approximating Average Parameters of Graphs Oded Goldreich, Weizmann Institute Dana Ron, Tel Aviv University.

Christian Sohler | Every Property of Hyperfinite Graphs is Testable Ilan Newman and Christian Sohler.

Artur Czumaj Dept of Computer Science & DIMAP University of Warwick Testing Expansion in Bounded Degree Graphs Joint work with Christian Sohler.

Property Testing: A Learning Theory Perspective Dana Ron Tel Aviv University.

Asaf Shapira (Georgia Tech) Joint work with: Arnab Bhattacharyya (MIT) Elena Grigorescu (Georgia Tech) Prasad Raghavendra (Georgia Tech) 1 Testing Odd-Cycle.

Testing of ‘massively parametrized problems’ - Ilan Newman Haifa University Based on joint work with: Sourav Chakraborty, Eldar Fischer, Shirley Halevi,

Oded Goldreich Shafi Goldwasser Dana Ron February 13, 1998 Max-Cut Property Testing by Ori Rosen.

Proclaiming Dictators and Juntas or Testing Boolean Formulae Michal Parnas Dana Ron Alex Samorodnitsky.

Proximity Oblivious Testing Oded Goldreich Weizmann Institute of Science Joint work with Dana Ron.

Testing the Diameter of Graphs Michal Parnas Dana Ron.

On Approximating the Average Distance Between Points Kfir Barhum, Oded Goldreich and Adi Shraibman Weizmann Institute of Science.

1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005

CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.

1 2 Introduction In last chapter we saw a few consistency tests. In this chapter we are going to prove the properties of Plane-vs.- Plane test: Thm[RaSa]:

Proximity Oblivious Testing Oded Goldreich Weizmann Institute of Science Joint work with Dana Ron.

Testing of Clustering Noga Alon, Seannie Dar Michal Parnas, Dana Ron.

Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.

Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.

Testing Metric Properties Michal Parnas and Dana Ron.

On Proximity Oblivious Testing Oded Goldreich - Weizmann Institute of Science Dana Ron – Tel Aviv University.

On Testing Convexity and Submodularity Michal Parnas Dana Ron Ronitt Rubinfeld.

1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint work with Mira Gonen Dana Ron Tel-Aviv University.

1 Algorithmic Aspects in Property Testing of Dense Graphs Oded Goldreich – Weizmann Institute Dana Ron - Tel-Aviv University.

1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint works with Mira Gonen and Oded Goldreich Dana Ron Tel-Aviv University.

Job Scheduling Lecture 19: March 19. Job Scheduling: Unrelated Multiple Machines There are n jobs, each job has: a processing time p(i,j) (the time to.

Lower Bounds for Property Testing Luca Trevisan U C Berkeley.

Christian Sohler 1 University of Dortmund Testing Expansion in Bounded Degree Graphs Christian Sohler University of Dortmund (joint work with Artur Czumaj,

Approximating the Distance to Properties in Bounded-Degree and Sparse Graphs Sharon Marko, Weizmann Institute Dana Ron, Tel Aviv University.

On Testing Computability by small Width OBDDs Oded Goldreich Weizmann Institute of Science.

Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.

Lecture 11. Matching A set of edges which do not share a vertex is a matching. Application: Wireless Networks may consist of nodes with single radios,

A Tutorial on Property Testing Dana Ron Tel Aviv University.

Finding Cycles and Trees in Sublinear Time Oded Goldreich Weizmann Institute of Science Joint work with Artur Czumaj, Dana Ron, C. Seshadhri, Asaf Shapira,

Some 3CNF Properties are Hard to Test Eli Ben-Sasson Harvard & MIT Prahladh Harsha MIT Sofya Raskhodnikova MIT.

GRAPH Learning Outcomes Students should be able to:

Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)

A Property Testing Double-Feature of Short Talks Oded Goldreich Weizmann Institute of Science Talk at Technion, June 2013.

Modular Decomposition and Interval Graphs recognition Speaker: Asaf Shapira.

GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,

Complexity and Efficient Algorithms Group / Department of Computer Science Approximating Structural Properties of Graphs by Random Walks Christian Sohler.

1 The number of orientations having no fixed tournament Noga Alon Raphael Yuster.

Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Testing the independence number of hypergraphs

Markov Chains and Random Walks. Def: A stochastic process X={X(t),t ∈ T} is a collection of random variables. If T is a countable set, say T={0,1,2, …

Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.

Artur Czumaj DIMAP DIMAP (Centre for Discrete Maths and it Applications) Computer Science & Department of Computer Science University of Warwick Testing.

GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,

Dense graphs with a large triangle cover have a large triangle packing Raphael Yuster SIAM DM’10.

Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.

Approximation Algorithms based on linear programming.

Hongyu Liang Institute for Theoretical Computer Science Tsinghua University, Beijing, China The Algorithmic Complexity.

Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.

Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.

On Sample Based Testers

Lower Bounds for Property Testing

Randomness and Computation

Finding Cycles and Trees in Sublinear Time

From dense to sparse and back again: On testing graph properties (and some properties of Oded)

Lecture 18: Uniformity Testing Monotonicity Testing

Graphs and Algorithms (2MMD30)

The Subgraph Testing Model

Every set in P is strongly testable under a suitable encoding

Presentation transcript:

Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013

My Aim: Promote Research in Property Testing I think this area is still under-explored. This holds in particular wrt testing graph properties (in the various models).

Gothic cathedral ? Property Testing (super-fast approximate decision): an illustration One Motivation: Real objects are far apart. Other motivations: Approx. per se, or a preliminary step. Deciding by inspecting few locations in the object.

Property Testing: informal definition A relaxation of a decision problem: For a fixed property P and any object O, determine whether O has property P or is far from having property P ( i.e., O is far from any other object having P ). Focus: sub-linear time algorithms = performing the task by inspecting the object at few locations. ?? ? ? ? Objects viewed as functions. Inspecting = querying the function/oracle.

Property Testing: the standard (one-sided error) def’n A property P =  n P n, where P n is a set of functions with domain D n. The tester T gets explicit input n and , and oracle access to a function f with domain D n. If f  P n then Prob[T f (n,  ) accepts] = 1 (or > 2/3). If f is  -far from P n then Prob[T f (n,  ) rejects] > 2/3. (Distance is defined as fraction of disagreements.) Focus: query complexity, q(n,  ) « |D n | Special focus: q(n,  )=q(  ), independent of n. Terminology:  is called the proximity parameter.

Example 1: Testing Linearity Given (access to) a function f:G  H, determine whether it is linear* (or far from any linear function). The BLR Tester (repeated O(1/  ) when given proximity par.  ): 1.Select uniformly and independently x,y  G. 2.Accept if and only if f(x)+f(y)=f(x+y). *) G and H are group, and f is linear (or a group homomorphism) if f(x+y)=f(x)+f(y) holds for every x,y  G, where the 1 st “+” is of G and the 2 nd of H.

Linearity Testing cont’ed (recall f:G  H) 1.Select uniformly and independently x,y  G. 2.Accept if and only if f(x)+f(y)=f(x+y). Analysis*: Clearly if f is linear, then the test accept w.p. 1. Suppose that f is δ-far from being linear (i.e., disagrees with each linear function on δ lin (f)  δ fraction of the domain). Let h be a linear function closest to f. Then, Prob[Test rejects f] = Prob x,y [f(x)+f(y) ≠ f(x+y)]  3  Prob x,y [f(x)≠h(x) & f(y)=h(y) & f(x+y)=h(x+y)]  3  Prob x,y [f(x)≠h(x)]  (1 – 2  Prob x,y [f(y)≠h(y) | f(x) ≠h(x)]) = 3δ lin (f)  (1 – 2δ lin (f)) So assuming that the rejection probability increases with the distance, we are done. But does this natural assumption hold? *) The analysis refers to a single iteration of the test.

Linearity Testing cont’ed (2) (recall f:G  H) 1.Accept if and only if f(x)+f(y)=f(x+y). 2.Select uniformly and independently x,y  G. Does the rejection probability increase with the distance? Surprisingly the answer is no, at least for G = (Z 2 ) n and H = Z 2 ! Notes: The maximum distance is ½. The lower bound 3δ-6δ 2 is tight in [0,5/16]. Add’l lower bounds are δ and 45/128 for δ  5/16. Indeed, strange… BTW, by an alternative simple pf: min(1/6,δ/2). distance (i.e. δ) ⅜ ¼ 3δ–6δ23δ–6δ2 δ Rejection prob. 45/128 5/16

Example 2: Testing Bipartiteness in the “Dense Graphs Model” Note: The representation effects both the type of queries and the distance measure. A graph G=([N],E) is represented by a function g:[N]  [N]  {0,1} (i.e., g(u,v)=1 iff (u,v) is an edge in G). This (representation) determines: 1.The type of queries: adjacency queries 2.The distance measure: #differences/N 2

Testing Bipartiteness in the Dense Graphs Model The GGR Tester (input graph G, adjacency queries, prox. par.  ): 1.Select uniformly a subset of Õ(1/   ) vertices in G. 2.Accept if and only if the subgraph induced by this set is bipartite. Analysis: Clearly if G is bipartite, then the test accept w.p. 1. Suppose that G=([N],E) is  -far from being bipartite (i.e.,  N 2 edges must be omitted from G to make it bipartite). Partition the sample to two (non-equal) parts, a Õ(1/  ) subset denoted U, and a Õ(1/  2 ) subset denoted S. Consider all 2-way partitions of U: For each partition (U 1,U 2 ), consider the partition induced on all (graph) vertices such that all neighbors of U i are on the side opposite to it. To be con’t

Testing Bipartiteness in the Dense Graphs Model, cont. Analysis: Suppose that G=([N],E) is  -far from being bipartite. Partition the sample to two (non-equal) parts, a Õ(1/  ) subset denoted U, and a Õ(1/   ) subset denoted S. Consider all 2-way partitions of U: For each partition (U 1,U 2 ), consider the partition induced on all graph vertices such that all neighbors of U i are on the side opposite to it. Vertices that neighbor both U i ’s “witness’’ the badness of (U 1,U 2 ). W.h.p., almost all high degree vertices have neighbors in U (i.e., “high degree” = degree at least  “almost all” = all but at most  There are many violating edges between vertices assigned same side (i.e., “many” = at least    edges ). [Since U “dominates” almost edges.] A vertex pair selected at random hits such a pair w.p. at least  Thus, each potential partition is “rejected” (i.e., we find a violating edge wrt it) with probability at least (1-  ) |S|/2 = 1-exp(-|U|), which implies that w.p. at least 2/3 the subgraph induced by U ∪ S is not bipartite.

Example 3: Testing Triangle-Freeness in the “Dense Graphs Model” Task: Given proximity parameter  and (adjacency) query access to G, determine whether G is triangle-free or  N 2 edges must be omitted to eliminate all triangles. (“Clearly”) The following tester will do: Select a sample of M(  ) vertex triples and accept if and only if none of these triplets induces a triangle in G. (We query the relevant pairs...) How large should M(  ) be? Please guess; to be cont’ed...

Testing Triangle-Freeness in the “Dense Graphs Model”, cont. Task: Given proximity parameter  and (adjacency) query access to G, determine whether G is triangle-free or  -far from being triangle-free. The candidate tester: Select a sample of M(  ) vertex triples and accept if and only if none of these triplets induces a triangle in G. How large should M(  ) be? Guess #1: M(  ) = O(1/  3 ). Wrong! Guess #2: M(  ) = poly(1/  ). Wrong! Well, I don’t know the answer… Still, it is known that it is at least super-polynomial in 1 /  and at most a tower of poly( 1 /  ) many exponents.

In General: Testing Graph Properties in the Dense Model q adaptive queries  O(q 2 ) non-adaptive queries [GT]. Properties testable in F(  ) queries (Characterization by [AFNS, BCLSSV]) testable in poly(1/  ) queries Triangle-freeness [A] Bipartite [GGR, BT] CC, BCC [GR08] testable in Õ(1/  ) queries testable in Õ(1/  ) non-adaptive queries

Testing Graph Properties in the Dense Model: The “lowest” complexity level BL(H) = the set of graphs obtained by a (not necessarily balanced) blow-up of the graph H. testable in Õ(1/  ) non-adaptive queries BL(H) for any fixed graph H [AG] The special case of H being a clique was done in [GR08]. A blow-up of a 5-cycle

Example 4: Testing Bipartiteness in the “Bounded-Degree Graphs Model” Note: The representation effects both the type of queries and the distance measure. A graph G=([N],E) of maximal degree d is represented by a function g:[N]  [d]  [N] ∪ {0} (i.e., g(u,i)=v iff v is the i th neighbor of u in G). This (representation) determines: 1.The type of queries: incidence queries 2.The distance measure: #differences/dN

Testing Bipartiteness in the Bounded-Degree Graphs Model The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly O(1/  ) start vertices. 2.For each start vertex s take m = Õ(N 1/2 /poly(  )) random walks, each of length l = poly((log N)/  )). 3.Accept if and only if the subgraph explored is bipartite. Analysis: Clearly if G is bipartite, then the test accept w.p. 1. Suppose that G=([N],E) is  -far from being bipartite (i.e.,  dN edges must be omitted from G to make it bipartite). Simplying assumption (unjustified!): G is an expander. Let p v (  ) = probability that a lazy random l -walk starting at vertex s reaches v such that the induced path has length of parity . Consider a 2-partition placing v according to p v (0) vs p v (1). To be con’t. Lower bound: Ω (N 1/2 ) queries. ( In contrast to “dense” graph model.)

Testing Bipartiteness in the Bounded-Degree Model (cont.) The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly O(1/  ) start vertices. 2.For each start vertex s take m = Õ(N 1/2 /poly(  )) random walks, each of length l = poly((log N)/  )). 3.Accept if and only if the subgraph explored is bipartite. Analysis: Simplying assumption (unjustified!): G is an expander. Let p v (  ) = probability that a lazy random l -walk starting at vertex s reaches v such that the induced path has length of parity . Consider a 2-partition placing v according to p v (0) vs p v (1). If ∑  ∑ (u,v)  E p u (  )  p v (  ) <  /N then this partition has at most  dN violating edges, otherwise (i.e., larger sum) the tester rejects w. probability at least 2/3. Thus, if G is not  -close to Bipartite, the tester rejects w.p. > 2/3. Lazy random walk = in each step stays in place w.p. ½. each of these requires a pf

Example 5: Testing Cycle-freeness in the “Bounded-Degree Graphs Model” The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly s=O(1/   ) start vertices. 2.For each start vertex, explore (*) till visiting O(1/  ) vertices. 3.If a cycle was found in any of these explorations reject. 4.Otherwise, let n denote the number of start vertices that reside in “large” components (i.e., CC that were not fully explored) and m be half the sum of their degrees. Accept iff |n-m| <  s/2. *) The exploration is rather arbitrary. Observation: A graph is cycle-free iff the number of edges in it equals the number of vertices minus the number of connected components. We shall approximate both. Aux. Obs. : The number of large CCs is negligible, hence the number of small CC approximates the total number of CC.

Testing Cycle-freeness in the “Bounded-Degree Graphs Model”, cont. The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly s=O(1/   ) start vertices. 2.For each start vertex, explore till visiting O(1/  ) vertices. 3.If a cycle was found in any of these explorations reject. 4.Otherwise, let n denote the number of start vertices that reside in “large” components (i.e., CC that were not fully explored) and m be half the sum of their degrees. Accept iff |n-m| <  s/2. The tester approximates the number of edges and the number of connected components. Hence, it has two-sided error. THM: Cycle-freeness has no one-sided error tester of o(N 1/2 ) query complexity, but does have a one-sided error tester of Õ(N 1/2 ) queries. For constant , this tester makes O(1) queries! N.B.: The tester does not try to find cycles. In contrast, a one-sided error tester may only reject when seeing cycles.

In General: Testing Graph Properties in the Bounded-Degree Model Questions (wrt constant proximity parameter): Testability in sub-linear query complexity. Testability in constant query complexity. One-sided vs two-sided probability error. E.g., cycle-freeness has a constant- query tester of two-sided error, no one-sided error tester of o(N 1/2 ) query complexity, but does have a one-sided error tester of Õ(N 1/2 ) queries.

End The slides of this talk are available at A survey on testing graph properties is available at Other surveys are available at

Example 6: Testing Connectivity in the “Bounded-Degree Graphs Model” The GR Tester (input graph G, incidence queries, prox. par.  ): 1.Select uniformly O(1/  ) start vertices. 2.For each start vertex, explore (*) till visiting Õ(1/  ) vertices. 3.Accept if and only if no small connected component is seen. *) The exploration is rather arbitrary. In a more efficient tester Steps (1) & (2) are replaced by selecting, for each i=1,…,log(1/  ), 2 i start vertices and exploring from each of these vertices till visiting O(2 -i /  ). Observation: A graph is far from being connected if and only if it has many (small) connected components.