Lower Bounds for Property Testing Luca Trevisan U C Berkeley.

Slides:



Advertisements
Similar presentations
Lower Bounds for Additive Spanners, Emulators, and More David P. Woodruff MIT and Tsinghua University To appear in FOCS, 2006.
Advertisements

Hardness of testing 3- colorability in bounded degree graphs Andrej Bogdanov Kenji Obata Luca Trevisan.
NP-Hard Nattee Niparnan.
Bart Jansen 1.  Problem definition  Instance: Connected graph G, positive integer k  Question: Is there a spanning tree for G with at least k leaves?
Max Cut Problem Daniel Natapov.
Lecture 22: April 18 Probabilistic Method. Why Randomness? Probabilistic method: Proving the existence of an object satisfying certain properties without.
1 NP-completeness Lecture 2: Jan P The class of problems that can be solved in polynomial time. e.g. gcd, shortest path, prime, etc. There are many.
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Combinatorial Algorithms
Artur Czumaj Dept of Computer Science & DIMAP University of Warwick Testing Expansion in Bounded Degree Graphs Joint work with Christian Sohler.
Inapproximability from different hardness assumptions Prahladh Harsha TIFR 2011 School on Approximability.
Fast FAST By Noga Alon, Daniel Lokshtanov And Saket Saurabh Presentation by Gil Einziger.
Approximate Counting via Correlation Decay Pinyan Lu Microsoft Research.
Approximating Maximum Edge Coloring in Multigraphs
Introduction to Approximation Algorithms Lecture 12: Mar 1.
A Linear Round Lower Bound for Lovasz-Schrijver SDP relaxations of Vertex Cover Grant Schoenebeck Luca Trevisan Madhur Tulsiani UC Berkeley.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
Approximation Algorithms Lecture for CS 302. What is a NP problem? Given an instance of the problem, V, and a ‘certificate’, C, we can verify V is in.
Implicit Hitting Set Problems Richard M. Karp Harvard University August 29, 2011.
A general approximation technique for constrained forest problems Michael X. Goemans & David P. Williamson Presented by: Yonatan Elhanani & Yuval Cohen.
Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.
Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.
On Proximity Oblivious Testing Oded Goldreich - Weizmann Institute of Science Dana Ron – Tel Aviv University.
Analysis of Algorithms CS 477/677
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint work with Mira Gonen Dana Ron Tel-Aviv University.
1 Algorithmic Aspects in Property Testing of Dense Graphs Oded Goldreich – Weizmann Institute Dana Ron - Tel-Aviv University.
2-Layer Crossing Minimisation Johan van Rooij. Overview Problem definitions NP-Hardness proof Heuristics & Performance Practical Computation One layer:
Computability and Complexity 24-1 Computability and Complexity Andrei Bulatov Approximation.
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint works with Mira Gonen and Oded Goldreich Dana Ron Tel-Aviv University.
Chapter 11: Limitations of Algorithmic Power
CS151 Complexity Theory Lecture 6 April 15, 2004.
Finding a maximum independent set in a sparse random graph Uriel Feige and Eran Ofek.
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
Hardness Results for Problems
Finding Cycles and Trees in Sublinear Time Oded Goldreich Weizmann Institute of Science Joint work with Artur Czumaj, Dana Ron, C. Seshadhri, Asaf Shapira,
Some 3CNF Properties are Hard to Test Eli Ben-Sasson Harvard & MIT Prahladh Harsha MIT Sofya Raskhodnikova MIT.
CSE 589 Applied Algorithms Spring Colorability Branch and Bound.
Correlation testing for affine invariant properties on Shachar Lovett Institute for Advanced Study Joint with Hamed Hatami (McGill)
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
1 The TSP : NP-Completeness Approximation and Hardness of Approximation All exact science is dominated by the idea of approximation. -- Bertrand Russell.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Graph limit theory: Algorithms László Lovász Eötvös Loránd University, Budapest May
Approximation Algorithms
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Techniques for Proving NP-Completeness Show that a special case of the problem you are interested in is NP- complete. For example: The problem of finding.
Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.
Optimization in very large graphs László Lovász Eötvös Loránd University, Budapest December
Data Structures & Algorithms Graphs
Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013.
Unit 9: Coping with NP-Completeness
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
1/19 Minimizing weighted completion time with precedence constraints Nikhil Bansal (IBM) Subhash Khot (NYU)
NP-Complete problems.
CSE 589 Part V One of the symptoms of an approaching nervous breakdown is the belief that one’s work is terribly important. Bertrand Russell.
Artur Czumaj DIMAP DIMAP (Centre for Discrete Maths and it Applications) Computer Science & Department of Computer Science University of Warwick Testing.
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
Lecture 25 NP Class. P = ? NP = ? PSPACE They are central problems in computational complexity.
1 CPSC 320: Intermediate Algorithm Design and Analysis July 30, 2014.
NP Completeness Piyush Kumar. Today Reductions Proving Lower Bounds revisited Decision and Optimization Problems SAT and 3-SAT P Vs NP Dealing with NP-Complete.
Non-LP-Based Approximation Algorithms Fabrizio Grandoni IDSIA
Approximation Algorithms by bounding the OPT Instructor Neelima Gupta
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
CHAPTER SIX T HE P ROBABILISTIC M ETHOD M1 Zhang Cong 2011/Nov/28.
Property Testing (a.k.a. Sublinear Algorithms )
Lower Bounds for Property Testing
Approximating the MST Weight in Sublinear Time
From dense to sparse and back again: On testing graph properties (and some properties of Oded)
Chapter 11 Limitations of Algorithm Power
Presentation transcript:

Lower Bounds for Property Testing Luca Trevisan U C Berkeley

Sub-linear Time Algorithms Want to design algorithms that run in less than linear time –cannot read entire input –must be probabilistic and approximate For optimization problems: –compute numerical apx of optimum cost (and implicit representation of apx solution?) For decision problems: –what is approximation?

Graph Property Testing [GGR] Testing a property P with accuracy  Given graph G that has property P –accept with probability >3/4 Given graph G that is  -far from property P –accept with probability <1/4  -far = must change  –fraction of representation of G to get property P Intuition: input (not output) is approximate

Different Representations G is represented as adjacency matrix –  -far = must add/remove  n 2 edges G has max degree d and is represented using adjacency lists –  -far = must add/remove  dn edges (Some extra subtleties in bounded-degree case)

Purpose of This Talk Discuss algorithms and lower bounds for –Sub-linear time property testing for some basic graph properties –Sub-linear time approximation algorithms for some basic optimization problems (we’ll mostly discuss lower bounds)

Motivations Large data sets –web, wall-mart, amazon, phone calls,... –linear time can still be infeasible Fine print: most research on property testing focuses on problems having no connection to applications with large data sets Goal for theory research –Develop general algorithmic techniques (like dynamic programming, local search, … for P) –Develop general techniques for impossibility results (like NP-completeness)

Property Testing and Approximation in Adjacency Matrix Representation

Bipartiteness Algorithm [GGR,AK] Testing bipartiteness of a given graph G Pick (1/  )polylog(1/  ) vertices, and check if they induce a bipartite graph; if so accept otherwise reject If G is bipartite then alg accepts with prob 1 If G is  -far from bipartite, then whp algorithm discovers an odd cycle (non-trivial to prove) Running time: O ((1/   )polylog(1/  ))

Lower Bounds [BT]  (1/  1.5 ) for adaptive algorithms  (1/  2 ) for non-adaptive algorithms The bounds apply to the ‘query complexity’ of the algorithm (and to running time for a stronger reason)

Proof for one-sided error case Pick a random graph with edge-probability 3  –whp it is  -far from bipartite Consider view of (possibly adaptive) algorithm that makes q ‘queries’ and finds odd cycle w.h.p. –sees  (  q) edges and O(  2 q 2 ) pairs of connected vertices –a cycle can be discovered only by querying two vertices in same connected component –it takes  (1/  ) such attempts – q=  (1/  1.5 )

One-sided error non-adaptive Pick a random graph with edge-probability 3  Consider view of non-adaptive algorithm that makes q ‘queries’ Same as: –Start with q-edges graph –Independently delete each edge with prob 1-  If q=o(1/  2 ) then view is a forest w.p. 1-o(1) –Proof: There are at most O(q t/2 ) cycles of length t

Two-Sided Error Two distributions: Gfar: random graph with edge probability 3  Gbip: first random partition, then each edge crossing partition exists with prob 6  Distributions indistinguishable by –Non-adaptive algorithms of query complexity o(1/  2 ) –Adaptive algorithms of query complexity o(1/  1.5 ) Both tight for these distributions

Generality/Lessons Possible lesson: try random graph as a possible distribution of ‘hard’ instances far from having the properties Not good for “Triangle freeness” property whose complexity is possibly most interesting open question in the adjacency matrix model.

Triangle-free Graphs Want to distinguish triangle-free graphs from graphs where need to remove  n 2 edges to break all triangles Solvable in time super-exponential in 1/  Polynomial in 1/  is impossible [Alon] 2 poly(1/  ) possible? Simplest special case of more general (and important) question

Sublinear Time Approximation Max CUT and other graph problems can be approximated within (1+  ) in graphs with at least  n 2 edges in time 2 poly(1/  ) [GGR] Max 3SAT can be approximated within (1+  ) in instances with at least  n 3 clauses in time 2 poly(1/  ) and similar results for other satisfiability problems [AFKK] Lower bounds?

Property Testing and Approximation in Adjacency List Representation

Bipartiteness [GR] Testing bipartiteness Repeat polylog n times: –Start at random point, and pick sqrt(n) random walks of length polylog n, if two of them combine to form an odd cycle reject, otherwise accept Analysis: –in a graph where you need to remove constant fraction of edges to make it bipartite, algorithm finds odd cycle

Matching Lower Bound [GR] Define two distributions of graphs: –Gfar: a random hamiltonian circuit, plus a random matching (whp 1/100-far from bipartite) –Gbip: a random hamiltonian circuit, plus a random matching conditioned on making the graph bipartite Gfar and Gbip are indistinguishable to algorithms of query complexity o(sqrt(n)).

Approximation Algorithms Minimum spanning tree –given a connected weighted graph of degree d with weights in range {1,…,w}, can approximate MST weight within (1+  ) in time about O(dw/  2 ) [Chazelle, Rubinfeld, T] Max SAT –Given a CNF where every variable occurs at most d times, can approximate Max SAT optimum within.618, presumably also 2/3, in O(d) time [Hopefully will get 3/4-  ]

Testing 3-Colorability NP-hard in adjacency list representation Only for small enough  –Can find 3-coloring good for 80% of the edges in a 3-colorable graph using SDP –NP-hard to find 3-coloring good for 98% (?) fraction of edges Gives non-tight, and conditional lower bound for query complexity

Other Problems Query complexity of following problems is ‘equivalent’ to query complexity of testing 3col –Testing satisfiability of 3SAT instance Every variable occurs in O(1) clauses, “adjacency list” representation –Approximating max cut, vertex cover, independent set,..., in bounded-degree graphs –Approximating Max SAT, Max 2SAT,... Lower bound of sqrt(n) for all problems –Reduction from bipartiteness

Tight Lower Bound [BOT] For one-sided error algorithms: –  (n) query complexity to distinguish 3-colorable graphs from graphs that are (1/3 –  )-far –Lower bound applies to testing problems that are solvable in polynomial time For two-sided error algorithms: –For some ,  (n) query complexity to distinguish 3-colorable graphs from graphs that are  -far.

Using Reductions... Unconditionally, algorithms running in time o(n) cannot: –Approximate Max 3SAT better than 7/8 –Approximate Max Cut in bounded-degree graphs better than 16/17 –... Hastad’97 proved above problems are NP-hard

The 3-Coloring Lower Bound Consider first one-sided error algorithms It’s enough to find a graph G that is (1/3 –  )- far from 3-colorable, but every subgraph of size <  n is 3-colorable –(for every  there is an  such that...) Then an algorithm of query complexity <  n either accepts G (which is wrong) or rejects some 3-colorable graph (which means the algorithm has not one-sided error)

The Graph Pick a graph of degree O(1/  2 ) at random (pick so many random matchings) Then it is (1/3 –  )-far whp But, for some , whp, every subgraph induced by k <  n vertices contains <1.5k edges In a minimal non-3-colorable graph, every vertex has degree at least 3 Every subgraph induced by <  n vertices is 3- colorable [Erdos]

Derandomization For constants d, , , and for every suff large n, we can explicitly construct a graph –on n vertices, –max degree d, –  -far from 3-colorable, –such that every subset of  n vertices induces a 3-colorable subgraph.

Two-Sided Error Algorithms Need to define two distributions of graphs Gcol and Gfar such that Graphs in Gcol are (almost) always 3-colorable Graphs in Gfar are (almost) always far from 3- colorable To an algorithm of bounded query complexity, Gcol and Gfar look (almost) the same

Main Step Define two distributions Dsat and Dfar of instances of E3LIN-2 (systems over GF(2) with 3 variables per equation) –Systems in Dsat are always satisfiable –Systems in Dfar are (almost) always (1/2-  )-far from satisfiable –To an algorithm of bounded query complexity, Dsat and Dfar look the same We get Gcol and Gfar using reduction from approximate E3LIN-2 to approximate 3-coloring

E3LIN-2 X1 + X3 + X10 = 0 mod 2 X2 + X3 + X4 = 1 mod 2 X1 + X2 + X9 = 0 mod 2...

Main Building Block We show that for every c there is  such that there exists a left-hand side with – n variables, cn equations, 3 variables per equations, every variable occurs in 3c equations –every  n equations are linearly independent Pick the left-hand side at random –repeat 3c times: pick at random a set of n/3 disjoint triples of variables Explicit construction? –Need strong unique-neighbor expanders

Distributions The left-hand side is always as before In Dsat, we pick a random assignment to the variables, and set right-hand side consistently –always satisfiable In Dfar, we pick the right-hand side uniformly at random –With high probability, (1/2 – O(1/sqrt c))-far

Indistinguishability Two distributions differ only in right-hand side In Dfar uniformly distributed In Dsat,  n-wise independent – Linear independence implies statistical independence Look the same to algorithm that sees less than  n equations

Conclusion of the Argument No algorithm of “query complexity” o(n) can distinguish satisfiable instances of E3LIN-2 from instances that are (1/2-  )-far from satisfiable For some , no algorithm of query complexity o(n) can distinguish 3-colorable graphs from graphs that  –far from 3-col. No algorithm of query complexity o(n) can approximate Max 3SAT better than 7/8...

Generality/Lessons Reductions are useful and extend results to several problems In adjacency matrix (dense graph) setting, several and general algorithms. Few and ad- hoc lower bounds In adjacency list (sparse graph) setting, vice versa.

Open Questions Show that distinguishing 3-colorable graphs from (1/3-  )-far graphs requires query complexity  (n) –we can only prove it for one-sided error Show that approximating Max SAT better than ¾ and Max CUT bettter than ½ requires query complexity  (n) –we only know  (sqrt(n)) [implicit in GR] –would “explain” why we need SDP