Testing Metric Properties Michal Parnas and Dana Ron.

Slides:



Advertisements
Similar presentations
Completeness and Expressiveness
Advertisements

Tests of Hypotheses Based on a Single Sample
Gillat Kol joint work with Ran Raz Locally Testable Codes Analogues to the Unique Games Conjecture Do Not Exist.
WSPD Applications.
PHYLOGENETIC TREES Bulent Moller CSE March 2004.
Fast Algorithms For Hierarchical Range Histogram Constructions
Lecture 17 Path Algebra Matrix multiplication of adjacency matrices of directed graphs give important information about the graphs. Manipulating these.
1 Distribution-free testing algorithms for monomials with a sublinear number of queries Elya Dolev & Dana Ron Tel-Aviv University.
Distribution-free testing algorithms for monomials with a sublinear number of queries Elya Dolev & Dana Ron Tel-Aviv University.
Approximating Average Parameters of Graphs Oded Goldreich, Weizmann Institute Dana Ron, Tel Aviv University.
Foreground/Background Image Segmentation. What is our goal? To label each pixel in an image as belonging to either the foreground of the scene or the.
1. Given a predetermined property and a graph we want to distinguish between the 2 cases: 1)The graph has the property 2) The graph is “far” from having.
Christian Sohler | Every Property of Hyperfinite Graphs is Testable Ilan Newman and Christian Sohler.
Artur Czumaj Dept of Computer Science & DIMAP University of Warwick Testing Expansion in Bounded Degree Graphs Joint work with Christian Sohler.
The Stackelberg Minimum Spanning Tree Game Jean Cardinal · Erik D. Demaine · Samuel Fiorini · Gwenaël Joret · Stefan Langerman · Ilan Newman · OrenWeimann.
Property Testing: A Learning Theory Perspective Dana Ron Tel Aviv University.
Complexity 15-1 Complexity Andrei Bulatov Hierarchy Theorem.
The main idea of the article is to prove that there exist a tester of monotonicity with query and time complexity.
Proclaiming Dictators and Juntas or Testing Boolean Formulae Michal Parnas Dana Ron Alex Samorodnitsky.
Proximity Oblivious Testing Oded Goldreich Weizmann Institute of Science Joint work with Dana Ron.
Testing the Diameter of Graphs Michal Parnas Dana Ron.
Some Techniques in Property Testing Dana Ron Tel Aviv University.
Approximation Algorithm: Iterative Rounding Lecture 15: March 9.
Testing of Clustering Noga Alon, Seannie Dar Michal Parnas, Dana Ron.
Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.
Sublinear time algorithms Ronitt Rubinfeld Blavatnik School of Computer Science Tel Aviv University TexPoint fonts used in EMF. Read the TexPoint manual.
Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.
On Proximity Oblivious Testing Oded Goldreich - Weizmann Institute of Science Dana Ron – Tel Aviv University.
Computational Complexity, Physical Mapping III + Perl CIS 667 March 4, 2004.
1 On approximating the number of relevant variables in a function Dana Ron & Gilad Tsur Tel-Aviv University.
On Testing Convexity and Submodularity Michal Parnas Dana Ron Ronitt Rubinfeld.
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint work with Mira Gonen Dana Ron Tel-Aviv University.
1 Algorithmic Aspects in Property Testing of Dense Graphs Oded Goldreich – Weizmann Institute Dana Ron - Tel-Aviv University.
2-Layer Crossing Minimisation Johan van Rooij. Overview Problem definitions NP-Hardness proof Heuristics & Performance Practical Computation One layer:
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint works with Mira Gonen and Oded Goldreich Dana Ron Tel-Aviv University.
Job Scheduling Lecture 19: March 19. Job Scheduling: Unrelated Multiple Machines There are n jobs, each job has: a processing time p(i,j) (the time to.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Backtracking Reading Material: Chapter 13, Sections 1, 2, 4, and 5.
Distributed Combinatorial Optimization
Randomness in Computation and Communication Part 1: Randomized algorithms Lap Chi Lau CSE CUHK.
Approximating the Distance to Properties in Bounded-Degree and Sparse Graphs Sharon Marko, Weizmann Institute Dana Ron, Tel Aviv University.
On Testing Computability by small Width OBDDs Oded Goldreich Weizmann Institute of Science.
. Phylogenetic Trees (2) Lecture 12 Based on: Durbin et al Section 7.3, 7.8, Gusfield: Algorithms on Strings, Trees, and Sequences Section 17.
A Tutorial on Property Testing Dana Ron Tel Aviv University.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
V. V. Vazirani. Approximation Algorithms Chapters 3 & 22
Graph Coalition Structure Generation Maria Polukarov University of Southampton Joint work with Tom Voice and Nick Jennings HUJI, 25 th September 2011.
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
1 Sublinear Algorithms Lecture 1 Sofya Raskhodnikova Penn State University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.
Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013.
1/19 Minimizing weighted completion time with precedence constraints Nikhil Bansal (IBM) Subhash Khot (NYU)
Chapter 8 Maximum Flows: Additional Topics All-Pairs Minimum Value Cut Problem  Given an undirected network G, find minimum value cut for all.
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
C&O 355 Lecture 19 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.
Approximation Algorithms based on linear programming.
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
Unconstrained Submodular Maximization Moran Feldman The Open University of Israel Based On Maximizing Non-monotone Submodular Functions. Uriel Feige, Vahab.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
On Sample Based Testers
Dana Ron Tel-Aviv University
On Testing Dynamic Environments
Approximating the MST Weight in Sublinear Time
From dense to sparse and back again: On testing graph properties (and some properties of Oded)
Chapter 5. Optimal Matchings
Every set in P is strongly testable under a suitable encoding
Presentation transcript:

Testing Metric Properties Michal Parnas and Dana Ron

Property Testing (Informal Definition) For a fixed property P and any object O, determine whether O has property P, or whether O is far from having property P (i.e., far from any other object having P ). Task should be performed by querying the object (in as few places as possible). ?? ? ? ?

Property Testing - Background Initially defined by Rubinfeld and Sudan in the context of Program Testing (of algebraic functions). Goldreich Goldwasser and Ron initiated study of testing properties of (undirected) graphs. Growing body of work deals with properties of functions, graphs, strings, sets of points... Many algorithms with complexity that is sub-linear in (or even independent of) size of object.

Motivation Computational: Design testing algorithms that are (much) more efficient than exact decision algorithms for properties. Combinatorial: Gain new understanding about tested property.

Testing Metric Properties P - Metric property ; M - n x n rational-valued matrix;   - Distance/approximation parameter; M is said to be -far from property P if must modify more than  fraction of n 2 entries so that M obtains P. Otherwise say that it is -close. Testing algorithm can query M on entries M[i,j]. If M has property P, should accept; If M is  -far from property P, should reject w.p.  2/3.

Tree Metrics and Ultametrics An n x n matrix M is a tree metric (additive metric) if exists a tree T with positive weights on edges, such that: There exists a mapping  from [n] into nodes of T; For every i,j[n]={1,…,n}, T(  (i),  (j))=M[i,j]; All nodes to which no i[n] is mapped to, have degree greater than 2. If: T is rooted,  maps only to leaves of T, and distance of all leaves to root is the same, then M is an ultrametric.

M[1,2]=8; M[1,3]=12; M[1,4]=10; M[1,5]=15;... Tree Metric M[1,2]=M[1,3]=M[2,3]=8; M[1,4]=M[1,5]=M[1,6]=12; M[4,5]=M[4,6]=6; M[5,6]=2;... Ultrametric

Our Results Can test ultrametrics with |S|= O(log(1/  )/   ). Can test general tree metrics with |S|=O(log(1/  )/   ). Can extend result for ultrametrics to approximate ultrametrics. Can test d-dimensional Euclidean metrics with |S|=O(d log d/  ). Our algorithms all work by taking uniformly selected sample S  [n] and querying M[i,j] for i,j  S. Size of sample is always poly(1/  ) and independent of n. Specifically:

Our Results (continued) Testing algorithms can be used to solve relaxed versions of corresponding search problems in time linear in n (and polynomial in 1/  ). That is, can construct tree that agrees with M on all but at most  -fraction of entries. (Note that running time is sub-linear in size of matrix M.)

Constructing an Ultrametric Tree Suppose M is an ultrametric. We can construct an ultrametric tree that agrees with M on given subset {1,…,s} in following manner: Initialization: Position points 1 and 2 at equal distance M[1,2]/2 from root node. Iterations: For each point j = 3,…,s add point j to current tree by adding new branch that emits from j’s unique point of departure from tree. This point is determined by closest point in tree.

M[1,2]=8; M[1,3]=M[1,4]=M[1,5]=10; M[2,3]=M[2,4]=M[2,5]=10; M[3,4]=2; M[3,5]=6; M[4,5]=6;

Consistency of points with tree For U  [n], let T U denote tree with leaf-set U, that agrees with M on U (if exists, such tree is unique). Def: Say that j  [n] \ U is consistent with T U if adding j to T U as described in construction procedure, results in tree that agrees with M on U+j. Denote set of points consistent with U by  U.

The “Scaffold Partition” For U  [n], let T U denote tree with leaf-set U, that agrees with M on U. We refer to tree as scaffold. Def: Let P U be following partition of  U, induced by T U : Points i and j are in same class i.f.f have same point of departure from T U.

C1C1 C4C4 C3C3 C2C2 The scaffold partition

Violating Pairs If M is an ultrametric, then for every subset U, and for every two points i,j that belong to different classes in P U, value of M[i,j] is exactly determined by corresponding (different) departure points in T U. Def: Say that i,j   U that belong to different classes in P U are a violating pair w.r.t. T U if distance between them according to scaffold T U differs from M[i,j].

C1C1 C4C4 C3C3 C2C2 If M is ultrametric, must have M[i,j]=8. ji 32

Two types of “witnesses” Suppose have scaffold tree T U that agrees with M on U. (If can’t construct such tree, clearly M not ultrametric.) It follows that: If obtain point j that is inconsistent with T U then have witness that M not ultrametric. If obtain pair of points i,j that are violating w.r.t. T U then have witness that M not ultrametric.

Testing Algorithm for Ultrametrics 1. Uniformly select s=O(log(1/  )/  3 ) points from [n]. Denote set by U. 2. Construct tree T U that agrees with M on U. If fail, reject. 3. Uniformly select m=O(1/  ) pairs of points from [n]. 4. If any of these 2m points is inconsistent with T U, or any of the m pairs is violating w.r.t. T U, then reject. 5. If no step cause rejection then accept.

Analysis of Algorithm  If M is ultrametric -- Algorithm always accepts. (No inconsistent points and no violating pairs.)  From now on assume M is  -far from ultrametric. Will show that algorithm rejects w.h.p. Specifically: Either can’t construct T U that agrees with M; or many inconsistent points w.r.t. T U ; or many violating pairs w.r.t. T U ;

Special Case (for M  -far from ultrametric) Suppose T U agrees with M, and all but at most (  /3)n 2 pairs of points in  U belong to different classes in P U (are separated). (In particular is the case if all classes of size O(  n).) Claim: Either have > (  /3)n inconsistent points w.r.t. T U or have > (  /3)n 2 violating pairs w.r.t T U. Subject to claim, if M is  -far from ultrametric, then rejected w.h.p. as required.

Proof of Claim for special case Assume, contrary to claim, that have  (  /3)n inconsistent points, and  (  /3)n 2 violating pairs. Will show that  ultrametric tree T that agrees with M on all but at most  n 2 entries, in contradiction to assumption on M. Tree T builds on scaffold T U : For every class C in P U create star-shaped sub-tree with leaf set C that is rooted at point of departure of C from T U. Inconsistent points are added arbitrarily. By premise of lemma and (counter) assumptions, num of disagreements  (  /3)n. n + (  /3)n 2 + (  /3)n 2 =  n 2. incon. pts viol. Pairs unsep. pairs

C1C1 C4C4 C3C3 C2C2

C1C1 C4C4 C3C3 C2C2

General Case By special case: Gain from separating points to diff classes. Def: Say that point kU is effective separator w.r.t. T U if adding k to U causes  (  n/12) 2 pairs of points to be separated into different classes. k C1C1 C4C4 C3C3 C2C2 C 1,2 C 1, 1

General Case By special case: Gain from separating points to diff classes. Def: Say that point kU is effective separator w.r.t. T U if adding k to U causes  (  n/12) 2 pairs of points to be separated into different classes. k C4C4 C3C3 C2C2 C 1,2 C 1, 1

General Case (continued) In analysis, view sample U as being selected in phases. In each phase, if  many effective separators then one selected w.h.p. After sufficient num of phases, either have special case (few non-separated pairs), or U s.t. have few effective separators w.r.t. T U. In latter case can show that  class C in P U,  tree T C s.t. for almost all pairs i,jC, M[i,j]= T C (i,j). (Tree is star-shaped/broom-shaped.)

General Case (continued) Claim: Either have > (  /4)n inconsistent points w.r.t. T U or have > (  /4)n 2 violating pairs w.r.t T U. Subject to claim, if M is  -far from ultrametric, then rejected w.h.p. as required. Proof of Claim is similar to that in special case: Assume few inconsistent points and violating pairs, show that  tree close to M (contradicting M being  -far from ultrametric).

C1C1 C4C4 C3C3 C2C2

C1C1 C4C4 C3C3 C2C2

Solving Relaxed version of Search Problem Analysis implies that testing algorithm can be used to solve relaxed version of corresponding search problem. That is, if M is ultrametric then, w.h.p. can construct tree that agrees with M on all but at most  -fraction of entries in time linear in n and polynomial in 1/  : Construct scaffold T U on uniformly selected sample U; Partition all points in [n]\U into classes of P U according to distances to points in U; For each class C construct star/broom-shaped tree T C.

Testing Approximate Ultrametrics Def: For a given approximation parameter , we say that matrix M is a  -approximate ultrametric if exists ultrametric M’ s.t. for every i,j [n], |M[i,j]-M’[i,j]|  . We describe an algorithm, that for every  and , if M is a  –approximate ultrametric then algorithm accepts M, and if M is  –far from being a c  –approximate ultrametric then algorithm rejects M w.h.p. (c is a fixed constant).

Conclusions and Further Research Presented algorithm for testing whether matrix is an ultrametric or far from being an ultrametric. Analysis implies fast solution for relaxed search problem. Mentioned similar results for approximate ultrametrics, general tree metrics and Euclidean metrics. We suspect that results can be improved in terms of dependence on 1/ . We conjecture that can extend result for general tree metrics to approximate variant. Testing other natural metric properties?