On Approximating the Number of Relevant Variables in a Function

Slides:



Advertisements
Similar presentations
The Future (and Past) of Quantum Lower Bounds by Polynomials Scott Aaronson UC Berkeley.
Advertisements

How to Solve Longstanding Open Problems In Quantum Computing Using Only Fourier Analysis Scott Aaronson (MIT) For those who hate quantum: The open problems.
Property Testing and Communication Complexity Grigory Yaroslavtsev
Lower Bounds for Testing Properties of Functions on Hypergrids Grigory Yaroslavtsev Joint with: Eric Blais (MIT) Sofya Raskhodnikova.
Distributional Property Estimation Past, Present, and Future Gregory Valiant (Joint work w. Paul Valiant)
Hypothesis testing Another judgment method of sampling data.
QuickSort Average Case Analysis An Incompressibility Approach Brendan Lucier August 2, 2005.
Learning Juntas Elchanan Mossel UC Berkeley Ryan O’Donnell MIT Rocco Servedio Harvard.
1 Distribution-free testing algorithms for monomials with a sublinear number of queries Elya Dolev & Dana Ron Tel-Aviv University.
Distribution-free testing algorithms for monomials with a sublinear number of queries Elya Dolev & Dana Ron Tel-Aviv University.
Noga Alon Institute for Advanced Study and Tel Aviv University
Property Testing: A Learning Theory Perspective Dana Ron Tel Aviv University.
Proclaiming Dictators and Juntas or Testing Boolean Formulae Michal Parnas Dana Ron Alex Samorodnitsky.
Testing the Diameter of Graphs Michal Parnas Dana Ron.
Putting a Junta to the Test Joint work with Eldar Fischer, Dana Ron, Shmuel Safra, and Alex Samorodnitsky Guy Kindler.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Sublinear time algorithms Ronitt Rubinfeld Blavatnik School of Computer Science Tel Aviv University TexPoint fonts used in EMF. Read the TexPoint manual.
Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.
Testing Metric Properties Michal Parnas and Dana Ron.
On Proximity Oblivious Testing Oded Goldreich - Weizmann Institute of Science Dana Ron – Tel Aviv University.
1 On approximating the number of relevant variables in a function Dana Ron & Gilad Tsur Tel-Aviv University.
1 Algorithmic Aspects in Property Testing of Dense Graphs Oded Goldreich – Weizmann Institute Dana Ron - Tel-Aviv University.
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint works with Mira Gonen and Oded Goldreich Dana Ron Tel-Aviv University.
Inference about a Mean Part II
On Testing Computability by small Width OBDDs Oded Goldreich Weizmann Institute of Science.
Turing Machines CS 105: Introduction to Computer Science.
Some 3CNF Properties are Hard to Test Eli Ben-Sasson Harvard & MIT Prahladh Harsha MIT Sofya Raskhodnikova MIT.
Time Series Data Analysis - II
Correlation testing for affine invariant properties on Shachar Lovett Institute for Advanced Study Joint with Hamed Hatami (McGill)
Theory of Computing Lecture 15 MAS 714 Hartmut Klauck.
Analysis of Algorithms
Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.
CSC 413/513: Intro to Algorithms NP Completeness.
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution.
CS151 Complexity Theory Lecture 16 May 20, The outer verifier Theorem: NP  PCP[log n, polylog n] Proof (first steps): –define: Polynomial Constraint.
CS 461 – Nov. 18 Section 7.1 Overview of complexity issues –“Can quickly decide” vs. “Can quickly verify” Measuring complexity Dividing decidable languages.
狄彥吾 (Yen-Wu Ti) 華夏技術學院資訊工程系 Property Testing on Combinatorial Objects.
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
Quantum Counterfeit Coin Problems Kazuo Iwama (Kyoto Univ.) Harumichi Nishimura (Osaka Pref. Univ.) Rudy Raymond (IBM Research - Tokyo) Junichi Teruyama.
Multi-Party Computation r n parties: P 1,…,P n  P i has input s i  Parties want to compute f(s 1,…,s n ) together  P i doesn’t want any information.
Tali Kaufman (Bar-Ilan)
On Sample Based Testers
Property Testing (a.k.a. Sublinear Algorithms )
Relations, Functions, and Matrices
Modeling with Recurrence Relations
Dana Ron Tel Aviv University
Analysis of Algorithms
By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron
Lower bounds for approximate membership dynamic data structures
Vitaly Feldman and Jan Vondrâk IBM Research - Almaden
Circuit Lower Bounds A combinatorial approach to P vs NP
From dense to sparse and back again: On testing graph properties (and some properties of Oded)
Lecture 18: Uniformity Testing Monotonicity Testing
Homework 3 As announced: not due today 
Warren Center for Network and Data Sciences
Algorithm design and Analysis
Local Error-Detection and Error-correction
Nikhil Bansal, Shashwat Garg, Jesper Nederlof, Nikhil Vyas
CIS 700: “algorithms for Big Data”
Locally Decodable Codes from Lifting
The Curve Merger (Dvir & Widgerson, 2008)
Probabilistic existence of regular combinatorial objects
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Algorithms Analysis Algorithm efficiency can be measured in terms of:
The Subgraph Testing Model
Every set in P is strongly testable under a suitable encoding
CS21 Decidability and Tractability
Lecture 6: Counting triangles Dynamic graphs & sampling
15th Scandinavian Workshop on Algorithm Theory
Complexity Theory: Foundations
Presentation transcript:

On Approximating the Number of Relevant Variables in a Function Dana Ron Gilad Tsur Tel-Aviv University

What We'd Like to Do We're given oracle access to a function f. Consider f:{0,1}n→{0,1}, although any discrete domain and range will be OK. We'd like to know how many variables influence f. We'd like to perform o(2n) queries, and would prefer as few as possible . ∣{ 𝑥 𝑖 𝑠.𝑡.∃ 𝑥 1, ..., 𝑥 𝑛 :𝑓 𝑥 1,. .. 𝑥 𝑖 ,..., 𝑥 𝑛 ≠𝑓 𝑥 1,. ..¬ 𝑥 𝑖 ,..., 𝑥 𝑛 }∣

But That Can't Be Done Consider a constant 0 function contrasted with a function g such that for single, unknown x, g(x) = 1 and g(y) = 0 otherwise. This shows we can't even give a good approximation of the number of variables in general. So we consider relaxations: Multiplicative approximation in the Property- Testing setting. Multiplicative approximation for particular families of functions.

What Others Have Done - Testing Juntas Testing Juntas: accept functions with at most k relevant variables ("k-Juntas"), and reject those ε-far from every k-junta. Fischer, Kindler, Ron, Safra and Samorodnitsky give an Õ(k2/ε) queries algorithm, improved by Blais to O(k/ε + k log(k)) queries. Chockler and Gutfreund give an Ω(k) Lower bound, improved by Blais to min(Ω(k/ε), 2k/k)(up to polylog factors).

What We Do: Upper Bounds Distinguish between linear functions with k relevant variables, and those with more than (1+γ)k, using O(log(1/γ)/γ2) queries. Do same for degree d polynomials, using O(2dlog(1/γ)/γ2) queries. Distinguish between k-Juntas and functions ε-far from every (1+γ)k-junta using O(klog(1/γ)/εγ2) queries. Techniques: We use tools developed in previous junta- testing papers, and some properties of variable influence.

What We Do: Lower Bounds k-Juntas vs. ε-far from (1+γ)k-junta (ε,γ- constant). Ω(k/log(k)) Degree-d polynomial k-Juntas vs. ε-far from degree-d polynomial (1+γ)k-Juntas. Ω(2d/d) A weaker lower bound for Monotone functions. Techniques: We use a reduction from the Distinct Elements problem. Recently, Blais, Brody and Matulef showed that distinguishing between k-juntas and (k+t)-juntas requires Ω(min(k2/t2,k)-log(k)).

Linear Functions Imagine we want to know if a linear function f has fewer than k or more than 2k variables. We can select a subset of variables S, adding each variable to S with probability 1/2k. Testing whether S contains a relevant variable is easy as f is linear. This takes a constant number of queries. Distinguishing between a linear functions with k variables and those with k+t requires Ω(min(k2/t2,k)-log(k)) [Blais, Brody and Matulef]. f(0000111001111010011101) f(0000111001011010011101) f(0000111001100101011101) f(0000111001110011011101)

Linear Functions and More Extends to deciding whether a linear function has k relevant variables, or more than (1+γ)k, using O(log(1/γ)/γ2) queries. Can also extend to polynomials with degree d using O(2dlog(1/γ)/γ2) queries. [Recall the 2d factor is required] We basically use the fact that each relevant variable has significant influence.

General Functions Given a function f: accept if f is influenced by at most k variables. reject if f is ε-far from every function influenced by at most (1+γ)k variables. We do this using O(klog(1/γ)/εγ2) queries. This is, again, done by taking a subset of the variables and checking if they influence the function. This is more likely to happen if we're far from (1+γ)k juntas. The difference from linear functions is that when influence is divided among many variables we must still capture sufficient influence in the selected subset.

Lower Bound: The Distinct Elements Problem We use a reduction from the Distinct Elements problem: Given random access to a string, approximate the number of different elements in it (think of each element as a color). Similar to approximating the support of a distribution (under certain conditions). Approximating the support of a distribution - Valiant and Valiant, improving on Raskhodnikova, Ron, Shpilka and Smith: t/log(t) queries are required to distinguish : length t string with t/2 distinct elements from length t string with t/16 distinct elements

Lower Bound: Reduction example

Lower Bound: The Reduction I We'll describe the reduction for k = Θ(n). (recall that we'll give a k/log(k) lower bound) We'll reduce strings with m distinct elements to functions in a family Fnm. Each function in Fnm depends on log(n) + m of the variables. The first log(n) varibles index one of the m variables, and that determines the value of f. For Ψ: {0,1}log(n) → [0,n-log(n)] We have f(x1,...,xn) = xlog(n) + Ψ(x1...xlog(n)).

Lower Bound: The Reduction II For a string s of length n, we have colors s[i][1,..., n-log(n)] . The string s maps to a function f. The first log(n) bits of f's input map to locations in the string. The rest, sequentially to colors. In our reduction we map colors sequentially to input bits (e.g., the color 1 to xlog(n)+1, 2 to xlog(n)+2...). Consider the string s=12251312. Length is 8, and the number of colors is, say, in the range 1...5. The variables of the function will be: x1x2x3x4x5x6x7x8 Example: f(00100111)=0: 001 location 2, color 2, Bit 5 is 0. Example: f(01000111)=0: 010 location 3, color 2, Bit 5 is 0. Example: f(11011100)=1: 110 location 7, color 1, Bit 4 is 1.

Lower Bound: The Reduction III Strings with n/16 colors will be mapped to functions with fewer than n/8 relevant variables. Strings with n/2 colors will be mapped to functions that are far from n/4-juntas. [Can be shown that f in Fnt/2 is ε-far from all t/4-juntas for a constant ε. ]. As it takes us Ω(k/log(k)) queries to distinguish the strings, the same holds for the functions.

Summary k-Juntas vs. functions ε-far from every (1+γ)k-junta: O(klog(1/γ)/εγ2) queries. Degree d polynomials with k relevant variables vs. those with more than (1+γ)k: O(2dlog(1/γ)/γ2) queries. Lower bound of Ω(k/log(k)) queries for k-Juntas vs. functions ε-far from every (1+γ)k-junta (for constant ε and γ). Lower bound of Ω(2d/d) queries for degree-d polynomials with k relevant variables where d < log(k) vs. functions ε-far from every such polynomial with more than (1+γ)k relevant variables.