The Complexity of Linear Dependence Problems in Vector Spaces David Woodruff IBM Almaden Joint work with Arnab Bhattacharyya, Piotr Indyk, and Ning Xie.

Slides:



Advertisements
Similar presentations
Estimating Distinct Elements, Optimally
Advertisements

1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO Based on a paper in STOC, 2012.
Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO.
Optimal Bounds for Johnson- Lindenstrauss Transforms and Streaming Problems with Sub- Constant Error T.S. Jayram David Woodruff IBM Almaden.
Lower Bounds for Additive Spanners, Emulators, and More David P. Woodruff MIT and Tsinghua University To appear in FOCS, 2006.
Tight Lower Bounds for the Distinct Elements Problem David Woodruff MIT Joint work with Piotr Indyk.
Subspace Embeddings for the L1 norm with Applications Christian Sohler David Woodruff TU Dortmund IBM Almaden.
5.4 Basis And Dimension.
5.1 Real Vector Spaces.
NP-Hard Nattee Niparnan.
Shortest Vector In A Lattice is NP-Hard to approximate
Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?
Information and Coding Theory
Lecture 23. Subset Sum is NPC
The Theory of NP-Completeness
Complexity 26-1 Complexity Andrei Bulatov Interactive Proofs.
Signal , Weight Vector Spaces and Linear Transformations
Chien Hsing James Wu David Gottesman Andrew Landahl.
Complexity 19-1 Complexity Andrei Bulatov More Probabilistic Algorithms.
CSE 421 Algorithms Richard Anderson Lecture 27 NP Completeness.
Integer Programming Difference from linear programming –Variables x i must take on integral values, not real values Lots of interesting problems can be.
NEW APPROACH TO CALCULATION OF RANGE OF POLYNOMIALS USING BERNSTEIN FORMS.
15-853Page :Algorithms in the Real World Error Correcting Codes I – Overview – Hamming Codes – Linear Codes.
Hardness Results for Problems
Theory of Computing Lecture 19 MAS 714 Hartmut Klauck.
Dana Moshkovitz, MIT Joint work with Subhash Khot, NYU.
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
Linear Codes.
The Theory of NP-Completeness 1. What is NP-completeness? Consider the circuit satisfiability problem Difficult to answer the decision problem in polynomial.
1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
Great Theoretical Ideas in Computer Science.
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
Complexity 25-1 Complexity Andrei Bulatov Counting Problems.
Number Theory Project The Interpretation of the definition Andre (JianYou) Wang Joint with JingYi Xue.
Great Theoretical Ideas in Computer Science.
Elementary Linear Algebra Anton & Rorres, 9th Edition
Section 2.3 Properties of Solution Sets
§6 Linear Codes § 6.1 Classification of error control system § 6.2 Channel coding conception § 6.3 The generator and parity-check matrices § 6.5 Hamming.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
DIGITAL COMMUNICATIONS Linear Block Codes
ADVANTAGE of GENERATOR MATRIX:
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
Information Theory Linear Block Codes Jalal Al Roumy.
NP-Complete problems.
1.1 Chapter 3: Proving NP-completeness Results Six Basic NP-Complete Problems Some Techniques for Proving NP-Completeness Some Suggested Exercises.
Some Computation Problems in Coding Theory
OR Chapter 8. General LP Problems Converting other forms to general LP problem : min c’x  - max (-c)’x   = by adding a nonnegative slack variable.
Rate Distortion Theory. Introduction The description of an arbitrary real number requires an infinite number of bits, so a finite representation of a.
Complexity 24-1 Complexity Andrei Bulatov Interactive Proofs.
Compression for Fixed-Width Memories Ori Rottenstriech, Amit Berman, Yuval Cassuto and Isaac Keslassy Technion, Israel.
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
The Message Passing Communication Model David Woodruff IBM Almaden.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
A Story of Principal Component Analysis in the Distributed Model David Woodruff IBM Almaden Based on works with Christos Boutsidis, Ken Clarkson, Ravi.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
EE611 Deterministic Systems Multiple-Input Multiple-Output (MIMO) Feedback Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
REVIEW Linear Combinations Given vectors and given scalars
The Theory of NP-Completeness
More NP-Complete and NP-hard Problems
Richard Anderson Lecture 26 NP-Completeness
NP-Completeness (2) NP-Completeness Graphs 7/23/ :02 PM x x x x
Richard Anderson Lecture 25 NP-Completeness
On The Quantitative Hardness of the Closest Vector Problem
Properties of Solution Sets
The Theory of NP-Completeness
Trevor Brown DC 2338, Office hour M3-4pm
Instructor: Aaron Roth
Presentation transcript:

The Complexity of Linear Dependence Problems in Vector Spaces David Woodruff IBM Almaden Joint work with Arnab Bhattacharyya, Piotr Indyk, and Ning Xie from MIT

The 3-SUM Problem Given a set S containing r real numbers, are there: a, b, c 2 S with a+b+c = 0? Solve in O(r 2 ) time –Interview question Conjectured to require (r 2 ) time Useful for hardness results in P. Many problems are 3-SUM Hard

Generalizations We study generalizations of this problem: –Replace 3 summands with k summands R –Replace real field R with a finite field –Replace sum of field elements with sum of vectors –Replace sum with a fixed linear combination –Replace sum with any linear combination –Require vectors be minimally linearly dependent –Replace target 0 with an arbitrary vector –and so on…

Applications Maximum Likelihood Decoding - Given x 1, …, x r in F q n and z in F q n, do there exist x i 1, …, x i k that contain z in their span? - x i are the columns of a parity-check matrix - z is the syndrome - there is a codeword corrupted in at most k positions with syndrome z iff the k-span contains z Weight Distribution Problem –Let A be an n x r matrix over F 2 –Define the code C = {x | Ax = 0} –C has a codeword of weight k iff k columns of A sum to 0

Formal Definitions In this talk, we focus on two problems: (k,r)-LinDependence: given r elements x 1, …, x r in F 2 n and z in F 2 n, do there exist x i 1, …, x i k that span z? (k,r)-ZeroSum: given r elements x 1, …, x r in F 2 n, do there exist x i 1, …, x i k with x i 1 + x i 2 + … + x i k = 0? We allow k and r to be functions of n First problem at least as hard as second

Results Assume 3-SAT cannot be solved in time less than 2 cn for a constant c > 0 Then (k,r)-ZeroSum requires min(r k, 2 n ) time, up to polynomial factors –So, (k,r)-LinDependence requires min(r k, 2 n ) time –Other variants also require this time Have matching upper bound: –r k is trivial. Can get roughly r k/2 –Can get 2 n with the FFT

Implications (k,r)-LinDependence reduces to Maximum Likelihood Decoding, so min(r k, 2 n ) lower bound (k,r)-ZeroSum reduces to the Weight Distribution Problem, so min(r k, 2 n ) lower bound Results improve previous best r k 1/4 lower bounds for these coding theory problems [Downey, Fellows, Vardy, Whittle] Hold for r and k functions of n

R Our starting point: [PW] showed an r bound for k-SUM over R assuming 3-SAT on n variables requires 2 cn time: 3-SAT formula F with n variables and m clauses Á1Á1 Á2Á2 … ÁsÁs - s = 2 εn. Each Á i has n variables and O(n) clauses [CIP] Ã1Ã1 Ã2Ã2 … ÃsÃs - Á i replaced with 1- in-3-SAT formula à i - à i converted to k-SUM instance Each k-SUM instance on a set of r = 2 Θ(n/k) real numbers. If can solve k-SUM in time r o(k), can solve 3-SAT in time r o(k) ¢ 2 εn This ensures bit complexity of resulting numbers is small

Reducing a 1-in-3-SAT formula à i on n variables and O(n) clauses to k-SUM on r = 2 Θ(n/k) real numbers G1G1 …GiGi …GkGk v i,1 v i,2 v i,3 …v i, 2 n/k In each group G i, create a real number v i,j for each possible assignment to its n/k variables v i,j : k group indicator digitsO(n) clause digits Base-k representation i-th indicator digit is 1 iff v 2 G i j-th clause digit is 1 iff A(v) sets exactly 1 literal of j-th clause to 1 All other digits are 0 Partition variables into k groups G 1, …, G k of n/k variables à i is true iff there are k real numbers that sum to 1 k + O(n)

G1G1 …GiGi …GkGk v i,1 v i,2 v i,3 …v i, 2 n/k In each group G i, create a real number v i,j for each possible assignment to its n/k variables v i,j : k group indicator digitsO(n) clause digits Base-k representation i-th indicator digit is 1 iff v 2 G i j-th clause digit is 1 iff A(v) sets exactly 1 literal of j-th clause to 1 All other digits are 0 Partition variables into k groups G 1, …, G k of n/k variables Can we do the same for F 2 ? In each group G i, create a vector v i,j for each possible assignment to its n/k variables k group coordinatesO(n) clause coordinates k + O(n) coordinates i-th indicator coordinate is 1 iff v 2 G i j-th clause coordinate is 1 iff A(v) sets exactly 1 literal of j-th clause to 1 All other coordinates are 0 - A sum of k vectors over F 2 can equal 1 k+O(n), but just means an odd number of literals in each clause are true - Odd-SAT is easy - A sum of k vectors over F 2 can equal 1 k+O(n), but just means an odd number of literals in each clause are true - Odd-SAT is easy

Our Modifications 3-SAT formula F with n variables and m clauses Á1Á1 Á2Á2 … ÁsÁs - s = 2 εn. Each Á i has n variables and O(n) clauses [CIP] Ã1Ã1 Ã2Ã2 … ÃsÃs - Á i replaced with NAE-SAT Formula à i - à i converted to (k,r)-ZeroSum Each (k,r)-ZeroSum instance on a set of r = 2 Θ(n/k) vectors. If can solve (k,r)-ZeroSum in time r o(k), solve 3-SAT in time r o(k) ¢ 2 εn - A NAE-SAT formula à i is 1 if for each clause, at least one but not all literals are true - - A NAE-SAT formula à i is 1 if for each clause, at least one but not all literals are true - R - With 1-in-3-SAT over R, variables in different groups independently update the clause digit We need interaction between groups - Before this was used for bit complexity. - Now it determines the number of dimensions - Before this was used for bit complexity. - Now it determines the number of dimensions

Interacting Variables We can replace duplicates of a variable with distinct variables and introduce equality constraints –preserve NAE-SAT and · 3 literals per clause –each variable occurs in a constant number of clauses For each clause (a Ç b Ç c), we introduce pairvairs –1 variable is [a, b], 1 variable is [b, c], and 1 variable is [c, a] Partition original n variables into k groups G i of n/k variables For a pairvar [a,b], –if original variables a and b occur in the same group G i, place [a,b] in G i –else, if a 2 G i and b 2 G j, place [a,b] in G min(i, j) G i still has O(n/k) variables

New Reduction In each group G i, create a vector v i,j for each assignment to its n/k variables as well as variables in G i s pairvars v i,j : k group coordinates O(n) clause coordinates k+O(n) coordinates O(n) consistency coordinates i-th group coordinate is 1, the others are 0 clause coordinates more complicated depend on variables and pairvars assigned to the group consistency coordinates allow for assignments to the same variable from different groups to be patched together 1 pair of consistency coordinates for each pairvar (a,b)

Clause Coordinates Clause coordinates are set so that for a consistent assignment (i.e., group and consistency coordinates are ok), then for clause with literals a, b, c –v(a) + v(b) + v(c) – v(a) ¢ v(b) – v(b) ¢ v(c) – v(a) ¢ v(c) –v(.) denotes the value assigned Case analysis –Clause only equals 1 if exactly 1 or 2 literals are true

Upper Bounds Consider functions f: Z 2 n ! {0,1} RFourier transform F: Z 2 n ! R is F(x) = 2 -n ¢ y f(y) ¢ (-1) Fast Fourier Transform computes F from f in O(n ¢ 2 n ) time Let f be indicator function of input set of r vectors. Then sum v 1 + v 2 + … v k = 0 f(v 1 ) ¢ f(v 2 ) f(v k ) is what we want This is just 2 n times the 0 n -Fourier coefficient of f k So we can get O(n ¢ 2 n ) time instead of the trivial 2 nk

Conclusion Assuming 3-SAT cannot be solved in time less than 2 cn for a constant c > 0, –(k,r)-LinDependence and (k,r)-ZeroSum require min(r k, 2 n ) time (up to polynomial factors) –Same bound holds for many similar problems –Almost matching upper bounds –New way to prove hardness in coding theory –Optimal hardness of basic problems in coding theory