Expander codes and pseudorandom subspaces of R n James R. Lee University of Washington [joint with Venkatesan Guruswami (Washington) and Alexander Razborov.

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

Quantum t-designs: t-wise independence in the quantum world Andris Ambainis, Joseph Emerson IQC, University of Waterloo.
Fast Johnson-Lindenstrauss Transform(s) Nir Ailon Edo Liberty, Bernard Chazelle Bertinoro Workshop on Sublinear Algorithms May 2011.
5.1 Real Vector Spaces.
Theoretical Computer Science methods in asymptotic geometry
The Stability of a Good Clustering Marina Meila University of Washington
Computing Kemeny and Slater Rankings Vincent Conitzer (Joint work with Andrew Davenport and Jayant Kalagnanam at IBM Research.)
MAT 2401 Linear Algebra Exam 2 Review
ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory Shriram Sarvotham Dror Baron Richard.
Uncertainty Principles, Extractors, and Explicit Embeddings of L 2 into L 1 Piotr Indyk MIT.
“Random Projections on Smooth Manifolds” -A short summary
Signal , Weight Vector Spaces and Linear Transformations
Signal , Weight Vector Spaces and Linear Transformations
Eigenvalues and Eigenvectors
Symmetric Matrices and Quadratic Forms
Totally Unimodular Matrices Lecture 11: Feb 23 Simplex Algorithm Elliposid Algorithm.
EXPANDER GRAPHS Properties & Applications. Things to cover ! Definitions Properties Combinatorial, Spectral properties Constructions “Explicit” constructions.
CSC5160 Topics in Algorithms Tutorial 1 Jan Jerry Le
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Expanders Eliyahu Kiperwasser. What is it? Expanders are graphs with no small cuts. The later gives several unique traits to such graph, such as: – High.
Orthogonality and Least Squares
Ramanujan Graphs of Every Degree Adam Marcus (Crisply, Yale) Daniel Spielman (Yale) Nikhil Srivastava (MSR India)
Eigenvectors of random graphs: nodal domains James R. Lee University of Washington Yael Dekel and Nati Linial Hebrew University TexPoint fonts used in.
Compressed Sensing Compressive Sampling
How Robust are Linear Sketches to Adaptive Inputs? Moritz Hardt, David P. Woodruff IBM Research Almaden.
Beating the Union Bound by Geometric Techniques Raghu Meka (IAS & DIMACS)
C&O 355 Lecture 2 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A.
Diophantine Approximation and Basis Reduction
Game Theory Meets Compressed Sensing
Cs: compressed sensing
4 4.2 © 2012 Pearson Education, Inc. Vector Spaces NULL SPACES, COLUMN SPACES, AND LINEAR TRANSFORMATIONS.
Institute for Advanced Study, April Sushant Sachdeva Princeton University Joint work with Lorenzo Orecchia, Nisheeth K. Vishnoi Linear Time Graph.
Section 4.1 Vectors in ℝ n. ℝ n Vectors Vector addition Scalar multiplication.
Streaming Algorithms Piotr Indyk MIT. Data Streams A data stream is a sequence of data that is too large to be stored in available memory Examples: –Network.
Chapter 5 Eigenvalues and Eigenvectors 大葉大學 資訊工程系 黃鈴玲 Linear Algebra.
Shriram Sarvotham Dror Baron Richard Baraniuk ECE Department Rice University dsp.rice.edu/cs Sudocodes Fast measurement and reconstruction of sparse signals.
Elementary Linear Algebra Anton & Rorres, 9th Edition
1 Chapter 3 – Subspaces of R n and Their Dimension Outline 3.1 Image and Kernel of a Linear Transformation 3.2 Subspaces of R n ; Bases and Linear Independence.
Section 2.3 Properties of Solution Sets
15-853:Algorithms in the Real World
A website that has programs that will do most operations in this course (an online calculator for matrices)
E XACT MATRIX C OMPLETION VIA CONVEX OPTIMIZATION E MMANUEL J. C ANDES AND B ENJAMIN R ECHT M AY 2008 Presenter: Shujie Hou January, 28 th,2011 Department.
I.4 Polyhedral Theory 1. Integer Programming  Objective of Study: want to know how to describe the convex hull of the solution set to the IP problem.
Elementary Linear Algebra Anton & Rorres, 9 th Edition Lecture Set – 07 Chapter 7: Eigenvalues, Eigenvectors.
4.3 Linearly Independent Sets; Bases
Arab Open University Faculty of Computer Studies M132: Linear Algebra
Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.
Ch 6 Vector Spaces. Vector Space Axioms X,Y,Z elements of  and α, β elements of  Def of vector addition Def of multiplication of scalar and vector These.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Multi-way spectral partitioning and higher-order Cheeger inequalities University of Washington James R. Lee Stanford University Luca Trevisan Shayan Oveis.
4.8 Rank Rank enables one to relate matrices to vectors, and vice versa. Definition Let A be an m  n matrix. The rows of A may be viewed as row vectors.
Presented by Alon Levin
 Matrix Operations  Inverse of a Matrix  Characteristics of Invertible Matrices …
Lap Chi Lau we will only use slides 4 to 19
Dimension reduction for finite trees in L1
Topics in Algorithms Lap Chi Lau.
Sam Hopkins Cornell Tselil Schramm UC Berkeley Jonathan Shi Cornell
Lecture 15 Sparse Recovery Using Sparse Matrices
Nuclear Norm Heuristic for Rank Minimization
Probabilistic existence of regular combinatorial objects
Bounds for Optimal Compressed Sensing Matrices
Sudocodes Fast measurement and reconstruction of sparse signals
Symmetric Matrices and Quadratic Forms
I.4 Polyhedral Theory.
CIS 700: “algorithms for Big Data”
Row-equivalences again
NULL SPACES, COLUMN SPACES, AND LINEAR TRANSFORMATIONS
Locality In Distributed Graph Algorithms
Symmetric Matrices and Quadratic Forms
Subspace Expanders and Low Rank Matrix Recovery
Presentation transcript:

expander codes and pseudorandom subspaces of R n James R. Lee University of Washington [joint with Venkatesan Guruswami (Washington) and Alexander Razborov (IAS/Steklov)]

random sections of the cross polytope Classical high-dimensional geometry [Kasin 77, Figiel-Lindenstrauss-Milman 77]: For a random subspace X µ R N with dim(X) = N/2, (e.g. choose X = span {v 1, …, v N/2 } where v i are i.i.d. on the unit sphere) In other words, every x 2 X has its L 2 mass very “spread” out: This holds not only for each v i, but every linear combination

random sections of the cross polytope Classical high-dimensional geometry [Kasin 77, Figiel-Lindenstrauss-Milman 77]: For a random subspace X µ R N with dim(X) = N/2, (e.g. choose X = span {v 1, …, v N/2 } where v i are i.i.d. on the unit sphere)

an existential crisis Geometric functional analysts face a dilemma we know well: Almost every subspace satisfies this property, but we can’t pinpoint even one. [Szarek, ICM 06 ; Milman, GAFA 01 ; Johnson-Schechtman, handbook 01 ] asked: Can we find an explicit subspace on which the L 1 and L 2 norms are equivalent? This is a prominent example of the (now ubiquitous) use of the probabilistic method in asymptotic convex geometry. Related questions about explicit, high-dim. constructions arose (concurrently) in CS: - explicit embeddings of L 2 into L 1 for nearest-neighbor search (Indyk) - explicit compressed sensing matrices M : R N  R n for n ¿ N (Devore) - explicit Johnson-Lindenstrauss (dimension reduction) transform (Ailon-Chazelle) Why do analytists / CSists care about explicit high-dimensional constructions?

distortion For a subspace X µ R N, we define the distortion of X by By Cauchy-Schwarz, we always have N 1/2 ¸  (X) ¸ 1. dim(X) =   (N) and  (X) · 1 + . [Fiegel-Lindenstrauss-Milman 77] Random construction: A random X µ R N satisfies: dim(X) = ( 1 -  )N and  (X) = O  ( 1 ). [Kasin 77] Let X = ker(first N/2 rows of Hadamard), then  (X) ¼ N 1/4. Example (Hadamard):

applications distortion dimension Nearest-neighbor search Compressive sensing Coding in characteristic zero, Geometric functional analysis View as an embedding: 1 +  distortion, small blowup in dimension O( 1 ) distortion,  (N) dimension Want a map A : R N  R n with n ¿ N, such that any r-sparse signal x 2 R N (vector with at most r non-zero entries) can be uniquely and efficiently recovered from Ax. Can uniquely and efficiently recover any r-sparse signal for r · N/  (ker(A)) 2. (Even tolerates additional “noise” in the “non-sparse” parts of the signal.) Relation to distortion: [Kashin-Temlyakov] (Milman believes impossible)

sensing and distortion Want a map A : R N  R n such that any r-sparse signal x 2 R N (vector with at most r non-zero entries) can be uniquely and efficiently recovered from Ax. Basis Pursuit: Given compressed signal y, minimize || x || 1 subject to Ax = y. (P1) Want to solve: Given compressed signal y, minimize || x || 0 subject to Ax = y. (P0) Highly non-convex optimization problem, NP-hard for general A. Can use linear programming! [KT07]: If y = Av and v has at most N/[2  (ker(A))] 2 non-zero coordinates, then (P0) and (P1) give the same answer. let’s prove this [Lots of work has been done here: Donoho et. al.; Candes-Tao-Romberg; etc.]

sensing and distortion [KT07]: If y = Av and v has at most N/[2  (ker(A))] 2 non-zero coordinates, then (P0) and (P1) give the same answer. For x 2 R N and S µ [N], let x S be x restricted to coordinates in S. If x 2 ker(A) and

previous results: explicit Sub-linear dimension: Rudin’60 (and later LLR’94) achieve dim(X) ¼ N 1/2 and  (X) · 3 (X = span {4-wise independent vectors}) Indyk’07 achieves dim(X) ¼ N/2 (log log N) 2 and  (X) = 1 +o( 1 ). Indyk’00 achieves dim(X) ¼ exp((log N) 1/2 ) and  (X) = 1 +o( 1 ). We construct an explicit subspace X µ R N with dim(X) = ( 1 -o( 1 ) ) N and Our result: In our constructions, X = ker(explicit sign matrix).

previous results: derandomization Partial derandomization: Let A k, N be a random k £ N sign matrix (entries are ± 1 i.i.d) Kashin’s technique shows that almost surely, (and dim(ker(A k, N )) ¸ N – k) Can reduce to O(N log 2 N) random bits [Indyk 00] Can reduce to O(N log N) random bits [Artstein-Milman 06] Can reduce to O(N) random bits [Lovett-Sodin 07] With N o(1) random bits, we get  (X) · polylog(N). Our result: With N  random bits for any , we get  (X) = O  ( 1 ). [Guruswami-L-Wigderson]

the expander code construction G = ([N], [n], E) - bipartite graph, d-right-regular and L µ R d a subspace. where x S 2 R |S| is x restricted to the coordinates in S µ [N] and  (j) is the neighborhood of j. N n À d j Resembles construction of Gallager, Tanner (L is the “inner” code). Following Tanner and Sipser-Spielman, we will show that if L is “good” and G is an “expander” then X(G,L) is even better (in some parameters). x1x1 x2x2 x3x3 xNxN

some quantitative matters Say that a subspace L µ R d is (t,  )-spread if every x 2 L satisfies If L is (  (d),  )-spread, then Conversely, if L has  (L) = O( 1 ), then L is (  (d),  ( 1 ))-spread. For a bipartite graph G = ([N],[n],E), the expansion profile of G is (This is expansion from left to right.)

spread-boosting theorem G = ([N], [n], E) - bipartite graph, d-right-regular and left degree · D. Setup: L µ R d a (t,  )-spread subspace. Conclusion: If X(G,L) is (T,  )-spread, then X(G,L) is How to apply: Assume D = O( 1 ) and  G (q) =  (q) 8 q 2 [N] (impossible to achieve) X(G,L) is (½, 1 )-spread ) (t,  )-spread ) (t 2,  2 )-spread … ) (  (N),  log t (N) )-spread )  (X(G,L)). ( 1 /  log t (N)

spread-boosting theorem G = ([N], [n], E) - bipartite graph, d-right-regular and left degree · D. Setup: L µ R d a (t,  )-spread subspace. Conclusion: If X(G,L) is (T,  )-spread, then X(G,L) is S S should “leak” L 2 mass outside (since L is spreading and G is an expander), unless most of the mass in S is concentrated on a small subset B (impossible by assumption) B

when L is random Let H be a (non-bipartite) d-regular graph with second eigenvalue = O(d 1/2 ). Let G be the edge-vertex incidence graph (an edge is connected to its endpoints) edges of H nodes of H Alon-Chung: Random subspace L µ R d is (  (d),  ( 1 ))-spread Letting d = N 1/4, the spread-boosting thm gives X(G,L) is (T,  )-spread ) X(G,L) is Takes O(log log N) steps to reach  (N)-sized sets ) poly(log N) distortion. (explicit constructions exist by Margulis, Lubotsky-Phillips-Sarnak)

explicit construction: ingredients for L Let A be any k £ d matrix whose columns a 1, …, a d 2 R k are unit vectors and such that for every i  j, | h a i, a j i | · . Kerdock codes (aka Mutually Unbiased Bases) [Kerdock’72, Cameron-Seidel’73] Spectral Lemma: Then ker(A) is (  (d 1/2 ),  ( 1 ) ) -spread subspaces of dimension ( 1 -  )d for every eps>0 +

boosting L with sum-product expanders Kerdock + Spectral Lemma gives (  (d 1/2 ),  ( 1 ) ) -spread subspaces of dimension ( 1 -  )d for every eps>0 Problem: If G=Ramanujan construction and L=Kerdock, the spread-boosting theorem gives nothing. (Ramanujan loses d 1/2 and Kerdock gains only d 1/2 ) Solution: Produce L’ = X(G,L) where L=Kerdock and G=sum-product expander Sum-product theorems [Bourgain-Katz-Tao, …] For A µ F p, with |A| · p 0.99 we have

boosting L with sum-product expanders Kerdock + Spectral Lemma gives (  (d 1/2 ),  ( 1 ) ) -spread subspaces of dimension ( 1 -  )d for every eps>0 Problem: If G=Ramanujan construction and L=Kerdock, the spread-boosting theorem gives nothing. (Ramanujan loses d 1/2 and Kerdock gains only d 1/2 ) Solution: Produce L’ = X(G,L) where L=Kerdock and G=sum-product expander Using [Barak-Impagliazzo-Wigderson/BKSSW] and the spread-boosting theorem, L’ is ( d 1/2+c,  ( 1 ) ) -spread for some c > 0.

boosting L with sum-product expanders Solution: Produce L’ = X(G,L) where L=Kerdock and G=sum-product expander Using [Barak-Impagliazzo-Wigderson/BKSSW] and the spread-boosting theorem, L’ is ( d 1/2+c,  ( 1 ) ) -spread for some c > 0. Now we can plug L’ into G=Ramanujan and get non-trivial boosting. (almost done…)

some open questions - Improve the current bounds: First attempt would be O( 1 ) distortion with sub-linear randomness. - Stronger pseudorandom properties: Restricted Isometry Property [T. Tao’s blog] Improve dependence on the co-dimension (important for compressed sensing) If dim(X) ¸ ( 1 -  )N, we get distortion dependence ( 1 /  O(log log N). - Breaking the diameter bound: Show that the kernel of a random { 0,1 } matrix with only 100 ones per row has small distortion. Or prove that sparse matrices cannot work. Could hope for. Find an explicit collection of unit vectors v 1, v 2, …, v N 2 R n with N À n so that every small enough sub-collection is “nearly orthogonal.”

some open questions - Refuting random subspaces with high distortion Give efficiently computable certificates for  (X) small or Restricted Isometry Property which exist almost surely for random X µ R N. - Linear time expander decoding? Are their recovery schemes that run faster than Basis Pursuit?