Sam Hopkins Cornell Tselil Schramm UC Berkeley Jonathan Shi Cornell

Slides:

Advertisements

Similar presentations

Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.

Advertisements

Sublinear-time Algorithms for Machine Learning Ken Clarkson Elad Hazan David Woodruff IBM Almaden Technion IBM Almaden.

Subexponential Algorithms for Unique Games and Related Problems School on Approximability, Bangalore, January 2011 David Steurer MSR New England Sanjeev.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.

Heuristics for the Hidden Clique Problem Robert Krauthgamer (IBM Almaden) Joint work with Uri Feige (Weizmann)

T HE POWER OF C ONVEX R ELAXATION : N EAR - OPTIMAL MATRIX COMPLETION E MMANUEL J. C ANDES AND T ERENCE T AO M ARCH, 2009 Presenter: Shujie Hou February,

Games, Proofs, Norms, and Algorithms Boaz Barak – Microsoft Research Based (mostly) on joint works with Jonathan Kelner and David Steurer.

Rounding Sum of Squares Relaxations Boaz Barak – Microsoft Research Joint work with Jonathan Kelner (MIT) and David Steurer (Cornell) workshop on semidefinite.

Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.

ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory Shriram Sarvotham Dror Baron Richard.

A Randomized Polynomial-Time Simplex Algorithm for Linear Programming Daniel A. Spielman, Yale Joint work with Jonathan Kelner, M.I.T.

1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.

Computational problems, algorithms, runtime, hardness

Symmetric Matrices and Quadratic Forms

Totally Unimodular Matrices Lecture 11: Feb 23 Simplex Algorithm Elliposid Algorithm.

Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)

The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.

Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.

(work appeared in SODA 10’) Yuk Hei Chan (Tom)

Finding Almost-Perfect

Pablo A. Parrilo ETH Zürich Semialgebraic Relaxations and Semidefinite Programs Pablo A. Parrilo ETH Zürich control.ee.ethz.ch/~parrilo.

Dana Moshkovitz, MIT Joint work with Subhash Khot, NYU.

Game Theory Meets Compressed Sensing

Cs: compressed sensing

Institute for Advanced Study, April Sushant Sachdeva Princeton University Joint work with Lorenzo Orecchia, Nisheeth K. Vishnoi Linear Time Graph.

1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.

Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)

Yuan Zhou Carnegie Mellon University Joint works with Boaz Barak, Fernando G.S.L. Brandão, Aram W. Harrow, Jonathan Kelner, Ryan O'Donnell and David Steurer.

E XACT MATRIX C OMPLETION VIA CONVEX OPTIMIZATION E MMANUEL J. C ANDES AND B ENJAMIN R ECHT M AY 2008 Presenter: Shujie Hou January, 28 th,2011 Department.

Boaz Barak (MSR New England) Fernando G.S.L. Brandão (Universidade Federal de Minas Gerais) Aram W. Harrow (University of Washington) Jonathan Kelner (MIT)

C&O 355 Lecture 19 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.

Massive Support Vector Regression (via Row and Column Chunking) David R. Musicant and O.L. Mangasarian NIPS 99 Workshop on Learning With Support Vectors.

Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.

Yuan Zhou Carnegie Mellon University Joint works with Boaz Barak, Fernando G.S.L. Brandão, Aram W. Harrow, Jonathan Kelner, Ryan O'Donnell and David Steurer.

A Story of Principal Component Analysis in the Distributed Model David Woodruff IBM Almaden Based on works with Christos Boutsidis, Ken Clarkson, Ravi.

Fernando G.S.L. Brandão MSR -> Caltech Faculty Summit 2016

Fernando G.S.L. Brandão Caltech QIP 2017

Spectral Methods for Dimensionality

Quantum Coherence and Quantum Entanglement

Provable Learning of Noisy-OR Networks

Compressive Coded Aperture Video Reconstruction

Finding Almost-Perfect

Lap Chi Lau we will only use slides 4 to 19

Stochastic Streams: Sample Complexity vs. Space Complexity

Multiplicative updates for L1-regularized regression

Hans Bodlaender, Marek Cygan and Stefan Kratsch

Topics in Algorithms Lap Chi Lau.

Search Engines and Link Analysis on the Web

Parallel Algorithm Design using Spectral Graph Theory

Polynomial Norms Amir Ali Ahmadi (Princeton University) Georgina Hall

Amir Ali Ahmadi (Princeton University)

Georgina Hall Princeton, ORFE Joint work with Amir Ali Ahmadi

Nonnegative polynomials and applications to learning

Sum of Squares, Planted Clique, and Pseudo-Calibration

Background: Lattices and the Learning-with-Errors problem

Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization Presenter: Xia Li.

Structural Properties of Low Threshold Rank Graphs

Polynomial DC decompositions

Nuclear Norm Heuristic for Rank Minimization

Polynomial Optimization over the Unit Sphere

Linear sketching over

Parallelization of Sparse Coding & Dictionary Learning

Symmetric Matrices and Quadratic Forms

Rank-Sparsity Incoherence for Matrix Decomposition

Non-Negative Matrix Factorization

Chapter 6. Large Scale Optimization

Symmetric Matrices and Quadratic Forms

On Solving Linear Systems in Sublinear Time

Subspace Expanders and Low Rank Matrix Recovery

Presentation transcript:

Fast Spectral Algorithms from Sum-of-Squares Proofs: Tensor Decomposition and Planted Sparse Vectors Sam Hopkins Cornell Tselil Schramm UC Berkeley Jonathan Shi Cornell David Steurer Cornell

Competing Themes in Algorithms 𝑂( 𝑛 8 ) 𝑂(𝑛) versus Polynomial time = Efficient algorithms BUT Stronger convex programs ↓ better (poly-time) algorithms (which aren’t really efficient) Poly = efficient (for NATURAL problems)

Algorithms, Hierarchies, and Running Time SDP Relaxation HUGE, accurate SDP Relaxation 𝑛×𝑛 𝑛 2 × 𝑛 2 𝑛 3 × 𝑛 3 2 𝑛 × 2 𝑛 ⋱ Sum-of-Squares Degree-d = 𝑛 𝑂 𝑑 -variable semidefinite program (SDP). add variables & constraints 𝑛 4 × 𝑛 4 Algorithm will have to run in time SUBlinear in the dimension of the convex relaxation – UNLIKE previous primal-dual or MMW. Hard problem

Algorithms, Hierarchies, and Running Time Better approximation ratios, noise tolerance, than linear programs, semidefinite programs. New Algorithms for: Scheduling [Levey-Rothvoss] Independent sets in bounded-degree graphs [Bansal, Chlamtac] Independent sets in hypergraphs [Chlamtac, Chlamtac-Singh] Planted problems [Barak-Kelner-Steurer, Barak-Moitra, Hopkins-Shi-Steurer, Ge-Ma, Raghavendra-Rao-Schramm, Ma-Shi-Steurer] Unique games [Barak-Raghavendra-Steurer, Barak-Brandao-Harrow-Kelner-Steurer-Zhou] Add UG citations

Algorithms, Hierarchies, and Running Time Big convex programs: e.g. 𝑂 𝑛 10 or 𝑂( 𝑛 log 𝑛 ) variables. Are these algorithms “purely theoretical” or can their running times be improved? Algorithm will have to run in time SUBlinear in the dimension of the convex relaxation – UNLIKE previous primal-dual or MMW.

Algorithms, Hierarchies, and Running Time Big convex programs: e.g. 𝑂 𝑛 10 or 𝑂( 𝑛 log 𝑛 ) variables. Are these algorithms “purely theoretical” or can their running times be improved? This work: fast spectral algorithms with matching guarantees for planted problems. Use eigenvectors of matrix polynomials Emphasize: spectral algs are different from usual ones

Algorithms, Hierarchies, and Running Time Big convex programs: e.g. 𝑂 𝑛 10 or 𝑂( 𝑛 log 𝑛 ) variables. Are these algorithms “purely theoretical” or can their running times be improved? This work: fast spectral algorithms with matching guarantees for planted problems. SDP Relaxation HUGE, accurate SDP Relaxation 𝑛×𝑛 𝑛 2 × 𝑛 2 𝑛 3 × 𝑛 3 ⋱ Fade background text moar 𝑛

Results (1) Planted Sparse Vector (2) Random Tensor Decomposition (3) Tensor Principal Component Analysis

Results (1) Planted Sparse Vector: There is a nearly-linear time algorithm to recover a constant-sparsity 𝑣 0 ∈ ℝ 𝑛 planted in a 𝑛 / log 𝑛 𝑂 1 -dimensional random subspace. (Matches guarantees of degree-4 SoS, up to log factor [BKS].) SoS (previous champion) has to solve large SDP (much larger than input size) (2) Random Tensor Decomposition: There is a 𝑂 ( 𝑛 (1+𝜔)/3 )-time algorithm to recover a rank-one factor of a random dimension 𝑑 3-tensor 𝑇= 𝑖≤𝑚 𝑎 𝑖 ⊗3 if 𝑚≪ 𝑑 4/3 . (SoS achieves 𝑚≪ 𝑑 3/2 in large polynomial time [GM, MSS].)

Results Match SoS guarantees, nearly-linear time Planted Sparse Vector Match SoS guarantees, nearly-linear time (2) Random Tensor Decomposition Almost match SoS guarantees, 𝑂( 𝑛 1.2 ) time Time guarantees are IN THE INPUT SIZE (3) Tensor Principal Component Analysis Match SoS guarantees, linear time

Results Match SoS guarantees, nearly-linear time Planted Sparse Vector Match SoS guarantees, nearly-linear time (2) Random Tensor Decomposition Almost match SoS guarantees, 𝑂( 𝑛 1.2 ) time (3) Tensor Principal Component Analysis Match SoS guarantees, linear time

Planted Sparse Vector 𝑣 0 Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 Containing: 𝑣 0 with ≤ 𝑛 100 nonzeros. Find 𝒗 𝟎 𝑛 𝑑 𝑥 𝑧 𝑦 𝑣 0 𝑛 𝑘

Planted Sparse Vector What dimensions 𝒅 permit efficient algorithms? Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 Containing: 𝑣 0 with ≤ 𝑛 100 nonzeros. Find 𝒗 𝟎 𝑛 𝑑 𝑥 𝑧 𝑦 𝑣 0 𝑛 𝑘 What dimensions 𝒅 permit efficient algorithms? Result: spectral algorithm matching SoS’s 𝒅 ≲ 𝒏

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop Dual certificates from SoS SDP (variant of primal-dual)

The Speedup Recipe Compression to small matrices SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop Dual certificates from SoS SDP (variant of primal-dual)

The Speedup Recipe Compression to small matrices Different from — matrix multiplicative weights [Arora-Kale] — simpler spectral algorithms SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop Dual certificates from SoS SDP (variant of primal-dual)

The Speedup Recipe Compression to small matrices Different from — matrix multiplicative weights [Arora-Kale] — simpler spectral algorithms Not local rounding [Guruswami-Sinop] SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop Dual certificates from SoS SDP (variant of primal-dual)

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 MENTION guruswami-sinop Matrix dimensions ≈ SDP dimensions

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 = 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 signal noise MENTION guruswami-sinop

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 = 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 signal noise MENTION guruswami-sinop 𝑛 basis 𝑑 Planted sparse vector 𝑥 ≈ 𝑣 0

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 This matrix is too big Matrix dimensions ≈ SDP dimensions MENTION guruswami-sinop

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ redundant information with tensor structure

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ redundant information with tensor structure

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ Hope: preserve signal-to-noise ratio of 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸

Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ In 𝑑=2 dimensions 𝑑 2 𝑥 1 2 ⋅𝑥 𝑥 ⊤ 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ In 𝑑=2 dimensions 𝑑 2 𝑥 1 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 𝑥 2 ⋅𝑥 𝑥 ⊤ MENTION guruswami-sinop 𝑥 2 𝑥 1 ⋅𝑥 𝑥 ⊤ 𝑥 2 2 ⋅𝑥 𝑥 ⊤

Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ In 𝑑=2 dimensions 𝑑 2 + = 𝑥 𝑥 ⊤ 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ In 𝑑=2 dimensions 𝑑 2 𝑥 1 2 ⋅𝑥 𝑥 ⊤ 𝑥 2 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 𝑥 2 ⋅𝑥 𝑥 ⊤ + = 𝑥 𝑥 ⊤ MENTION guruswami-sinop 𝑥 2 𝑥 1 ⋅𝑥 𝑥 ⊤ 𝑥 2 2 ⋅𝑥 𝑥 ⊤

Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? MENTION guruswami-sinop

Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? MENTION guruswami-sinop

Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? ≈ 𝑚 ±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? MENTION guruswami-sinop

Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? 𝑑 2 𝑑 𝑑 ≈ 𝑚 ±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? 𝑑 2 𝑑 ±1 𝑑 MENTION guruswami-sinop 1 𝑑 ⋅

Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? 𝑑 2 ≈ 𝑚 ±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? 𝑑 2 1 𝑑 ⋅ ±1 + ⋯+ MENTION guruswami-sinop 1 𝑑 ⋅

Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? 𝑑 2 ≈ 𝑚 ±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? 𝑑 2 1 𝑑 ⋅ ±1 + ⋯+ ≈ ± 𝑑 1 𝑑 ⋅

Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? 𝑑 2 ≈ 𝑚 ±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? 𝑑 2 1 𝑑 ⋅ ±1 + ⋯+ ≈ ± 𝑑 MENTION guruswami-sinop 1 𝑑 ⋅ 1 𝑑 ⋅ ± 𝑑 ≈1

Conclusion: signal-to-noise ratio is preserved! 𝑚 ±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? Conclusion: signal-to-noise ratio is preserved! 𝑑 2 1 𝑑 ⋅ ±1 + ⋯+ ≈ ± 𝑑 MENTION guruswami-sinop 1 𝑑 ⋅ 1 𝑑 ⋅ ± 𝑑 ≈1

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ +𝐸′

The Speedup Recipe SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm Ensure 𝑬 is like a ±𝟏 i.i.d. matrix 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ +𝐸′

The Speedup Recipe Avoid explicitly computing large matrix SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm Ensure 𝑬 is like a ±𝟏 i.i.d. matrix 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ +𝐸′

Resulting Algorithms are Simple and Spectral Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 Containing: 𝑣 0 with ≤ 𝑛 100 nonzeros. Find 𝒗 𝟎 𝑛 𝑑 𝑥 𝑧 𝑦 𝑣 0 𝑛 𝑘

Resulting Algorithms are Simple and Spectral Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 𝑛 𝑑 𝑎 1 ⋮ 𝑎 𝑛

Resulting Algorithms are Simple and Spectral Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 𝑛 𝑑 Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . 𝑎 1 ⋮ 𝑎 𝑛

Resulting Algorithms are Simple and Spectral Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 𝑛 𝑑 Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑎 1 𝑛 basis 𝑑 ⋮ 𝑦 𝑎 𝑛

Resulting Algorithms are Simple and Spectral Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 𝑛 𝑑 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ is a degree-4 matrix polynomial in the input variables. Capture the power of SoS without high dimensions Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑎 1 𝑛 basis 𝑑 ⋮ 𝑦 𝑎 𝑛

tensor structure in dual certificates, practical spectral algorithms. Conclusions By exploiting tensor structure in dual certificates, randomness in inputs, impractical SoS algorithms can become practical spectral algorithms. Thanks For Coming!

The Resulting Algorithms are Simple and Spectral Example: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 is sparse, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal: find 𝑣 0 Our Algorithm: Input: subspace basis U=( 𝑢 1 ,…, 𝑢 𝑑 ). 𝑈= ⋮ ⋮ 𝑢 1 ⋯ 𝑢 𝑑 ⋮ ⋮ = ⋯ 𝑎 1 ⋯ ⋮ ⋯ 𝑎 𝑛 ⋯

The Resulting Algorithms are Simple and Spectral Example: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 is sparse, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal: find 𝑣 0 Our Algorithm: Input: subspace basis U=( 𝑢 1 ,…, 𝑢 𝑑 ). 𝑈= ⋮ ⋮ 𝑢 1 ⋯ 𝑢 𝑑 ⋮ ⋮ = ⋯ 𝑎 1 ⋯ ⋮ ⋯ 𝑎 𝑛 ⋯ Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑖 𝑦 𝑖 𝑢 𝑖 . Replace matrices by pictures

The Resulting Algorithms are Simple and Spectral Example: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 is sparse, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal: find 𝑣 0 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ is a degree-4 matrix polynomial in the input variables. Capture the power of SoS without high dimensions Our Algorithm: Input: subspace basis U=( 𝑢 1 ,…, 𝑢 𝑑 ). 𝑈= ⋮ ⋮ 𝑢 1 ⋯ 𝑢 𝑑 ⋮ ⋮ = ⋯ 𝑎 1 ⋯ ⋮ ⋯ 𝑎 𝑛 ⋯ Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑖 𝑦 𝑖 𝑢 𝑖 .

Contrast to Previous Speedup Approaches (Matrix) Multiplicative Weights [Arora-Kale]: Cannot go faster than matrix-vector multiplication for matrices in the underlying SDP. Fast Solvers for Local Rounding [Guruswami-Sinop]: achieve running time 2 𝑂 𝑑 ⋅𝑝𝑜𝑙𝑦(𝑛) when rounding algorithm is “local”. Many SoS rounding algorithms are not local, and we want near-linear time. Algorithm will have to run in time SUBlinear in the dimension of the convex relaxation – UNLIKE previous primal-dual or MMW.

The Speedup Recipe Understand Spectrum of SoS Dual Certificate (avoid SDP) (2) Reduce Dimensions via Tensor Structure in Dual Cert.

The Speedup Recipe Understand Spectrum of SoS Dual Certificate (avoid SDP) the result: a spectral algorithm using high-dimensional matrices. typical matrix: 𝑥 ⊗4 +𝐸 for some unit 𝑥 and 𝐸 ≪1. (2) Reduce Dimensions via Tensor Structure in Dual Cert.

The Speedup Recipe Understand Spectrum of SoS Dual Certificate (avoid SDP) the result: a spectral algorithm using high-dimensional matrices. typical matrix: 𝑥 ⊗4 +𝐸 for some unit 𝑥 and 𝐸 ≪1. (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′

𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 The Speedup Recipe 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 Understand Spectrum of SoS Dual Certificate the result: a spectral algorithm using high-dimensional matrices. In 2 dimensions: 𝑥 ⊗2 𝑥 ⊗2 ⊤ = 𝑥 1 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 𝑥 2 ⋅𝑥 𝑥 ⊤ 𝑥 2 𝑥 1 ⋅𝑥 𝑥 ⊤ 𝑥 2 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 2 ⋅𝑥 𝑥 ⊤ + 𝑥 2 2 ⋅𝑥 𝑥 ⊤ = 𝑥 2 2 ⋅𝑥 𝑥 ⊤ =𝑥 𝑥 ⊤ (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′

𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 The Speedup Recipe 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 Understand Spectrum of SoS Dual Certificate the result: a spectral algorithm using high-dimensional matrices. (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′ 𝐸 randomish  ‖𝐸 ′ ‖= 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 ≈‖𝐸‖

𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 The Speedup Recipe 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 Understand Spectrum of SoS Dual Certificate the result: a spectral algorithm using high-dimensional matrices. Heuristic: 𝐸 has iid entries 𝐸= 1 𝑑 ±1 ⋯ ⋯ ⋯ 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 = 1 𝑑 ± 𝑑 ⋯ ⋯ ⋯ 𝑑 2 × 𝑑 2 𝑑 ×𝑑 (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′ 𝐸 randomish  ‖𝐸 ′ ‖= 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 ≈‖𝐸‖

𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 The Speedup Recipe 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 Understand Spectrum of SoS Dual Certificate the result: a spectral algorithm using high-dimensional matrices. Heuristic: 𝐸 has iid entries 𝐸= 1 𝑑 ±1 ⋯ ⋯ ⋯ 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 = 1 𝑑 ± 𝑑 ⋯ ⋯ ⋯ 𝑑 2 × 𝑑 2 𝑑 ×𝑑 (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′ 𝐸 randomish  ‖𝐸 ′ ‖= 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 ≈‖𝐸‖ Compute 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 without 𝑥 ⊗4 +𝐸.

Thanks For Coming!

Can We Use Previous Approaches to Speeding Up Relaxation-Based Algorithms? Goal: Take an algorithm which uses an 𝑛 𝑂 𝑑 -size SDP, run it in nearly-linear time. (Matrix) Multiplicative Weights [Arora-Kale]: solve SDP on 𝑚×𝑚 matrices in 𝑂 (𝑚) time. We need e.g. 𝑛 2 × 𝑛 2 matrices. Algorithm will have to run in time SUBlinear in the dimension of the convex relaxation – UNLIKE previous primal-dual or MMW. Fast Solvers for Local Rounding [Guruswami-Sinop]: achieve running time 2 𝑂 𝑑 ⋅𝑝𝑜𝑙𝑦(𝑛) when rounding algorithm is “local”. Many SoS rounding algorithms are not local, and we want near-linear time.

A Benchmark Problem: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random ≈± 1 𝑛 𝑣 1 . 𝑢 0 . Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Random linear combinations 𝑣 𝑑 𝑢 𝑑 ≈± 10 𝑛 𝑣 0

A Benchmark Problem: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random Gets harder as d=𝑑(𝑛) grows. Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

A Benchmark Problem: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random Question: what dimension 𝑑 can be handled by efficient algorithms? (Poly- time in input size 𝑛𝑑.) [Spielman-Wang-Wright, Demanet-Hand, Barak- Kelner-Steurer, Qu-Sun-Wright, H-Schramm-Shi- Steurer] Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

A Benchmark Problem: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random Related to compressed sensing, dictionary learning, sparse pca, shortest codeword, small-set expansion Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

A Benchmark Problem: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random Simple problem where sum-of-squares (SoS) hierarchy beats LP, (small) SDPs, local search Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Previous Work (recovery version) Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Previous Work (recovery version) Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Folklore Semidefinite Programming Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Previous Work (recovery version) Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Folklore Semidefinite Programming [Qu-Sun-Wright] 𝑛 1/4 log 𝑛 𝑂 1 Alternating Minimization Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Previous Work (recovery version) Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Folklore Semidefinite Programming [Qu-Sun-Wright] 𝑛 1/4 log 𝑛 𝑂 1 Alternating Minimization [Barak-Brandao-Harrow-Kelner-Steurer-Zhou, Barak-Kelner-Steurer] 𝑛 SoS Hierarchy Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Previous Work (recovery version) Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Folklore Semidefinite Programming [Qu-Sun-Wright] 𝑛 1/4 log 𝑛 𝑂 1 Alternating Minimization [Barak-Brandao-Harrow-Kelner-Steurer-Zhou, Barak-Kelner-Steurer] 𝑛 SoS Hierarchy All require polynomial loss in sparsity or subspace dimension or both, compared with SoS. Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Sum-of-Squares (and SoS-inspired) Algorithms From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Sum-of-Squares (and SoS-inspired) Algorithms From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Sum-of-Squares (and SoS-inspired) Algorithms From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Sum-of-Squares (and SoS-inspired) Algorithms From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Sum-of-Squares (and SoS-inspired) Algorithms From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Running-time barrier from dimension of convex program

Sum-of-Squares (and SoS-inspired) Algorithms From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work 𝑂 (𝑛𝑑) i.e. nearly-linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Running-time barrier from dimension of convex program

[Barak et al]’s Distinguishing Algorithm Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work 𝑂 (𝑛𝑑) i.e. nearly-linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Observation: 𝑚𝑎 𝑥 𝑣∈ V random 𝑣 4 𝑣 2 ≪ 𝑣 0 4 𝑣 0 2 for sparse 𝑣 0 [Barak et al]: 𝐒𝐨 𝐒 𝟒 𝑚𝑎 𝑥 𝑣∈ V random 𝑣 4 𝑣 2 ≪ 𝑣 0 4 𝑣 0 2 for sparse 𝑣 0 SDP in 𝑑 2 × 𝑑 2 matrices

Bound 𝑆𝑜 𝑆 4 Using Dual Certificate 𝐒𝐨 𝐒 𝟒 𝑚𝑎 𝑥 𝑣∈ V random 𝑣 4 𝑣 2 ≤ 𝑀 𝑉 𝑟𝑎𝑛𝑑𝑜𝑚 ≪ 𝑣 0 4 𝑣 0 2 Dual: 𝑀 𝑉 ∈ ℝ 𝑑 2 × 𝑑 2 so that 𝑥 ⊗2 , 𝑀 𝑉 𝑥 ⊗2 = ∑ 𝑥 𝑖 𝑢 𝑖 4 4 (remember that 𝑠𝑝𝑎𝑛 𝑢 0 ,…, 𝑢 𝑑 =𝑉) ∑ 𝒙 𝒊 𝒖 𝒊 𝟒 𝟒 makes SoS the champion High dimension  access high-degree polynomial Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

[Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work 𝑂 (𝑛𝑑) i.e. nearly-linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

Structure of the Dual Cert. 𝑀(𝑉) Computing Directly With 𝑴 𝑽 (1) 𝑀(𝑉) is explicit: matrix-vector multiply in time 𝑂(𝑛 𝑑 2 ) (2) 𝑀 𝑉 𝑟𝑎𝑛𝑑𝑜𝑚 ≪ 𝑣 0 4 𝑣 0 2 ≈‖𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 ‖ Lemma: With high probability, 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 = 𝑦⊗𝑦 𝑦⊗𝑦 ⊤ +𝐸, where 𝑦 is unit, 𝑣 0 = 𝑖 𝑦 𝑖 𝑢 𝑖 , and 𝐸 ≪1. Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game 𝐸 is a random matrix depending on randomness in subspace 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 .

[Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work 𝑂 (𝑛𝑑) i.e. nearly-linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game

(Breaking) The Dimension Barrier Recall: ∑ 𝒙 𝒊 𝒖 𝒊 𝟒 𝟒 makes SoS the champion 𝑀(𝑉) is 𝑑 2 × 𝑑 2  access high-degree polynomial High-Degree but Lower Dimension 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 = 𝑦⊗𝑦 𝑦⊗𝑦 ⊤ +𝐸 is 𝑑 2 × 𝑑 2 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 =𝑦 𝑦 ⊤ +𝐸′ is 𝑑×𝑑 Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑦⊗𝑦 𝑦⊗𝑦 ⊤ =𝑦 𝑦 ⊤ 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝐸 =𝐸′

(Breaking) The Dimension Barrier High-Degree but Lower Dimension 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 = 𝑦⊗𝑦 𝑦⊗𝑦 ⊤ +𝐸 is 𝑑 2 × 𝑑 2 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 =𝑦 𝑦 ⊤ +𝐸′ is 𝑑×𝑑 Compute top eigenvector in time 𝑶 (𝒏𝒅)? Yes, 𝑀(𝑉) is a nice function of 𝑉 and 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 is linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Does 𝑬 ≪𝟏  𝑬 ′ =‖𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆 𝑬 ‖≪𝟏 ? Yes, if 𝐸 is random enough. Related: 𝐴∈ −1,1 𝑛×𝑛 uniform has 𝐴 ≈𝑇𝑟𝐴≈ 𝑛 .

Recovering 𝑣 0 in 𝑂 𝑛𝑑 Time Algorithm Input: subspace basis U=( 𝑢 1 ,…, 𝑢 𝑑 ). Let 𝑎 1 ,…, 𝑎 𝑛 be generators (rows of 𝑈). Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 ⋅ 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑖 𝑦 𝑖 𝑢 𝑖 . The Recipe (1) SoS algorithms (often) come with dual certificate constructions. (2) Explicitly compute spectrum of dual certificate. (3) Compress to lower dimensions using randomness to avoid losing information. Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game