Download presentation
Presentation is loading. Please wait.
Published byWinifred Pope Modified over 6 years ago
1
Fast Spectral Algorithms from Sum-of-Squares Proofs: Tensor Decomposition and Planted Sparse Vectors
Sam Hopkins Cornell Tselil Schramm UC Berkeley Jonathan Shi Cornell David Steurer Cornell
2
Competing Themes in Algorithms
𝑂( 𝑛 8 ) 𝑂(𝑛) versus Polynomial time = Efficient algorithms BUT Stronger convex programs ↓ better (poly-time) algorithms (which aren’t really efficient) Poly = efficient (for NATURAL problems)
3
Algorithms, Hierarchies, and Running Time
SDP Relaxation HUGE, accurate SDP Relaxation 𝑛×𝑛 𝑛 2 × 𝑛 2 𝑛 3 × 𝑛 3 2 𝑛 × 2 𝑛 ⋱ Sum-of-Squares Degree-d = 𝑛 𝑂 𝑑 -variable semidefinite program (SDP). add variables & constraints 𝑛 4 × 𝑛 4 Algorithm will have to run in time SUBlinear in the dimension of the convex relaxation – UNLIKE previous primal-dual or MMW. Hard problem
4
Algorithms, Hierarchies, and Running Time
Better approximation ratios, noise tolerance, than linear programs, semidefinite programs. New Algorithms for: Scheduling [Levey-Rothvoss] Independent sets in bounded-degree graphs [Bansal, Chlamtac] Independent sets in hypergraphs [Chlamtac, Chlamtac-Singh] Planted problems [Barak-Kelner-Steurer, Barak-Moitra, Hopkins-Shi-Steurer, Ge-Ma, Raghavendra-Rao-Schramm, Ma-Shi-Steurer] Unique games [Barak-Raghavendra-Steurer, Barak-Brandao-Harrow-Kelner-Steurer-Zhou] Add UG citations
5
Algorithms, Hierarchies, and Running Time
Big convex programs: e.g. 𝑂 𝑛 10 or 𝑂( 𝑛 log 𝑛 ) variables. Are these algorithms “purely theoretical” or can their running times be improved? Algorithm will have to run in time SUBlinear in the dimension of the convex relaxation – UNLIKE previous primal-dual or MMW.
6
Algorithms, Hierarchies, and Running Time
Big convex programs: e.g. 𝑂 𝑛 10 or 𝑂( 𝑛 log 𝑛 ) variables. Are these algorithms “purely theoretical” or can their running times be improved? This work: fast spectral algorithms with matching guarantees for planted problems. Use eigenvectors of matrix polynomials Emphasize: spectral algs are different from usual ones
7
Algorithms, Hierarchies, and Running Time
Big convex programs: e.g. 𝑂 𝑛 10 or 𝑂( 𝑛 log 𝑛 ) variables. Are these algorithms “purely theoretical” or can their running times be improved? This work: fast spectral algorithms with matching guarantees for planted problems. SDP Relaxation HUGE, accurate SDP Relaxation 𝑛×𝑛 𝑛 2 × 𝑛 2 𝑛 3 × 𝑛 3 ⋱ Fade background text moar 𝑛
8
Results (1) Planted Sparse Vector (2) Random Tensor Decomposition
(3) Tensor Principal Component Analysis
9
Results (1) Planted Sparse Vector: There is a nearly-linear time algorithm to recover a constant-sparsity 𝑣 0 ∈ ℝ 𝑛 planted in a 𝑛 / log 𝑛 𝑂 1 -dimensional random subspace. (Matches guarantees of degree-4 SoS, up to log factor [BKS].) SoS (previous champion) has to solve large SDP (much larger than input size) (2) Random Tensor Decomposition: There is a 𝑂 ( 𝑛 (1+𝜔)/3 )-time algorithm to recover a rank-one factor of a random dimension 𝑑 3-tensor 𝑇= 𝑖≤𝑚 𝑎 𝑖 ⊗3 if 𝑚≪ 𝑑 4/3 . (SoS achieves 𝑚≪ 𝑑 3/2 in large polynomial time [GM, MSS].)
10
Results Match SoS guarantees, nearly-linear time
Planted Sparse Vector Match SoS guarantees, nearly-linear time (2) Random Tensor Decomposition Almost match SoS guarantees, 𝑂( 𝑛 1.2 ) time Time guarantees are IN THE INPUT SIZE (3) Tensor Principal Component Analysis Match SoS guarantees, linear time
11
Results Match SoS guarantees, nearly-linear time
Planted Sparse Vector Match SoS guarantees, nearly-linear time (2) Random Tensor Decomposition Almost match SoS guarantees, 𝑂( 𝑛 1.2 ) time (3) Tensor Principal Component Analysis Match SoS guarantees, linear time
12
Planted Sparse Vector 𝑣 0
Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 Containing: 𝑣 0 with ≤ 𝑛 nonzeros. Find 𝒗 𝟎 𝑛 𝑑 𝑥 𝑧 𝑦 𝑣 0 𝑛 𝑘
13
Planted Sparse Vector What dimensions 𝒅 permit efficient algorithms?
Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 Containing: 𝑣 0 with ≤ 𝑛 nonzeros. Find 𝒗 𝟎 𝑛 𝑑 𝑥 𝑧 𝑦 𝑣 0 𝑛 𝑘 What dimensions 𝒅 permit efficient algorithms? Result: spectral algorithm matching SoS’s 𝒅 ≲ 𝒏
14
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop
15
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop Dual certificates from SoS SDP (variant of primal-dual)
16
The Speedup Recipe Compression to small matrices
SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop Dual certificates from SoS SDP (variant of primal-dual)
17
The Speedup Recipe Compression to small matrices Different from
— matrix multiplicative weights [Arora-Kale] — simpler spectral algorithms SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop Dual certificates from SoS SDP (variant of primal-dual)
18
The Speedup Recipe Compression to small matrices Different from
— matrix multiplicative weights [Arora-Kale] — simpler spectral algorithms Not local rounding [Guruswami-Sinop] SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop Dual certificates from SoS SDP (variant of primal-dual)
19
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop
20
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 MENTION guruswami-sinop Matrix dimensions ≈ SDP dimensions
21
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 = 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 signal noise MENTION guruswami-sinop
22
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 = 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 signal noise MENTION guruswami-sinop 𝑛 basis 𝑑 Planted sparse vector 𝑥 ≈ 𝑣 0
23
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 This matrix is too big Matrix dimensions ≈ SDP dimensions MENTION guruswami-sinop
24
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm MENTION guruswami-sinop
25
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop
26
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ redundant information with tensor structure
27
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ redundant information with tensor structure
28
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ Hope: preserve signal-to-noise ratio of 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸
29
Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ In 𝑑=2 dimensions 𝑑 2 𝑥 1 2 ⋅𝑥 𝑥 ⊤
𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ In 𝑑=2 dimensions 𝑑 2 𝑥 1 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 𝑥 2 ⋅𝑥 𝑥 ⊤ MENTION guruswami-sinop 𝑥 2 𝑥 1 ⋅𝑥 𝑥 ⊤ 𝑥 2 2 ⋅𝑥 𝑥 ⊤
30
Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ In 𝑑=2 dimensions 𝑑 2 + = 𝑥 𝑥 ⊤
𝑥⊗𝑥 𝑥⊗𝑥 ⊤ 𝑥 𝑥 ⊤ In 𝑑=2 dimensions 𝑑 2 𝑥 1 2 ⋅𝑥 𝑥 ⊤ 𝑥 2 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 𝑥 2 ⋅𝑥 𝑥 ⊤ + = 𝑥 𝑥 ⊤ MENTION guruswami-sinop 𝑥 2 𝑥 1 ⋅𝑥 𝑥 ⊤ 𝑥 2 2 ⋅𝑥 𝑥 ⊤
31
Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? MENTION guruswami-sinop
32
Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ??
𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? MENTION guruswami-sinop
33
Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? ≈ 𝑚
±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? MENTION guruswami-sinop
34
Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? 𝑑 2 𝑑 𝑑 ≈ 𝑚
±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? 𝑑 2 𝑑 ±1 𝑑 MENTION guruswami-sinop 1 𝑑 ⋅
35
Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? 𝑑 2 ≈ 𝑚
±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? 𝑑 2 1 𝑑 ⋅ ±1 + ⋯+ MENTION guruswami-sinop 1 𝑑 ⋅
36
Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? 𝑑 2 ≈ 𝑚
±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? 𝑑 2 1 𝑑 ⋅ ±1 + ⋯+ ≈ ± 𝑑 1 𝑑 ⋅
37
Partial Trace 𝑚 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? 𝑑 2 ≈ 𝑚
±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? 𝑑 2 1 𝑑 ⋅ ±1 + ⋯+ ≈ ± 𝑑 MENTION guruswami-sinop 1 𝑑 ⋅ 1 𝑑 ⋅ ± 𝑑 ≈1
38
Conclusion: signal-to-noise ratio is preserved!
𝑚 ±1 ≈ 𝑚 Partial Trace 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ + ?? What if the noise is as random as possible? i.i.d ±1 entries? Conclusion: signal-to-noise ratio is preserved! 𝑑 2 1 𝑑 ⋅ ±1 + ⋯+ ≈ ± 𝑑 MENTION guruswami-sinop 1 𝑑 ⋅ 1 𝑑 ⋅ ± 𝑑 ≈1
39
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ +𝐸′
40
The Speedup Recipe SoS algorithm (large SDP)
Spectral algorithm with big matrices Fast spectral algorithm Ensure 𝑬 is like a ±𝟏 i.i.d. matrix 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ +𝐸′
41
The Speedup Recipe Avoid explicitly computing large matrix
SoS algorithm (large SDP) Spectral algorithm with big matrices Fast spectral algorithm Ensure 𝑬 is like a ±𝟏 i.i.d. matrix 𝑑 2 𝑑 MENTION guruswami-sinop 𝑥⊗𝑥 𝑥⊗𝑥 ⊤ +𝐸 𝑥 𝑥 ⊤ +𝐸′
42
Resulting Algorithms are Simple and Spectral
Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 Containing: 𝑣 0 with ≤ 𝑛 nonzeros. Find 𝒗 𝟎 𝑛 𝑑 𝑥 𝑧 𝑦 𝑣 0 𝑛 𝑘
43
Resulting Algorithms are Simple and Spectral
Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 𝑛 𝑑 𝑎 1 ⋮ 𝑎 𝑛
44
Resulting Algorithms are Simple and Spectral
Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 𝑛 𝑑 Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . 𝑎 1 ⋮ 𝑎 𝑛
45
Resulting Algorithms are Simple and Spectral
Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 𝑛 𝑑 Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑎 1 𝑛 basis 𝑑 ⋮ 𝑦 𝑎 𝑛
46
Resulting Algorithms are Simple and Spectral
Given: basis for (almost) random 𝑑-dimensional subspace of ℝ 𝑛 𝑛 𝑑 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ is a degree-4 matrix polynomial in the input variables. Capture the power of SoS without high dimensions Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑎 1 𝑛 basis 𝑑 ⋮ 𝑦 𝑎 𝑛
47
tensor structure in dual certificates, practical spectral algorithms.
Conclusions By exploiting tensor structure in dual certificates, randomness in inputs, impractical SoS algorithms can become practical spectral algorithms. Thanks For Coming!
48
The Resulting Algorithms are Simple and Spectral
Example: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 is sparse, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal: find 𝑣 0 Our Algorithm: Input: subspace basis U=( 𝑢 1 ,…, 𝑢 𝑑 ). 𝑈= ⋮ ⋮ 𝑢 1 ⋯ 𝑢 𝑑 ⋮ ⋮ = ⋯ 𝑎 1 ⋯ ⋮ ⋯ 𝑎 𝑛 ⋯
49
The Resulting Algorithms are Simple and Spectral
Example: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 is sparse, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal: find 𝑣 0 Our Algorithm: Input: subspace basis U=( 𝑢 1 ,…, 𝑢 𝑑 ). 𝑈= ⋮ ⋮ 𝑢 1 ⋯ 𝑢 𝑑 ⋮ ⋮ = ⋯ 𝑎 1 ⋯ ⋮ ⋯ 𝑎 𝑛 ⋯ Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑖 𝑦 𝑖 𝑢 𝑖 . Replace matrices by pictures
50
The Resulting Algorithms are Simple and Spectral
Example: Planted Sparse Vector Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 is sparse, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal: find 𝑣 0 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ is a degree-4 matrix polynomial in the input variables. Capture the power of SoS without high dimensions Our Algorithm: Input: subspace basis U=( 𝑢 1 ,…, 𝑢 𝑑 ). 𝑈= ⋮ ⋮ 𝑢 1 ⋯ 𝑢 𝑑 ⋮ ⋮ = ⋯ 𝑎 1 ⋯ ⋮ ⋯ 𝑎 𝑛 ⋯ Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑖 𝑦 𝑖 𝑢 𝑖 .
51
Contrast to Previous Speedup Approaches
(Matrix) Multiplicative Weights [Arora-Kale]: Cannot go faster than matrix-vector multiplication for matrices in the underlying SDP. Fast Solvers for Local Rounding [Guruswami-Sinop]: achieve running time 2 𝑂 𝑑 ⋅𝑝𝑜𝑙𝑦(𝑛) when rounding algorithm is “local”. Many SoS rounding algorithms are not local, and we want near-linear time. Algorithm will have to run in time SUBlinear in the dimension of the convex relaxation – UNLIKE previous primal-dual or MMW.
52
The Speedup Recipe Understand Spectrum of SoS Dual Certificate (avoid SDP) (2) Reduce Dimensions via Tensor Structure in Dual Cert.
53
The Speedup Recipe Understand Spectrum of SoS Dual Certificate (avoid SDP) the result: a spectral algorithm using high-dimensional matrices. typical matrix: 𝑥 ⊗4 +𝐸 for some unit 𝑥 and 𝐸 ≪1. (2) Reduce Dimensions via Tensor Structure in Dual Cert.
54
The Speedup Recipe Understand Spectrum of SoS Dual Certificate (avoid SDP) the result: a spectral algorithm using high-dimensional matrices. typical matrix: 𝑥 ⊗4 +𝐸 for some unit 𝑥 and 𝐸 ≪1. (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′
55
𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬
The Speedup Recipe 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 Understand Spectrum of SoS Dual Certificate the result: a spectral algorithm using high-dimensional matrices. In 2 dimensions: 𝑥 ⊗2 𝑥 ⊗2 ⊤ = 𝑥 1 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 𝑥 2 ⋅𝑥 𝑥 ⊤ 𝑥 2 𝑥 1 ⋅𝑥 𝑥 ⊤ 𝑥 2 2 ⋅𝑥 𝑥 ⊤ 𝑥 1 2 ⋅𝑥 𝑥 ⊤ + 𝑥 2 2 ⋅𝑥 𝑥 ⊤ = 𝑥 2 2 ⋅𝑥 𝑥 ⊤ =𝑥 𝑥 ⊤ (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′
56
𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬
The Speedup Recipe 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 Understand Spectrum of SoS Dual Certificate the result: a spectral algorithm using high-dimensional matrices. (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′ 𝐸 randomish ‖𝐸 ′ ‖= 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 ≈‖𝐸‖
57
𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬
The Speedup Recipe 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 Understand Spectrum of SoS Dual Certificate the result: a spectral algorithm using high-dimensional matrices. Heuristic: 𝐸 has iid entries 𝐸= 1 𝑑 ±1 ⋯ ⋯ ⋯ 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 = 1 𝑑 ± 𝑑 ⋯ ⋯ ⋯ 𝑑 2 × 𝑑 2 𝑑 ×𝑑 (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′ 𝐸 randomish ‖𝐸 ′ ‖= 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 ≈‖𝐸‖
58
𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬
The Speedup Recipe 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 Understand Spectrum of SoS Dual Certificate the result: a spectral algorithm using high-dimensional matrices. Heuristic: 𝐸 has iid entries 𝐸= 1 𝑑 ±1 ⋯ ⋯ ⋯ 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 = 1 𝑑 ± 𝑑 ⋯ ⋯ ⋯ 𝑑 2 × 𝑑 2 𝑑 ×𝑑 (2) Reduce Dimensions via Tensor Structure in Dual Cert. dual certificate matrix is high-dimensional but top eigenvector has tensor structure instead, use 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 =𝑥 𝑥 ⊤ +𝐸′ 𝐸 randomish ‖𝐸 ′ ‖= 𝑃𝑎𝑟𝑡𝑖𝑎𝑙𝑇𝑟𝑎𝑐𝑒 𝐸 ≈‖𝐸‖ Compute 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑥 ⊗4 +𝐸 without 𝑥 ⊗4 +𝐸.
59
Thanks For Coming!
60
Can We Use Previous Approaches to Speeding Up Relaxation-Based Algorithms?
Goal: Take an algorithm which uses an 𝑛 𝑂 𝑑 -size SDP, run it in nearly-linear time. (Matrix) Multiplicative Weights [Arora-Kale]: solve SDP on 𝑚×𝑚 matrices in 𝑂 (𝑚) time. We need e.g. 𝑛 2 × 𝑛 2 matrices. Algorithm will have to run in time SUBlinear in the dimension of the convex relaxation – UNLIKE previous primal-dual or MMW. Fast Solvers for Local Rounding [Guruswami-Sinop]: achieve running time 2 𝑂 𝑑 ⋅𝑝𝑜𝑙𝑦(𝑛) when rounding algorithm is “local”. Many SoS rounding algorithms are not local, and we want near-linear time.
61
A Benchmark Problem: Planted Sparse Vector
Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random ≈± 1 𝑛 𝑣 1 . 𝑢 0 . Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Random linear combinations 𝑣 𝑑 𝑢 𝑑 ≈± 10 𝑛 𝑣 0
62
A Benchmark Problem: Planted Sparse Vector
Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random Gets harder as d=𝑑(𝑛) grows. Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
63
A Benchmark Problem: Planted Sparse Vector
Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random Question: what dimension 𝑑 can be handled by efficient algorithms? (Poly- time in input size 𝑛𝑑.) [Spielman-Wang-Wright, Demanet-Hand, Barak- Kelner-Steurer, Qu-Sun-Wright, H-Schramm-Shi- Steurer] Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
64
A Benchmark Problem: Planted Sparse Vector
Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random Related to compressed sensing, dictionary learning, sparse pca, shortest codeword, small-set expansion Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
65
A Benchmark Problem: Planted Sparse Vector
Input: a basis for V planted = span 𝑣 0 , 𝑣 1 ,…, 𝑣 𝑑 where 𝑣 0 ∈ ℝ 𝑛 has 𝑛 100 nonzeros, 𝑣 1 ,…, 𝑣 𝑑 ∈ ℝ 𝑛 are random. Goal (recovery): find 𝑣 0 Goal (distinguishing): distinguish from a random subspace V random Simple problem where sum-of-squares (SoS) hierarchy beats LP, (small) SDPs, local search Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
66
Previous Work (recovery version)
Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
67
Previous Work (recovery version)
Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Folklore Semidefinite Programming Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
68
Previous Work (recovery version)
Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Folklore Semidefinite Programming [Qu-Sun-Wright] 𝑛 1/4 log 𝑛 𝑂 1 Alternating Minimization Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
69
Previous Work (recovery version)
Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Folklore Semidefinite Programming [Qu-Sun-Wright] 𝑛 1/4 log 𝑛 𝑂 1 Alternating Minimization [Barak-Brandao-Harrow-Kelner-Steurer-Zhou, Barak-Kelner-Steurer] 𝑛 SoS Hierarchy Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
70
Previous Work (recovery version)
Authors Subspace Dimension Technique [Spielman-Wang-Wright, Demanet-Hand] 𝑂 1 unless greater sparsity Linear Programming Folklore Semidefinite Programming [Qu-Sun-Wright] 𝑛 1/4 log 𝑛 𝑂 1 Alternating Minimization [Barak-Brandao-Harrow-Kelner-Steurer-Zhou, Barak-Kelner-Steurer] 𝑛 SoS Hierarchy All require polynomial loss in sparsity or subspace dimension or both, compared with SoS. Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
71
Sum-of-Squares (and SoS-inspired) Algorithms
From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
72
Sum-of-Squares (and SoS-inspired) Algorithms
From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
73
Sum-of-Squares (and SoS-inspired) Algorithms
From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
74
Sum-of-Squares (and SoS-inspired) Algorithms
From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
75
Sum-of-Squares (and SoS-inspired) Algorithms
From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Running-time barrier from dimension of convex program
76
Sum-of-Squares (and SoS-inspired) Algorithms
From now on: d= 𝑛 /𝑙𝑜𝑔 𝑛 𝑂 1 Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work 𝑂 (𝑛𝑑) i.e. nearly-linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Running-time barrier from dimension of convex program
77
[Barak et al]’s Distinguishing Algorithm
Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work 𝑂 (𝑛𝑑) i.e. nearly-linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Observation: 𝑚𝑎 𝑥 𝑣∈ V random 𝑣 𝑣 2 ≪ 𝑣 𝑣 for sparse 𝑣 0 [Barak et al]: 𝐒𝐨 𝐒 𝟒 𝑚𝑎 𝑥 𝑣∈ V random 𝑣 𝑣 2 ≪ 𝑣 𝑣 for sparse 𝑣 0 SDP in 𝑑 2 × 𝑑 2 matrices
78
Bound 𝑆𝑜 𝑆 4 Using Dual Certificate
𝐒𝐨 𝐒 𝟒 𝑚𝑎 𝑥 𝑣∈ V random 𝑣 𝑣 2 ≤ 𝑀 𝑉 𝑟𝑎𝑛𝑑𝑜𝑚 ≪ 𝑣 𝑣 0 2 Dual: 𝑀 𝑉 ∈ ℝ 𝑑 2 × 𝑑 2 so that 𝑥 ⊗2 , 𝑀 𝑉 𝑥 ⊗2 = ∑ 𝑥 𝑖 𝑢 𝑖 4 4 (remember that 𝑠𝑝𝑎𝑛 𝑢 0 ,…, 𝑢 𝑑 =𝑉) ∑ 𝒙 𝒊 𝒖 𝒊 𝟒 𝟒 makes SoS the champion High dimension access high-degree polynomial Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
79
[Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer]
Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work 𝑂 (𝑛𝑑) i.e. nearly-linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
80
Structure of the Dual Cert. 𝑀(𝑉)
Computing Directly With 𝑴 𝑽 (1) 𝑀(𝑉) is explicit: matrix-vector multiply in time 𝑂(𝑛 𝑑 2 ) (2) 𝑀 𝑉 𝑟𝑎𝑛𝑑𝑜𝑚 ≪ 𝑣 𝑣 ≈‖𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 ‖ Lemma: With high probability, 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 = 𝑦⊗𝑦 𝑦⊗𝑦 ⊤ +𝐸, where 𝑦 is unit, 𝑣 0 = 𝑖 𝑦 𝑖 𝑢 𝑖 , and 𝐸 ≪1. Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game 𝐸 is a random matrix depending on randomness in subspace 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 .
81
[Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer]
Running Time Distinguishing Recovery 𝑝𝑜𝑙𝑦 𝑛,𝑑 [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] [Barak-Kelner-Steurer] 𝑂 (𝑛 𝑑 2 ) [Barak-Brandao-Harrow-Kelner-Steurer-Zhou] (implicit) This work 𝑂 (𝑛𝑑) i.e. nearly-linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
82
(Breaking) The Dimension Barrier
Recall: ∑ 𝒙 𝒊 𝒖 𝒊 𝟒 𝟒 makes SoS the champion 𝑀(𝑉) is 𝑑 2 × 𝑑 2 access high-degree polynomial High-Degree but Lower Dimension 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 = 𝑦⊗𝑦 𝑦⊗𝑦 ⊤ +𝐸 is 𝑑 2 × 𝑑 2 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 =𝑦 𝑦 ⊤ +𝐸′ is 𝑑×𝑑 Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game 𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆: 𝒅 𝟐 × 𝒅 𝟐 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 → 𝒅×𝒅 𝐦𝐚𝐭𝐫𝐢𝐜𝐞𝐬 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑦⊗𝑦 𝑦⊗𝑦 ⊤ =𝑦 𝑦 ⊤ 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝐸 =𝐸′
83
(Breaking) The Dimension Barrier
High-Degree but Lower Dimension 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 = 𝑦⊗𝑦 𝑦⊗𝑦 ⊤ +𝐸 is 𝑑 2 × 𝑑 2 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 𝑀 𝑉 𝑝𝑙𝑎𝑛𝑡𝑒𝑑 =𝑦 𝑦 ⊤ +𝐸′ is 𝑑×𝑑 Compute top eigenvector in time 𝑶 (𝒏𝒅)? Yes, 𝑀(𝑉) is a nice function of 𝑉 and 𝑃𝑎𝑟𝑡𝑖𝑎𝑙 𝑇𝑟𝑎𝑐𝑒 is linear Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game Does 𝑬 ≪𝟏 𝑬 ′ =‖𝑷𝒂𝒓𝒕𝒊𝒂𝒍𝑻𝒓𝒂𝒄𝒆 𝑬 ‖≪𝟏 ? Yes, if 𝐸 is random enough. Related: 𝐴∈ −1,1 𝑛×𝑛 uniform has 𝐴 ≈𝑇𝑟𝐴≈ 𝑛 .
84
Recovering 𝑣 0 in 𝑂 𝑛𝑑 Time
Algorithm Input: subspace basis U=( 𝑢 1 ,…, 𝑢 𝑑 ). Let 𝑎 1 ,…, 𝑎 𝑛 be generators (rows of 𝑈). Compute top eigenvector 𝑦 of 𝑖 𝑎 𝑖 2 ⋅ 𝑎 𝑖 𝑎 𝑖 ⊤ . Output 𝑖 𝑦 𝑖 𝑢 𝑖 . The Recipe (1) SoS algorithms (often) come with dual certificate constructions. (2) Explicitly compute spectrum of dual certificate. (3) Compress to lower dimensions using randomness to avoid losing information. Fix sparsity to n/100 More slides, fewer words Need a visualization – maybe the spikyness picture, maybe game
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.