Download presentation
Presentation is loading. Please wait.
Published byHartono Kusnadi Modified over 6 years ago
1
Matrix Martingales in Randomized Numerical Linear Algebra
Speaker Rasmus Kyng Harvard University Sep 27, 2018
2
Concentration of Scalar Random Variables
𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 Is 𝑋≈0 with high probability?
3
Concentration of Scalar Random Variables
Bernstein’s Inequality Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝑋 𝑖 ≤𝑟 𝑖 𝔼 𝑋 𝑖 2 ≤ 𝜎 2 gives ℙ[ 𝑋 >𝜀]≤ 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 2 ≤𝜏 E.g. if 𝜀=0.5, and 𝑟, 𝜎 2 =0.1/log(1/𝜏)
4
Concentration of Scalar Martingales
Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝔼 𝑋 𝑖 | 𝑋 1 , … ,𝑋 𝑖−1 =0
5
Concentration of Scalar Martingales
Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝔼 𝑋 𝑖 | previous steps =0
6
Concentration of Scalar Martingales
Freedman’s Inequality Bernstein’s Inequality Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝑋 𝑖 ≤𝑟 𝑖 𝔼 𝑋 𝑖 2 ≤ 𝜎 2 gives ℙ[ 𝑋 >𝜀]≤2 exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 2 𝔼 𝑋 𝑖 | previous steps =0 𝑖 𝔼 𝑋 𝑖 2 | previous steps ≤ 𝜎 2 ℙ 𝑖 𝔼 𝑋 𝑖 2 | previous steps > 𝜎 2 ≤𝛿 +𝛿
7
Concentration of Scalar Martingales
Freedman’s Inequality Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 | previous steps =0 𝑋 𝑖 ≤𝑟 ℙ[ 𝑖 𝔼 𝑋 𝑖 2 | previous steps > 𝜎 2 ]≤𝛿 gives ℙ[ 𝑋 >𝜀]≤ 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿
8
Concentration of Matrix Random Variables
Matrix Bernstein’s Inequality (Tropp 11) Random 𝑿= 𝑖 𝑿 𝑖 𝑿 𝑖 ∈ ℝ 𝑑×𝑑 , symmetric 𝑿 𝑖 are independent 𝔼 𝑿 𝑖 =𝟎 𝑿 𝑖 ≤𝑟 𝑖 𝔼 𝑿 𝑖 2 ≤ 𝜎 2 gives ℙ[ 𝑿 >𝜖]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 2
9
Concentration of Matrix Martingales
Matrix Freedman’s Inequality (Tropp 11) Random 𝑿= 𝑖 𝑿 𝑖 𝑿 𝑖 ∈ ℝ 𝑑×𝑑 , symmetric 𝑿 𝑖 are independent 𝔼 𝑿 𝑖 =0 𝔼 𝑿 𝑖 | previous steps =0 𝑿 𝑖 ≤𝑟 𝑖 𝔼 𝑿 𝑖 2 ≤ 𝜎 2 ℙ[ 𝑖 𝔼 𝑿 𝑖 2 | prev. steps > 𝜎 2 ]≤𝛿 gives ℙ[ 𝑿 >𝜀]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿 E.g. if 𝜀=0.5, and 𝑟, 𝜎 2 =0.1/log(𝑑/𝜏) ≤𝜏+𝛿
10
Concentration of Matrix Martingales
Matrix Freedman’s Inequality (Tropp 11) ℙ[ 𝑖 𝔼 𝑿 𝑖 2 | prev. steps > 𝜎 2 ]≤𝛿 Predictable quadratic variation = 𝑖 𝔼 𝑿 𝑖 2 | prev. steps
11
Laplacian Matrices 𝑏 𝑎 Graph 𝐺=(𝑉,𝐸) Edge weights 𝑤:𝐸→ ℝ + 𝑛= 𝑉 𝑚=|𝐸|
𝑐 𝐿= 𝑒∈𝐸 𝐿 𝑒 𝑛×𝑛 matrix
12
Laplacian Matrices 𝑏 𝑎 Graph 𝐺=(𝑉,𝐸) Edge weights 𝑤:𝐸→ ℝ + 𝑛= 𝑉 𝑚=|𝐸|
𝑐 𝐿= 𝑒∈𝐸 𝐿 𝑒 𝑛×𝑛 matrix
13
Laplacian Matrices Graph 𝐺=(𝑉,𝐸) Edge weights 𝑤:𝐸→ ℝ + 𝑛= 𝑉 𝑚=|𝐸|
𝑨: weighted adjacency matrix of the graph 𝑫: diagonal matrix of weighted degrees 𝑳=𝑫−𝑨 𝑨 𝑖𝑗 = 𝑤 𝑖𝑗 𝑫 𝑖𝑖 = 𝑗 𝑤 𝑖𝑗
14
Laplacian Matrices Symmetric matrix 𝐿 All off-diagonals are non-positive and 𝐿 𝑖𝑖 = 𝑗≠𝑖 𝐿 𝑖𝑗
15
Laplacian of a Graph 1 1 2
16
Approximation between graphs
PSD Order 𝑨≼𝑩 iff for all 𝒙 𝒙 ⊤ 𝑨𝒙≤ 𝒙 ⊤ 𝑩𝒙 Matrix Approximation 𝑨 ≈ 𝜀 𝑩 iff 𝑨≼ 1+𝜀 𝑩 and 𝑩≼ 1+𝜀 𝑨 Graphs 𝐺 ≈ 𝜀 𝐻 iff 𝑳 𝐺 ≈ 𝜀 𝑳 𝐻 Implies multiplicative approximation to cuts
17
Sparsification of Laplacians
Every graph 𝐺 (0) on 𝑛 vertices for any 𝜀∈ 0,1 , has a sparsifier 𝐺 (1) ≈ 𝜀 𝐺 (0) with 𝑂(𝑛 𝜀 −2 ) edges [BSS’09,LS’17] Sampling edges independently by leverage scores w.h.p. gives sparsifier with 𝑂(𝑛 𝜀 −2 log 𝑛 ) edges [Spielman-Srivastava’08] Many applications!
18
Graph Semi-Streaming Sparsification
𝑚 edges of a graph arriving as stream 𝑎 1 , 𝑏 1 , 𝑤 1 , 𝑎 2 , 𝑏 2 , 𝑤 2 ,… GOAL: maintain 𝜀-sparsifier, minimize working space (& time) [Ahn-Guha ’09] 𝑛 𝜀 −2 log 𝑛 log 𝑚 𝑛 space 𝑛 𝜀 −2 log 𝑛 log 𝑚 𝑛 edge cut 𝜀-sparsifier Used scalar martingales
19
Independent Edge Samples
Expectation 𝔼 𝑳 𝑒 (1) = 𝑳 𝑒 (0) Zero-mean 𝑿 𝑒 (1) = 𝑳 𝑒 (1) − 𝑳 𝑒 (0) 𝑿 (1) = 𝑳 (1) − 𝑳 0 = 𝑒 𝑿 𝑒 (1) 𝑳 𝑒 (1) = 1 𝑝 𝑒 𝑳 𝑒 (0) 1 𝑝 𝑒 𝟎 w. probability 𝑝 𝑒 o.w.
20
Matrix Martingale? Zero mean? So what if we just sparsify again? It’s a martingale! Every time we accumulate 20 𝑛 𝜀 −2 log 𝑛, sparsify down to 10 𝑛 𝜀 −2 log 𝑛, and continue
21
Independent Edge Samples
Expectation 𝔼 𝑳 𝑒 (1) = 𝑳 𝑒 (0) Zero-mean 𝑿 𝑒 (1) = 𝑳 𝑒 (1) − 𝑳 𝑒 (0) 𝑿 (1) = 𝑒 𝑿 𝑒 (1) 𝑳 𝑒 (1) = 1 𝑝 𝑒 𝑳 𝑒 (0) 1 𝑝 𝑒 𝟎 w. probability 𝑝 𝑒 o.w.
22
Repeated Edge Sampling
Expectation 𝔼 𝑳 𝑒 (𝑖) = 𝑳 𝑒 (𝑖−1) Zero-mean 𝑿 𝑒 (𝑖) = 𝑳 𝑒 (𝑖) − 𝑳 𝑒 (𝑖−1) 𝑿 (𝑖) = 𝑳 (𝑖) − 𝑳 (𝑖−1) = 𝒆 𝑿 𝑒 (𝑖) 𝑳 𝑒 (𝑖) = 1 𝑝 𝑒 (𝑖−1) 𝑳 𝑒 (𝑖−1) 1 𝑝 𝑒 𝟎 w. prob. 𝑝 𝑒 (𝑖−1) o.w.
23
Repeated Edge Sampling
Expectation 𝔼 𝑳 𝑒 (𝑖) = 𝑳 𝑒 (𝑖−1) Zero-mean 𝑿 𝑒 (𝑖) = 𝑳 𝑒 (𝑖) − 𝑳 𝑒 (𝑖−1) 𝑿 (𝑖) = 𝑳 (𝑖) − 𝑳 (𝑖−1) = 𝒆 𝑿 𝑒 (𝑖) 𝑳 𝑒 (𝑖) = 1 𝑝 𝑒 (𝑖−1) 𝑳 𝑒 (𝑖−1) 1 𝑝 𝑒 𝟎 w. prob. 𝑝 𝑒 (𝑖−1) o.w.
24
Repeated Edge Sampling
Expectation 𝔼 𝑳 𝑒 (𝑖) = 𝑳 𝑒 (𝑖−1) Zero-mean 𝑿 𝑒 (𝑖) = 𝑳 𝑒 (𝑖) − 𝑳 𝑒 (𝑖−1) 𝑿 (𝑖) = 𝑳 (𝑖) − 𝑳 (𝑖−1) = 𝒆 𝑿 𝑒 (𝑖) Martingale 𝔼 𝑿 𝑖 prev. steps =𝟎 𝑳 𝑒 (𝑖) = 1 𝑝 𝑒 (𝑖−1) 𝑳 𝑒 (𝑖−1) 1 𝑝 𝑒 𝟎 w. prob. 𝑝 𝑒 (𝑖−1) o.w.
25
Repeated Edge Sampling
It works – and it couldn’t be simpler [K.PPS’17] 𝑛 𝜀 −2 log 𝑛 space 𝑛 𝜀 −2 log 𝑛 edge 𝜀-sparsifier Shave a log off many algorithms that do repeated matrix sampling…
26
Approximation? 𝑳 𝐻 ≈ 0.5 𝑳 𝐺 𝑳 𝐻 ≼1.5 𝑳 𝐺 and 𝑳 𝐺 ≼1.5 𝑳 𝐻 𝑳 𝐻 − 𝑳 𝐺 ≤0.5 ? 𝑳 𝐺 −1/2 (𝑳 𝐻 − 𝑳 𝐺 ) 𝑳 𝐺 −1/2 ≤0.1
27
Isotropic position 𝒁 ≝ 𝑳 𝐺 −1/2 𝒁 𝑳 𝐺 −1/2 Goal is now 𝑳 𝐻 − 𝑳 𝐺 ≤0.1 𝑳 𝐻 −𝑰 ≤0.1
28
Concentration of Matrix Martingales
Matrix Freedman’s Inequality Random 𝑿 = 𝑖 𝑒 𝑿 𝑒 (𝑖) 𝑿 𝑒 (𝑖) ∈ ℝ 𝑑×𝑑 , symmetric 𝔼 𝑿 𝑒 (𝑖) | previous steps =0 𝑿 𝑒 (𝑖) ≤𝑟 ℙ 𝑖 𝑒 𝔼 𝑿 𝑒 𝑖 2 | prev. steps > 𝜎 2 ≤𝛿 gives ℙ[ 𝑿 >𝜀]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿
29
Concentration of Matrix Martingales
Matrix Freedman’s Inequality Random 𝑿 = 𝑖 𝑒 𝑿 𝑒 (𝑖) 𝑿 𝑒 (𝑖) ∈ ℝ 𝑑×𝑑 , symmetric 𝔼 𝑿 𝑒 (𝑖) | previous steps =0 𝑿 𝑒 (𝑖) ≤𝑟 ℙ 𝑖 𝑒 𝔼 𝑿 𝑒 𝑖 2 | prev. steps > 𝜎 2 ≤𝛿 gives ℙ[ 𝑿 >𝜀]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿
30
Norm Control Can there be many edges with large norm? 𝑒 𝑳 𝑒 (0) = 𝑒 Tr 𝑳 𝑒 0 =Tr 𝑒 𝑳 𝑒 0 = Tr 𝑳 (0) =Tr 𝑰 = 𝑛−1 Can afford 𝑟= 1 log 𝑛 , 𝑝 𝑒 = 𝑳 𝑒 0 log 𝑛 , 𝑒 𝑝 𝑒 =𝑛 log 𝑛 𝑳 𝑒 (1) = 1 𝑝 𝑒 𝑳 𝑒 (0) 1 𝑝 𝑒 𝟎 w. probability 𝑝 𝑒 o.w.
31
Norm Control 𝑿 𝑒 (𝑖) ≤𝑟 If we lose track of the original graph, we can’t estimate the norm Stopping time argument Bad regions
32
Variance Control 𝔼 𝑿 𝑒 1 2 ≼𝑟 𝑳 𝑒 (0) 𝑒 𝔼 𝑿 𝑒 1 2 ≼ 𝑒 𝑟 𝑳 𝑒 (0) =𝑟𝑰 But then edges get resampled. Geometric sequence for variance, roughly.
33
Resparsification Every time we accumulate 20 𝑛 𝜀 −2 log 𝑛, sparsify down to 10 𝑛 𝜀 −2 log 𝑛, and continue [K.PPS’17] 𝑛 𝜀 −2 log 𝑛 space 𝑛 𝜀 −2 log 𝑛 edge 𝜀-sparsifier Shave a log off many algorithms that do repeated matrix sampling…
34
Laplacian Linear Equations
[ST04]: solving Laplacian linear equations in 𝑂 𝑚 time ⋮ [KS16]: simple algorithm
35
Solving a Laplacian Linear Equation
𝑳𝒙=𝒃 Gaussian Elimination Find 𝖀, upper triangular matrix, s.t. 𝖀 ⊤ 𝖀=𝑳 Then 𝒙= 𝖀 −1 𝖀 −⊤ 𝒃 Easy to apply 𝖀 −1 and 𝖀 −⊤
36
Solving a Laplacian Linear Equation
𝑳𝒙=𝒃 Approximate Gaussian Elimination Find 𝖀, upper triangular matrix, s.t. 𝖀 ⊤ 𝖀 ≈ 0.5 𝑳 𝖀 is sparse. 𝑂 log 1 𝜀 iterations to get 𝜀-approximate solution 𝒙 .
37
Approximate Gaussian Elimination
Theorem [KS] When 𝑳 is an Laplacian matrix with 𝑚 non-zeros, we can find in 𝑂(𝑚 log 3 𝑛 ) time an upper triangular matrix 𝖀 with 𝑂(𝑚 log 3 𝑛 ) non-zeros, s.t. w.h.p. 𝖀 ⊤ 𝖀 ≈ 𝟎.𝟓 𝑳
38
Additive View of Gaussian Elimination
Find 𝖀, upper triangular matrix, s.t 𝖀 ⊤ 𝖀=𝑴 𝑴=
39
Additive View of Gaussian Elimination
Find the rank-1 matrix that agrees with 𝑴 on the first row and column.
40
Additive View of Gaussian Elimination
Subtract the rank 1 matrix. We have eliminated the first variable.
41
Additive View of Gaussian Elimination
The remaining matrix is PSD.
42
Additive View of Gaussian Elimination
Find rank-1 matrix that agrees with our matrix on the next row and column.
43
Additive View of Gaussian Elimination
Subtract the rank 1 matrix. We have eliminated the second variable.
44
Additive View of Gaussian Elimination
Repeat until all parts written as rank 1 terms. 𝑴=
45
Additive View of Gaussian Elimination
Repeat until all parts written as rank 1 terms. 𝑴=
46
Additive View of Gaussian Elimination
Repeat until all parts written as rank 1 terms. 𝑴= =𝖀 ⊤ 𝖀
47
Additive View of Gaussian Elimination
What is special about Gaussian Elimination on Laplacians? The remaining matrix is always Laplacian. 𝑳=
48
Additive View of Gaussian Elimination
What is special about Gaussian Elimination on Laplacians? The remaining matrix is always Laplacian. 𝑳=
49
Additive View of Gaussian Elimination
What is special about Gaussian Elimination on Laplacians? The remaining matrix is always Laplacian. 𝑳= A new Laplacian!
50
Why is Gaussian Elimination Slow?
Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= 𝑣
51
Why is Gaussian Elimination Slow?
Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= 𝑣
52
Why is Gaussian Elimination Slow?
Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= New Laplacian 𝑣 Elimination creates a clique on the neighbors of 𝑣
53
Why is Gaussian Elimination Slow?
Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= New Laplacian 𝑣 ≈ Laplacian cliques can be sparsified!
54
Gaussian Elimination Pick a vertex 𝑣 to eliminate
Add the clique created by eliminating 𝑣 Repeat until done
55
Approximate Gaussian Elimination
Pick a vertex 𝑣 to eliminate Add the clique created by eliminating 𝑣 Repeat until done
56
Approximate Gaussian Elimination
Pick a random vertex 𝑣 to eliminate Add the clique created by eliminating 𝑣 Repeat until done
57
Approximate Gaussian Elimination
Pick a random vertex 𝑣 to eliminate Sample the clique created by eliminating 𝑣 Repeat until done Resembles randomized Incomplete Cholesky
58
How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3
59
How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3
60
How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3
61
How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3
62
How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3
63
How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3
64
How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3
65
How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3
66
How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3
67
Approximating Matrices by Sampling
Goal 𝖀 ⊤ 𝖀≈𝑳 Approach 𝔼 𝖀 ⊤ 𝖀=𝑳 Show 𝖀 ⊤ 𝖀 concentrated around expectation Gives 𝖀 ⊤ 𝖀≈𝑳 w. high probability
68
Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (0) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Original Laplacian Rank 1 term Remaining graph + clique
69
Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝐿 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique
70
Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique
71
Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique
72
Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique 𝔼 𝐿 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Suppose
73
Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique Then = 𝑳 (0)
74
Approximating Matrices in Expectation
Let 𝑳 (𝑖) be our approximation after 𝑖 eliminations If we ensure at each step Then 𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ = ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Sparsified clique Clique 𝔼 𝑳 (𝑖) = 𝑳 (𝑖−1) 𝔼 𝑳 𝑖 − 𝑳 𝑖−1 | previous steps =𝟎
75
Predictable Quadratic Variation
Predictable quadratic variation 𝑾= 𝑖 𝑒 𝔼 𝑿 𝑒 𝑖 2 | prev. steps Want to show ℙ 𝑾 > 𝜎 2 ≤𝛿 Promise: 𝔼 𝑿 𝑒 2 ≼𝑟 𝑳 𝑒
76
Sample Variance 𝑣 𝑣 ≼ 𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 elim. clique of
77
Sample Variance 𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 = 2 current lap. ≼ 4 𝑳 = 4𝑰
𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 elim. clique of = 2 current lap. # vertices ≼ 𝑳 # vertices = 𝑰 # vertices WARNING: only true w.h.p.
78
Sample Variance Recall 𝔼 𝑿 𝑒 2 ≼𝑟 𝑳 𝑒 Putting it together
𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 elim. clique of 𝔼 𝑣 𝑒∈ 𝑣 𝔼 𝑿 𝑒 2 ≼ =𝑟 𝑰 # vertices = 𝑰 # vertices elim. clique of variance in one round of elimination
79
Sample Variance ≼ 𝑟⋅ 4𝑰 # vertices variance ≼4 𝑟 log 𝑛 ⋅𝑰 rounds of
elimination variance 𝑟⋅ 4𝑰 # vertices rounds of elimination ≼4 𝑟 log 𝑛 ⋅𝑰
80
Directed Laplacian of a Graph
1 1 2
81
(Weighted) Eulerian Graph
weighted in-degree = weighted out-degree 1 2 1
82
Eulerian Laplacian of a Graph
1 1 1
83
Solving an Eul. Laplacian Linear Equation
𝑳𝒙=𝒃 Gaussian Elimination Find 𝕷,𝖀, lower and upper triangular matrices, s.t. 𝕷𝖀=𝑳 Then 𝒙= 𝖀 −1 𝕷 −1 𝒃 Easy to apply 𝖀 −1 and 𝕷 −1
84
Solving an Eul. Laplacian Linear Equation
𝑳𝒙=𝒃 Approximate Gaussian Elimination [CKKPPRS FOCS’18] Find 𝕷,𝖀, lower and upper triangular matrices, s.t. 𝕷𝖀≈𝑳 𝕷, 𝖀 are sparse. 𝑂 log 1 𝜀 iterations to get 𝜀-approximate solution 𝒙 .
85
Gaussian Elimination on Eulerian Laplacians
𝑳= 1 2 𝑣 1 1 1
86
Gaussian Elimination on Eulerian Laplacians
𝑳= 2/3 1 2 𝑣 1 1/3 1 1
87
Gaussian Elimination on Eulerian Laplacians
𝑳= 2/3 1 2 𝑣 1 1/3 1 1
88
Gaussian Elimination on Eulerian Laplacians
𝑳= 1 2 𝑣 1 1 1
89
Gaussian Elimination on Eulerian Laplacians
Sparsify correct in expectation output always Eulerian 𝑳= 1 1 ≈ 1
90
Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 2 𝑣 1 1 1 ≈
91
Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 2 𝑣 1 1 1 ≈
92
Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 2 𝑣 1 1 1 1 ≈
93
Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 1 ≈
94
Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 1 ≈
95
Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 1 ≈ 1
96
Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 ≈ 1
97
Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 𝑣 1 1 ≈ 1
98
Gaussian Elimination on Eulerian Laplacians
Our sparsification correct in expectation output always Eulerian Average over log 3 (𝑛) runs to reduce variance 1 1 ≈ 1
99
Approximating Matrices by Sampling
Goal 𝕷𝖀≈𝑳 Approach 𝔼 𝕷𝖀=𝑳 Show 𝕷𝖀 concentrated around expectation Gives 𝕷𝖀≈𝑳 w. high probability (?) But what does ≈ mean? 𝑳 is not PSD?!? Come to my talk and I’ll tell you!
100
Difficulties with Approximation
1 1 1 1 1 1 1 1 1 1 6 1 1 1 1 1 1 1 1 1
101
Difficulties with Approximation
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
102
Difficulties with Approximation
1 1 1 1 1 1 1 1 1 1 6 1 1 1 1 1 1 1 1 1 Variance manageable
103
Difficulties with Approximation
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Variance problematic!
104
Difficulties with Approximation
Come to my talk and hear the rest of the story! [Cohen-Kelner-Kyng-Peebles-Peng-Rao-Sidford FOCS’18] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
105
Spanning Trees of a Graph
106
Spanning Trees of a (Multi)-Graph
Pick a random tree?
107
Random Spanning Trees Graph 𝐺 Tree distribution Marginal probability of edge: 𝑝 𝑒 = 𝑳 𝑒 Similar to sparsification?!
108
Random Spanning Trees Graph 𝐺 Tree distribution Marginal probability of edge: 𝑝 𝑒 = 𝑳 𝑒 𝑳 𝐺 = 𝑒∈𝐺 𝑳 𝑒 Consider 𝑳 𝑇 = 𝑒∈𝑇 1 𝑝 𝑒 𝑳 𝑒 𝔼 𝑳 𝑇 = 𝑳 𝐺
109
Random Spanning Trees Concentration? 𝑳 𝑇 ≈ 0.5 𝑳 𝐺 ? 𝑳 𝑇 ≼log 𝑛 𝑳 𝐺 ? 1 𝑡 𝑖=1 𝑡 𝑳 𝑇 𝑖 ≈ 𝜀 𝑳 𝐺 , 𝑡= 𝜀 −2 log 2 𝑛 [Kyng-Song FOCS’18] Come see my talk! NO YES, w.h.p
110
Sometimes You Get a Martingale for Free
Random variables 𝑍 1 ,…, 𝑍 𝑘 Prove concentration for 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) where 𝑓 is ``stable” under small changes to 𝑍 1 ,…, 𝑍 𝑘 𝑓 𝑍 1 ,…, 𝑍 𝑘 ≈𝔼𝑓 𝑍 1 ,…, 𝑍 𝑘 ?
111
Doob Martingales NOT indep.
Random variables 𝑍 1 ,…, 𝑍 𝑘 Prove concentration for 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) where 𝑓 is ``stable” under small changes to 𝑍 1 ,…, 𝑍 𝑘 Pick outcome 𝑧 1 ,…, 𝑧 𝑘 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) → 𝔼 𝑓 𝑍 1 ,…, 𝑍 𝑘 | 𝑍 1 = 𝑧 1 → 𝔼 𝑓 𝑍 1 ,…, 𝑍 𝑘 | 𝑍 1 = 𝑧 1 , 𝑍 2 = 𝑧 2 →𝑓 𝑧 1 ,…, 𝑧 𝑘 NOT indep.
112
Doob Martingales 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) → 𝔼 𝑓 𝑍 1 ,…,𝑘 | 𝑍 1 = 𝑧 1 → 𝔼 𝑓 𝑍 1 ,…, 𝑍 𝑘 | 𝑍 1 = 𝑧 1 , 𝑍 2 = 𝑧 2 →𝑓 𝑧 1 ,…, 𝑧 𝑘 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) =𝔼 𝑧 1 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 )| 𝑍 1 = 𝑧 1 𝑌 0 =𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) 𝑌 1 =𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 )| 𝑍 1 = 𝑧 1 𝑌 2 =𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 )| 𝑍 1 = 𝑧 1 , 𝑍 2 = 𝑧 2 𝔼 𝑌 𝑖+1 − 𝑌 𝑖 |prev. steps =0 Martingale! Despite 𝑍 1 ,…, 𝑍 𝑘 NOT indep.
113
Concentration of Random Spanning Trees
Random variables 𝑍 1 ,…, 𝑍 𝑛−1 Positions of 𝑛−1 edges of random spanning tree 𝑓 𝑍 1 ,…, 𝑍 𝑛−1 = 𝑳 𝑇 Similar to [Peres-Pemantle ‘14]
114
Conditioning Changes Distributions?
Graph Tree distribution All
115
Conditioning Changes Distributions?
Graph Tree distribution All Conditional
116
Summary Matrix martingales: a natural fit for algorithmic analysis
Understanding the Predictable Quadratic Variation is key Some results using matrix martingales Cohen, Musco, Pachocki ’16 – online row sampling Kyng, Sachdeva ’17 – approximate Gaussian elimination Kyng, Peng, Pachocki, Sachdeva ’17 – semi-streaming graph sparsification Cohen, Kelner, Kyng, Peebles, Peng, Rao, Sidford ’18 – solving Directed Laplacian eqs. Kyng, Song ’18 – Matrix Chernoff bound for negatively dependent random variables
117
Thanks!
118
This is really the end
119
Older slides
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.