Matrix Martingales in Randomized Numerical Linear Algebra

Matrix Martingales in Randomized Numerical Linear Algebra
Speaker Rasmus Kyng Harvard University Sep 27, 2018

Concentration of Scalar Random Variables
𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 Is 𝑋≈0 with high probability?

Concentration of Scalar Random Variables
Bernstein’s Inequality Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝑋 𝑖 ≤𝑟 𝑖 𝔼 𝑋 𝑖 2 ≤ 𝜎 2 gives ℙ[ 𝑋 >𝜀]≤ 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 2 ≤𝜏 E.g. if 𝜀=0.5, and 𝑟, 𝜎 2 =0.1/log(1/𝜏)

Concentration of Scalar Martingales
Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝔼 𝑋 𝑖 | 𝑋 1 , … ,𝑋 𝑖−1 =0

Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝔼 𝑋 𝑖 | previous steps =0

Freedman’s Inequality Bernstein’s Inequality Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝑋 𝑖 ≤𝑟 𝑖 𝔼 𝑋 𝑖 2 ≤ 𝜎 2 gives ℙ[ 𝑋 >𝜀]≤2 exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 2 𝔼 𝑋 𝑖 | previous steps =0 𝑖 𝔼 𝑋 𝑖 2 | previous steps ≤ 𝜎 2 ℙ 𝑖 𝔼 𝑋 𝑖 2 | previous steps > 𝜎 2 ≤𝛿 +𝛿

Freedman’s Inequality Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 | previous steps =0 𝑋 𝑖 ≤𝑟 ℙ[ 𝑖 𝔼 𝑋 𝑖 2 | previous steps > 𝜎 2 ]≤𝛿 gives ℙ[ 𝑋 >𝜀]≤ 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿

Concentration of Matrix Random Variables
Matrix Bernstein’s Inequality (Tropp 11) Random 𝑿= 𝑖 𝑿 𝑖 𝑿 𝑖 ∈ ℝ 𝑑×𝑑 , symmetric 𝑿 𝑖 are independent 𝔼 𝑿 𝑖 =𝟎 𝑿 𝑖 ≤𝑟 𝑖 𝔼 𝑿 𝑖 2 ≤ 𝜎 2 gives ℙ[ 𝑿 >𝜖]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 2

Concentration of Matrix Martingales
Matrix Freedman’s Inequality (Tropp 11) Random 𝑿= 𝑖 𝑿 𝑖 𝑿 𝑖 ∈ ℝ 𝑑×𝑑 , symmetric 𝑿 𝑖 are independent 𝔼 𝑿 𝑖 =0 𝔼 𝑿 𝑖 | previous steps =0 𝑿 𝑖 ≤𝑟 𝑖 𝔼 𝑿 𝑖 2 ≤ 𝜎 2 ℙ[ 𝑖 𝔼 𝑿 𝑖 2 | prev. steps > 𝜎 2 ]≤𝛿 gives ℙ[ 𝑿 >𝜀]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿 E.g. if 𝜀=0.5, and 𝑟, 𝜎 2 =0.1/log(𝑑/𝜏) ≤𝜏+𝛿

Matrix Freedman’s Inequality (Tropp 11) ℙ[ 𝑖 𝔼 𝑿 𝑖 2 | prev. steps > 𝜎 2 ]≤𝛿 Predictable quadratic variation = 𝑖 𝔼 𝑿 𝑖 2 | prev. steps

Laplacian Matrices 𝑏 𝑎 Graph 𝐺=(𝑉,𝐸) Edge weights 𝑤:𝐸→ ℝ + 𝑛= 𝑉 𝑚=|𝐸|
𝑐 𝐿= 𝑒∈𝐸 𝐿 𝑒 𝑛×𝑛 matrix

Laplacian Matrices Graph 𝐺=(𝑉,𝐸) Edge weights 𝑤:𝐸→ ℝ + 𝑛= 𝑉 𝑚=|𝐸|
𝑨: weighted adjacency matrix of the graph 𝑫: diagonal matrix of weighted degrees 𝑳=𝑫−𝑨 𝑨 𝑖𝑗 = 𝑤 𝑖𝑗 𝑫 𝑖𝑖 = 𝑗 𝑤 𝑖𝑗

Laplacian Matrices Symmetric matrix 𝐿 All off-diagonals are non-positive and 𝐿 𝑖𝑖 = 𝑗≠𝑖 𝐿 𝑖𝑗

Laplacian of a Graph 1 1 2

Approximation between graphs
PSD Order 𝑨≼𝑩 iff for all 𝒙 𝒙 ⊤ 𝑨𝒙≤ 𝒙 ⊤ 𝑩𝒙 Matrix Approximation 𝑨 ≈ 𝜀 𝑩 iff 𝑨≼ 1+𝜀 𝑩 and 𝑩≼ 1+𝜀 𝑨 Graphs 𝐺 ≈ 𝜀 𝐻 iff 𝑳 𝐺 ≈ 𝜀 𝑳 𝐻 Implies multiplicative approximation to cuts

Sparsification of Laplacians
Every graph 𝐺 (0) on 𝑛 vertices for any 𝜀∈ 0,1 , has a sparsifier 𝐺 (1) ≈ 𝜀 𝐺 (0) with 𝑂(𝑛 𝜀 −2 ) edges [BSS’09,LS’17] Sampling edges independently by leverage scores w.h.p. gives sparsifier with 𝑂(𝑛 𝜀 −2 log 𝑛 ) edges [Spielman-Srivastava’08] Many applications!

Graph Semi-Streaming Sparsification
𝑚 edges of a graph arriving as stream 𝑎 1 , 𝑏 1 , 𝑤 1 , 𝑎 2 , 𝑏 2 , 𝑤 2 ,… GOAL: maintain 𝜀-sparsifier, minimize working space (& time) [Ahn-Guha ’09] 𝑛 𝜀 −2 log 𝑛 log 𝑚 𝑛 space 𝑛 𝜀 −2 log 𝑛 log 𝑚 𝑛 edge cut 𝜀-sparsifier Used scalar martingales

Independent Edge Samples
Expectation 𝔼 𝑳 𝑒 (1) = 𝑳 𝑒 (0) Zero-mean 𝑿 𝑒 (1) = 𝑳 𝑒 (1) − 𝑳 𝑒 (0) 𝑿 (1) = 𝑳 (1) − 𝑳 0 = 𝑒 𝑿 𝑒 (1) 𝑳 𝑒 (1) = 1 𝑝 𝑒 𝑳 𝑒 (0) 1 𝑝 𝑒 𝟎 w. probability 𝑝 𝑒 o.w.

Matrix Martingale? Zero mean? So what if we just sparsify again? It’s a martingale! Every time we accumulate 20 𝑛 𝜀 −2 log 𝑛, sparsify down to 10 𝑛 𝜀 −2 log 𝑛, and continue

Independent Edge Samples
Expectation 𝔼 𝑳 𝑒 (1) = 𝑳 𝑒 (0) Zero-mean 𝑿 𝑒 (1) = 𝑳 𝑒 (1) − 𝑳 𝑒 (0) 𝑿 (1) = 𝑒 𝑿 𝑒 (1) 𝑳 𝑒 (1) = 1 𝑝 𝑒 𝑳 𝑒 (0) 1 𝑝 𝑒 𝟎 w. probability 𝑝 𝑒 o.w.

Repeated Edge Sampling
Expectation 𝔼 𝑳 𝑒 (𝑖) = 𝑳 𝑒 (𝑖−1) Zero-mean 𝑿 𝑒 (𝑖) = 𝑳 𝑒 (𝑖) − 𝑳 𝑒 (𝑖−1) 𝑿 (𝑖) = 𝑳 (𝑖) − 𝑳 (𝑖−1) = 𝒆 𝑿 𝑒 (𝑖) 𝑳 𝑒 (𝑖) = 1 𝑝 𝑒 (𝑖−1) 𝑳 𝑒 (𝑖−1) 1 𝑝 𝑒 𝟎 w. prob. 𝑝 𝑒 (𝑖−1) o.w.

Expectation 𝔼 𝑳 𝑒 (𝑖) = 𝑳 𝑒 (𝑖−1) Zero-mean 𝑿 𝑒 (𝑖) = 𝑳 𝑒 (𝑖) − 𝑳 𝑒 (𝑖−1) 𝑿 (𝑖) = 𝑳 (𝑖) − 𝑳 (𝑖−1) = 𝒆 𝑿 𝑒 (𝑖) Martingale 𝔼 𝑿 𝑖 prev. steps =𝟎 𝑳 𝑒 (𝑖) = 1 𝑝 𝑒 (𝑖−1) 𝑳 𝑒 (𝑖−1) 1 𝑝 𝑒 𝟎 w. prob. 𝑝 𝑒 (𝑖−1) o.w.

It works – and it couldn’t be simpler [K.PPS’17] 𝑛 𝜀 −2 log 𝑛 space 𝑛 𝜀 −2 log 𝑛 edge 𝜀-sparsifier Shave a log off many algorithms that do repeated matrix sampling…

Approximation? 𝑳 𝐻 ≈ 0.5 𝑳 𝐺 𝑳 𝐻 ≼1.5 𝑳 𝐺 and 𝑳 𝐺 ≼1.5 𝑳 𝐻 𝑳 𝐻 − 𝑳 𝐺 ≤0.5 ? 𝑳 𝐺 −1/2 (𝑳 𝐻 − 𝑳 𝐺 ) 𝑳 𝐺 −1/2 ≤0.1

Isotropic position 𝒁 ≝ 𝑳 𝐺 −1/2 𝒁 𝑳 𝐺 −1/2 Goal is now 𝑳 𝐻 − 𝑳 𝐺 ≤0.1 𝑳 𝐻 −𝑰 ≤0.1

Matrix Freedman’s Inequality Random 𝑿 = 𝑖 𝑒 𝑿 𝑒 (𝑖) 𝑿 𝑒 (𝑖) ∈ ℝ 𝑑×𝑑 , symmetric 𝔼 𝑿 𝑒 (𝑖) | previous steps =0 𝑿 𝑒 (𝑖) ≤𝑟 ℙ 𝑖 𝑒 𝔼 𝑿 𝑒 𝑖 2 | prev. steps > 𝜎 2 ≤𝛿 gives ℙ[ 𝑿 >𝜀]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿

Norm Control Can there be many edges with large norm? 𝑒 𝑳 𝑒 (0) = 𝑒 Tr 𝑳 𝑒 0 =Tr 𝑒 𝑳 𝑒 0 = Tr 𝑳 (0) =Tr 𝑰 = 𝑛−1 Can afford 𝑟= 1 log 𝑛 , 𝑝 𝑒 = 𝑳 𝑒 0 log 𝑛 , 𝑒 𝑝 𝑒 =𝑛 log 𝑛 𝑳 𝑒 (1) = 1 𝑝 𝑒 𝑳 𝑒 (0) 1 𝑝 𝑒 𝟎 w. probability 𝑝 𝑒 o.w.

Norm Control 𝑿 𝑒 (𝑖) ≤𝑟 If we lose track of the original graph, we can’t estimate the norm Stopping time argument Bad regions

Variance Control 𝔼 𝑿 𝑒 1 2 ≼𝑟 𝑳 𝑒 (0) 𝑒 𝔼 𝑿 𝑒 1 2 ≼ 𝑒 𝑟 𝑳 𝑒 (0) =𝑟𝑰 But then edges get resampled. Geometric sequence for variance, roughly.

Resparsification Every time we accumulate 20 𝑛 𝜀 −2 log 𝑛, sparsify down to 10 𝑛 𝜀 −2 log 𝑛, and continue [K.PPS’17] 𝑛 𝜀 −2 log 𝑛 space 𝑛 𝜀 −2 log 𝑛 edge 𝜀-sparsifier Shave a log off many algorithms that do repeated matrix sampling…

Laplacian Linear Equations
[ST04]: solving Laplacian linear equations in 𝑂 𝑚 time ⋮ [KS16]: simple algorithm

Solving a Laplacian Linear Equation
𝑳𝒙=𝒃 Gaussian Elimination Find 𝖀, upper triangular matrix, s.t. 𝖀 ⊤ 𝖀=𝑳 Then 𝒙= 𝖀 −1 𝖀 −⊤ 𝒃 Easy to apply 𝖀 −1 and 𝖀 −⊤

Solving a Laplacian Linear Equation
𝑳𝒙=𝒃 Approximate Gaussian Elimination Find 𝖀, upper triangular matrix, s.t. 𝖀 ⊤ 𝖀 ≈ 0.5 𝑳 𝖀 is sparse. 𝑂 log 1 𝜀 iterations to get 𝜀-approximate solution 𝒙 .

Approximate Gaussian Elimination
Theorem [KS] When 𝑳 is an Laplacian matrix with 𝑚 non-zeros, we can find in 𝑂(𝑚 log 3 𝑛 ) time an upper triangular matrix 𝖀 with 𝑂(𝑚 log 3 𝑛 ) non-zeros, s.t. w.h.p. 𝖀 ⊤ 𝖀 ≈ 𝟎.𝟓 𝑳

Additive View of Gaussian Elimination
Find 𝖀, upper triangular matrix, s.t 𝖀 ⊤ 𝖀=𝑴 𝑴=

Find the rank-1 matrix that agrees with 𝑴 on the first row and column.

Subtract the rank 1 matrix. We have eliminated the first variable.

The remaining matrix is PSD.

Find rank-1 matrix that agrees with our matrix on the next row and column.

Subtract the rank 1 matrix. We have eliminated the second variable.

Repeat until all parts written as rank 1 terms. 𝑴=

Repeat until all parts written as rank 1 terms. 𝑴= =𝖀 ⊤ 𝖀

What is special about Gaussian Elimination on Laplacians? The remaining matrix is always Laplacian. 𝑳=

What is special about Gaussian Elimination on Laplacians? The remaining matrix is always Laplacian. 𝑳= A new Laplacian!

Why is Gaussian Elimination Slow?
Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= 𝑣

Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= New Laplacian 𝑣 Elimination creates a clique on the neighbors of 𝑣

Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= New Laplacian 𝑣 ≈ Laplacian cliques can be sparsified!

Gaussian Elimination Pick a vertex 𝑣 to eliminate
Add the clique created by eliminating 𝑣 Repeat until done

Pick a vertex 𝑣 to eliminate Add the clique created by eliminating 𝑣 Repeat until done

Pick a random vertex 𝑣 to eliminate Add the clique created by eliminating 𝑣 Repeat until done

Pick a random vertex 𝑣 to eliminate Sample the clique created by eliminating 𝑣 Repeat until done Resembles randomized Incomplete Cholesky

How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

Approximating Matrices by Sampling
Goal 𝖀 ⊤ 𝖀≈𝑳 Approach 𝔼 𝖀 ⊤ 𝖀=𝑳 Show 𝖀 ⊤ 𝖀 concentrated around expectation Gives 𝖀 ⊤ 𝖀≈𝑳 w. high probability

Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (0) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Original Laplacian Rank 1 term Remaining graph + clique

Consider eliminating the first variable 𝔼 𝐿 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique

Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique

Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique 𝔼 𝐿 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Suppose

Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique Then = 𝑳 (0)

Let 𝑳 (𝑖) be our approximation after 𝑖 eliminations If we ensure at each step Then 𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ = ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Sparsified clique Clique 𝔼 𝑳 (𝑖) = 𝑳 (𝑖−1) 𝔼 𝑳 𝑖 − 𝑳 𝑖−1 | previous steps =𝟎

Predictable Quadratic Variation
Predictable quadratic variation 𝑾= 𝑖 𝑒 𝔼 𝑿 𝑒 𝑖 2 | prev. steps Want to show ℙ 𝑾 > 𝜎 2 ≤𝛿 Promise: 𝔼 𝑿 𝑒 2 ≼𝑟 𝑳 𝑒

Sample Variance 𝑣 𝑣 ≼ 𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 elim. clique of

Sample Variance 𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 = 2 current lap. ≼ 4 𝑳 = 4𝑰
𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 elim. clique of = 2 current lap. # vertices ≼ 𝑳 # vertices = 𝑰 # vertices WARNING: only true w.h.p.

Sample Variance Recall 𝔼 𝑿 𝑒 2 ≼𝑟 𝑳 𝑒 Putting it together
𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 elim. clique of 𝔼 𝑣 𝑒∈ 𝑣 𝔼 𝑿 𝑒 2 ≼ =𝑟 𝑰 # vertices = 𝑰 # vertices elim. clique of variance in one round of elimination

Sample Variance ≼ 𝑟⋅ 4𝑰 # vertices variance ≼4 𝑟 log 𝑛 ⋅𝑰 rounds of
elimination variance 𝑟⋅ 4𝑰 # vertices rounds of elimination ≼4 𝑟 log 𝑛 ⋅𝑰

Directed Laplacian of a Graph
1 1 2

(Weighted) Eulerian Graph
weighted in-degree = weighted out-degree 1 2 1

Eulerian Laplacian of a Graph
1 1 1

Solving an Eul. Laplacian Linear Equation
𝑳𝒙=𝒃 Gaussian Elimination Find 𝕷,𝖀, lower and upper triangular matrices, s.t. 𝕷𝖀=𝑳 Then 𝒙= 𝖀 −1 𝕷 −1 𝒃 Easy to apply 𝖀 −1 and 𝕷 −1

Solving an Eul. Laplacian Linear Equation
𝑳𝒙=𝒃 Approximate Gaussian Elimination [CKKPPRS FOCS’18] Find 𝕷,𝖀, lower and upper triangular matrices, s.t. 𝕷𝖀≈𝑳 𝕷, 𝖀 are sparse. 𝑂 log 1 𝜀 iterations to get 𝜀-approximate solution 𝒙 .

Gaussian Elimination on Eulerian Laplacians
𝑳= 1 2 𝑣 1 1 1

𝑳= 2/3 1 2 𝑣 1 1/3 1 1

𝑳= 1 2 𝑣 1 1 1

Sparsify correct in expectation output always Eulerian 𝑳= 1 1 ≈ 1

𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 2 𝑣 1 1 1 ≈

𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 2 𝑣 1 1 1 1 ≈

𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 1 ≈

𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 1 ≈ 1

𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 ≈ 1

𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 𝑣 1 1 ≈ 1

Our sparsification correct in expectation output always Eulerian Average over log 3 (𝑛) runs to reduce variance 1 1 ≈ 1

Approximating Matrices by Sampling
Goal 𝕷𝖀≈𝑳 Approach 𝔼 𝕷𝖀=𝑳 Show 𝕷𝖀 concentrated around expectation Gives 𝕷𝖀≈𝑳 w. high probability (?) But what does ≈ mean? 𝑳 is not PSD?!? Come to my talk and I’ll tell you!

Difficulties with Approximation
1 1 1 1 1 1 1 1 1 1 6 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 6 1 1 1 1 1 1 1 1 1 Variance manageable

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Variance problematic!

Come to my talk and hear the rest of the story! [Cohen-Kelner-Kyng-Peebles-Peng-Rao-Sidford FOCS’18] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Spanning Trees of a Graph

Spanning Trees of a (Multi)-Graph
Pick a random tree?

Random Spanning Trees Graph 𝐺 Tree distribution Marginal probability of edge: 𝑝 𝑒 = 𝑳 𝑒 Similar to sparsification?!

Random Spanning Trees Graph 𝐺 Tree distribution Marginal probability of edge: 𝑝 𝑒 = 𝑳 𝑒 𝑳 𝐺 = 𝑒∈𝐺 𝑳 𝑒 Consider 𝑳 𝑇 = 𝑒∈𝑇 1 𝑝 𝑒 𝑳 𝑒 𝔼 𝑳 𝑇 = 𝑳 𝐺

Random Spanning Trees Concentration? 𝑳 𝑇 ≈ 0.5 𝑳 𝐺 ? 𝑳 𝑇 ≼log 𝑛 𝑳 𝐺 ? 1 𝑡 𝑖=1 𝑡 𝑳 𝑇 𝑖 ≈ 𝜀 𝑳 𝐺 , 𝑡= 𝜀 −2 log 2 𝑛 [Kyng-Song FOCS’18] Come see my talk! NO YES, w.h.p

Sometimes You Get a Martingale for Free
Random variables 𝑍 1 ,…, 𝑍 𝑘 Prove concentration for 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) where 𝑓 is ``stable” under small changes to 𝑍 1 ,…, 𝑍 𝑘 𝑓 𝑍 1 ,…, 𝑍 𝑘 ≈𝔼𝑓 𝑍 1 ,…, 𝑍 𝑘 ?

Doob Martingales NOT indep.
Random variables 𝑍 1 ,…, 𝑍 𝑘 Prove concentration for 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) where 𝑓 is ``stable” under small changes to 𝑍 1 ,…, 𝑍 𝑘 Pick outcome 𝑧 1 ,…, 𝑧 𝑘 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) → 𝔼 𝑓 𝑍 1 ,…, 𝑍 𝑘 | 𝑍 1 = 𝑧 1 → 𝔼 𝑓 𝑍 1 ,…, 𝑍 𝑘 | 𝑍 1 = 𝑧 1 , 𝑍 2 = 𝑧 2 →𝑓 𝑧 1 ,…, 𝑧 𝑘 NOT indep.

Doob Martingales 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) → 𝔼 𝑓 𝑍 1 ,…,𝑘 | 𝑍 1 = 𝑧 1 → 𝔼 𝑓 𝑍 1 ,…, 𝑍 𝑘 | 𝑍 1 = 𝑧 1 , 𝑍 2 = 𝑧 2 →𝑓 𝑧 1 ,…, 𝑧 𝑘 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) =𝔼 𝑧 1 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 )| 𝑍 1 = 𝑧 1 𝑌 0 =𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) 𝑌 1 =𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 )| 𝑍 1 = 𝑧 1 𝑌 2 =𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 )| 𝑍 1 = 𝑧 1 , 𝑍 2 = 𝑧 2 𝔼 𝑌 𝑖+1 − 𝑌 𝑖 |prev. steps =0 Martingale! Despite 𝑍 1 ,…, 𝑍 𝑘 NOT indep.

Concentration of Random Spanning Trees
Random variables 𝑍 1 ,…, 𝑍 𝑛−1 Positions of 𝑛−1 edges of random spanning tree 𝑓 𝑍 1 ,…, 𝑍 𝑛−1 = 𝑳 𝑇 Similar to [Peres-Pemantle ‘14]

Conditioning Changes Distributions?
Graph Tree distribution All

Conditioning Changes Distributions?
Graph Tree distribution All Conditional

Summary Matrix martingales: a natural fit for algorithmic analysis
Understanding the Predictable Quadratic Variation is key Some results using matrix martingales Cohen, Musco, Pachocki ’16 – online row sampling Kyng, Sachdeva ’17 – approximate Gaussian elimination Kyng, Peng, Pachocki, Sachdeva ’17 – semi-streaming graph sparsification Cohen, Kelner, Kyng, Peebles, Peng, Rao, Sidford ’18 – solving Directed Laplacian eqs. Kyng, Song ’18 – Matrix Chernoff bound for negatively dependent random variables

Thanks!

This is really the end

Older slides

Matrix Martingales in Randomized Numerical Linear Algebra

Similar presentations

Presentation on theme: "Matrix Martingales in Randomized Numerical Linear Algebra"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Matrix Martingales in Randomized Numerical Linear Algebra

Similar presentations

Presentation on theme: "Matrix Martingales in Randomized Numerical Linear Algebra"— Presentation transcript:

Similar presentations

About project

Feedback