Presentation is loading. Please wait.

Presentation is loading. Please wait.

Matrix Martingales in Randomized Numerical Linear Algebra

Similar presentations


Presentation on theme: "Matrix Martingales in Randomized Numerical Linear Algebra"— Presentation transcript:

1 Matrix Martingales in Randomized Numerical Linear Algebra
Speaker Rasmus Kyng Harvard University Sep 27, 2018

2 Concentration of Scalar Random Variables
𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 Is 𝑋≈0 with high probability?

3 Concentration of Scalar Random Variables
Bernstein’s Inequality Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝑋 𝑖 ≤𝑟 𝑖 𝔼 𝑋 𝑖 2 ≤ 𝜎 2 gives ℙ[ 𝑋 >𝜀]≤ 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 2 ≤𝜏 E.g. if 𝜀=0.5, and 𝑟, 𝜎 2 =0.1/log(1/𝜏)

4 Concentration of Scalar Martingales
Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝔼 𝑋 𝑖 | 𝑋 1 , … ,𝑋 𝑖−1 =0

5 Concentration of Scalar Martingales
Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝔼 𝑋 𝑖 | previous steps =0

6 Concentration of Scalar Martingales
Freedman’s Inequality Bernstein’s Inequality Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 =0 𝑋 𝑖 ≤𝑟 𝑖 𝔼 𝑋 𝑖 2 ≤ 𝜎 2 gives ℙ[ 𝑋 >𝜀]≤2 exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 2 𝔼 𝑋 𝑖 | previous steps =0 𝑖 𝔼 𝑋 𝑖 2 | previous steps ≤ 𝜎 2 ℙ 𝑖 𝔼 𝑋 𝑖 2 | previous steps > 𝜎 2 ≤𝛿 +𝛿

7 Concentration of Scalar Martingales
Freedman’s Inequality Random 𝑋= 𝑖 𝑋 𝑖 , 𝑋 𝑖 ∈ℝ 𝑋 𝑖 are independent 𝔼 𝑋 𝑖 | previous steps =0 𝑋 𝑖 ≤𝑟 ℙ[ 𝑖 𝔼 𝑋 𝑖 2 | previous steps > 𝜎 2 ]≤𝛿 gives ℙ[ 𝑋 >𝜀]≤ 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿

8 Concentration of Matrix Random Variables
Matrix Bernstein’s Inequality (Tropp 11) Random 𝑿= 𝑖 𝑿 𝑖 𝑿 𝑖 ∈ ℝ 𝑑×𝑑 , symmetric 𝑿 𝑖 are independent 𝔼 𝑿 𝑖 =𝟎 𝑿 𝑖 ≤𝑟 𝑖 𝔼 𝑿 𝑖 2 ≤ 𝜎 2 gives ℙ[ 𝑿 >𝜖]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 2

9 Concentration of Matrix Martingales
Matrix Freedman’s Inequality (Tropp 11) Random 𝑿= 𝑖 𝑿 𝑖 𝑿 𝑖 ∈ ℝ 𝑑×𝑑 , symmetric 𝑿 𝑖 are independent 𝔼 𝑿 𝑖 =0 𝔼 𝑿 𝑖 | previous steps =0 𝑿 𝑖 ≤𝑟 𝑖 𝔼 𝑿 𝑖 2 ≤ 𝜎 2 ℙ[ 𝑖 𝔼 𝑿 𝑖 2 | prev. steps > 𝜎 2 ]≤𝛿 gives ℙ[ 𝑿 >𝜀]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿 E.g. if 𝜀=0.5, and 𝑟, 𝜎 2 =0.1/log(𝑑/𝜏) ≤𝜏+𝛿

10 Concentration of Matrix Martingales
Matrix Freedman’s Inequality (Tropp 11) ℙ[ 𝑖 𝔼 𝑿 𝑖 2 | prev. steps > 𝜎 2 ]≤𝛿 Predictable quadratic variation = 𝑖 𝔼 𝑿 𝑖 2 | prev. steps

11 Laplacian Matrices 𝑏 𝑎 Graph 𝐺=(𝑉,𝐸) Edge weights 𝑤:𝐸→ ℝ + 𝑛= 𝑉 𝑚=|𝐸|
𝑐 𝐿= 𝑒∈𝐸 𝐿 𝑒 𝑛×𝑛 matrix

12 Laplacian Matrices 𝑏 𝑎 Graph 𝐺=(𝑉,𝐸) Edge weights 𝑤:𝐸→ ℝ + 𝑛= 𝑉 𝑚=|𝐸|
𝑐 𝐿= 𝑒∈𝐸 𝐿 𝑒 𝑛×𝑛 matrix

13 Laplacian Matrices Graph 𝐺=(𝑉,𝐸) Edge weights 𝑤:𝐸→ ℝ + 𝑛= 𝑉 𝑚=|𝐸|
𝑨: weighted adjacency matrix of the graph 𝑫: diagonal matrix of weighted degrees 𝑳=𝑫−𝑨 𝑨 𝑖𝑗 = 𝑤 𝑖𝑗 𝑫 𝑖𝑖 = 𝑗 𝑤 𝑖𝑗

14 Laplacian Matrices Symmetric matrix 𝐿 All off-diagonals are non-positive and 𝐿 𝑖𝑖 = 𝑗≠𝑖 𝐿 𝑖𝑗

15 Laplacian of a Graph 1 1 2

16 Approximation between graphs
PSD Order 𝑨≼𝑩 iff for all 𝒙 𝒙 ⊤ 𝑨𝒙≤ 𝒙 ⊤ 𝑩𝒙 Matrix Approximation 𝑨 ≈ 𝜀 𝑩 iff 𝑨≼ 1+𝜀 𝑩 and 𝑩≼ 1+𝜀 𝑨 Graphs 𝐺 ≈ 𝜀 𝐻 iff 𝑳 𝐺 ≈ 𝜀 𝑳 𝐻 Implies multiplicative approximation to cuts

17 Sparsification of Laplacians
Every graph 𝐺 (0) on 𝑛 vertices for any 𝜀∈ 0,1 , has a sparsifier 𝐺 (1) ≈ 𝜀 𝐺 (0) with 𝑂(𝑛 𝜀 −2 ) edges [BSS’09,LS’17] Sampling edges independently by leverage scores w.h.p. gives sparsifier with 𝑂(𝑛 𝜀 −2 log 𝑛 ) edges [Spielman-Srivastava’08] Many applications!

18 Graph Semi-Streaming Sparsification
𝑚 edges of a graph arriving as stream 𝑎 1 , 𝑏 1 , 𝑤 1 , 𝑎 2 , 𝑏 2 , 𝑤 2 ,… GOAL: maintain 𝜀-sparsifier, minimize working space (& time) [Ahn-Guha ’09] 𝑛 𝜀 −2 log 𝑛 log 𝑚 𝑛 space 𝑛 𝜀 −2 log 𝑛 log 𝑚 𝑛 edge cut 𝜀-sparsifier Used scalar martingales

19 Independent Edge Samples
Expectation 𝔼 𝑳 𝑒 (1) = 𝑳 𝑒 (0) Zero-mean 𝑿 𝑒 (1) = 𝑳 𝑒 (1) − 𝑳 𝑒 (0) 𝑿 (1) = 𝑳 (1) − 𝑳 0 = 𝑒 𝑿 𝑒 (1) 𝑳 𝑒 (1) = 1 𝑝 𝑒 𝑳 𝑒 (0) 1 𝑝 𝑒 𝟎 w. probability 𝑝 𝑒 o.w.

20 Matrix Martingale? Zero mean? So what if we just sparsify again? It’s a martingale! Every time we accumulate 20 𝑛 𝜀 −2 log 𝑛, sparsify down to 10 𝑛 𝜀 −2 log 𝑛, and continue

21 Independent Edge Samples
Expectation 𝔼 𝑳 𝑒 (1) = 𝑳 𝑒 (0) Zero-mean 𝑿 𝑒 (1) = 𝑳 𝑒 (1) − 𝑳 𝑒 (0) 𝑿 (1) = 𝑒 𝑿 𝑒 (1) 𝑳 𝑒 (1) = 1 𝑝 𝑒 𝑳 𝑒 (0) 1 𝑝 𝑒 𝟎 w. probability 𝑝 𝑒 o.w.

22 Repeated Edge Sampling
Expectation 𝔼 𝑳 𝑒 (𝑖) = 𝑳 𝑒 (𝑖−1) Zero-mean 𝑿 𝑒 (𝑖) = 𝑳 𝑒 (𝑖) − 𝑳 𝑒 (𝑖−1) 𝑿 (𝑖) = 𝑳 (𝑖) − 𝑳 (𝑖−1) = 𝒆 𝑿 𝑒 (𝑖) 𝑳 𝑒 (𝑖) = 1 𝑝 𝑒 (𝑖−1) 𝑳 𝑒 (𝑖−1) 1 𝑝 𝑒 𝟎 w. prob. 𝑝 𝑒 (𝑖−1) o.w.

23 Repeated Edge Sampling
Expectation 𝔼 𝑳 𝑒 (𝑖) = 𝑳 𝑒 (𝑖−1) Zero-mean 𝑿 𝑒 (𝑖) = 𝑳 𝑒 (𝑖) − 𝑳 𝑒 (𝑖−1) 𝑿 (𝑖) = 𝑳 (𝑖) − 𝑳 (𝑖−1) = 𝒆 𝑿 𝑒 (𝑖) 𝑳 𝑒 (𝑖) = 1 𝑝 𝑒 (𝑖−1) 𝑳 𝑒 (𝑖−1) 1 𝑝 𝑒 𝟎 w. prob. 𝑝 𝑒 (𝑖−1) o.w.

24 Repeated Edge Sampling
Expectation 𝔼 𝑳 𝑒 (𝑖) = 𝑳 𝑒 (𝑖−1) Zero-mean 𝑿 𝑒 (𝑖) = 𝑳 𝑒 (𝑖) − 𝑳 𝑒 (𝑖−1) 𝑿 (𝑖) = 𝑳 (𝑖) − 𝑳 (𝑖−1) = 𝒆 𝑿 𝑒 (𝑖) Martingale 𝔼 𝑿 𝑖 prev. steps =𝟎 𝑳 𝑒 (𝑖) = 1 𝑝 𝑒 (𝑖−1) 𝑳 𝑒 (𝑖−1) 1 𝑝 𝑒 𝟎 w. prob. 𝑝 𝑒 (𝑖−1) o.w.

25 Repeated Edge Sampling
It works – and it couldn’t be simpler [K.PPS’17] 𝑛 𝜀 −2 log 𝑛 space 𝑛 𝜀 −2 log 𝑛 edge 𝜀-sparsifier Shave a log off many algorithms that do repeated matrix sampling…

26 Approximation? 𝑳 𝐻 ≈ 0.5 𝑳 𝐺 𝑳 𝐻 ≼1.5 𝑳 𝐺 and 𝑳 𝐺 ≼1.5 𝑳 𝐻 𝑳 𝐻 − 𝑳 𝐺 ≤0.5 ? 𝑳 𝐺 −1/2 (𝑳 𝐻 − 𝑳 𝐺 ) 𝑳 𝐺 −1/2 ≤0.1

27 Isotropic position 𝒁 ≝ 𝑳 𝐺 −1/2 𝒁 𝑳 𝐺 −1/2 Goal is now 𝑳 𝐻 − 𝑳 𝐺 ≤0.1 𝑳 𝐻 −𝑰 ≤0.1

28 Concentration of Matrix Martingales
Matrix Freedman’s Inequality Random 𝑿 = 𝑖 𝑒 𝑿 𝑒 (𝑖) 𝑿 𝑒 (𝑖) ∈ ℝ 𝑑×𝑑 , symmetric 𝔼 𝑿 𝑒 (𝑖) | previous steps =0 𝑿 𝑒 (𝑖) ≤𝑟 ℙ 𝑖 𝑒 𝔼 𝑿 𝑒 𝑖 2 | prev. steps > 𝜎 2 ≤𝛿 gives ℙ[ 𝑿 >𝜀]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿

29 Concentration of Matrix Martingales
Matrix Freedman’s Inequality Random 𝑿 = 𝑖 𝑒 𝑿 𝑒 (𝑖) 𝑿 𝑒 (𝑖) ∈ ℝ 𝑑×𝑑 , symmetric 𝔼 𝑿 𝑒 (𝑖) | previous steps =0 𝑿 𝑒 (𝑖) ≤𝑟 ℙ 𝑖 𝑒 𝔼 𝑿 𝑒 𝑖 2 | prev. steps > 𝜎 2 ≤𝛿 gives ℙ[ 𝑿 >𝜀]≤𝑑 2exp − 𝜀 2 /2 𝑟𝜀+ 𝜎 𝛿

30 Norm Control Can there be many edges with large norm? 𝑒 𝑳 𝑒 (0) = 𝑒 Tr 𝑳 𝑒 0 =Tr 𝑒 𝑳 𝑒 0 = Tr 𝑳 (0) =Tr 𝑰 = 𝑛−1 Can afford 𝑟= 1 log 𝑛 , 𝑝 𝑒 = 𝑳 𝑒 0 log 𝑛 , 𝑒 𝑝 𝑒 =𝑛 log 𝑛 𝑳 𝑒 (1) = 1 𝑝 𝑒 𝑳 𝑒 (0) 1 𝑝 𝑒 𝟎 w. probability 𝑝 𝑒 o.w.

31 Norm Control 𝑿 𝑒 (𝑖) ≤𝑟 If we lose track of the original graph, we can’t estimate the norm Stopping time argument Bad regions

32 Variance Control 𝔼 𝑿 𝑒 1 2 ≼𝑟 𝑳 𝑒 (0) 𝑒 𝔼 𝑿 𝑒 1 2 ≼ 𝑒 𝑟 𝑳 𝑒 (0) =𝑟𝑰 But then edges get resampled. Geometric sequence for variance, roughly.

33 Resparsification Every time we accumulate 20 𝑛 𝜀 −2 log 𝑛, sparsify down to 10 𝑛 𝜀 −2 log 𝑛, and continue [K.PPS’17] 𝑛 𝜀 −2 log 𝑛 space 𝑛 𝜀 −2 log 𝑛 edge 𝜀-sparsifier Shave a log off many algorithms that do repeated matrix sampling…

34 Laplacian Linear Equations
[ST04]: solving Laplacian linear equations in 𝑂 𝑚 time ⋮ [KS16]: simple algorithm

35 Solving a Laplacian Linear Equation
𝑳𝒙=𝒃 Gaussian Elimination Find 𝖀, upper triangular matrix, s.t. 𝖀 ⊤ 𝖀=𝑳 Then 𝒙= 𝖀 −1 𝖀 −⊤ 𝒃 Easy to apply 𝖀 −1 and 𝖀 −⊤

36 Solving a Laplacian Linear Equation
𝑳𝒙=𝒃 Approximate Gaussian Elimination Find 𝖀, upper triangular matrix, s.t. 𝖀 ⊤ 𝖀 ≈ 0.5 𝑳 𝖀 is sparse. 𝑂 log 1 𝜀 iterations to get 𝜀-approximate solution 𝒙 .

37 Approximate Gaussian Elimination
Theorem [KS] When 𝑳 is an Laplacian matrix with 𝑚 non-zeros, we can find in 𝑂(𝑚 log 3 𝑛 ) time an upper triangular matrix 𝖀 with 𝑂(𝑚 log 3 𝑛 ) non-zeros, s.t. w.h.p. 𝖀 ⊤ 𝖀 ≈ 𝟎.𝟓 𝑳

38 Additive View of Gaussian Elimination
Find 𝖀, upper triangular matrix, s.t 𝖀 ⊤ 𝖀=𝑴 𝑴=

39 Additive View of Gaussian Elimination
Find the rank-1 matrix that agrees with 𝑴 on the first row and column.

40 Additive View of Gaussian Elimination
Subtract the rank 1 matrix. We have eliminated the first variable.

41 Additive View of Gaussian Elimination
The remaining matrix is PSD.

42 Additive View of Gaussian Elimination
Find rank-1 matrix that agrees with our matrix on the next row and column.

43 Additive View of Gaussian Elimination
Subtract the rank 1 matrix. We have eliminated the second variable.

44 Additive View of Gaussian Elimination
Repeat until all parts written as rank 1 terms. 𝑴=

45 Additive View of Gaussian Elimination
Repeat until all parts written as rank 1 terms. 𝑴=

46 Additive View of Gaussian Elimination
Repeat until all parts written as rank 1 terms. 𝑴= =𝖀 ⊤ 𝖀

47 Additive View of Gaussian Elimination
What is special about Gaussian Elimination on Laplacians? The remaining matrix is always Laplacian. 𝑳=

48 Additive View of Gaussian Elimination
What is special about Gaussian Elimination on Laplacians? The remaining matrix is always Laplacian. 𝑳=

49 Additive View of Gaussian Elimination
What is special about Gaussian Elimination on Laplacians? The remaining matrix is always Laplacian. 𝑳= A new Laplacian!

50 Why is Gaussian Elimination Slow?
Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= 𝑣

51 Why is Gaussian Elimination Slow?
Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= 𝑣

52 Why is Gaussian Elimination Slow?
Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= New Laplacian 𝑣 Elimination creates a clique on the neighbors of 𝑣

53 Why is Gaussian Elimination Slow?
Solving 𝑳𝒙=𝒃 by Gaussian Elimination can take Ω 𝑛 3 time. The main issue is fill 𝑳= New Laplacian 𝑣 Laplacian cliques can be sparsified!

54 Gaussian Elimination Pick a vertex 𝑣 to eliminate
Add the clique created by eliminating 𝑣 Repeat until done

55 Approximate Gaussian Elimination
Pick a vertex 𝑣 to eliminate Add the clique created by eliminating 𝑣 Repeat until done

56 Approximate Gaussian Elimination
Pick a random vertex 𝑣 to eliminate Add the clique created by eliminating 𝑣 Repeat until done

57 Approximate Gaussian Elimination
Pick a random vertex 𝑣 to eliminate Sample the clique created by eliminating 𝑣 Repeat until done Resembles randomized Incomplete Cholesky

58 How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

59 How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

60 How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

61 How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

62 How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

63 How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

64 How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

65 How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

66 How to Sample a Clique For each edge (1,𝑣) pick an edge 1,𝑢 with probability ~ 𝑤 𝑢 insert edge 𝑣,𝑢 with weight 𝑤 𝑢 𝑤 𝑣 𝑤 𝑢 +𝑤 𝑣 1 5 5 4 4 2 2 3 3

67 Approximating Matrices by Sampling
Goal 𝖀 ⊤ 𝖀≈𝑳 Approach 𝔼 𝖀 ⊤ 𝖀=𝑳 Show 𝖀 ⊤ 𝖀 concentrated around expectation Gives 𝖀 ⊤ 𝖀≈𝑳 w. high probability

68 Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (0) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Original Laplacian Rank 1 term Remaining graph + clique

69 Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝐿 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique

70 Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique

71 Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique

72 Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique 𝔼 𝐿 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Suppose

73 Approximating Matrices in Expectation
Consider eliminating the first variable 𝔼 𝑳 (1) = 𝒄 𝒄 ⊤ +𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Rank 1 term Remaining graph + sparsified clique Then = 𝑳 (0)

74 Approximating Matrices in Expectation
Let 𝑳 (𝑖) be our approximation after 𝑖 eliminations If we ensure at each step Then 𝔼 ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ = ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ ∎ Sparsified clique Clique 𝔼 𝑳 (𝑖) = 𝑳 (𝑖−1) 𝔼 𝑳 𝑖 − 𝑳 𝑖−1 | previous steps =𝟎

75 Predictable Quadratic Variation
Predictable quadratic variation 𝑾= 𝑖 𝑒 𝔼 𝑿 𝑒 𝑖 2 | prev. steps Want to show ℙ 𝑾 > 𝜎 2 ≤𝛿 Promise: 𝔼 𝑿 𝑒 2 ≼𝑟 𝑳 𝑒

76 Sample Variance 𝑣 𝑣 𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 elim. clique of

77 Sample Variance 𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 = 2 current lap. ≼ 4 𝑳 = 4𝑰
𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 elim. clique of = 2 current lap. # vertices ≼ 𝑳 # vertices = 𝑰 # vertices WARNING: only true w.h.p.

78 Sample Variance Recall 𝔼 𝑿 𝑒 2 ≼𝑟 𝑳 𝑒 Putting it together
𝔼 𝑣 𝑒∈ 𝑣 𝑳 𝑒 ≼ 𝔼 𝑣 𝑒∋𝑣 𝑳 𝑒 elim. clique of 𝔼 𝑣 𝑒∈ 𝑣 𝔼 𝑿 𝑒 2 ≼ =𝑟 𝑰 # vertices = 𝑰 # vertices elim. clique of variance in one round of elimination

79 Sample Variance ≼ 𝑟⋅ 4𝑰 # vertices variance ≼4 𝑟 log 𝑛 ⋅𝑰 rounds of
elimination variance 𝑟⋅ 4𝑰 # vertices rounds of elimination ≼4 𝑟 log 𝑛 ⋅𝑰

80 Directed Laplacian of a Graph
1 1 2

81 (Weighted) Eulerian Graph
weighted in-degree = weighted out-degree 1 2 1

82 Eulerian Laplacian of a Graph
1 1 1

83 Solving an Eul. Laplacian Linear Equation
𝑳𝒙=𝒃 Gaussian Elimination Find 𝕷,𝖀, lower and upper triangular matrices, s.t. 𝕷𝖀=𝑳 Then 𝒙= 𝖀 −1 𝕷 −1 𝒃 Easy to apply 𝖀 −1 and 𝕷 −1

84 Solving an Eul. Laplacian Linear Equation
𝑳𝒙=𝒃 Approximate Gaussian Elimination [CKKPPRS FOCS’18] Find 𝕷,𝖀, lower and upper triangular matrices, s.t. 𝕷𝖀≈𝑳 𝕷, 𝖀 are sparse. 𝑂 log 1 𝜀 iterations to get 𝜀-approximate solution 𝒙 .

85 Gaussian Elimination on Eulerian Laplacians
𝑳= 1 2 𝑣 1 1 1

86 Gaussian Elimination on Eulerian Laplacians
𝑳= 2/3 1 2 𝑣 1 1/3 1 1

87 Gaussian Elimination on Eulerian Laplacians
𝑳= 2/3 1 2 𝑣 1 1/3 1 1

88 Gaussian Elimination on Eulerian Laplacians
𝑳= 1 2 𝑣 1 1 1

89 Gaussian Elimination on Eulerian Laplacians
Sparsify correct in expectation output always Eulerian 𝑳= 1 1 1

90 Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 2 𝑣 1 1 1

91 Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 2 𝑣 1 1 1

92 Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 2 𝑣 1 1 1 1

93 Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 1

94 Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 1

95 Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 1 1

96 Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 1 𝑣 1 1 1 1

97 Gaussian Elimination on Eulerian Laplacians
𝑳= While edges remain: Pick a minimum weight edge Pair with random neighbor Reduce mass on both by 𝑤 min 𝑣 1 1 1

98 Gaussian Elimination on Eulerian Laplacians
Our sparsification correct in expectation output always Eulerian Average over log 3 (𝑛) runs to reduce variance 1 1 1

99 Approximating Matrices by Sampling
Goal 𝕷𝖀≈𝑳 Approach 𝔼 𝕷𝖀=𝑳 Show 𝕷𝖀 concentrated around expectation Gives 𝕷𝖀≈𝑳 w. high probability (?) But what does ≈ mean? 𝑳 is not PSD?!? Come to my talk and I’ll tell you!

100 Difficulties with Approximation
1 1 1 1 1 1 1 1 1 1 6 1 1 1 1 1 1 1 1 1

101 Difficulties with Approximation
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

102 Difficulties with Approximation
1 1 1 1 1 1 1 1 1 1 6 1 1 1 1 1 1 1 1 1 Variance manageable

103 Difficulties with Approximation
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Variance problematic!

104 Difficulties with Approximation
Come to my talk and hear the rest of the story! [Cohen-Kelner-Kyng-Peebles-Peng-Rao-Sidford FOCS’18] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

105 Spanning Trees of a Graph

106 Spanning Trees of a (Multi)-Graph
Pick a random tree?

107 Random Spanning Trees Graph 𝐺 Tree distribution Marginal probability of edge: 𝑝 𝑒 = 𝑳 𝑒 Similar to sparsification?!

108 Random Spanning Trees Graph 𝐺 Tree distribution Marginal probability of edge: 𝑝 𝑒 = 𝑳 𝑒 𝑳 𝐺 = 𝑒∈𝐺 𝑳 𝑒 Consider 𝑳 𝑇 = 𝑒∈𝑇 1 𝑝 𝑒 𝑳 𝑒 𝔼 𝑳 𝑇 = 𝑳 𝐺

109 Random Spanning Trees Concentration? 𝑳 𝑇 ≈ 0.5 𝑳 𝐺 ? 𝑳 𝑇 ≼log 𝑛 𝑳 𝐺 ? 1 𝑡 𝑖=1 𝑡 𝑳 𝑇 𝑖 ≈ 𝜀 𝑳 𝐺 , 𝑡= 𝜀 −2 log 2 𝑛 [Kyng-Song FOCS’18] Come see my talk! NO YES, w.h.p

110 Sometimes You Get a Martingale for Free
Random variables 𝑍 1 ,…, 𝑍 𝑘 Prove concentration for 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) where 𝑓 is ``stable” under small changes to 𝑍 1 ,…, 𝑍 𝑘 𝑓 𝑍 1 ,…, 𝑍 𝑘 ≈𝔼𝑓 𝑍 1 ,…, 𝑍 𝑘 ?

111 Doob Martingales NOT indep.
Random variables 𝑍 1 ,…, 𝑍 𝑘 Prove concentration for 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) where 𝑓 is ``stable” under small changes to 𝑍 1 ,…, 𝑍 𝑘 Pick outcome 𝑧 1 ,…, 𝑧 𝑘 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) → 𝔼 𝑓 𝑍 1 ,…, 𝑍 𝑘 | 𝑍 1 = 𝑧 1 → 𝔼 𝑓 𝑍 1 ,…, 𝑍 𝑘 | 𝑍 1 = 𝑧 1 , 𝑍 2 = 𝑧 2 →𝑓 𝑧 1 ,…, 𝑧 𝑘 NOT indep.

112 Doob Martingales 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) → 𝔼 𝑓 𝑍 1 ,…,𝑘 | 𝑍 1 = 𝑧 1 → 𝔼 𝑓 𝑍 1 ,…, 𝑍 𝑘 | 𝑍 1 = 𝑧 1 , 𝑍 2 = 𝑧 2 →𝑓 𝑧 1 ,…, 𝑧 𝑘 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) =𝔼 𝑧 1 𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 )| 𝑍 1 = 𝑧 1 𝑌 0 =𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 ) 𝑌 1 =𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 )| 𝑍 1 = 𝑧 1 𝑌 2 =𝔼 𝑓( 𝑍 1 ,…, 𝑍 𝑘 )| 𝑍 1 = 𝑧 1 , 𝑍 2 = 𝑧 2 𝔼 𝑌 𝑖+1 − 𝑌 𝑖 |prev. steps =0 Martingale! Despite 𝑍 1 ,…, 𝑍 𝑘 NOT indep.

113 Concentration of Random Spanning Trees
Random variables 𝑍 1 ,…, 𝑍 𝑛−1 Positions of 𝑛−1 edges of random spanning tree 𝑓 𝑍 1 ,…, 𝑍 𝑛−1 = 𝑳 𝑇 Similar to [Peres-Pemantle ‘14]

114 Conditioning Changes Distributions?
Graph Tree distribution All

115 Conditioning Changes Distributions?
Graph Tree distribution All Conditional

116 Summary Matrix martingales: a natural fit for algorithmic analysis
Understanding the Predictable Quadratic Variation is key Some results using matrix martingales Cohen, Musco, Pachocki ’16 – online row sampling Kyng, Sachdeva ’17 – approximate Gaussian elimination Kyng, Peng, Pachocki, Sachdeva ’17 – semi-streaming graph sparsification Cohen, Kelner, Kyng, Peebles, Peng, Rao, Sidford ’18 – solving Directed Laplacian eqs. Kyng, Song ’18 – Matrix Chernoff bound for negatively dependent random variables

117 Thanks!

118 This is really the end

119 Older slides


Download ppt "Matrix Martingales in Randomized Numerical Linear Algebra"

Similar presentations


Ads by Google