Generating Random Spanning Trees via Fast Matrix Multiplication Keyulu Xu University of British Columbia Joint work with Nick Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA
What are random spanning trees? Goal: Output a uniformly random spanning tree More precisely: T(G) = set of all spanning trees of graph G Output a spanning tree T’ such that T’ is equal to any of the trees in T(G) with prob. 1/|T(G)| Note: |T(G)| can be exponentially large
Why random spanning trees? Fundamental probabilistic object [K’1847] Connections to trending topics in theoretical computer science: spectral graph theory, electrical network flows, and graph structures Many applications in theoretical computer science, computer networks, statistical physics Recreation! (Generating random maze puzzles)
Why random spanning trees? Applications in theoretical computer science Estimating the coefficients of the reliability polynomial [CDM’88] Generating expander graphs [GRV’09] Traveling salesman problem [AGMO’10] [OSS’11] Submodular optimization [CVZ’10]
How to generate RST? Laplacian-based algorithms [CMN’96] Matrix Tree theorem [Kirchoff 1847] Pr[e in RST] = Reff(e) For each edge e: – Compute Reff(e) and add e to T with prob Reff(e) – Update G by contracting e if e is in T and delete e o.w. Running time O(n ω )
How to generate RST? Random walks [Border’89, Aldous’90 ] Start a random walk at some vertex s Whenever visiting a new vertex v, add to T the edge through which we reached v Running time = O(cover time) = O(mn) Can get O(mean hitting time) but still O(mn) in the worst case [Wilson’96]
How to generate RST? Approximate algorithms [KM’09, MST’15] Nearly-linear time approximate Laplacian system solvers Accelerate the random graph algorithms by identifying regions where the random walk will be slow Most efficient on sparse graphs
Our work A new Laplacian-based algorithm for generating random spanning trees with running time O(n ω ) Advantages: Conceptually simpler and cleaner No need to introduce directed graphs Uses fast matrix multiplication as a black box, thus avoiding the intricate details of LU-decomposition Uses a simple recursion framework that could be applied to many graph algorithms
Main idea Pr[e in RST] = Reff(e) For e = {u, v}, Reff(e) = X u T L + X v X is the characteristic function L is the Laplacian matrix of the graph G L = D - A
Main idea Pr[e in RST] = Reff(e) For e = {u, v}, Reff(e) = X u T L + X v For each edge e: – Compute Reff(e) and add e to T with prob Reff(e) – Update L + by contracting e if e is in T and delete e o.w Faster way to update L + ? Lazy updates Visit the edges in a specific order Recursion!
Main idea For e = {u, v}, Reff(e) = X u T L + X v Observation: Reff(e) only depends on 4 entries of L + lazy updates! Need formula for updating L + whenever we contract or delete edges Sherman-Morrison-Woodbury formula for inverse updates Sherman-Morrison-like formula for Laplacian pseudo-inverse
Pseudo-inverse update formulas Theorem 1 (Update formula for deletion) Let G = (V, E) be a connected, undirected graph and D ⊆ E. If G ∖ D contains at least one spanning tee, then (L – L D ) + = L + - L + (L D L + - I) -1 L D L + Update running time O(|V| ω ), can we do better? No need to update/clean up the entire matrix
Pseudo-inverse update formulas Corollary 1 (Improved update formula) Let G = (V, E) be a connected, undirected graph and S ⊆ V. Define E[S] as the set of edges whose vertices are in S. Suppose D ⊆ E[S] and G ∖ D contains at least one spanning tee, then (L – L D ) + S,S = L + S,S - L + S,S (L D_S,S L + S,S - I) -1 L D_S,S L + S,S Runtime of each update is O(|S| ω )!
Pseudo-inverse update formulas Contract edges while avoiding the cumbersome updates to the Laplacian that result from decreasing the number of vertices Our approach: increase the weight of the edge to a large value k. This is equivalent to contracting the edge in the limit as k→∞. Can obtain similar update formula for contraction! Note: The object of interest to us is L + and L + does have a finite limit as k→∞.
Recursive algorithm Our pseudo-inverse update formulas indicate that if we have only made sampling decisions for edges in a submatrix, then we can update the submatrix of L + with a small cost, using values from that submatrix. Recursively sample edges in submatrices! Apply update formulas to clean up the matrix
Recursive Decomposition of Graph Define: E[S] = { {u,v} : u,v ∈ S and {u,v} ∈ E } “within” E[S 1,S 2 ] = { {u,v} : u ∈ S 1, v ∈ S 2, and {u,v} ∈ E } “crossing” Claim: Let S=S 1 ⋃ S 2. Then E[S] = E[S 1 ] ⋃ E[S 2 ] ⋃ E[S 1,S 2 ] E[S] S1S1 S2S2 E[S 1 ] E[S 2 ] E[S 1,S 2 ]
Recursive Decomposition of Graph Define: E[S] = { {u,v} : u,v ∈ S and {u,v} ∈ E } “within” E[R,S] = { {u,v} : u ∈ R, v ∈ S, and {u,v} ∈ E } “crossing” Claim: Let R=R 1 ⋃ R 2 and S=S 1 ⋃ S 2. Then E[R,S] = E[R 1,S 1 ] ⋃ E[R 1,S 2 ] ⋃ E[R 2,S 1 ] ⋃ E[R 2,S 2 ] RS
Recursive Decomposition of Graph Define: E[S] = { {u,v} : u,v ∈ S and {u,v} ∈ E } “within” E[R,S] = { {u,v} : u ∈ R, v ∈ S, and {u,v} ∈ E } “crossing” Claim: Let R=R 1 ⋃ R 2 and S=S 1 ⋃ S 2. Then E[R,S] = E[R 1,S 1 ] ⋃ E[R 1,S 2 ] ⋃ E[R 2,S 1 ] ⋃ E[R 2,S 2 ] RS R1R1 S2S2 R2R2 S1S1
SampleWithin(S) If |S|=1 then Return Partition S=S 1 ⋃ S 2 For i ∈ {1,2} SampleWithin(S i ) Update N S,S SampleCrossing(S 1,S 2 ) GeneratingRST( G=(V,E) ) Construct L and N=L + SampleWithin(V) SampleCrossing(R,S) If |R|=|S|=1 Try to sample R-S edge; Return Partition R=R 1 ⋃ R 2 and S=S 1 ⋃ S 2 For i ∈ {1,2} and j ∈ {1,2} SampleCrossing(R i,S j ) Update N R,S V S1S1 S2S2 E[S 1 ] E[S 2 ] E[S 1,S 2 ] SampleWithin(S 1 ) SampleWithin(S 2 ) SampleCrossing(S 1,S 2 ) Recursively sample edges
SampleWithin(S) If |S|=1 then Return Partition S=S 1 ⋃ S 2 For i ∈ {1,2} SampleWithin(S i ) Update N S,S SampleCrossing(S 1,S 2 ) SampleCrossing(R,S) If |R|=|S|=1 Try to sample R-S edge; Return Partition R=R 1 ⋃ R 2 and S=S 1 ⋃ S 2 For i ∈ {1,2} and j ∈ {1,2} SampleCrossing(R i,S j ) Update N R,S S1S1 S2S2 SampleWithin(S 1 ) SampleWithin(S 2 ) SampleCrossing(S 1,S 2 ) GeneratingRST( G=(V,E) ) Construct L and N=L + SampleWithin(V) Recursively sample edges
SampleWithin(S) If |S|=1 then Return Partition S=S 1 ⋃ S 2 For i ∈ {1,2} SampleWithin(S i ) Update N S,S SampleCrossing(S 1,S 2 ) SampleCrossing(R,S) If |R|=|S|=1 Try to sample R-S edge; Return Partition R=R 1 ⋃ R 2 and S=S 1 ⋃ S 2 For i ∈ {1,2} and j ∈ {1,2} SampleCrossing(R i,S j ) Update N R,S SampleWithin(S 1 ) SampleWithin(S 2 ) SampleCrossing(S 1,S 2 ) S1S1 S2S2 GeneratingRST( G=(V,E) ) Construct L and N=L + SampleWithin(V) Recursively sample edges
SampleWithin(S) If |S|=1 then Return Partition S=S 1 ⋃ S 2 For i ∈ {1,2} SampleWithin(S i ) Update N S,S SampleCrossing(S 1,S 2 ) SampleCrossing(R,S) If |R|=|S|=1 Try to delete R-S edge; Return Partition R=R 1 ⋃ R 2 and S=S 1 ⋃ S 2 For i ∈ {1,2} and j ∈ {1,2} SampleCrossing(R i,S j ) Update N R,S SampleWithin(S 1 ) SampleWithin(S 2 ) SampleCrossing(S 1,S 2 ) S1S1 S2S2 Try to sample R-S edge; Return GeneratingRST( G=(V,E) ) Construct L and N=L + SampleWithin(V) Recursively sample edges
Runtime Analysis g(n) = 4∙g(n/2) + O(n )f(n) = 2∙f(n/2)+g(n)+O(n ) g(n) = O(n ) f(n) = O(n ) SampleWithin(S) If |S|=1 then Return Partition S=S 1 ⋃ S 2 For i ∈ {1,2} SampleWithin(S i ) Update N S,S SampleCrossing(S 1,S 2 ) SampleCrossing(R,S) If |R|=|S|=1 Try to sample R-S edge; Return Partition R=R 1 ⋃ R 2 and S=S 1 ⋃ S 2 For i ∈ {1,2} and j ∈ {1,2} SampleCrossing(R i,S j ) Update N R,S By Sherman-Morrison-Woodbury Formula, can do each update in O(n ) time Runtime: f(n), where n=|S|Runtime: g(n), where n=|R|=|S| Total runtime of algorithm is O(n ) time
Conclusion Generating random spanning trees is a fundamental and important problem in TCS We provided a simpler algorithm that also achieves O(n ω ) Open questions: Bridging the generation of random spanning trees for dense and sparse graphs?