Bucket Renormalization for Approximate Inference Sungsoo Ahn1 Joint work with Michael Chertkov2, Adrian Weller3 and Jinwoo Shin1 1Korea Advanced Institute of Science and Technology (KAIST) 2Los Alamos National Laboratory (LANL) 3University of Cambridge June 7th, 2018
Goal: approximate inference in GMs Graphical model (GM) is a family of distributions, factorized by graphs. E.g., Ising model [Ising, 1920] for distribution of atomic spins. This talk is about undirected GMs with discrete variables. Protein structure (A) being modeled by graphical model (B) [Kamisety et al, 2008]
Goal: approximate inference in GMs Graphical model (GM) is a family of distributions, factorized by graphs.
Goal: approximate inference in GMs Graphical model (GM) is a family of distributions, factorized by graphs. Distribution requires space to specify (for binary variables). GM factorization allows to be stored in order of space. Partition function is essential for inference & normalization . However, NP-hard to compute, so we need approximations. e.g., MCMC, variational inference and approximate variable elimination
Approximate variable elimination Sequentially summing out variables (approximately) one-by-one. e.g., mini bucket elimination for upper bounding Z. terminates in fixed number of iterations. compared to other families, much faster but inaccurate. Bucket Renormalization New approximate variable eliminations with superior performance. variant of mini bucket elimination, but without bounding property. can also be seen as low-rank approximation of GMs
Summary for rest of the talk Variable (bucket) elimination Mini bucket renormalization (MBR) Global bucket renormalization (GBR)
Variable (bucket) elimination for exact Z For each variable in GM: Collect adjacent factors, i.e., bucket. Sum variable over bucket to generate a new factor.
Variable (bucket) elimination for exact Z For each variable in GM: Collect adjacent factors, i.e., bucket. Sum variable over bucket to generate a new factor.
Variable (bucket) elimination for exact Z For each variable in GM: Collect adjacent factors, i.e., bucket. Sum variable over bucket to generate a new factor. Requires computation & memory.
Variable (bucket) elimination for exact Z For each variable in GM: Collect adjacent factors, i.e., bucket. Generate new factor by marginalizing bucket over the variable. Complexity is determined by size of bucket. Key idea: replacing with approximation.
Summary for rest of the talk Variable (bucket) elimination Mini bucket renormalization (MBR) Global bucket renormalization (GBR)
Mini bucket renormalization Idea 1. Splitting variables, then adding compensating factors.
Mini bucket renormalization Idea 1. Splitting variables, then adding compensating factors. Number of splitting is decided by available resources. Choosing a nice compensation factor is important. Idea 2. Comparing with the optimal compensation.
Algorithm description Given variable to marginalize: Split the variables and generate mini buckets: Add compensating factors for each of split variables: Generate new factors by summing out each mini buckets:
Mini bucket renormalization Idea 2. Comparing with the optimal compensation. The resulting optimization is equivalent to rank-1 truncated SVD. minimize L2-difference
Connection to rank-1 truncated SVD Eventually, we are minimizing error of rank-1 projection:
Algorithm description Given compensating factors to choose: Sum out over mini bucket and : Compare with the optimal compensation:
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Illustration of mini bucket renormalization MBR with elimination order 1,2,3,4,5 with memory budget .
Why MBR is called a renormalization Splitting & compensation without variable elimination results in a renormalized GM: This can be interpreted as a tractable approximation to original GM.
Summary for rest of the talk Variable (bucket) elimination Mini bucket renormalization (MBR) Global bucket renormalization (GBR)
Global bucket renormalization (GBR) Recall from mini bucket renormalization: minimize L2-difference GBR aims to find a better choice of compensation at cost of additional computation.
Global bucket renormalization (GBR) Idea: increasing the scope of comparison: minimize L2-difference
Global bucket renormalization (GBR) Idea: increasing the scope of comparison: However, complexity is hard as computing the partition function. As a heuristic, we perform comparison in the renormalized GM. minimize L2-difference
Global bucket renormalization (GBR) Idea: increasing the scope of comparison: However, complexity is hard as computing the partition function. As a heuristic, we perform comparison in the renormalized GM.
Experiments We measure log-Z approximation ratio of our algorithms: mini bucket renormalization (MBR) global bucket renormalization (GBR) and compare with 4 existing algorithms: mini bucket elimination (MBE) weighted mini bucket elimination (WMBE) belief propagation (BP) and mean field approximation (MF).
Ising GM experiments Comparison over varying interaction parameter (or temperature). GBR > MBR > MF > WMBE ≈ MBE > BP GBR > MBR > BP > MF > WMBE > MBE complete graph with 15 variables grid graph with 15x15 variables
UAI 2014 competition experiments Number in brackets denote # of algorithms dominating others. GBR ≈ MBR > BP > WMBE > MBE Promedus dataset Linkage dataset
Conclusion We proposed bucket renormalization, based on splitting & compensation Highly inspired from tensor network renormalization (TNR) in statistica l physics and tensor decomposition algorithms. Arxiv version available at: https://arxiv.org/abs/1803.05104 Thank you for listening!
Ising GM experiments Comparison over varying order of available memory ibound. complete graph with 15 variables grid graph with 15x15 variables