Inference for Learning Belief Propagation
So far... Exact methods for submodular energies Approximations for non-submodular energies Move-making ( N_Variables >> N_Labels)
Motivating Application ImageDesired Output Only 10 variables !! head
Motivating Application headtorso uleg1 lleg1 uleg2 lleg2 uleg3 lleg3 uleg4 lleg4 Only 10 variables !! Thousands of Labels !! Millions of pairwise potentials!!
Belief Propagation E(f; ) = ∑ a a;f(a) + ∑ (a,b) ab;f(a)f(b) MAP Estimation f* = argmin f E(f; ) An algorithm for solving RECALL Potentials a;i and ab;ij Labeling f : V L Exact for tree-structured models Pearl, 1988
Belief Propagation VaVa VbVb M ab Message M ab;i : V a ’s opinion on V b taking label i V b gathers information from V a Compute the belief B b;i
VaVa VbVb VaVa VbVb a;0 + ab;00 = a;1 + ab;10 = min M ab;0 = Two Variables M ab;i = min j a;j + ab;ji
VaVa VbVb a;0 + ab;01 = a;1 + ab;11 = min M ab;1 = Two Variables VaVa VbVb f(a) = 1 M ab;i = min j a;j + ab;ji
Two Variables VaVa VbVb f(a) = 1 VaVa VbVb B b;i = b;i +∑ a M ab;i b;0 + M ab;0 = b;1 + M ab;1 = argmin f*(b) =
Two Variables VaVa VbVb f(a) = 1 VaVa VbVb B b;i = b;i +∑ a M ab;i b;0 + M ab;0 = b;1 + M ab;1 = argmin f*(b) =
Two Variables VaVa VbVb f(a) = 1 VaVa VbVb B b;i = b;i +∑ a M ab;i b;0 + M ab;0 = b;1 + M ab;1 = argmin f*(b) =
Three Variables VaVa VbVb VcVc Pass message from “a” to “b” as before l0l0 l1l1
Three Variables VaVa VbVb VcVc f(a) = 1 2 f(a) = 1 l0l0 l1l1
Three Variables VaVa VbVb VcVc Pass message from “b” to “c” as before 3 f(a) = 1 2 f(a) = 1 l0l0 l1l1
Three Variables VbVb VcVc f(a) = 1 2 f(a) = 1 b;0 + bc;00 + M ab;0 = 6 b;1 + bc;10 + M ab;1 = 8 min M bc;0 = M bc;i = min j b;j + bc;ji + ∑ n\c M nb;j l0l0 l1l1 VaVa 2 5
Three Variables VbVb VcVc f(a) = 1 2 f(a) = 1 b;0 + bc;00 + M ab;0 = 6 b;1 + bc;10 + M ab;1 = 8 min M bc;0 = M bc;i = min j b;j + bc;ji + ∑ n\c M nb;j 6 f(b) = 0 l0l0 l1l1 VaVa 2 5
Three Variables VbVb VcVc f(a) = 1 2 f(a) = 1 b;0 + bc;01 + M ab;0 = 8 b;1 + bc;11 + M ab;1 = 6 min M bc;1 = M bc;i = min j b;j + bc;ji + ∑ n\c M nb;j 6 f(b) = 0 l0l0 l1l1 VaVa 2 5
Three Variables VbVb VcVc f(a) = 1 2 f(a) = 1 M bc;i = min j b;j + bc;ji + ∑ n\c M nb;j 6 f(b) = 0 b;0 + bc;01 + M ab;0 = 8 b;1 + bc;11 + M ab;1 = 6 min M bc;1 = 6 f(b) = 1 l0l0 l1l1 VaVa 2 5
Three Variables VbVb VcVc f(a) = 1 2 f(a) = 1 6 f(b) = 0 6 f(b) = 1 B c;i = c;i +∑ b M bc;i c;0 + M bc;0 = c;1 + M bc;1 = argmin f*(c) = l0l0 l1l1 VaVa 2 5
Three Variables VbVb VcVc f(a) = 1 2 f(a) = 1 6 f(b) = 0 6 f(b) = 1 B c;i = c;i +∑ b M bc;i c;0 + M bc;0 = c;1 + M bc;1 = argmin f*(c) = l0l0 l1l1 VaVa 2 5
Three Variables VbVb VcVc f(a) = 1 2 f(a) = 1 6 f(b) = 0 6 f(b) = 1 B c;i = c;i +∑ b M bc;i c;0 + M bc;0 = c;1 + M bc;1 = argmin f*(c) = l0l0 l1l1 VaVa 2 5
Three Variables VbVb VcVc f(a) = 1 2 f(a) = 1 6 f(b) = 0 6 f(b) = 1 B c;i = c;i +∑ b M bc;i c;0 + M bc;0 = c;1 + M bc;1 = argmin f*(c) = l0l0 l1l1 VaVa 2 5
Tree-structured Models headtorso uleg1 lleg1 uleg2 lleg2 uleg3 lleg3 uleg4 lleg4 Message Passing
Tree-structured Models head torso uleg1 lleg1 uleg2 lleg2 uleg3 lleg3 uleg4 lleg4 Message Passing
Tree-structured Models head torso uleg1 lleg1 uleg2 lleg2 uleg3 lleg3 uleg4 lleg4 Message Passing
Tree-structured Models head torso uleg1 lleg1 uleg2 lleg2 uleg3 lleg3 uleg4 lleg4 Message Passing
Loopy Graphs VaVa VdVd VbVb VcVc Overcounting
Summary of BP Exact for trees Approximate MAP for general cases Convergence is not guaranteed M bc;i = min j b;j + bc;ji + ∑ n\a M nb;j B c;i = c;i +∑ b M bc;i
Inference for Learning Linear Programming Relaxation
Linear Integer Programming min x g 0 T x s.t. g i T x ≤ 0 h i T x = 0 Linear function Linear constraints x is a vector of integers For example, x {0,1} N Hard to solve !!
Linear Programming min x g 0 T x s.t. g i T x ≤ 0 h i T x = 0 Linear function Linear constraints x is a vector of reals Easy to solve!! For example, x [0,1] N Relaxation
Roadmap Express MAP as an integer program Relax to a linear program and solve Round fractional solution to integers
V1V1 V2V2 Label ‘ 0 ’ Label ‘ 1 ’ Unary Cost Integer Programming Formulation Unary Cost Vector u = [ 5 Cost of V 1 = 0 2 Cost of V 1 = 1 ; 2 4 ]
V1V1 V2V2 Label ‘ 0 ’ Label ‘ 1 ’ Unary Cost Unary Cost Vector u = [ 5 2 ; 2 4 ] T Label vector x = [ 0 V 1 0 1 V 1 = 1 ; 1 0 ] T Integer Programming Formulation
V1V1 V2V2 Label ‘ 0 ’ Label ‘ 1 ’ Unary Cost Unary Cost Vector u = [ 5 2 ; 2 4 ] T Label vector x = [ 01; 1 0 ] T Sum of Unary Costs = ∑i ui xi∑i ui xi Integer Programming Formulation
V1V1 V2V2 Label ‘ 0 ’ Label ‘ 1 ’ Pairwise Cost Integer Programming Formulation 0 Cost of V 1 = 0 and V 1 = Cost of V 1 = 0 and V 2 = 0 3 Cost of V 1 = 0 and V 2 = Pairwise Cost Matrix P
V1V1 V2V2 Label ‘ 0 ’ Label ‘ 1 ’ Pairwise Cost Integer Programming Formulation Pairwise Cost Matrix P Sum of Pairwise Costs ∑ i<j P ij x i x j = ∑ i<j P ij X ij X = xx T
Integer Programming Formulation Constraints Uniqueness Constraint ∑ x i = 1 i V a Integer Constraints x i {0,1} X = x x T
Integer Programming Formulation x* = argmin ∑ u i x i +∑ P ij X ij x i {0,1} X = x x T ∑ x i = 1 i V a
Roadmap Express MAP as an integer program Relax to a linear program and solve Round fractional solution to integers
Integer Programming Formulation x* = argmin ∑ u i x i +∑ P ij X ij ∑ x i = 1 i V a x i {0,1} X = x x T Convex Non-Convex
Integer Programming Formulation x* = argmin ∑ u i x i +∑ P ij X ij ∑ x i = 1 i V a x i [0,1] X = x x T Convex Non-Convex
Integer Programming Formulation x* = argmin ∑ u i x i +∑ P ij X ij ∑ x i = 1 i V a x i [0,1] X ij [0,1] Convex ∑ X ij = x i j V b
Linear Programming Formulation x* = argmin ∑ u i x i +∑ P ij X ij ∑ x i = 1 i V a x i [0,1] X ij [0,1] Convex ∑ X ij = x i j V b Schlesinger, 76; Chekuri et al., 01; Wainwright et al., 01
Roadmap Express MAP as an integer program Relax to a linear program and solve Round fractional solution to integers
Properties Dominate many convex relaxations Best known multiplicative bounds 2 for Potts (uniform) energies 2 + √2 for Truncated linear energies O(log n) for metric labeling Matched by move-making Kumar and Torr, 2008; Kumar and Koller, UAI 2009 Kumar, Kolmogorov and Torr, 2007
Algorithms Tree-reweighted message passing (TRW) Max-product linear programming (MPLP) Dual decomposition Komodakis and Paragios, ICCV 2007