Pushmeet Kohli. E(X) E: {0,1} n → R 0 → fg 1 → bg Image (D) n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and.

Slides:



Advertisements
Similar presentations
MAP Estimation Algorithms in
Advertisements

Mean-Field Theory and Its Applications In Computer Vision1 1.
Primal-dual Algorithm for Convex Markov Random Fields Vladimir Kolmogorov University College London GDR (Optimisation Discrète, Graph Cuts et Analyse d'Images)
Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts.
1 LP, extended maxflow, TRW OR: How to understand Vladimirs most recent work Ramin Zabih Cornell University.
Solving Markov Random Fields using Dynamic Graph Cuts & Second Order Cone Programming Relaxations M. Pawan Kumar, Pushmeet Kohli Philip Torr.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
I Images as graphs Fully-connected graph – node for every pixel – link between every pair of pixels, p,q – similarity w ij for each link j w ij c Source:
The University of Ontario CS 4487/9687 Algorithms for Image Analysis Multi-Label Image Analysis Problems.
ICCV 2007 tutorial on Discrete Optimization Methods in Computer Vision part I Basic overview of graph cuts.
1 s-t Graph Cuts for Binary Energy Minimization  Now that we have an energy function, the big question is how do we minimize it? n Exhaustive search is.
Learning with Inference for Discrete Graphical Models Nikos Komodakis Pawan Kumar Nikos Paragios Ramin Zabih (presenter)
Pseudo-Bound Optimization for Binary Energies Presenter: Meng Tang Joint work with Ismail Ben AyedYuri Boykov 1 / 27.
1 Can this be generalized?  NP-hard for Potts model [K/BVZ 01]  Two main approaches 1. Exact solution [Ishikawa 03] Large graph, convex V (arbitrary.
Robust Higher Order Potentials For Enforcing Label Consistency
Schedule Introduction Models: small cliques and special potentials Tea break Inference: Relaxation techniques:
ICCV Tutorial 2007 Philip Torr Papers, presentations and videos on web.....
P 3 & Beyond Solving Energies with Higher Order Cliques Pushmeet Kohli Pawan Kumar Philip H. S. Torr Oxford Brookes University CVPR 2007.
Improved Moves for Truncated Convex Models M. Pawan Kumar Philip Torr.
2010/5/171 Overview of graph cuts. 2010/5/172 Outline Introduction S-t Graph cuts Extension to multi-label problems Compare simulated annealing and alpha-
Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.
Graph Cut based Inference with Co-occurrence Statistics Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr.
Lecture 10: Stereo and Graph Cuts
Stereo Computation using Iterative Graph-Cuts
Comp 775: Graph Cuts and Continuous Maximal Flows Marc Niethammer, Stephen Pizer Department of Computer Science University of North Carolina, Chapel Hill.
What Energy Functions Can be Minimized Using Graph Cuts? Shai Bagon Advanced Topics in Computer Vision June 2010.
Relaxations and Moves for MAP Estimation in MRFs M. Pawan Kumar STANFORDSTANFORD Vladimir KolmogorovPhilip TorrDaphne Koller.
Hierarchical Graph Cuts for Semi-Metric Labeling M. Pawan Kumar Joint work with Daphne Koller.
Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.
Graph-Cut Algorithm with Application to Computer Vision Presented by Yongsub Lim Applied Algorithm Laboratory.
Computer vision: models, learning and inference
Extensions of submodularity and their application in computer vision
Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris École des Ponts ParisTech INRIA Saclay, Île-de-France Joint work with Phil.
A Selective Overview of Graph Cut Energy Minimization Algorithms Ramin Zabih Computer Science Department Cornell University Joint work with Yuri Boykov,
Michael Bleyer LVA Stereo Vision
Graph Cut & Energy Minimization
Graph Cut 韋弘 2010/2/22. Outline Background Graph cut Ford–Fulkerson algorithm Application Extended reading.
CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct
Minimizing general submodular functions
Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris Joint work with Phil Torr, Daphne Koller.
Rounding-based Moves for Metric Labeling M. Pawan Kumar Center for Visual Computing Ecole Centrale Paris.
Discrete Optimization Lecture 5 – Part 2 M. Pawan Kumar Slides available online
Graph Cuts Marc Niethammer. Segmentation by Graph-Cuts A way to compute solutions to the optimization problems we looked at before. Example: Binary Segmentation.
CS 4487/6587 Algorithms for Image Analysis
Probabilistic Inference Lecture 3 M. Pawan Kumar Slides available online
Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London.
Discrete Optimization in Computer Vision M. Pawan Kumar Slides will be available online
Discrete Optimization Lecture 3 – Part 1 M. Pawan Kumar Slides available online
1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.
Fast and accurate energy minimization for static or time-varying Markov Random Fields (MRFs) Nikos Komodakis (Ecole Centrale Paris) Nikos Paragios (Ecole.
Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online
Machine Learning – Lecture 15
Lecture 19: Solving the Correspondence Problem with Graph Cuts CAP 5415 Fall 2006.
Presenter : Kuang-Jui Hsu Date : 2011/3/24(Thur.).
Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.
Gaussian Mixture Models and Expectation-Maximization Algorithm.
Machine Learning – Lecture 15
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.
Graph Algorithms for Vision Amy Gale November 5, 2002.
Outline Standard 2-way minimum graph cut problem. Applications to problems in computer vision Classical algorithms from the theory literature A new algorithm.
Markov Random Fields in Vision
Rounding-based Moves for Metric Labeling M. Pawan Kumar École Centrale Paris INRIA Saclay, Île-de-France.
Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:
Markov Random Fields Tomer Michaeli Graduate Course
Markov Random Fields with Efficient Approximations
Efficient Graph Cut Optimization for Full CRFs with Quantized Edges
Discrete Inference and Learning
Discrete Optimization Methods Basic overview of graph cuts
Graphical Models and Learning
Presentation transcript:

Pushmeet Kohli

E(X) E: {0,1} n → R 0 → fg 1 → bg Image (D) n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04]

Unary Cost (c i ) Dark (negative) Bright (positive) E: {0,1} n → R 0 → fg 1 → bg n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04] ∑ c i x i Pixel Colour E(X) =

Unary Cost (c i ) Dark (negative) Bright (positive) E: {0,1} n → R 0 → fg 1 → bg n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04] x * = arg min E(x) E(X) = ∑ c i x i Pixel Colour

Discontinuity Cost (d ij ) E: {0,1} n → R 0 → fg 1 → bg n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04] E(X) = + ∑ d ij |x i -x j | Smoothness Prior ∑ c i x i Pixel Colour

Discontinuity Cost (d ij ) E: {0,1} n → R 0 → fg 1 → bg n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04] E(X) = + ∑ d ij x i (1-x j ) + d ij x j (1-x i ) Smoothness Prior ∑ c i x i Pixel Colour

E: {0,1} n → R 0 → fg 1 → bg n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04] E(X) = ∑ c i x i Pixel Colour + ∑ d ij x i (1-x j ) + d ij x j (1-x i ) Smoothness Prior Old Solution x * = arg min E(x)

E: {0,1} n → R 0 → fg 1 → bg n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04] E(X) = ∑ c i x i Pixel Colour + ∑ d ij x i (1-x j ) + d ij x j (1-x i ) Smoothness Prior x * = arg min E(x)

E(x) = ∑ f i (x i ) + ∑ g ij (x i,x j ) + ∑ h c (x c ) i ij c UnaryPairwiseHigher Order How to minimize E(x)? x takes from a label set L = {l 1, l 2,.., l k }

Space of Problems n = Number of Variables Segmentation Energy CSP MAXCUT NP-Hard Tractability Properties

Space of Problems n = Number of Variables Segmentation Energy CSP Tree Structured MAXCUT NP-Hard Tractability Properties Structural Tractability

Space of Problems n = Number of Variables Segmentation Energy Submodular Functions CSP Tree Structured Pair-wise O(n 3 ) MAXCUT O(n 6 ) NP-Hard Tractability Properties Language or Form Tractability g ij (x i,x j ) Constraints on the terms of your energy functions

Example: n = 2, A = [1,0], B = [0,1] f([1,0]) + f([0,1])  f([1,1]) + f([0,0]) Property : Sum of submodular functions is submodular E(x) = ∑ c i x i + ∑ d ij |x i -x j | ii,j Binary Image Segmentation Energy is submodular for all A,B {0,1} n f(A) + f(B)  f(A ˅ B) + f(A ˄ B) (AND)(OR) Pseudo-boolean function f  {0,1} n  ℝ is submodular if

 Discrete Analogues of Concave Functions [Lovasz, ’83]  Widely applied in Operations Research  Applications in Machine Learning  MAP Inference in Markov Random Fields  Clustering [Narasimhan, Jojic, & Bilmes, NIPS 2005]  Structure Learning [Narasimhan & Bilmes, NIPS 2006]  Maximizing the spread of influence through a social network [Kempe, Kleinberg & Tardos, KDD 2003]

 Polynomial time algorithms  Ellipsoid Algorithm: [Grotschel, Lovasz & Schrijver ‘81]  First strongly polynomial algorithm: [Iwata et al. ’00] [A. Schrijver ’00]  Current Best: O(n 5 Q + n 6 ) [Q is function evaluation time] [Orlin ‘07]  Symmetric functions: E(x) = E(1-x)  Can be minimized in O(n 3 )  Minimizing Pairwise submodular functions  Can be transformed to st-mincut/max-flow [Hammer, 1965]  Very low empirical running time ~ O(n) E(X) = ∑ f i (x i ) + ∑ g ij (x i,x j ) iij

Source Sink v1v1 v2v Graph (V, E, C) Vertices V = {v 1, v 2... v n } Edges E = {(v 1, v 2 )....} Costs C = {c (1, 2)....}

Source Sink v1v1 v2v What is a st-cut?

Source Sink v1v1 v2v What is a st-cut? An st-cut (S,T) divides the nodes between source and sink. What is the cost of a st-cut? Sum of cost of all edges going from S to T = 15

What is a st-cut? An st-cut (S,T) divides the nodes between source and sink. What is the cost of a st-cut? Sum of cost of all edges going from S to T What is the st-mincut? st-cut with the minimum cost Source Sink v1v1 v2v = 8

Construct a graph such that: 1.Any st-cut corresponds to an assignment of x 2.The cost of the cut is equal to the energy of x : E(x) Solution T S st-mincut E(x) [Hammer, 1965] [Kolmogorov and Zabih, 2002

E(x) = ∑ θ i (x i ) + ∑ θ ij (x i,x j ) i,ji θ ij (0,1) + θ ij (1,0)  θ ij (0,0) + θ ij (1,1)For all ij E(x) = ∑ c i x i + ∑ c ij x i (1-x j ) c ij ≥0 i,ji Equivalent (transformable)

Sink (1) Source (0) a1a1 a2a2 E(a 1,a 2 )

Sink (1) Source (0) a1a1 a2a2 E(a 1,a 2 ) = 2a 1 2

a1a1 a2a2 E(a 1,a 2 ) = 2a 1 + 5ā Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a a 1 = 1 a 2 = 1 E (1,1) = 11 Cost of cut = 11 Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Sink (1) Source (0) a 1 = 1 a 2 = 0 E (1,0) = 8 st-mincut cost = 8

Source Sink v1v1 v2v Solve the dual maximum flow problem Compute the maximum flow between Source and Sink s.t. Edges: Flow < Capacity Nodes: Flow in = Flow out Assuming non-negative capacity In every network, the maximum flow equals the cost of the st-mincut Min-cut\Max-flow Theorem

Augmenting Path Based Algorithms Source Sink v1v1 v2v Flow = 0

Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity Source Sink v1v1 v2v Flow = 0

Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path Source Sink v1v1 v2v Flow = 0 + 2

Source Sink v1v1 v2v Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path Flow = 2

Source Sink v1v1 v2v Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path 3.Repeat until no path can be found Flow = 2

Source Sink v1v1 v2v Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path 3.Repeat until no path can be found Flow = 2

Source Sink v1v1 v2v Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path 3.Repeat until no path can be found Flow = 2 + 4

Source Sink v1v1 v2v Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path 3.Repeat until no path can be found Flow = 6

Source Sink v1v1 v2v Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path 3.Repeat until no path can be found Flow = 6

Source Sink v1v1 v2v Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path 3.Repeat until no path can be found Flow = 6 + 2

Source Sink v1v1 v2v Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path 3.Repeat until no path can be found Flow = 8

Source Sink v1v1 v2v Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path 3.Repeat until no path can be found Flow = 8

a1a1 a2a2 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Sink (1) Source (0) 2a 1 + 5ā 1 = 2(a 1 +ā 1 ) + 3ā 1 = 2 + 3ā 1

Sink (1) Source (0) a1a1 a2a2 E(a 1,a 2 ) = 2 + 3ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a a 1 + 5ā 1 = 2(a 1 +ā 1 ) + 3ā 1 = 2 + 3ā 1

a1a1 a2a2 E(a 1,a 2 ) = 2 + 3ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a a 2 + 4ā 2 = 4(a 2 +ā 2 ) + 5ā 2 = 4 + 5ā 2 Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 2 + 3ā 1 + 5a a 1 ā 2 + ā 1 a a 2 + 4ā 2 = 4(a 2 +ā 2 ) + 5ā 2 = 4 + 5ā 2 Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 6 + 3ā 1 + 5a 2 + 2a 1 ā 2 + ā 1 a Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 6 + 3ā 1 + 5a 2 + 2a 1 ā 2 + ā 1 a Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 6 + 3ā 1 + 5a 2 + 2a 1 ā 2 + ā 1 a ā 1 + 5a 2 + 2a 1 ā 2 = 2(ā 1 +a 2 +a 1 ā 2 ) +ā 1 +3a 2 = 2(1+ā 1 a 2 ) +ā 1 +3a 2 F1 = ā 1 +a 2 +a 1 ā 2 F2 = 1+ā 1 a 2 a1a1 a2a2 F1F Sink (1) Source (0)

a1a1 a2a2 E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a ā 1 + 5a 2 + 2a 1 ā 2 = 2(ā 1 +a 2 +a 1 ā 2 ) +ā 1 +3a 2 = 2(1+ā 1 a 2 ) +ā 1 +3a 2 a1a1 a2a2 F1F F1 = ā 1 +a 2 +a 1 ā 2 F2 = 1+ā 1 a 2 Sink (1) Source (0)

a1a1 a2a E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 No more augmenting paths possible Sink (1) Source (0)

a1a1 a2a E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 Total Flow Residual Graph (positive coefficients) bound on the optimal solution Tight Bound --> Inference of the optimal solution becomes trivial Sink (1) Source (0)

a1a1 a2a E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 a 1 = 1 a 2 = 0 E (1,0) = 8 st-mincut cost = 8 Total Flow bound on the energy of the optimal solution Residual Graph (positive coefficients) Tight Bound --> Inference of the optimal solution becomes trivial Sink (1) Source (0)

[Slide credit: Andrew Goldberg] Augmenting Path and Push-Relabel n: # nodes m: # edges U: maximum edge weight Algorithms assume non- negative edge weights

[Slide credit: Andrew Goldberg] n: # nodes m: # edges U: maximum edge weight Algorithms assume non- negative edge weights Augmenting Path and Push-Relabel

a1a1 a2a Sink Source Ford Fulkerson: Choose any augmenting path

a1a1 a2a Sink Source Good Augmenting Paths Ford Fulkerson: Choose any augmenting path

a1a1 a2a Sink Source Bad Augmenting Path Ford Fulkerson: Choose any augmenting path

a1a1 a2a Sink Source Ford Fulkerson: Choose any augmenting path

a1a1 a2a Sink Source Ford Fulkerson: Choose any augmenting path n: # nodes m: # edges We will have to perform 2000 augmentations! Worst case complexity: O (m x Total_Flow) (Pseudo-polynomial bound: depends on flow)

Dinic: Choose shortest augmenting path n: # nodes m: # edges Worst case Complexity: O (m n 2 ) a1a1 a2a Sink Source

 Specialized algorithms for vision problems  Grid graphs  Low connectivity (m ~ O(n))  Dual search tree augmenting path algorithm [Boykov and Kolmogorov PAMI 2004] Finds approximate shortest augmenting paths efficiently High worst-case time complexity Empirically outperforms other algorithms on vision problems Efficient code available on the web

E(x) = ∑ c i x i + ∑ d ij |x i -x j | ii,j Global Minimum (x * ) x x * = arg min E(x) How to minimize E(x)? E: {0,1} n → R 0 → fg 1 → bg n = number of pixels

Sink (1) Source (0) Graph *g; For all pixels p /* Add a node to the graph */ nodeID(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeID(p), fgCost(p), bgCost(p)); end for all adjacent pixels p,q add_weights(nodeID(p), nodeID(q), cost(p,q)); end g->compute_maxflow(); label_p = g->is_connected_to_source(nodeID(p)); // is the label of pixel p (0 or 1)

a1a1 a2a2 fgCost(a 1 ) Sink (1) Source (0) fgCost(a 2 ) bgCost(a 1 ) bgCost(a 2 ) Graph *g; For all pixels p /* Add a node to the graph */ nodeID(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeID(p), fgCost(p), bgCost(p)); end for all adjacent pixels p,q add_weights(nodeID(p), nodeID(q), cost(p,q)); end g->compute_maxflow(); label_p = g->is_connected_to_source(nodeID(p)); // is the label of pixel p (0 or 1)

a1a1 a2a2 fgCost(a 1 ) Sink (1) Source (0) fgCost(a 2 ) bgCost(a 1 ) bgCost(a 2 ) cost(p,q) Graph *g; For all pixels p /* Add a node to the graph */ nodeID(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeID(p), fgCost(p), bgCost(p)); end for all adjacent pixels p,q add_weights(nodeID(p), nodeID(q), cost(p,q)); end g->compute_maxflow(); label_p = g->is_connected_to_source(nodeID(p)); // is the label of pixel p (0 or 1)

Graph *g; For all pixels p /* Add a node to the graph */ nodeID(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeID(p), fgCost(p), bgCost(p)); end for all adjacent pixels p,q add_weights(nodeID(p), nodeID(q), cost(p,q)); end g->compute_maxflow(); label_p = g->is_connected_to_source(nodeID(p)); // is the label of pixel p (0 or 1) a1a1 a2a2 fgCost(a 1 ) Sink (1) Source (0) fgCost(a 2 ) bgCost(a 1 ) bgCost(a 2 ) cost(p,q) a 1 = bg a 2 = fg

 Mixed (Real-Integer) Problems  Multi-label Problems  Ordered Labels ▪ Stereo (depth labels)  Unordered Labels ▪ Object segmentation ( ‘car’, `road’, `person’)  Higher Order Energy Functions

 Mixed (Real-Integer) Problems  Multi-label Problems  Ordered Labels ▪ Stereo (depth labels)  Unordered Labels ▪ Object segmentation ( ‘car’, `road’, `person’)  Higher Order Energy Functions

x – binary image segmentation (x i ∊ {0,1}) ω – non-local parameter (lives in some large set Ω) constant unary potentials pairwise potentials E(x, ω ) = C( ω ) + ∑ θ i ( ω, x i ) + ∑ θ ij ( ω, x i,x j ) i,ji ≥ 0 ω Template Position Scale Orientation We have seen several of them in the intro...

x – binary image segmentation (x i ∊ {0,1}) ω – non-local parameter (lives in some large set Ω) constant unary potentials pairwise potentials E(x, ω ) = C( ω ) + ∑ θ i ( ω, x i ) + ∑ θ ij ( ω, x i,x j ) i,ji ≥ 0 {x *, ω*} = arg min E(x,ω) Standard “graph cut” energy if ω is fixed x,ω [Kohli et al, 06,08] [Lempitsky et al, 08]

Local Method: Gradient Descent over ω ω* ω * = arg min min E (x, ω ) xω Submodular [Kohli et al, 06,08]

Local Method: Gradient Descent over ω ω * = arg min min E (x, ω ) xω Submodular Dynamic Graph Cuts time speedup! E (x,ω 1 ) E (x,ω 2 ) Similar Energy Functions [Kohli et al, 06,08]

Global Method: Branch and Mincut [Lempitsky et al, 08] Produces the global optimal solution Exhaustively explores all ω in Ω in the worst case 30,000,000 shapes

 Mixed (Real-Integer) Problems  Multi-label Problems  Ordered Labels ▪ Stereo (depth labels)  Unordered Labels ▪ Object segmentation ( ‘car’, `road’, `person’)  Higher Order Energy Functions

 Exact Transformation to QPBF  Move making algorithms E(y) = ∑ f i (y i ) + ∑ g ij (y i,y j ) i,ji y Labels L = {l 1, l 2, …, l k } Min y [Roy and Cox ’98] [Ishikawa ’03] [Schlesinger & Flach ’06] [Ramalingam, Alahari, Kohli, and Torr ’08]

So what is the problem? E b (x 1,x 2,..., x m )E m (y 1,y 2,..., y n ) Multi-label ProblemBinary label Problem y i L = {l 1, l 2, …, l k }x i L = {0,1} such that: Let Y and X be the set of feasible solutions, then 1. One-One encoding function T:X->Y 2. arg min E m (y) = T(arg min E b (x))

Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06] # Nodes = n * k # Edges = m * k 2

Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06] # Nodes = n * k # Edges = m * k 2 Ishikawa’s result: E(y) = ∑ θ i (y i ) + ∑ θ ij (y i,y j ) i,ji y Labels L = {l 1, l 2, …, l k } θ ij (y i,y j ) = g(|y i -y j |) Convex Function g(|y i -y j |) |y i -y j |

Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06] # Nodes = n * k # Edges = m * k 2 Schlesinger & Flach ’06: E(y) = ∑ θ i (y i ) + ∑ θ ij (y i,y j ) i,ji y Labels L = {l 1, l 2, …, l k } θ ij (l i+1,l j ) + θ ij (l i,l j+1 )  θ ij (l i,l j ) + θ ij (l i+1,l j+1 ) l i +1 lili l j +1 ljlj

ImageMAP Solution Scanline algorithm [Roy and Cox, 98]

 Applicability  Cannot handle truncated costs (non-robust)  Computational Cost  Very high computational cost  Problem size = |Variables| x |Labels|  Gray level image denoising (1 Mpixel image) (~2.5 x 10 8 graph nodes) θ ij (y i,y j ) = g(|y i -y j |) |y i -y j | discontinuity preserving potentials Blake&Zisserman’83,87

Unary Potentials Pair-wise Potentials Complexity Ishikawa Transformation [03] ArbitraryConvex and Symmetric T(nk, mk 2 ) Schlesinger Transformation [06] ArbitrarySubmodularT(nk, mk 2 ) Hochbaum [01] LinearConvex and Symmetric T(n, m) + n log k Hochbaum [01] ConvexConvex and Symmetric O(mn log n log nk) Other “less known” algorithms T(a,b) = complexity of maxflow with a nodes and b edges

 Exact Transformation to QPBF  Move making algorithms E(y) = ∑ f i (y i ) + ∑ g ij (y i,y j ) i,ji y Labels L = {l 1, l 2, …, l k } Min y [Boykov, Veksler and Zabih 2001] [Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008] [Veksler, 2008] [Kohli, Ladicky, Torr 2008]

Solution Space Energy

Search Neighbourhood Current Solution Optimal Move Solution Space Energy

Search Neighbourhood Current Solution Optimal Move xcxc (t)(t) Key Property Move Space Bigger move space Solution Space Energy Better solutions Finding the optimal move hard

Minimizing Pairwise Functions [Boykov Veksler and Zabih, PAMI 2001] Series of locally optimal moves Each move reduces energy Optimal move by minimizing submodular function Space of Solutions (x) : L n Move Space (t) : 2 n Search Neighbourhood Current Solution n Number of Variables L Number of Labels Kohli et al. ‘07, ‘08, ‘09 Extend to minimize Higher order Functions

Minimize over move variables t x = t x 1 + (1- t ) x 2 New solution Current Solution Second solution E m ( t ) = E( t x 1 + (1- t ) x 2 ) For certain x 1 and x 2, the move energy is sub-modular QPBF [Boykov, Veksler and Zabih 2001]

Variables labeled α, β can swap their labels [Boykov, Veksler and Zabih 2001]

Sky House Tree Ground Swap Sky, House Variables labeled α, β can swap their labels [Boykov, Veksler and Zabih 2001]

 Move energy is submodular if:  Unary Potentials: Arbitrary  Pairwise potentials: Semi-metric θ ij (l a,l b ) ≥ 0 θ ij (l a,l b ) = 0 a = b Examples: Potts model, Truncated Convex [Boykov, Veksler and Zabih 2001] Variables labeled α, β can swap their labels

[Boykov, Veksler, Zabih] Variables take label  or retain current label [Boykov, Veksler and Zabih 2001]

Sky House Tree Ground Initialize with TreeStatus:Expand GroundExpand HouseExpand Sky [Boykov, Veksler, Zabih] [Boykov, Veksler and Zabih 2001] Variables take label  or retain current label

 Move energy is submodular if:  Unary Potentials: Arbitrary  Pairwise potentials: Metric [Boykov, Veksler, Zabih] θ ij (l a,l b ) + θ ij (l b,l c ) ≥ θ ij (l a,l c ) Semi metric + Triangle Inequality Examples: Potts model, Truncated linear Cannot solve truncated quadratic Variables take label  or retain current label [Boykov, Veksler and Zabih 2001]

 Expansion and Swap can be derived as a primal dual scheme  Get solution of the dual problem which is a lower bound on the energy of solution  Weak guarantee on the solution [Komodakis et al 05, 07] E(x) < 2(d max /d min ) E(x*) d max d min θ ij (l i,l j ) = g(|l i -l j |) |y i -y j |

Move TypeFirst Solution Second Solution Guarantee ExpansionOld solutionAll alphaMetric FusionAny solution  Minimize over move variables t x = t x 1 + (1-t) x 2 New solution First solution Second solution Move functions can be non-submodular!!

x = t x 1 + (1-t) x 2 x 1, x 2 can be continuous F x1x1 x2x2 x Optical Flow Example Final Solution Solution from Method 1 Solution from Method 2 [Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008]

 Move variables can be multi-label  Optimal move found out by using the Ishikawa Transform  Useful for minimizing energies with truncated convex pairwise potentials θ ij (y i,y j ) = min(|y i -y j | 2,T) |y i -y j | θ ij (y i,y j ) T x = (t == 1) x 1 + (t == 2) x 2 +… +(t == k) x k [Kumar and Torr, 2008] [Veksler, 2008]

[Veksler, 2008] Image Noisy Image Range Moves Expansion Move