Michael Bleyer LVA Stereo Vision

Slides:



Advertisements
Similar presentations
Primal-dual Algorithm for Convex Markov Random Fields Vladimir Kolmogorov University College London GDR (Optimisation Discrète, Graph Cuts et Analyse d'Images)
Advertisements

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts.
1 LP, extended maxflow, TRW OR: How to understand Vladimirs most recent work Ramin Zabih Cornell University.
Graph Cut Algorithms for Computer Vision & Medical Imaging Ramin Zabih Computer Science & Radiology Cornell University Joint work with Y. Boykov, V. Kolmogorov,
Tutorial at ICCV (Barcelona, Spain, November 2011)
Introduction to Markov Random Fields and Graph Cuts Simon Prince
ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.
I Images as graphs Fully-connected graph – node for every pixel – link between every pair of pixels, p,q – similarity w ij for each link j w ij c Source:
Graph-Based Image Segmentation
1 s-t Graph Cuts for Binary Energy Minimization  Now that we have an energy function, the big question is how do we minimize it? n Exhaustive search is.
Learning with Inference for Discrete Graphical Models Nikos Komodakis Pawan Kumar Nikos Paragios Ramin Zabih (presenter)
1 Fast Primal-Dual Strategies for MRF Optimization (Fast PD) Robot Perception Lab Taha Hamedani Aug 2014.
Epipolar lines epipolar lines Baseline O O’ epipolar plane.
1 Can this be generalized?  NP-hard for Potts model [K/BVZ 01]  Two main approaches 1. Exact solution [Ishikawa 03] Large graph, convex V (arbitrary.
Last Time Pinhole camera model, projection
Robust Higher Order Potentials For Enforcing Label Consistency
P 3 & Beyond Solving Energies with Higher Order Cliques Pushmeet Kohli Pawan Kumar Philip H. S. Torr Oxford Brookes University CVPR 2007.
2010/5/171 Overview of graph cuts. 2010/5/172 Outline Introduction S-t Graph cuts Extension to multi-label problems Compare simulated annealing and alpha-
Approximation Algorithms
Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.
Multiview stereo. Volumetric stereo Scene Volume V Input Images (Calibrated) Goal: Determine occupancy, “color” of points in V.
MRF Labeling With Graph Cut CMPUT 615 Nilanjan Ray.
The plan for today Camera matrix
Optical flow and Tracking CISC 649/849 Spring 2009 University of Delaware.
Stereo Computation using Iterative Graph-Cuts
What Energy Functions Can be Minimized Using Graph Cuts? Shai Bagon Advanced Topics in Computer Vision June 2010.
Relaxations and Moves for MAP Estimation in MRFs M. Pawan Kumar STANFORDSTANFORD Vladimir KolmogorovPhilip TorrDaphne Koller.
Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.
Graph-Cut Algorithm with Application to Computer Vision Presented by Yongsub Lim Applied Algorithm Laboratory.
Computer vision: models, learning and inference
Extensions of submodularity and their application in computer vision
Image Renaissance Using Discrete Optimization Cédric AllèneNikos Paragios ENPC – CERTIS ESIEE – A²SI ECP - MAS France.
Michael Bleyer LVA Stereo Vision
Fast Approximate Energy Minimization via Graph Cuts
A Selective Overview of Graph Cut Energy Minimization Algorithms Ramin Zabih Computer Science Department Cornell University Joint work with Yuri Boykov,
Mutual Information-based Stereo Matching Combined with SIFT Descriptor in Log-chromaticity Color Space Yong Seok Heo, Kyoung Mu Lee, and Sang Uk Lee.
Surface Stereo with Soft Segmentation Michael Bleyer 1, Carsten Rother 2, Pushmeet Kohli 2 1 Vienna University of Technology, Austria 2 Microsoft Research.
Graph Cut & Energy Minimization
Graph Cut Algorithms for Binocular Stereo with Occlusions
Graph Cut 韋弘 2010/2/22. Outline Background Graph cut Ford–Fulkerson algorithm Application Extended reading.
CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct
Planar Cycle Covering Graphs for inference in MRFS The Typhon Algorithm A New Variational Approach to Ground State Computation in Binary Planar Markov.
Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images (Fri) Young Ki Baik, Computer Vision Lab.
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris Joint work with Phil Torr, Daphne Koller.
Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.
Graph Cuts Marc Niethammer. Segmentation by Graph-Cuts A way to compute solutions to the optimization problems we looked at before. Example: Binary Segmentation.
Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London.
1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.
Fast and accurate energy minimization for static or time-varying Markov Random Fields (MRFs) Nikos Komodakis (Ecole Centrale Paris) Nikos Paragios (Ecole.
Michael Bleyer LVA Stereo Vision
Machine Learning – Lecture 15
Lecture 19: Solving the Correspondence Problem with Graph Cuts CAP 5415 Fall 2006.
Presenter : Kuang-Jui Hsu Date : 2011/3/24(Thur.).
Gaussian Mixture Models and Expectation-Maximization Algorithm.
Segmentation- Based Stereo Michael Bleyer LVA Stereo Vision.
Machine Learning – Lecture 15
CS654: Digital Image Analysis Lecture 28: Advanced topics in Image Segmentation Image courtesy: IEEE, IJCV.
A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.
Graph Algorithms for Vision Amy Gale November 5, 2002.
Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Newton's method Wikpedia page
Photoconsistency constraint C2 q C1 p l = 2 l = 3 Depth labels If this 3D point is visible in both cameras, pixels p and q should have similar intensities.
Markov Random Fields in Vision
Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:
Markov Random Fields with Efficient Approximations
Geometry 3: Stereo Reconstruction
Efficient Graph Cut Optimization for Full CRFs with Quantized Edges
Haim Kaplan and Uri Zwick
Lecture 31: Graph-Based Image Segmentation
“Traditional” image segmentation
Presentation transcript:

Michael Bleyer LVA Stereo Vision Graph-Cuts Michael Bleyer LVA Stereo Vision

What happened last time? (1) We have defined an energy function to measure the quality of a disparity map D: where m(p,dp) computes color dissimilarity for matching pixel p at disparity dp N denotes all spatial neighbors in 4-connectivity s() is the smoothness function. We use the Potts model: This energy function is important for many computer vision problems. 0 if dp = dq P otherwise.

What happened last time? (2) Smoothness interactions define a graph known as 4-connected grid. Computing the energy optimum on the 4-connected grid is an np-complete problem. We have learned about dynamic programming: Computes exact energy optimum Requires the graph to be a tree => We had to remove smoothness interactions 4-Connected Grid

What is Going to Happen Today? Just one point on the agenda: Graph-Cuts

What is Graph-Cuts? Powerful optimization method. Finds strong local minima of our np-complete energy function. Graph-cuts have been around in computer vision for quite some time (e.g. [Roy,ICCV98]). I will speak about modern graph-cuts, i.e. move making algorithms Was ist Graph-Cuts? Graph-Cuts ist nichts anderes als eine mächtige Optimierungsmethode. Graph-Cuts können sehr gute lokale Minima unserer np-vollständigen Energy Function finden. Sehr gut heißt in diesem Kontext, dass die lokalen Minima nicht weit vom globalen Optimum entfernt liegen. Es gilt zu beachten, dass unsere Energy np-vollständig ist. Das heißt, es ist unwahrscheinlich, dass ein Algorithmus existiert, der das wirkliche (globale) Optimum der Energie in vernünftiger Zeit findet. Graph-Cuts sind in der Computer Vision nicht unbedingt ein neues Thema. Für Stereo wurden sie zum ersten mal 1998 verwendet. Die Methoden, die ich heute präsentieren werde, unterscheiden sich allerdings signifikant von diesen frühen Ansätzen. Die Methoden, die ich heute vorstellen werde, werden daher oft als moderne Graph-Cuts bezeichnet. Sie sind sogenannte Move Making Algorithmen. Schauen wir uns nun an, wie Move Making Algorithmen funktionieren.

Move Making Algorithms We are given a labeled image as input. (In our case, the image is labeled with disparity values, i.e. label α can for example mean a disparity of 10 pixels.) We want to modify the assignment of pixels to labels to obtain a better solution, i.e. one of lower energy. An operation that changes labels is called a move. We will learn about 3 types of moves: αβ-swap α-expansion fusion move β γ α Current labeling Move Laßt uns annehmen, dass wir ein gelabeltes Bild haben. Um das Ganze möglichst generell zu halten spreche ich von Labels. In unserem Fall stellen diese Labels Disparitäten dar. Zum Beispiel kann das Label Alpha einen Disparitätswert von 10 Pixel repräsentieren. Wir können aber natürlich den Labels auch eine andere Bedeutung geben. Sie können zum Beispiel Optical Flow Vektoren oder Helligkeitswerte darstellen. Wir haben also ein gelabeltes Bild. Dieses wollen wir nun verändern. Dabei wollen wir ein neu gelabeltes Bild erhalten, das besser ist als unser altes – sprich eine niedrigere Energie hat. Als Move bezeichnen wir dabei eine Operation, die unsere Labels verändert. Wir werden in der heutigen Session 3 verschieden Arten von Moves kennenlernen: Alpha-Beta Swaps Alpha-Expansions Fusion Moves Ich beginne auf der nächsten Folie mit Alpha Beta Swaps. New labeling (preferably of lower energy than current labeling)

One possible labeling after αβ-swap αβ-Swap [Boykov,PAMI01] β Select two labels: α and β. A pixel that is assigned to α in the current labeling can either: switch its label to β or keep its old label α in the new labeling. Analogously, a pixel that is currently assigned to β can either: switch its label to α or keep its old label β in the new labeling. Simply spoken: Some pixels that had the label α are now assigned to β. Some pixels that had the label β are now assigned to α. γ α Current labeling β γ Was macht also ein Alpha-Beta Swap? Wir wählen zufällig zwei Label aus. Diese bezeichnen wir mit Alpha und Beta. Wir schauen uns nun alle Pixel an, die das Label Alpha oder das Label Beta tragen. Bei einem Swap kann nun jedes Pixel, das das Label Alpha trägt seinen Label zu Beta ändern, muss dies allerdings nicht. Das heißt sein Label kann genauso gut auf Alpha bleiben. (Beispiel in Abbildung) Umgekehrt schauen wir uns auch alle Pixel an, die in der jetzigen Lösung das Label Beta tragen. Diese Pixel können in der neuen Lösung weiterhin das Label Beta tragen. Sie können aber genauso gut das Label Alpha annehmen. (Beispiel in Abbildung) Anders ausgedrückt macht ein Swap nichts anders als manchen Pixel, die das Label Alpha trugen, das Label Beta zuzuweisen und manchen anderen Pixel, die das Label Beta trugen, das Label Alpha zuzuweisen. Sprich Pixel werden zwischen Alpha und Beta ausgetauscht (oder geswapped). α One possible labeling after αβ-swap

α-Expansion [Boykov,PAMI01] β Select one label: α. Any pixel can either switch its label to α or keep its old label. More global than αβ-swap: All pixels can change their labels simultaneously. In experiments, α-expansion moves typically outperform αβ-swaps. We will therefore concentrate on α-expansions. γ α Current labeling β γ Ich stelle nun einen zweiten Move vor, den sogenannten Alpha-Expansion Move. Hier wählen wir ein einziges Label zufällig aus und bezeichnen diesen mit Alpha. Die simple Regel bei einem Alpha-Expansion Move ist, dass jedes Pixel sein Label auf Alpha ändern kann. Genauso gut kann aber jedes Pixel auch seinen ursprünglichen Label behalten. (Beispiel in Abbildung) Eine Alpha-Expansion ist globaler als ein Alpha-Beta Swap. Während bei einem Alpha-Beta Swap nur jene Pixel geändert werden dürfen, die entweder Label Alpha oder Beta tragen, darf bei einer Alpha-Expansion jeder beliebige Pixel sein Label ändern. Dieses höhere Maß an Globalität stellt auch den Grund dar, warum Alpha-Expansions in Experimenten typischerweise besser funktionieren als Alpha-Beta Swaps. Ich werde daher im Folgenden nicht weiter auf Alpha-Beta Swaps eingehen und bei den Alpha-Expansions bleiben. α One possible labeling after α-expansion

The Key Problem β α γ α β γ α β β γ γ α α α-exp 1 α-exp 2 α β γ α β α-exp 3 β α-exp 4 γ γ α Current labeling α Auf der vorigen Folie habe ich einen möglichen Alpha-Expansion Move gezeigt. Die Menge aller möglichen Alpha-Expansion Moves ist extrem groß (Beispiele auf der Folie). In unserem Optimierungsverfahren wollen wir nicht einen beliebigen Alpha-Expansion Move wählen, sondern einen möglichst guten. Was meine ich mit möglichst gut? There is an extremely large number of possible α-expansions. The key challenge is to find the “best” α-expansion, i.e. the one that leads to the largest decrease of our energy. Good news: For our energy function, we can solve this problem in an exact and fast way via solving a min-cut problem in a graph.

The Key Problem E = 1000 E = 2000 β α γ E = 1000 α β γ α β β γ γ α α α-exp 1 α-exp 2 α β γ α β α-exp 4 α-exp 3 β γ γ α Current labeling α We should take this one In diesem Beispiel hat unsere momentane Lösung eine Energie von 1000. … Das Schlüsselproblem ist jene Alpha-Expansion zu finden welche niedrigere Energie hat als alle möglichen anderen. Dies ist das zentrale Problem. Beachtet dass die optimale Alpha-Expansion immer gleiche oder niedrigere Energie als die momentane Energie besitzen muß. Falls jede Labeländerung zu einer schlechteren Lösung mit höherer Energie führt, dann stellt die momentane Lösung auch die beste Alpha-Expansion dar. Die beste Alpha-Expansion hat daher immer gleiche oder niedrigere Energie als die momentane Lösung. There is an extremely large number of possible α-expansions. The key challenge is to find the “best” α-expansion, i.e. the one that leads to the largest decrease of our energy. Good news: For our energy function, we can solve this problem in an exact and fast way via solving a min-cut problem in a graph. E = 500 E = 750

The Key Problem β α γ α β γ α β β γ γ α α α-exp 1 α-exp 2 α β γ α β α-exp 3 β α-exp 4 γ γ α Current labeling α Die gute Nachricht ist, dass man dieses Problem exakt und schnell lösen kann. Dies geschieht durch das Berechnen eines minimalen Schnitts (Min-Cut) in einem speziellen Graphen. Daher stammt die Bezeichnung Graph-Cuts. Wir beschäftigen mit diesem Problem in Kürze. There is an extremely large number of possible α-expansions. The key challenge is to find the “best” α-expansion, i.e. the one that leads to the largest decrease of our energy. Good news: For our energy function, we can solve this problem in an exact and fast way via solving a min-cut problem in a graph.

Iterative Algorithm – α-Expansion Let us for now assume that we know how to compute the optimal α-expansion. We can incorporate the α-expansion as follows. Iterative Algorithm: Start with an arbitrary labeling f. Loop (e.g. 3 times) For each allowed label α: Find f* = argmin E(f’) among f’ within one α-expansion of f f := f* Comment: Note that we compute the optimal α-expansion. Therefore, the energy will either decrease after α-expansion or stay the same (not changing the labeling at all is a feasible α-expansion). The algorithm will in any case converge to a (strong) local energy optimum. Laßt uns für jetzt davon ausgehen, dass wir bereits wissen, wie wir die optimale Alpha-Expansion berechnen. Wir können dann folgenden iterativen Algorithmus verwenden. … Hier gibt es eine zweite Schleife in welchem wir unser Label Set durchlaufen. Gehen wir davon aus, dass wir hier um einen Stereo Matching Algorithms handelt. Im ersten Durchlauf wird daher Alpha der Disparität 1 entsprechen. Wir berechnen nun den besten Alpha-Expansion Move ausgehend von unserer momentanen Lösung. In unserem Beispiel machen wir ein Alpha-Expansion auf Disparität 1. Die beste Alpha-Expansion habe ich mit f* bezeichnet. Die beste Alpha-Expansion wird nun zur momentanen Lösung f und ich setze die Schleife mit Disparität 2 fort.

Iterative Algorithm – Example Video (α-expansions for stereo matching)

Computing the Optimal α-Expansion There are 3 things you have to do to find the optimal α-expansion via graph-cuts: Write your energy as a pseudo-boolean function Construct a graph that represents your boolean function Compute the Minimum Cut in this graph These steps are discussed in the following.

Writing the Energy as a Pseudo-Boolean Function (1) We associate a boolean variable xp with each pixel p where: xp = 0 means that pixel p keeps its old label after α-expansion xp = 1 means that pixel p takes label α after α-expansion For example, if this is the current labeling: then x = leads to the label configuration: after α-expansion. We can represent all possible α-expansions by the boolean variables x. β β β γ γ 1 1 1 α α β γ α

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: β β

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: Recall our energy function: β β Potts model (Impose penalty P if p and q have different labels) Dissimilarity function

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: Recall our energy function: We can write our energy as a function of binary variables xp and xq: β β

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: Recall our energy function: We can write our energy as a function of binary variables xp and xq: β β

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: Recall our energy function: We can write our energy as a function of binary variables xp and xq: β β

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: Recall our energy function: We can write our energy as a function of binary variables xp and xq: β β We call these terms unary terms, since they depend on one variable. We call this term a pairwise term, since it depends on two variables.

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: Recall our energy function: We can write our energy as a function of binary variables xp and xq: where: β β

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: Recall our energy function: We can write our energy as a function of binary variables xp and xq: where: β β

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixels are assigned to label β in the current labeling: Recall our energy function: We can write our energy as a function of binary variables xp and xq: where: β β β β β α α β α α

Writing the Energy as a Pseudo-Boolean Function (2) Let us assume we have two pixels p and q. Both pixel are assigned to label β in the current labeling: Recall our energy function: We can write our energy as a function of binary variables xp and xq: where: We have to find the settings of binary variables xp and xq that minimize the energy. This comes next. β β β β β α α β α α

Example taken from Pushmeet Kohli’s ICCV09 tutorial The Min-Cut Problem We have two dedicated nodes, the source and the sink. source 2 9 2 p q 1 5 4 sink Example taken from Pushmeet Kohli’s ICCV09 tutorial

Example taken from Pushmeet Kohli’s ICCV09 tutorial The Min-Cut Problem We have two dedicated nodes, the source and the sink. We partition the graph into two sets S and T where S source 2 9 2 p q 1 5 4 T sink Example taken from Pushmeet Kohli’s ICCV09 tutorial

Example taken from Pushmeet Kohli’s ICCV09 tutorial The Min-Cut Problem We have two dedicated nodes, the source and the sink. We partition the graph into two sets S and T where The cut consists of all edges that lead from S to T. S source 2 9 2 p q 1 5 4 T sink Example taken from Pushmeet Kohli’s ICCV09 tutorial

Example taken from Pushmeet Kohli’s ICCV09 tutorial The Min-Cut Problem We have two dedicated nodes, the source and the sink. We partition the graph into two sets S and T where The cut consists of all edges that lead from S to T. The costs of a cut are the sum of weights of these edges. S source 2 9 2 p q 1 5 4 T sink Costs: 5 + 2 + 9 = 16 Example taken from Pushmeet Kohli’s ICCV09 tutorial

Example taken from Pushmeet Kohli’s ICCV09 tutorial The Min-Cut Problem We have two dedicated nodes, the source and the sink. We partition the graph into two sets S and T where The cut consists of all edges that lead from S to T. The costs of a cut are the sum of weights of these edges. The minimum cut is the cut of minimum costs among all possible cuts. S source 2 9 2 p q 1 5 4 T sink Costs: 2 + 1 + 4 = 7 Example taken from Pushmeet Kohli’s ICCV09 tutorial

The Min-Cut Problem The min-cut problem has been extensively studied in graph theory. There exists a variety of algorithms that Can find the exact solution Are computationally very fast. Side notes: The min-cut problem and the max-flow problem are dual problems: => Solving min-cut also gives the solution for max-flow and vice versa. Max-flow and min-cut are therefore often used synonymously. If you are interested in algorithms for computing min-cut/max-flow: Read [Boykov,PAMI04]

Nice, but does this help us to optimize our pseudo-boolean function? The Min-Cut Problem The min-cut problem has been extensively studied in graph theory. There exists a variety of algorithms that Can find the exact solution Are computationally very fast. Side notes: The min-cut problem and the max-flow problem are dual problems: => Solving min-cut also gives the solution for max-flow and vice versa. Max-flow and min-cut are therefore often used synonymously. If you are interested in algorithms for computing min-cut/max-flow: Read [Boykov,PAMI04] Nice, but does this help us to optimize our pseudo-boolean function?

Optimization of our Pseudo-Boolean Function We insert a node for each pixel. source p q sink

Optimization of our Pseudo-Boolean Function We insert a node for each pixel. If a node p is member of S after the cut, then xp = 0. S source => xp = 0 p q sink

Optimization of our Pseudo-Boolean Function We insert a node for each pixel. If a node p is member of S after the cut, then xp = 0. If p is member of T, then xp = 1 S source => xp = 0 p q => xq = 1 T sink

Optimization of our Pseudo-Boolean Function We insert a node for each pixel. If a node p is member of S after the cut, then xp = 0. If p is member of T, then xp = 1 We adjust the edges so that the costs of the cut are equal to the energy of our binary variables x. S source => xp = 0 p q => xq = 1 T sink The costs of this cut have to be equal to the energy of xp = 0 and xq = 1.

Optimization of our Pseudo-Boolean Function We insert a node for each pixel. If a node p is member of S after the cut, then xp = 0. If p is member of T, then xp = 1 We adjust the edges so that the costs of the cut are equal to the energy of our binary variables x. The minimum cut therefore also represents the minimum of our energy. S source => xp = 0 p q => xq = 1 T sink The costs of this cut have to be equal to the energy of xp = 0 and xq = 1.

Optimization of our Pseudo-Boolean Function We insert a node for each pixel. If a node p is member of S after the cut, then xp = 0. If p is member of T, then xp = 1 We adjust the edges so that the costs of the cut are equal to the energy of our binary variables x. The minimum cut therefore also represents the minimum of our energy. S How can we do this for our example? source => xp = 0 p q => xq = 1 T sink The costs of this cut have to be equal to the energy of xp = 0 and xq = 1.

Optimization of our Pseudo-Boolean Function Our unary terms: Our pairwise term: source p q sink

Optimization of our Pseudo-Boolean Function Our unary terms: Our pairwise term: source m(p,α) p q m(p,β) sink

Optimization of our Pseudo-Boolean Function Our unary terms: Our pairwise term: source m(p,α) m(q,α) p q m(p,β) m(q,β) sink

Optimization of our Pseudo-Boolean Function Our unary terms: Our pairwise term: source m(p,α) m(q,α) P p q P m(p,β) m(q,β) sink

Optimization of our Pseudo-Boolean Function Our unary terms: Our pairwise term: Let us check whether this graph really represents our energy. source m(p,α) m(q,α) P p q P m(p,β) m(q,β) sink

Optimization of our Pseudo-Boolean Function => xp = 0, xq = 0 Our unary terms: Our pairwise term: S source m(p,α) m(q,α) P p q P m(p,β) m(q,β) T sink Energy: E(0,1) = Ep(0)+Eq(0)+Ep,q(0,0) = m(p,β)+m(q,β)+0 Cut Costs: C = m(p,β)+m(q,β)

Optimization of our Pseudo-Boolean Function => xp = 0, xq = 1 Our unary terms: Our pairwise term: S source m(p,α) m(q,α) P p q P m(p,β) m(q,β) T sink Energy: E(0,1) = Ep(0)+Eq(1)+Ep,q(0,1) = m(p,β)+m(q,α)+P Cut Costs: C = m(p,β)+m(q,α)+P

Optimization of our Pseudo-Boolean Function => xp = 1, xq = 0 Our unary terms: Our pairwise term: S source m(p,α) m(q,α) P p q P m(p,β) m(q,β) T sink Energy: E(0,1) = Ep(1)+Eq(0)+Ep,q(1,0) = m(p,α)+m(q,β)+P Cut Costs: C = m(p,α)+m(q,β)+P

Optimization of our Pseudo-Boolean Function => xp = 1, xq = 1 Our unary terms: Our pairwise term: S source m(p,α) m(q,α) P p q P m(p,β) m(q,β) T sink Energy: E(0,1) = Ep(1)+Eq(1)+Ep,q(1,1) = m(p,α)+m(q,α)+0 Cut Costs: C = m(p,α)+m(q,α)

Optimization of our Pseudo-Boolean Function => xp = 1, xq = 1 Our unary terms: Our pairwise term: S We have shown that the graph represents our energy. source m(p,α) m(q,α) P p q P m(p,β) m(q,β) T sink Energy: E(0,1) = Ep(1)+Eq(1)+Ep,q(1,1) = m(p,α)+m(q,α)+0 Cut Costs: C = m(p,α)+m(q,α)

What Energy Function Can be Optimized via Graph-Cuts? Not every boolean energy function can be represented by a graph! The pairwise terms have to fulfill the following constraint [Kolmogorov,PAMI04]: In our example, this has been the case: If there is at least one pairwise term in the boolean energy function that violates this constraint, the energy is said to be non-submodular. Otherwise, it is called submodular. Optimizing non-submodular energies is an np-complete problem. => Computing the optimal α-expansion becomes very difficult.  (but not impossible)

Max-Flow/Min-Cut Library “I would like to use graph-cuts, but I do not want to mess around with graphs.” Good news: You don’t have to. It is sufficient to define your energy as a pseudo-boolean function. You can then download the Max-Flow/Min-Cut library from http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/software.html The library will: Construct the graph that represents your boolean function Compute the min-cut Provide you the optimal labeling See example on next slide.

Example Code for the Max-Flow/Min-Cut Library // Set up graph and add 2 nodes Graph *g = new Graph(); int p = g->AddNode(); int q = g->AddNode(); // Define boolean energy g->AddUnaryTerm(p, Ep(0), Ep(1)); g->AddUnaryTerm(q, Eq(0), Eq(1)); g->AddPairwiseTerm(p, q, Ep,q(0,0), Ep,q(0,1), Ep,q(1,0), Ep,q(1,1)); // Construct graph that represents the energy // Compute min-cut g->Solve(); // Write optimal labels printf (“optimal label p %d”, g->GetLabel(p)); printf (“optimal label q %d”, g->GetLabel(q));

The Fusion Move [Lemptisky,ICCV07] Two proposals are fused to obtain a new solution of lower energy. Fusion Move: Let fp denote pixel p’s label in proposal 1. Let gp denote p’s label in proposal 2. After fusion p is either assigned to fp or gp. α-expansion is a special case of a fusion move where the second proposal contains only a single label. Proposal 1 Proposal 2 One possible labeling after fusion of proposals 1 and 2

Iterative Algorithm – Fusion Moves Start with an arbitrary labeling f. For each proposal g: Find f* = argmin E(f’) among f’ being one possible fusion of f and g. f := f*

Iterative Algorithm – Example Video (Fusion moves for stereo matching)

Why Fusion Moves? (1) Parallelization: Parallel implementations of Min-Cut algorithms are very difficult to accomplish. We can do the following parallel implementation: CPU1 computes α-expansions for disparities 0-8 CPU2 computes α-expansions for disparities 9-16 The results of both CPUs are then fused Fusion Move

Why Fusion Moves? (2) You have two algorithms that have different failure modes. Opical flow example: Horn-Schunck: works well in untextured regions fails at flow borders Lucas-Kanade: fails in untextured regions works well at flow borders We run both algorithms and fuse their results Fusion move will pick Horn-Schunk result for untextured regions pick Lucas-Kanade result at flow borders => much better result Can even work in real-time First frame Ground truth optical flow Result – Horn-Schunck algo. Result – Lucas-Kanade algo. Fusion Move Fusion of both algorithms

Why Fusion Moves? (3) α-expansions will become intractable if there is a very large or infinite label set. For example: Large resolution stereo: You might need to test > 1000 disparity labels Optical flow: The space of all possible discrete flow vectors is very large (2 dimensions) Assigning pixels to continuous disparity values: The set of all continuous disparities is of infinite size Assigning pixels to 3D surfaces: There is an infinite amount of 3D surfaces. You will hear more about surface stereo in a different session. Fusion moves can handle all of these cases! Probably the most important argument.

Writing the Boolean Fusion Move Energy Let us assume we have two pixels p and q. Our 2 proposals have the following labeling: α β γ α Proposal 1 Proposal 2

Writing the Boolean Fusion Move Energy Let us assume we have two pixels p and q. Our 2 proposals have the following labeling: This time xp has a different meaning: xp = 0, if p takes the label of proposal 1 xp = 1, if p takes the label of proposal 2 α β γ α Proposal 1 Proposal 2

Writing the Boolean Fusion Move Energy Let us assume we have two pixels p and q. Our 2 proposals have the following labeling: This time xp has a different meaning: xp = 0, if p takes the label of proposal 1 xp = 1, if p takes the label of proposal 2 As before, we write our energy as a function of binary variables xp and xq: where: α β γ α Proposal 1 Proposal 2

Writing the Boolean Fusion Move Energy Let us assume we have two pixels p and q. Our 2 proposals have the following labeling: This time xp has a different meaning: xp = 0, if p takes the label of proposal 1 xp = 1, if p takes the label of proposal 2 As before, we write our energy as a function of binary variables xp and xq: where: α β γ α Proposal 1 Proposal 2

Writing the Boolean Fusion Move Energy Let us assume we have two pixels p and q. Our 2 proposals have the following labeling: This time xp has a different meaning: xp = 0, if p takes the label of proposal 1 xp = 1, if p takes the label of proposal 2 As before, we write our energy as a function of binary variables xp and xq: where: α β γ α Proposal 1 Proposal 2

Writing the Boolean Fusion Move Energy Let us assume we have two pixels p and q. Our 2 proposals have the following labeling: This time xp has a different meaning: xp = 0, if p takes the label of proposal 1 xp = 1, if p takes the label of proposal 2 As before, we write our energy as a function of binary variables xp and xq: where: α β γ α Proposal 1 Proposal 2 α β α α γ β γ α

Writing the Boolean Fusion Move Energy Let us assume we have two pixels p and q. Our 2 proposals have the following labeling: This time xp has a different meaning: xp = 0, if p takes the label of proposal 1 xp = 1, if p takes the label of proposal 2 As before, we write our energy as a function of binary variables xp and xq: where: Can you see a problem here? α β γ α Proposal 1 Proposal 2 α β α α γ β γ α

The Fusion Energy Can Be Non-Submodular Remember the condition for sub-modularity: Our example energy is non-submodular: Finding the optimal fusion move is, in general, an np-complete problem  That is actually the reason why fusion moves have not been used before 2007. Good news: Nowadays there exist powerful graph-cut-based optimization algorithms that can handle non-submodular energies. In particular, I mean Quadratic Pseudo Boolean Optimization (QPBO)

Quadratic Pseudo Boolean Optimization (QPBO) [Kolmogorov,PAMI07] QPBO can only compute a part of the global optimal solution This means Instead of a complete labeling such as xp = 0, xq = 1, xr = 0 QPBO will in general provide an incomplete labeling such as xp = 0, xq = ø, xr = ø where ø means “unknown”. Those pixel whose label ≠ ø would also have this label in the “complete” global optimal solution. Proposal 1 Proposal 2 Fused Result (computed via QPBOI) Pixels labeled as unknown by QPBO are shown in black

What to do with pixels labeled as unknown? Autarky property of QPBO: If you assign all unknown pixels to label 0, the energy is guaranteed to be lower or equal to the labeling <0,0,0,…,0>. In case of a fusion move, this means that assigning unknown pixels to the labels of proposal 1 will lead to a lower or equal energy than that of proposal 1. Assigning unknown pixels to label 0 is known as QPBOF. You can do more [Rother, CVPR07]: QPBOI (I stands for Improve): Tries to improve QPBOF solution. QPBOP (P stands for Probe): Tries to find more pixels of the global optimal solution. You can download QPBO: http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/software.html Also includes QPBOI and QPBOP Interface almost identical to MaxFlow Library.

Corrected Iterative Algorithm I have cheated in the definition of the iterative algorithm. Iterative Algorithm: Start with an arbitrary labeling f. For each proposal g: Find f* = argmin E(f’) among f’ being one possible fusion of f and g. f := f* In the general case, we cannot really compute the global optimal fusion move (np-complete problem). We just find a “good” one. The energy of f* is guaranteed to be equal or lower than that of f. (autarky property of QPBO). The iterative algorithm will therefore converge to a local energy minimum.

Summary Move making algorithms α-expansions: Sub-modularity condition Iterative algorithm Computing the optimal α-expansion Sub-modularity condition Fusion moves: Handle large label spaces Computing a “good” fusion move QPBO

References [Boykov,PAMI01] Y. Boykov, O. Veksler, R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI, vol. 23, no. 11, pp. 1222-1239, 2001. [Boykov,PAMI04] Y. Boykov, V. Kolmogorov, An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. PAMI, vol. 26, no. 9, pp. 1124-1137, 2004. [Kolmogorov,PAMI07] V. Kolmogorov, C. Rother, Minimizing Nonsubmodular Functions with Graph Cuts-A Review, PAMI, vol. 29, no. 7, pp. 1274-1279, 2007. [Lempitsky,ICCV07] V. Lempitsky, C. Rother, A. Blake, LogCut - Efficient Graph Cut Optimization for Markov Random Fields, ICCV 2007. [Rother,CVPR07] C. Rother, V. Kolmogorov, V. Lempitsky, M. Szummer, Optimizing Binary MRFs Via Extended Roof Duality, CVPR 2007. [Roy,ICCV98] S. Roy, I. Cox, A Maximum-Flow Formulation of the N-Camera Stereo Correspondence Problem“, ICCV 1998.