Stereo Computation using Iterative Graph-Cuts Vision Course - Weizmann Institute
The “Binocular” Stereo problem “Right” Camera “Left” Camera Two views of the same scene from a slightly different point of view
The stereo problem Both images are very similar (like images that you see with your two eyes) • Most of the pixels in the left image are present in the right image (except for few occlusions)
Rectification Image Reprojection reproject image planes onto common plane parallel to baseline
The stereo problem After rectification: all correspondences are along the same horizontal scan lines (pixels in one image simply shift horizontally in the other image)
The relation between depth and disparities Origin at midpoint between camera centers Axes parallel to those of the two (rectified) cameras Depth and Disparity are inverse proportional
The stereo problem The horizontal shifts between the images are sometimes called: “disparities” The Disparities are related to depth: Closer objects have larger disparities
The stereo problem: compute the disparity map between two images
Traditional Approaches • Matching small windows around each pixel • Each window is matched independently Modern approaches • Finding coherent correspondences for all pixels - “Graph cuts” - “Belief Propagation”
Window-Based Approach Compute a cost for each location Location with the lowest cost wins
General Problem : Ambiguity Left Right scanline
Window-Based Approach Small Window Large Window noisy in low texture areas blurred boundaries
Results with best window size (still not good enough) Window-based matching (best window size) Ground truth
Graph Cuts Graph cuts Ground truth
Maximum flow problem Max flow problem: S T flow F Each edge is a “pipe” Find the largest flow F of “water” that can be sent from the “source” to the “sink” along the pipes Edge weights give the pipe’s capacity “source” A graph with two terminals S T “sink”
Minimum cut problem Min cut problem: S T a cut C Find the cheapest way to cut the edges so that the “source” is completely separated from the “sink” Edge weights now represent cutting “costs” “source” A graph with two terminals S T “sink”
Max flow/Min cut theorem Maximum flow saturates the edges along the minimum cut. Ford and Fulkerson, 1962 Problem reduction! Ford and Fulkerson gave first polynomial time algorithm for globally optimal solution “source” A graph with two terminals S T “sink”
The basic Ford-Fulkerson algorithm for each edge do while there exists a path P from s to t in the residual network Gf do cf (P) ← min{cf (u, v ): (u, v) is on P} for each edge (u, v) in P do
Min-Cut: Important Rule No subset of the cut can also be a cut This is not a minimal cut
Energy Minimization Using Iterative Graph cuts Fast Approximate Energy Minimization via Graph Cuts Yuri Boykov, Olga Veksler and Ramin Zabih Pami 2001 More papers, code: http://www.cs.cornell.edu/~rdz/graphcuts.html
To do better we need a better model of images We can make reasonable assumptions about the surfaces in the world Usually assume that the surfaces are smooth Can pose the problem of finding the corresponding points as an energy (or cost) minimization: f - assignment neighboring pixels have similar disparities how well the pixels match up for different disparities
To do better we need a better model of images We can make reasonable assumptions about the surfaces in the world Usually assume that the surfaces are smooth Can pose the problem of finding the corresponding points as an energy (or cost) minimization: f - assignment p,q - pixels Data term is calculated for each pixels Smoothness is calculated on neighbor pixels
Example for Smoothness terms Quadratic L1 Truncated L1 Potts model
Constructing a Graph to Solve the Stereo Problem
Constructing a Graph to Solve the Stereo Problem
Constructing a Graph to Solve the Stereo Problem The labels of each pixel are the possible disparity values
Constructing a Graph to Solve the Stereo Problem The labels of each pixel are the possible disparity values
Relation between the Energy and the Graph labeling problem Smoothness term {fp=10} Data term 10 1 {fq=2} p q
Relation between the Energy and the Graph labeling problem Smoothness term Dp(10) Data term 10 V(p,q)(1, 10) 1 p q
Iterative graph-cuts Use an iterative scheme to find a “good” local optimum of the energy function. In each iteration: convert the original multi-label problem to a binary one, and solve it by finding a minimal graph-cut (max-flow). The most popular scheme is the expansion move. -expansion: set the label of each pixel to be either or the current label.
Types of Moves A Single Pixel Move Problem: A lot of local minima
Types of Moves Expansion Move Any pixel can change its label to alpha
Types of Moves Expansion Move Claim (without proof): The difference between the optimal solution and the solution from the iterative expansion moves is bounded
Energy Minimization Algorithm Start with arbitrary labeling f Set success = 0 For each label Find If set and success =1 If success =1 goto (2) Return f
Conditions on the Smoothness for using expansion moves: In other words: V should be a metric Note : The Quadratic smoothness is not a metric
For each pair of vertices such that we add a ‘dummy’ vertex (together with the respective edges as shown in the table).
The Relation between the cut and the Energy Given a cut C, we define a labeling fc by: The cost of a cut C is |C| = E(fC) (plus a constant) If the cut C separates p and If the cut C separates p and
The Relation between the cut and the Energy The case
The Relation between the cut and the Energy The case
Conditions on the Smoothness for using expansion moves: In other words: V should be a metric
The image segmentation problem: given an image, group it to several regions containing pixels with similar intensities / colors. an image a labeling
Supervised Image Segmentation We assume that for each segment, users scribbles are given: pixels that are known in be inside this segment: examples with two segments
Again, we can describe the problem using a graph Smoothness term Data term {fp=5} means – p belongs to the segment ’5’. V(p,q)(1, 10)– The penalty for assigning p and q to different segments (1 and 10 respectively). D(p)(5)– The penalty for assigning p to segment 5. 43
The Data & Smoothness terms The penalty for assigning p and q to different segments should be high if the colors of pixels p and q are similar. For example: As a data term, we have only one constraint: That the user scribbles will be assigned with the correct label: For pixels inside user scribbles for the rest of the pixels
Graph Cuts can be used for problems other than Stereo Graph Cuts can be used for problems other than Stereo ! (segmentation, noise removal, image stitching, etc’).