Lecture 19: Solving the Correspondence Problem with Graph Cuts CAP 5415 Fall 2006
Announcements PS4 Only need to search over a limited number of disparities Schedule Changes Project Proposals due next Monday Class canceled on Wednesday Will have class on November 22
Stereo Once we have rectified images, the hard problem is find the corresponding points Then we can triangulate Known as the correspondence problem When people say “stereo algorithm”, they usually mean “correspondence algorithm”
A very simple algorithm To find the point that matches this point
A very simple algorithm Look at a square window around that point and compare it to windows in the other image Could measure sum of squared differences (SSD) or correlation
What's wrong with this algorithm? This set of correspondences don't bother it Areas of the image without texture will be a problem for this algorithm.
To do better we need a better model of images We can make reasonable assumptions about the surfaces in the world Usually assume that the surfaces are smooth Can pose the problem of finding the corresponding points as an energy (or cost) minimization The data term measures how well the local windows match up for different disparities
The smoothness cost The smoothness is usually implemented by penalizing differences in the disparity Different penalty functions lead to different assumptions about the surfaces log(1+x^2) x^2 First deriv, Second derivative
Today Assuming that there a discrete set of possible disparities How do we optimize this?
Today I'll be talking about the Graph Cuts algorithm Very popular Used heavily in vision and graphics Paper: “Fast Approximate Energy Minimization Notation for the rest of the lecture
Simple Way to Optimize - ICM Choose a pixel Fix the disparities at the rest of the pixels Set the pixel to the optimal disparity Iterate
ICM Advantages Energy always guaranteed to decrease Easy to code Disadvantages Convergence Can you think of the big on?
Making Bigger Moves The problem with ICM is that you can only modify one pixel at a time Get stuck in local minima easily We need a way of moving multiple pixels at once Boykov, Veksler and Zabih introduced two types of moves: alpha-beta swap alpha-expansion
Types of Moves ICM – One pixel moves (From BKZ-PAMI 01)
alpha-beta swap Fix all nodes that aren't labeled alpha or beta With remaining nodes, find optimal swap Some alpha nodes change to beta Vice Versa Some stay the same (From BKZ-PAMI 01)
alpha-expansion Any node can change to alpha (From BKZ-PAMI 01)
Basic Algorithm ICM: Compute one-pixel moves until convergence Graph Cuts: Compute swaps (or expansions) until convergence Since the optimal swap is being computed each time, energy always decreases
Important Question How do we find the swaps? Use min-cut from graphs Slightly different from the min-cut problem that we discussed in the context of normalized cuts Terminal Node
Minimum Cut A cut is a set of edges that we will remove so that the terminal nodes are separated If a cost is assigned to each edge, the cost of the cut is the weight assigned to each edge Minimum Cut can be found in polynomial time (Ford and Fulkerson) Terminal Node
Important Rule No proper subset of the cut can also be a cut This is not a minimum cut Terminal Node
Using the minimum-cut to find a swap Consider a 1-D image Terminal Node (From BKZ-PAMI 01)
Can guarantee that every node will have at least one t-link (From BKZ-PAMI 01)
How to see that (From BKZ-PAMI 01)
Actually Computing the Min-Cut Could use code to solve general min-cut problems Boykov and Kolmogorov have released an algorithm optimized for vision problems
Expansion Move The difference between the optimal solution and the solution from the expansion move is bounded
What functions can be minimized using graph cuts V is called a metric if it obeys all 3 Can use Expansion Moves V is called a semimetric if it obeys the last two Can only use Swap-Moves Subject of recent research.
Graph-Cuts has been shown to perform very well
From “A Comparative Study of Energy Minimization Methods for Markov Random Fields”
Used for much more than just stereo