The plan for today Camera matrix Part A) Notation, preprocessing, and basic concepts. Part B) 4 Stereo Algorithms Slides are courtesy of Prof. Ronen Basri
Stereo Vision Objective: 3D reconstruction Input: 2 (or more) images taken with calibrated cameras Output: 3D structure of scene Steps: Rectification Matching Depth estimation
Rectification Image Reprojection reproject image planes onto common plane parallel to baseline Notice, only focal point of camera really matters (Seitz)
Rectification Any stereo pair can be rectified by rotating and scaling the two image planes (=homography) We will assume images have been rectified so Image planes of cameras are parallel. Focal points are at same height. Focal lengths same. Then, epipolar lines fall along the horizontal scan lines of the images
Cyclopean Coordinates Origin at midpoint between camera centers Axes parallel to those of the two (rectified) cameras
Disparity The difference is called “disparity” d is inversely related to Z: greater sensitivity to nearby points d is directly related to b: sensitivity to small baseline
Main Step: Correspondence Search What to match? Objects? More identifiable, but difficult to compute Pixels? Easier to handle, but maybe ambiguous Edges? Collections of pixels (regions)?
Random Dot Stereogram Using random dot pairs Julesz showed that recognition is not needed for stereo
Random Dot in Motion
Finding Matches
1D Search More efficient Fewer false matches SSD error disparity
Finding Matches Under what conditions pixels can be matched? Ignoring specularities, we can assume that matching pixels have the same brightness (constant brightness assumption) Still, changes in gain and sensitivity may change the values of pixels
Possible solutions Use larger windows Constraint the search Other metric e.g., Normalized correlation
Window Size W = 3 W = 20 Small window: accurate match is more likely Large window: less false positives W = 3 W = 20
Constraining the Search Restrict search to epipolar lines (1D search) Enforce ordering Problem: not always true Enforce smoothness Problem: discontinuities at object boundaries
Matching objects vs. Pixels Left Right scanline
Ordering
Ordering
Summary basic ideas Restrict search to epipolar lines (1D search) Use larger elements (larger windows, edges, regions) Problem: large elements may be distorted Enforce ordering Problem: not always true Other similarity measures (e.g., Normalized correlation) Enforce smoothness Problem: discontinuities at object boundaries
Comparison of Stereo Algorithms Scene Ground truth D. Scharstein and R. Szeliski. "A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms," International Journal of Computer Vision, 47 (2002), pp. 7-42.
Scharstein and Szeliski
Results with window correlation Window-based matching (best window size) Ground truth
Graph Cuts Graph cuts Ground truth
Correspondence as Optimization Most stereo algorithms attempt to minimize a functional that usually consists of two terms: where - penalizes for quality of a match (unary) - penalizes non smooth (or even non fronto-parallel) reconstructions (binary) Many different optimization approaches were proposed
Part B) 4 Stereo Algorithms Dynamic programming Minimal cut/Max flow Space carving Graph cut optimization
?
1D Methods: Dynamic Programming Discretize the 3-D space Find the correct curve at every slice (A slice = epipolar plane)
Dynamic programming Find correspondences of each epipolar line separately
Dynamic programming
Dynamic programming How do we find the best curve? Assign weight of all edges insertion match deletion
Dynamic programming How do we find the best curve? Assign weight of all edges Find shortest path Dijkstra insertion match deletion
Results
Dynamic programming Advantages Simple, efficient Globally optimal Disadvantages Each slice computed independently (smoothness is not enforced between slices) Problems due to discretization (tilted planes)
Stereo Algorithms Brief review of 4 algorithms: Dynamic programming Minimal cut/Max flow Space carving Graph cut optimization
Min Cut/Max Flow
Min Cut/Max Flow
Min Cut/Max Flow
Min Cut/Max Flow
Min Cut/Max Flow Main idea: Lets solve all DP problem together. Objective: find the optimal cut using all the slices simultaneously.
Min Cut/Max Flow
Min Cut/Max Flow Construct a graph: Every voxel (3-D point in space) is a node Every node is connected to its 6 neighbors
Min Cut/Max Flow Weights on the edges: Data cost: change in pixel value Neighbor In next slice/row data Neighbor Neighbor data Neighbor In next slice/row
Min Cut/Max Flow Weights on the edges: Data cost: change in pixel value Smoothness cost: change in depth smooth smooth smooth smooth
Min Cut/Max Flow Weights on the edges: Data cost: change in pixel value Smoothness cost: change in depth data
Min Cut/Max Flow Source Add source and sink Find min cut … ∞ ∞ Sink
Min Cut/Max Flow Data penalty Smoothness penalty
Results Input Min cut Dynamic programming
Min Cut/Max Flow Advantages All slices are optimized simultaneously Efficient Disadvantages Extension to multi-camera is difficult Discretization
Space Carving Multi-view stereo Every point in space corresponds to a match in the images Compute data term for each match
Space Carving Multi-view stereo Every point in space corresponds to a match in the images Compute data term for each match (“photo-consistency”) 0.2 0.3 0.9 0.8 0.4 0.5
Space Carving Dynamic data term (taking occlusion into account) Order of sweep is important
Space Carving
Space Carving Done for all slices simultaneously
Space Carving Done for all slices simultaneously
Space Carving Done for all slices simultaneously
Space Carving Computes a bound on the object, the visual hull More camera views: better result
Space Carving: Results
Space Carving: Results
Space Carving Advantages True multi-views stereo Handles occlusion Disadvantages Limited to visual hull Lacks smoothness term Noise may introduce holes, allowing for noise may thicken shape Discretization
Graph Cut Optimization Stereo is a minimization problem Possible solution: local search (gradient descent) Problem: inefficient, local minima Instead, search larger areas at every iteration
Graph Cut Optimization … 2 1 k Construct a graph to represent the problem: Nodes: Pixels (in first image) k discrete depth values Edges: From every pixel node to a depth node (data term) Neighboring nodes (smoothness) Assign weights corresponding to pixel intensities to get a global cost function depths pixels
Graph Cut Optimization Objective: Multiway cut Edges: Every pixel remains connected to one depth node Edges between neighboring nodes only if they are connected to same depth node Nodes are assigned the depth that they are connected to Multiway cut is NP-complete, solve iteratively … 1 2 3 k … … depths pixels
Graph Cut Optimization α-β swap Nodes labeled α or β, (i.e., connected to or ) can change their labeling to α or β Edges between neighbors are updated according to the new labeling Other edges are not changed Finding best swap = min cut! … 1 2 3 k α β … … pixels depths
Graph Cut Optimization Example: 1-2 swap … … 1 2 k 1 2 3 3 k … … … …
Graph Cut Optimization Example: 1-2 swap … 1 … 2 3 k 1 2 3 k … … Connect the nodes labeled 1 or 2 to both labels
Graph Cut Optimization Example: 1-2 swap … … 1 1 3 k 3 k Mark 1 as source and 2 as sink Find minimal cut 2 2
Graph Cut Optimization Example: 1-2 swap … 1 3 k … 1 2 3 k Erase edges that were on the cut Result: a new labeling of the 1,2 nodes 2
Graph Cut Optimization Start with an arbitrary labeling For every pair {α, β} є {1,…,k} Find the best α-β swap (minimizing the function) Update the graph (add and erase edges) Quit when no pair improves the cost function Induce pixel labels
Graph Cuts: Results
Graph Cut Optimization 1 … 2 3 k Advantages State of the art results Efficient Bound on approximation quality Same technique can be applied to other problems (e.g., image restoration) Disadvantages Discretization Occlusion Still room for improvement
Summary Stereo vision: shape reconstruction from two or more images Steps: Rectification Correspondence search Depth estimation Algorithms: Dynamic programming Min cut/max flow Space carving Graph cuts