What we didn’t have time for CS664 Lecture 26 Thursday 12/02/04 Some slides c/o Dan Huttenlocher, Stefano Soatto, Sebastian Thrun
Administrivia Final project is due at noon on Friday 12/17 Write-up only (5MB max) Be sure to include some pictures Send me if you missed any quiz for a good reason
Outline Geometry Graph-based segmentation Statistics
Homogeneous coordinates Identify a point in the image plane with ray passing through that point (pixel) (x,y) ´ ( x, y, ) for non-zero (X,Y,Z) ´ (X/Z,Y/Z,1) for non-zero Z
Advantages Many non-linear operations become linear in homogeneous coordinates Example: ( X, Y, Z ) projects to ( fX / Z, fY / Z ) 2D point 3D point 3x4 camera projection
Camera projection matrix
epipole Epipolar geometry epipolar plane epipolar line Stefano Soatto (c) 2002
Pencil of planes Different epipolar planes for different scene points x Plane defined by camera origins + x
Epipolar lines are important For pixel p in I there is a corresponding epipolar line in I’ This allows us to limit the search! Generalization of stereo to arbitrary camera positions Classical stereo has parallel cameras p
Example: verged stereo
Examples: motion Parallel to Image Plane Forward
Essential matrix E Ex is perpendicular to x ’s epipolar line in the other image So if x ’ corresponds to x then x ’ T Ex = 0 Captures the scene geometry We assume the cameras are calibrated Otherwise we get the fundamental matrix
Estimating the geometry The essential matrix has 5 parameters Can estimate from 5 corresponding points Fundamental matrix has 7 The question of “how few perfect correspondences do you need” has spawned an unfortunately large literature
Yet more optimization We can estimate the essential matrix from a bunch of point matches A similar technique can be used to compute structure from motion Bundle adjustment
RANSAC (line fitting) Variant of generate-and-test Pick a small set of points at random Fit them via least squares Points “far” from this line are outliers Repeat until you find a line with very few outliers
RANSAC (camera geometry) Pick a small set of corresponding pixels At least 5 (essential) or 7 (fundamental) Compute the matrix from these See how many corresponding pixels this matrix explains
Graph-based Segmentation
Segmentation by min cut Image Pixels w Similarity Measure Minimum Cut * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Min cuts don’t segment well Ideal Cut Cuts with lesser weight than the ideal cut * Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Normalized cuts Instead of the min cut, minimize Measure of dis-similarity between the sets A and B NP-hard to minimize Rely on continuous approximation
Normalized cuts examples
Limitations of normalized cuts Works by binary partitioning Slow and memory-intensive Textured backgrounds are problems
Other graph-based methods Many other variants on min cuts Typical cuts, nested cuts, etc. No clear winner for segmentation Perhaps mean shift?
MST-based segmentation Minimum spanning tree is the cheapest way to connect all pixels into a single component (or “region”) Merge two components when the cheapest edge between them is cheap compared to a measure of the internal variation Provably good segmentation under a fairly natural definition Neither too coarse nor too fine
Example output Solves many problems with normalized cuts
More statistics
Dimensionality reduction We can represent orange points only by their v 1 coordinate
Eigenfaces An n -pixel image is a point in < n Find low-dimensional representation of face images (from a training set) Recognition by finding the closest face in face space
Markov Random Fields MRF defining property: Hammersley-Clifford Theorem: neighborhood relationships ( n-links ) image pixels ( vertices ) - disparity at pixel p - configuration
MAP estimation of an MRF Observed data Likelihood function (sensor noise) Prior (MRF model) Bayes rule
Energy minimization Data term (sensor noise) Smoothness term (MRF prior)