Perceptual Organization: Segmentation and Optical Flow
Inspiration from psychology The Gestalt school: Grouping is key to visual perception –“The whole is greater than the sum of its parts” subjective contours occlusion familiar configuration
Gestalt grouping factors
Emergence
Motion and perceptual organization Even “impoverished” motion data can evoke a strong percept G. Johansson, “Visual Perception of Biological Motion and a Model For Its Analysis", Perception and Psychophysics 14, , YouTube video
Image segmentation
The goals of segmentation Obtain primitives for other tasks Perceptual organization, recognition Graphics, image manipulation
Goal 1: Primitives for other tasks Group together similar-looking pixels for efficiency of further processing “Bottom-up” process Unsupervised X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003.Learning a classification model for segmentation. “superpixels”
Image parsing or semantic segmentation: Segments as primitives for recognition J. Tighe and S. Lazebnik, ECCV 2010, IJCV 2013
Goal 2: Recognition Separate image into coherent “objects” “Bottom-up” or “top-down” process? Supervised or unsupervised? Berkeley segmentation database: image human segmentation
Goal 3: Image manipulation Interactive segmentation for graphics
Approaches to segmentation Segmentation as clustering Segmentation as graph partitioning Segmentation as labeling
Segmentation as clustering Source: K. Grauman
Image Intensity-based clustersColor-based clusters Segmentation as clustering K-means clustering based on intensity or color is essentially vector quantization of the image attributes Clusters don’t have to be spatially coherent
Segmentation as clustering Source: K. Grauman
Segmentation as clustering Clustering based on (r,g,b,x,y) values enforces more spatial coherence
Segmentation as graph partitioning Node for every pixel Edge between every pair of pixels (or every pair of “sufficiently close” pixels) Each edge is weighted by the affinity or similarity of the two nodes w ij i j Source: S. Seitz
Measuring affinity Represent each pixel by a feature vector x and define an appropriate distance function small σ large σ Role of σ
Segmentation as graph partitioning Break Graph into Segments Delete links that cross between segments Easiest to break links that have low affinity –similar pixels should be in the same segments –dissimilar pixels should be in different segments ABC Source: S. Seitz w ij i j
Graph cut Set of edges whose removal makes a graph disconnected Cost of a cut: sum of weights of cut edges A graph cut gives us a segmentation What is a “good” graph cut and how do we find one? A B Source: S. Seitz
Minimum cut We can do segmentation by finding the minimum cut in a graph Efficient algorithms exist for doing this Minimum cut example
Minimum cut We can do segmentation by finding the minimum cut in a graph Efficient algorithms exist for doing this Minimum cut example
Normalized cut Drawback: minimum cut tends to cut off very small, isolated components Ideal Cut Cuts with lesser weight than the ideal cut * Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Normalized cut To encourage larger segments, normalize the cut by the total weight of edges incident to the segment The normalized cut cost is: Intuition: big segments will have a large w(A,V), thus decreasing ncut(A, B) Finding the globally optimal cut is NP-complete, but a relaxed version can be solved using a generalized eigenvalue problem w(A, B) = sum of weights of all edges between A and B J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000Normalized cuts and image segmentation.
Normalized cut: Algorithm Let W be the affinity matrix of the graph (n x n for n pixels) Let D be the diagonal matrix with entries D(i, i) = Σ j W(i, j) Solve generalized eigenvalue problem (D − W)y = λDy for the eigenvector with the second smallest eigenvalue The ith entry of y can be viewed as a “soft” indicator of the component membership of the ith pixel Use 0 or median value of the entries of y to split the graph into two components To find more than two components: Recursively bipartition the graph Run k-means clustering on values of several eigenvectors
Example result
Challenge How to define affinities for segmenting highly textured images?
Segmenting textured images Convolve image with a bank of filters Find textons by clustering vectors of filter bank outputs J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001."Contour and Texture Analysis for Image Segmentation" Texton mapImage Filter bank
Segmenting textured images Convolve image with a bank of filters Find textons by clustering vectors of filter bank outputs Represent pixels by texton histograms computed over neighborhoods at some “local scale” Define affinities as similarities between local texton histograms J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001."Contour and Texture Analysis for Image Segmentation"
Pitfall of texture features Possible solution: check for “intervening contours” when computing affinities J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001."Contour and Texture Analysis for Image Segmentation"
Results: Berkeley Segmentation Engine
Berkeley Segmentation Engine
Pro Generic framework, can be used with many different features and affinity formulations Con High storage requirement and time complexity: involves solving a generalized eigenvalue problem of size n x n, where n is the number of pixels Normalized cuts: Pro and con
Efficient graph-based segmentation Runs in time nearly linear in the number of edges Easy to control coarseness of segmentations Results can be unstable P. Felzenszwalb and D. Huttenlocher, Efficient Graph-Based Image Segmentation, IJCV 2004Efficient Graph-Based Image Segmentation
Segmentation as labeling Suppose we want to segment an image into foreground and background Binary labeling problem Credit: N. Snavely
Segmentation as labeling Suppose we want to segment an image into foreground and background Binary labeling problem User sketches out a few strokes on foreground and background… How do we label the rest of the pixels? Source: N. Snavely
Binary segmentation as energy minimization Define a labeling L as an assignment of each pixel with a 0-1 label (background or foreground) Find the labeling L that minimizes data term smoothness term How similar is each labeled pixel to the foreground or background? Encourage spatially coherent segments Source: N. Snavely
: “distance” from pixel to background : “distance” from pixel to foreground { computed by creating a color model from user- labeled pixels Source: N. Snavely
Neighboring pixels should generally have the same labels Unless the pixels have very different intensities : similarity in intensity of p and q = 10.0 = 0.1 Source: N. Snavely
Binary segmentation as energy minimization For this problem, we can efficiently find the global minimum using the max flow / min cut algorithm Source: N. Snavely Y. Boykov and M.-P. Jolly, Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images, ICCV 2001Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images
Recall: Stereo as energy minimization I1I1 I2I2 D Energy functions of this form can be minimized using graph cuts Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001Fast Approximate Energy Minimization via Graph Cuts W1(i )W1(i )W 2 (i+D(i )) D(i )D(i ) data term smoothness term
GrabCut C. Rother, V. Kolmogorov, and A. Blake, “GrabCut” — Interactive Foreground Extraction using Iterated Graph Cuts, SIGGRAPH 2004“GrabCut” — Interactive Foreground Extraction using Iterated Graph Cuts