K-Means Segmentation
Segmentation
* * Pictures from Mean Shift: A Robust Approach toward Feature Space Analysis, by D. Comaniciu and P. Meer http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
Segmentation and Grouping Motivation: not information is evidence Obtain a compact representation from an image/motion sequence/set of tokens Should support application Broad theory is absent at present Grouping (or clustering) collect together tokens that “belong together” Fitting associate a model with tokens issues which model? which token goes to which element? how many elements in the model?
General ideas tokens top down segmentation bottom up segmentation whatever we need to group (pixels, points, surface elements, etc., etc.) top down segmentation tokens belong together because they lie on the same object bottom up segmentation tokens belong together because they are locally coherent These two are not mutually exclusive
Why do these tokens belong together? A disturbing possibility is that they all lie on a sphere -- but then, if we didn’t know that the tokens belonged together, where did the sphere come from? Why do these tokens belong together?
A driving force behind the gestalt movement is the observation that it isn’t enough to think about pictures in terms of separating figure and ground (e.g. foreground and background). This is (partially) because there are too many different possibilities in pictures like this one. Is a square with a hole in it the figure? or a white circle? or what?
Basic ideas of grouping in humans Figure-ground discrimination grouping can be seen in terms of allocating some elements to a figure, some to ground impoverished theory Gestalt properties elements in a collection of elements can have properties that result from relationships (Muller-Lyer effect) gestaltqualitat A series of factors affect whether elements should be grouped together Gestalt factors
The famous Muller-Lyer illusion; the point is that the horizontal bar has properties that come only from its membership in a group (it looks shorter in the lower picture, but is actually the same size) and that these properties can’t be discounted--- you can’t look at this figure and ignore the arrowheads and thereby make the two bars seem to be the same size.
Some criteria that tend to cause tokens to be grouped.
More such
Occlusion cues seem to be very important in grouping Occlusion cues seem to be very important in grouping. Most people find it hard to read the 5 numerals in this picture
but easy in this
The story is in the book (figure 14.7)
Illusory contours; a curious phenomenon where you see an object that appears to be occluding.
Groupings by Invisible Completions
University of Missouri at Columbia Here, the 3D nature of grouping is apparent: Why do these tokens belong together? Corners and creases in 3D, length is interpreted differently: In Out The (in) line at the far end of corridor must be longer than the (out) near line if they measure to be the same size A disturbing possibility is that they all lie on a sphere -- but then, if we didn’t know that the tokens belonged together, where did the sphere come from? University of Missouri at Columbia
And the famous invisible dog eating under a tree:
A Final Example
Segmentation as clustering Cluster together (pixels, tokens, etc.) that belong together Agglomerative clustering attach closest to cluster it is closest to repeat Divisive clustering split cluster along best boundary Point-Cluster distance single-link clustering complete-link clustering group-average clustering Dendrograms yield a picture of output as clustering process continues
Simple clustering algorithms
K-Means Choose a fixed number of clusters Algorithm Choose cluster centers and point-cluster allocations to minimize error can’t do this by search, because there are too many possible allocations. Algorithm fix cluster centers; allocate points to closest cluster fix allocation; compute best cluster centers x could be any set of features for which we can compute a distance (careful about scaling) * From Marc Pollefeys COMP 256 2003
K-Means
K-Means * From Marc Pollefeys COMP 256 2003
Image Segmentation by K-Means Select a value of K Select a feature vector for every pixel (color, texture, position, or combination of these etc.) Define a similarity measure between feature vectors (Usually Euclidean Distance). Apply K-Means Algorithm. Apply Connected Components Algorithm. Merge any components of size less than some threshold to an adjacent component that is most similar to it. * From Marc Pollefeys COMP 256 2003
Results of K-Means Clustering: I gave each pixel the mean intensity or mean color of its cluster --- this is basically just vector quantizing the image intensities/colors. Notice that there is no requirement that clusters be spatially localized and they’re not. Image Clusters on intensity Clusters on color K-means clustering using intensity alone and color alone
K-Means Is an approximation to EM We notice: Model (hypothesis space): Mixture of N Gaussians Latent variables: Correspondence of data and Gaussians We notice: Given the mixture model, it’s easy to calculate the correspondence Given the correspondence it’s easy to estimate the mixture models
Generalized K-Means (EM)
University of Missouri at Columbia K-Means Choose a fixed number of clusters Choose cluster centers and point-cluster allocations to minimize error can’t do this by search, because there are too many possible allocations. University of Missouri at Columbia
K-means using color alone, 11 segments Image Clusters on color K-means using color alone, 11 segments
K-means using color alone, 11 segments.
K-means using colour and position, 20 segments Here I’ve represented each pixel as (r, g, b, x, y), which means that segments prefer to be spatially coherent. THese are just some of 20 segments.
Graph theoretic clustering Represent tokens (which are associated with each pixel) using a weighted graph. affinity matrix (pij has affinity of 0) Cut up this graph to get subgraphs with strong interior links and weaker exterior links
Graphs Representations b c e d Adjacency Matrix: W
Weighted Graphs a b c e 6 d Weight Matrix: W
Minimum Cut A cut of a graph G is the set of edges S such that removal of S from G disconnects G. Minimum cut is the cut of minimum weight, where weight of cut <A,B> is given as
Minimum Cut and Clustering
Image Segmentation & Minimum Cut Pixel Neighborhood Image Pixels w Similarity Measure Minimum Cut
Minimum Cut There can be more than one minimum cut in a given graph All minimum cuts of a graph can be found in polynomial time1.
Finding the Minimal Cuts: Spectral Clustering Overview Block-Detection Similarities Data
Eigenvectors and Blocks Block matrices have block eigenvectors: Near-block matrices have near-block eigenvectors: [Ng et al., NIPS 02] 1= 2 2= 2 3= 0 4= 0 1 .71 .71 eigensolver 1= 2.02 2= 2.02 3= -0.02 4= -0.02 1 .2 -.2 .71 .69 .14 -.14 .69 .71 eigensolver
Spectral Space Can put items into blocks by eigenvectors: Resulting clusters independent of row ordering: e1 1 .2 -.2 .71 .69 .14 -.14 .69 .71 e2 e1 e2 e1 1 .2 -.2 .71 .14 .69 .69 -.14 .71 e2 e1 e2
The Spectral Advantage The key advantage of spectral clustering is the spectral space representation:
Clustering and Classification Once our data is in spectral space: Clustering Classification
Measuring Affinity Intensity Distance Texture here c(x) is a vector of filter outputs. A natural thing to do is to square the outputs of a range of different filters at different scales and orientations, smooth the result, and rack these into a vector. Texture
Scale affects affinity This is figure 14.18
Drawbacks of Minimum Cut Weight of cut is directly proportional to the number of edges in the cut. Cuts with lesser weight than the ideal cut Ideal Cut
Normalized Cuts1 Normalized cut is defined as Ncut(A,B) is the measure of dissimilarity of sets A and B. Small if Weights between clusters small Weights within a cluster large Minimizing Ncut(A,B) maximizes a measure of similarity within the sets A and B
Finding Minimum Normalized-Cut Finding the Minimum Normalized-Cut is NP-Hard. Polynomial Approximations are generally used for segmentation
Finding Minimum Normalized-Cut
Finding Minimum Normalized-Cut It can be shown that such that If y is allowed to take real values then the minimization can be done by solving the generalized eigenvalue system
Algorithm Compute matrices W & D Solve for eigen vectors with the smallest eigen values Use the eigen vector with second smallest eigen value to bipartition the graph Recursively partition the segmented parts if necessary.
Example eigenvector points matrix eigenvector
More than two segments Two options Recursively split each side to get a tree, continuing till the eigenvalues are too small Use the other eigenvectors
More than two segments
Normalized cuts Current criterion evaluates within cluster similarity, but not across cluster difference Instead, we’d like to maximize the within cluster similarity compared to the across cluster difference Write graph as V, one cluster as A and the other as B Maximize i.e. construct A, B such that their within cluster similarity is high compared to their association with the rest of the graph
by Shi and Malik, copyright IEEE, 1998 This is figure 14.23 - caption there explains it all. Figure from “Image and video segmentation: the normalised cut framework”, by Shi and Malik, copyright IEEE, 1998
Figure from “Normalized cuts and image segmentation,” Shi and Malik, copyright IEEE, 2000 This is figure 14.24, whose caption gives the story
Drawbacks of Minimum Normalized Cut Huge Storage Requirement and time complexity Bias towards partitioning into equal segments Have problems with textured backgrounds