Download presentation
Presentation is loading. Please wait.
Published byMarilyn Simmons Modified over 9 years ago
1
Segmentation Techniques Luis E. Tirado PhD qualifying exam presentation Northeastern University
2
Page 2 Segmentation Spectral Clustering – Graph-cut – Normalized graph-cut Expectation Maximization (EM) clustering 9/6/2015
3
Page 3 Segmentation Spectral Clustering – Graph-cut – Normalized graph-cut Expectation Maximization (EM) clustering 9/6/2015
4
Page 4 Graph Theory Terminology Graph G(V,E) – Set of vertices and edges – Numbers represent weights Graphs for Clustering – Points are vertices – Weights reduced with distance – Segmentation: look for minimum cut in graph 9/6/2015 A B
5
Page 5 Spectral Clustering Graph-cut – Undirected, weighted graph G = (V,E) as affinity matrix A – Use eigenvectors for segmentation Assume k elements and c clusters Represent cluster n with vector w of k components Values represent cluster association; normalize so that Extract good clusters Select w n which maximizes Solution is w n is an eigenvector of A; select eigenvector with largest eigenvalue 9/6/2015 from Forsyth & Ponce 1 2 3 4 5 7 6 9 8 1 1
6
Page 6 Spectral Clustering Normalized Cut – Address drawbacks of graph-cut – Define association between vertex subset A and full set V as: – Previously maximized assoc(A,A); now also wish to minimize assoc(A,V). Define normalized cut as: 9/6/2015
7
Page 7 Spectral Clustering Normalized Cuts Algorithm – Definewhere A is affinity matrix. – Define vector x depicting cluster membership x i = 1 if point i is in A, and -1, otherwise – Define real approximation to x: – We now wish to minimize objective function: – This constitutes solving: – Solution is eigenvector with second smallest eigenvalue – If normcut is over some threshold, re-partition graph. 9/6/2015
8
Page 8 Probabilistic Mixture Resolving Approach to Clustering Expectation Maximization (EM) Algorithm – Density estimation of data points in unsupervised setting – Finds ML estimates when data depends on latent variables E step – likelihood expectation including latent variables as observed M step – computes ML estimates of parameters by maximizing above – Start with Gaussian Mixture Model: – Segmentation: reformulate as missing data problem Latent variable Z provides labeling – Gaussian bivariate PDF: 9/6/2015
9
Page 9 Probabilistic Mixture Resolving Approach to Clustering EM Process – Maximize log-likelihood function: – Not trivial; introduce Z, & denote complete data Y = [X T Z T ] T : – We know above data; ML is easy: 9/6/2015
10
Page 10 Probabilistic Mixture Resolving Approach to Clustering EM steps 9/6/2015
11
Page 11 Spectral Clustering Results
12
Page 12 Spectral Clustering Results
13
Page 13 EM Clustering Results
14
Page 14 Conclusions For simple case like example of four Gaussians, both algorithms perform well, as can be seen from results From literature: (k = # of clusters) – EM is good for small k; coarse segmentation for large k Needs to know number of components to cluster Initial conditions are essential; prior knowledge helpful to accelerate convergence and achieving a local/global maximum of likelihood – Ncut gives good results for large k For fully connected graph, intensive space & computation time requirements – Graph cut’s first eigenvector approach finds points in the ‘dominant’ cluster Not very consistent; literature advocates for normalized approach – In end, tradeoff depending on source data
15
Page 15 References (for slide images) J. Shi & J. Malik “Normalized Cuts and Image Segmentation” – http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf C. Bishop “Latent Variables, Mixture Models and EM” – http://cmp.felk.cvut.cz/cmp/courses/recognition/Resources/_EM/Bishop-EM.ppt R. Nugent & L. Stanberry “Spectral Clustering” – http://www.stat.washington.edu/wxs/Stat593-s03/Student- presentations/SpectralClustering2.ppt S. Candemir “Graph-based Algorithms for Segmentation” – http://www.bilmuh.gyte.edu.tr/BIL629/special%20section- %20graphs/GraphBasedAlgorithmsForComputerVision.ppt W. H. Liao “Segmentation: Graph-Theoretic Clustering” – http://www.cs.nccu.edu.tw/~whliao/acv2008/segmentation_by_graph.ppt D. Forsyth & J. Ponce “Computer Vision: A Modern Approach”
16
Page 16 Supplementary material
17
Page 17 K-means (used by some clustering algorithms) Determine Euclidean distance of each object in data set to (randomly picked) center points Construct K clusters by assigning all points to closest cluster Move the center points to the real centers of the resulting clusters
18
Page 18 Responsibilities Responsibilities assign data points to clusters such that Example: 5 data points and 3 clusters
19
Page 19 K-means Cost Function prototypes responsibilities data
20
Page 20 Minimizing the Cost Function E-step: minimize w.r.t. – assigns each data point to nearest prototype M-step: minimize w.r.t – gives – each prototype set to the mean of points in that cluster Convergence guaranteed since there is a finite number of possible settings for the responsibilities
21
Page 21 Limitations of K-means Hard assignments of data points to clusters – small shift of a data point can flip it to a different cluster Not clear how to choose the value of K – and value must be chosen beforehand. – Solution: replace ‘hard’ clustering of K-means with ‘soft’ probabilistic assignments of EM Not robust to outliers – Far data from centroid may pull centroid away from real one.
22
Page 22 Example: Mixture of 3 Gaussians
23
Page 23 Contours of Probability Distribution
24
Page 24 EM Algorithm – Informal Derivation Let us proceed by simply differentiating the log likelihood Setting derivative with respect to equal to zero gives giving which is simply the weighted mean of the data
25
Page 25 Ng, Jordan, Weiss Algorithm Form the matrix Find, the k largest eigenvectors of L These form the columns of the new matrix X – Note: have reduced dimension from nxn to nxk
26
Page 26 Ng, Jordan, Weiss Algorithm Form the matrix Y – Renormalize each of X’s rows to have unit length – – Y Treat each row of Y as a point in Cluster into k clusters via K-means Final Cluster Assignment – Assign point to cluster j iff row i of Y was assigned to cluster j
27
Page 27 Reasoning for Ng If we eventually use K-means, why not just apply K-means to the original data? This method allows us to cluster non-convex regions
28
Page 28 User’s Prerogative Choice of k, the number of clusters Choice of scaling factor – Realistically, search over and pick value that gives the tightest clusters Choice of clustering method
29
Page 29 Comparison of Methods AuthorsMatrix usedProcedure/Eigenvectors used Perona & FreemanAffinity A1 st (largest) eigenvector x: Recursive procedure; can be used non-recursively with k-largest eigenvectors for simple cases Shi & MalikD-A with D a degree matrix 2 nd smallest generalized eigenvector Also recursive Ng, Jordan, WeissAffinity A, User inputs k Normalizes A. Finds k eigenvectors, forms X. Normalizes X, clusters rows
30
Page 30 Advantages/Disadvantages Perona & Freeman – For block diagonal affinity matrices, the first eigenvector finds points in the “dominant” cluster; not very consistent Shi & Malik – 2 nd generalized eigenvector minimizes affinity between groups by affinity within each group; no guarantee, constraints Ng, Jordan, Weiss – Again depends on choice of k – Claim: effectively handles clusters whose overlap or connectedness varies across clusters
31
Page 31 Affinity Matrix Perona/Freeman Shi/Malik 1 st eigenv. 2 nd gen. eigenv. Affinity Matrix Perona/Freeman Shi/Malik 1 st eigenv. 2 nd gen. eigenv. Affinity Matrix Perona/Freeman Shi/Malik 1 st eigenv. 2 nd gen. eigenv.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.