Segmentation Techniques Luis E. Tirado PhD qualifying exam presentation Northeastern University.

Slides:



Advertisements
Similar presentations
Spectral Clustering Eyal David Image Processing seminar May 2008.
Advertisements

Clustering Beyond K-means
Supervised Learning Recap
Normalized Cuts and Image Segmentation
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Visual Recognition Tutorial
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Lecture 6 Image Segmentation
Image segmentation. The goals of segmentation Group together similar-looking pixels for efficiency of further processing “Bottom-up” process Unsupervised.
Spectral Clustering Course: Cluster Analysis and Other Unsupervised Learning Methods (Stat 593 E) Speakers: Rebecca Nugent1, Larissa Stanberry2 Department.
Normalized Cuts and Image Segmentation Jianbo Shi and Jitendra Malik, Presented by: Alireza Tavakkoli.
Region Segmentation. Find sets of pixels, such that All pixels in region i satisfy some constraint of similarity.
Image Segmentation Chapter 14, David A. Forsyth and Jean Ponce, “Computer Vision: A Modern Approach”.
Segmentation CSE P 576 Larry Zitnick Many slides courtesy of Steve Seitz.
Announcements Project 2 more signup slots questions Picture taking at end of class.
Normalized Cuts Demo Original Implementation from: Jianbo Shi Jitendra Malik Presented by: Joseph Djugash.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Segmentation Graph-Theoretic Clustering.
Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction
Expectation Maximization for GMM Comp344 Tutorial Kai Zhang.
Visual Recognition Tutorial
What is it? When would you use it? Why does it work? How do you implement it? Where does it stand in relation to other methods? EM algorithm reading group.
Part 3 Vector Quantization and Mixture Density Model CSE717, SPRING 2008 CUBS, Univ at Buffalo.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Clustering with Bregman Divergences Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, Joydeep Ghosh Presented by Rohit Gupta CSci 8980: Machine Learning.
Computer Vision - A Modern Approach Set: Segmentation Slides by D.A. Forsyth Segmentation and Grouping Motivation: not information is evidence Obtain a.
Radial Basis Function Networks
Image Segmentation Image segmentation is the operation of partitioning an image into a collection of connected sets of pixels. 1. into regions, which usually.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Image Segmentation Rob Atlas Nick Bridle Evan Radkoff.
Segmentation using eigenvectors
CSSE463: Image Recognition Day 34 This week This week Today: Today: Graph-theoretic approach to segmentation Graph-theoretic approach to segmentation Tuesday:
Segmentation using eigenvectors Papers: “Normalized Cuts and Image Segmentation”. Jianbo Shi and Jitendra Malik, IEEE, 2000 “Segmentation using eigenvectors:
Region Segmentation Readings: Chapter 10: 10.1 Additional Materials Provided K-means Clustering (text) EM Clustering (paper) Graph Partitioning (text)
Segmentation Course web page: vision.cis.udel.edu/~cv May 7, 2003  Lecture 31.
Chapter 14: SEGMENTATION BY CLUSTERING 1. 2 Outline Introduction Human Vision & Gestalt Properties Applications – Background Subtraction – Shot Boundary.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
CSC321: Neural Networks Lecture 12: Clustering Geoffrey Hinton.
CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.
Clustering Gene Expression Data BMI/CS 576 Colin Dewey Fall 2010.
Spectral Clustering Jianping Fan Dept of Computer Science UNC, Charlotte.
HMM - Part 2 The EM algorithm Continuous density HMM.
Lecture 6 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
Computer Vision Lecture 6. Probabilistic Methods in Segmentation.
Prototype Classification Methods Fu Chang Institute of Information Science Academia Sinica ext. 1819
Lecture 2: Statistical learning primer for biologists
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Flat clustering approaches
Chapter 13 (Prototype Methods and Nearest-Neighbors )
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: MLLR For Two Gaussians Mean and Variance Adaptation MATLB Example Resources:
 In the previews parts we have seen some kind of segmentation method.  In this lecture we will see graph cut, which is a another segmentation method.
Advanced Artificial Intelligence Lecture 8: Advance machine learning.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Normalized Cuts and Image Segmentation Patrick Denis COSC 6121 York University Jianbo Shi and Jitendra Malik.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
Gaussian Mixture Model classification of Multi-Color Fluorescence In Situ Hybridization (M-FISH) Images Amin Fazel 2006 Department of Computer Science.
Clustering Usman Roshan.
CSSE463: Image Recognition Day 34
Classification of unlabeled data:
Latent Variables, Mixture Models and EM
Jianping Fan Dept of CS UNC-Charlotte
Segmentation Graph-Theoretic Clustering.
Grouping.
Probabilistic Models with Latent Variables
Spectral Clustering Eric Xing Lecture 8, August 13, 2010
3.3 Network-Centric Community Detection
Announcements Project 4 questions Evaluations.
Generally Discriminant Analysis
CSSE463: Image Recognition Day 34
Clustering Usman Roshan CS 675.
Presentation transcript:

Segmentation Techniques Luis E. Tirado PhD qualifying exam presentation Northeastern University

Page 2 Segmentation Spectral Clustering – Graph-cut – Normalized graph-cut Expectation Maximization (EM) clustering 9/6/2015

Page 3 Segmentation Spectral Clustering – Graph-cut – Normalized graph-cut Expectation Maximization (EM) clustering 9/6/2015

Page 4 Graph Theory Terminology Graph G(V,E) – Set of vertices and edges – Numbers represent weights Graphs for Clustering – Points are vertices – Weights reduced with distance – Segmentation: look for minimum cut in graph 9/6/2015 A B

Page 5 Spectral Clustering Graph-cut – Undirected, weighted graph G = (V,E) as affinity matrix A – Use eigenvectors for segmentation Assume k elements and c clusters Represent cluster n with vector w of k components  Values represent cluster association; normalize so that Extract good clusters  Select w n which maximizes  Solution is  w n is an eigenvector of A; select eigenvector with largest eigenvalue 9/6/2015 from Forsyth & Ponce

Page 6 Spectral Clustering Normalized Cut – Address drawbacks of graph-cut – Define association between vertex subset A and full set V as: – Previously maximized assoc(A,A); now also wish to minimize assoc(A,V). Define normalized cut as: 9/6/2015

Page 7 Spectral Clustering Normalized Cuts Algorithm – Definewhere A is affinity matrix. – Define vector x depicting cluster membership x i = 1 if point i is in A, and -1, otherwise – Define real approximation to x: – We now wish to minimize objective function: – This constitutes solving: – Solution is eigenvector with second smallest eigenvalue – If normcut is over some threshold, re-partition graph. 9/6/2015

Page 8 Probabilistic Mixture Resolving Approach to Clustering Expectation Maximization (EM) Algorithm – Density estimation of data points in unsupervised setting – Finds ML estimates when data depends on latent variables E step – likelihood expectation including latent variables as observed M step – computes ML estimates of parameters by maximizing above – Start with Gaussian Mixture Model: – Segmentation: reformulate as missing data problem Latent variable Z provides labeling – Gaussian bivariate PDF: 9/6/2015

Page 9 Probabilistic Mixture Resolving Approach to Clustering EM Process – Maximize log-likelihood function: – Not trivial; introduce Z, & denote complete data Y = [X T Z T ] T : – We know above data; ML is easy: 9/6/2015

Page 10 Probabilistic Mixture Resolving Approach to Clustering EM steps 9/6/2015

Page 11 Spectral Clustering Results

Page 12 Spectral Clustering Results

Page 13 EM Clustering Results

Page 14 Conclusions For simple case like example of four Gaussians, both algorithms perform well, as can be seen from results From literature: (k = # of clusters) – EM is good for small k; coarse segmentation for large k Needs to know number of components to cluster Initial conditions are essential; prior knowledge helpful to accelerate convergence and achieving a local/global maximum of likelihood – Ncut gives good results for large k For fully connected graph, intensive space & computation time requirements – Graph cut’s first eigenvector approach finds points in the ‘dominant’ cluster Not very consistent; literature advocates for normalized approach – In end, tradeoff depending on source data

Page 15 References (for slide images) J. Shi & J. Malik “Normalized Cuts and Image Segmentation” – C. Bishop “Latent Variables, Mixture Models and EM” – R. Nugent & L. Stanberry “Spectral Clustering” – presentations/SpectralClustering2.ppt S. Candemir “Graph-based Algorithms for Segmentation” – %20graphs/GraphBasedAlgorithmsForComputerVision.ppt W. H. Liao “Segmentation: Graph-Theoretic Clustering” – D. Forsyth & J. Ponce “Computer Vision: A Modern Approach”

Page 16 Supplementary material

Page 17 K-means (used by some clustering algorithms) Determine Euclidean distance of each object in data set to (randomly picked) center points Construct K clusters by assigning all points to closest cluster Move the center points to the real centers of the resulting clusters

Page 18 Responsibilities Responsibilities assign data points to clusters such that Example: 5 data points and 3 clusters

Page 19 K-means Cost Function prototypes responsibilities data

Page 20 Minimizing the Cost Function E-step: minimize w.r.t. – assigns each data point to nearest prototype M-step: minimize w.r.t – gives – each prototype set to the mean of points in that cluster Convergence guaranteed since there is a finite number of possible settings for the responsibilities

Page 21 Limitations of K-means Hard assignments of data points to clusters – small shift of a data point can flip it to a different cluster Not clear how to choose the value of K – and value must be chosen beforehand. – Solution: replace ‘hard’ clustering of K-means with ‘soft’ probabilistic assignments of EM Not robust to outliers – Far data from centroid may pull centroid away from real one.

Page 22 Example: Mixture of 3 Gaussians

Page 23 Contours of Probability Distribution

Page 24 EM Algorithm – Informal Derivation Let us proceed by simply differentiating the log likelihood Setting derivative with respect to equal to zero gives giving which is simply the weighted mean of the data

Page 25 Ng, Jordan, Weiss Algorithm Form the matrix Find, the k largest eigenvectors of L These form the columns of the new matrix X – Note: have reduced dimension from nxn to nxk

Page 26 Ng, Jordan, Weiss Algorithm Form the matrix Y – Renormalize each of X’s rows to have unit length – – Y Treat each row of Y as a point in Cluster into k clusters via K-means Final Cluster Assignment – Assign point to cluster j iff row i of Y was assigned to cluster j

Page 27 Reasoning for Ng If we eventually use K-means, why not just apply K-means to the original data? This method allows us to cluster non-convex regions

Page 28 User’s Prerogative Choice of k, the number of clusters Choice of scaling factor – Realistically, search over and pick value that gives the tightest clusters Choice of clustering method

Page 29 Comparison of Methods AuthorsMatrix usedProcedure/Eigenvectors used Perona & FreemanAffinity A1 st (largest) eigenvector x: Recursive procedure; can be used non-recursively with k-largest eigenvectors for simple cases Shi & MalikD-A with D a degree matrix 2 nd smallest generalized eigenvector Also recursive Ng, Jordan, WeissAffinity A, User inputs k Normalizes A. Finds k eigenvectors, forms X. Normalizes X, clusters rows

Page 30 Advantages/Disadvantages Perona & Freeman – For block diagonal affinity matrices, the first eigenvector finds points in the “dominant” cluster; not very consistent Shi & Malik – 2 nd generalized eigenvector minimizes affinity between groups by affinity within each group; no guarantee, constraints Ng, Jordan, Weiss – Again depends on choice of k – Claim: effectively handles clusters whose overlap or connectedness varies across clusters

Page 31 Affinity Matrix Perona/Freeman Shi/Malik 1 st eigenv. 2 nd gen. eigenv. Affinity Matrix Perona/Freeman Shi/Malik 1 st eigenv. 2 nd gen. eigenv. Affinity Matrix Perona/Freeman Shi/Malik 1 st eigenv. 2 nd gen. eigenv.