Spectral Clustering Eric Xing Lecture 8, August 13, 2010

Slides:



Advertisements
Similar presentations
Spectral Clustering Eyal David Image Processing seminar May 2008.
Advertisements

Partitional Algorithms to Detect Complex Clusters
The Stability of a Good Clustering Marina Meila University of Washington
Normalized Cuts and Image Segmentation
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Clustering II CMPUT 466/551 Nilanjan Ray. Mean-shift Clustering Will show slides from:
1School of CS&Eng The Hebrew University
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Image segmentation. The goals of segmentation Group together similar-looking pixels for efficiency of further processing “Bottom-up” process Unsupervised.
Lecture 21: Spectral Clustering
CS 584. Review n Systems of equations and finite element methods are related.
Spectral Clustering Course: Cluster Analysis and Other Unsupervised Learning Methods (Stat 593 E) Speakers: Rebecca Nugent1, Larissa Stanberry2 Department.
CS 376b Introduction to Computer Vision 04 / 08 / 2008 Instructor: Michael Eckmann.
Normalized Cuts and Image Segmentation Jianbo Shi and Jitendra Malik, Presented by: Alireza Tavakkoli.
© 2003 by Davi GeigerComputer Vision October 2003 L1.1 Image Segmentation Based on the work of Shi and Malik, Carnegie Mellon and Berkley and based on.
Region Segmentation. Find sets of pixels, such that All pixels in region i satisfy some constraint of similarity.
A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts
CS 376b Introduction to Computer Vision 04 / 04 / 2008 Instructor: Michael Eckmann.
Segmentation Graph-Theoretic Clustering.
Image Segmentation A Graph Theoretic Approach. Factors for Visual Grouping Similarity (gray level difference) Similarity (gray level difference) Proximity.
אשכול בעזרת אלגורתמים בתורת הגרפים
Computer Vision - A Modern Approach Set: Segmentation Slides by D.A. Forsyth Segmentation and Grouping Motivation: not information is evidence Obtain a.
Dimensionality reduction Usman Roshan CS 675. Supervised dim reduction: Linear discriminant analysis Fisher linear discriminant: –Maximize ratio of difference.
Image Segmentation Image segmentation is the operation of partitioning an image into a collection of connected sets of pixels. 1. into regions, which usually.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Image Segmentation Rob Atlas Nick Bridle Evan Radkoff.
Segmentation Techniques Luis E. Tirado PhD qualifying exam presentation Northeastern University.
Presenter : Kuang-Jui Hsu Date : 2011/5/3(Tues.).
Segmentation using eigenvectors
CSSE463: Image Recognition Day 34 This week This week Today: Today: Graph-theoretic approach to segmentation Graph-theoretic approach to segmentation Tuesday:
Segmentation using eigenvectors Papers: “Normalized Cuts and Image Segmentation”. Jianbo Shi and Jitendra Malik, IEEE, 2000 “Segmentation using eigenvectors:
Region Segmentation Readings: Chapter 10: 10.1 Additional Materials Provided K-means Clustering (text) EM Clustering (paper) Graph Partitioning (text)
Image Segmentation February 27, Implicit Scheme is considerably better with topological change. Transition from Active Contours: –contour v(t) 
Segmentation Course web page: vision.cis.udel.edu/~cv May 7, 2003  Lecture 31.
Chapter 14: SEGMENTATION BY CLUSTERING 1. 2 Outline Introduction Human Vision & Gestalt Properties Applications – Background Subtraction – Shot Boundary.
Spectral Analysis based on the Adjacency Matrix of Network Data Leting Wu Fall 2009.
Jad silbak -University of Haifa. What we have, and what we want: Most segmentations until now focusing on local features (K-Means). We would like to extract.
Spectral Clustering Jianping Fan Dept of Computer Science UNC, Charlotte.
Spectral Sequencing Based on Graph Distance Rong Liu, Hao Zhang, Oliver van Kaick {lrong, haoz, cs.sfu.ca {lrong, haoz, cs.sfu.ca.
Learning Spectral Clustering, With Application to Speech Separation F. R. Bach and M. I. Jordan, JMLR 2006.
CS654: Digital Image Analysis Lecture 28: Advanced topics in Image Segmentation Image courtesy: IEEE, IJCV.
 In the previews parts we have seen some kind of segmentation method.  In this lecture we will see graph cut, which is a another segmentation method.
Advanced Artificial Intelligence Lecture 8: Advance machine learning.
Spectral Clustering Shannon Quinn (with thanks to William Cohen of Carnegie Mellon University, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford.
A Tutorial on Spectral Clustering Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No.
Normalized Cuts and Image Segmentation Patrick Denis COSC 6121 York University Jianbo Shi and Jitendra Malik.
Document Clustering with Prior Knowledge Xiang Ji et al. Document Clustering with Prior Knowledge. SIGIR 2006 Presenter: Suhan Yu.
K-Means Segmentation.
Spectral Methods for Dimensionality
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
Segmentation by clustering: normalized cut
Clustering Usman Roshan.
Document Clustering Based on Non-negative Matrix Factorization
Exercise class 13 : Image Segmentation
Dimensionality reduction
CSSE463: Image Recognition Day 34
Department of Electrical and Computer Engineering
Jianping Fan Dept of CS UNC-Charlotte
Segmentation Graph-Theoretic Clustering.
Advanced Artificial Intelligence
Grouping.
Lecture 31: Graph-Based Image Segmentation
Digital Image Processing
Seam Carving Project 1a due at midnight tonight.
Image Segmentation CS 678 Spring 2018.
3.3 Network-Centric Community Detection
CSSE463: Image Recognition Day 34
“Traditional” image segmentation
Most slides are from Eyal David’s presentation
Presentation transcript:

Spectral Clustering Eric Xing Lecture 8, August 13, 2010 Machine Learning Spectral Clustering Eric Xing Lecture 8, August 13, 2010 Eric Xing © Eric Xing @ CMU, 2006-2010 Reading:

Data Clustering Eric Xing © Eric Xing @ CMU, 2006-2010

Eric Xing © Eric Xing @ CMU, 2006-2010

Data Clustering Two different criteria Connectivity Compactness Compactness, e.g., k-means, mixture models Connectivity, e.g., spectral clustering Connectivity Compactness Eric Xing © Eric Xing @ CMU, 2006-2010

Spectral Clustering Data Similarities Eric Xing © Eric Xing @ CMU, 2006-2010

Weighted Graph Partitioning Wij i j Some graph terminology Objects (e.g., pixels, data points) iÎ I = vertices of graph G Edges (ij) = pixel pairs with Wij > 0 Similarity matrix W = [ Wij ] Degree di = SjÎG Wij dA = SiÎA di degree of A G Assoc(A,B) = SiÎA SjÎB Wij i A B Eric Xing © Eric Xing @ CMU, 2006-2010

Cuts in a Graph (edge) cut = set of edges whose removal makes a graph disconnected weight of a cut: cut( A, B ) = SiÎA SjÎB Wij=Assoc(A,B) Normalized Cut criteria: minimum cut(A,Ā) More generally: Eric Xing © Eric Xing @ CMU, 2006-2010

Graph-based Clustering Data Grouping Image sigmentation Affinity matrix: Degree matrix: Laplacian matrix: (bipartite) partition vector: Wij i j G = {V,E} Eric Xing © Eric Xing @ CMU, 2006-2010

Affinity Function Affinities grow as s grows  How the choice of s value affects the results? What would be the optimal choice for s? Eric Xing © Eric Xing @ CMU, 2006-2010

Clustering via Optimizing Normalized Cut The normalized cut: Computing an optimal normalized cut over all possible y (i.e., partition) is NP hard Transform Ncut equation to a matrix form (Shi & Malik 2000): Still an NP hard problem Subject to: Rayleigh quotient Eric Xing © Eric Xing @ CMU, 2006-2010

Relaxation Instead, relax into the continuous domain by solving generalized eigenvalue system: Which gives: Note that so, the first eigenvector is y0=1 with eigenvalue 0. The second smallest eigenvector is the real valued solution to this problem!! Rayleigh quotient Subject to: Rayleigh quotient theorem Eric Xing © Eric Xing @ CMU, 2006-2010

Algorithm Define a similarity function between 2 nodes. i.e.: Compute affinity matrix (W) and degree matrix (D). Solve Do singular value decomposition (SVD) of the graph Laplacian Use the eigenvector with the second smallest eigenvalue, , to bipartition the graph. For each threshold k, Ak={i | yi among k largest element of y*} Bk={i | yi among n-k smallest element of y*} Compute Ncut(Ak,Bk) Output Þ Eric Xing © Eric Xing @ CMU, 2006-2010

Ideally … y2i i A y2i i A Eric Xing © Eric Xing @ CMU, 2006-2010

Example (Xing et al, 2001) Eric Xing © Eric Xing @ CMU, 2006-2010

Poor features can lead to poor outcome (Xing et al 2001) Eric Xing © Eric Xing @ CMU, 2006-2010

Cluster vs. Block matrix Eric Xing © Eric Xing @ CMU, 2006-2010

Compare to Minimum cut Criterion for partition: A Problem! B Cuts with Weight of cut is directly proportional to the number of edges in the cut. B Ideal Cut Cuts with lesser weight than the ideal cut First proposed by Wu and Leahy Eric Xing © Eric Xing @ CMU, 2006-2010

Superior Performance? K-means and Gaussian mixture methods are biased toward convex clusters Eric Xing © Eric Xing @ CMU, 2006-2010

Ncut is superior in certain cases Eric Xing © Eric Xing @ CMU, 2006-2010

Why? Eric Xing © Eric Xing @ CMU, 2006-2010

General Spectral Clustering Data Similarities Eric Xing © Eric Xing @ CMU, 2006-2010

Representation Partition matrix X: Pair-wise similarity matrix W: Degree matrix D: Laplacian matrix L: segments pixels Eric Xing © Eric Xing @ CMU, 2006-2010

A Spectral Clustering Algorithm Ng, Jordan, and Weiss 2003 Given a set of points S={s1,…sn} Form the affinity matrix Define diagonal matrix Dii= Sk aik Form the matrix Stack the k largest eigenvectors of L to for the columns of the new matrix X: Renormalize each of X’s rows to have unit length and get new matrix Y. Cluster rows of Y as points in R k Eric Xing © Eric Xing @ CMU, 2006-2010

SC vs Kmeans Eric Xing © Eric Xing @ CMU, 2006-2010

Why it works? K-means in the spectrum space ! Eric Xing © Eric Xing @ CMU, 2006-2010

Eigenvectors and blocks Block matrices have block eigenvectors: Near-block matrices have near-block eigenvectors: 1= 2 2= 2 3= 0 4= 0 1 .71 .71 eigensolver 1= 2.02 2= 2.02 3= -0.02 4= -0.02 1 .2 -.2 .71 .69 .14 -.14 .69 .71 eigensolver Eric Xing © Eric Xing @ CMU, 2006-2010

Spectral Space Can put items into blocks by eigenvectors: Clusters clear regardless of row ordering: e1 1 .2 -.2 .71 .69 .14 -.14 .69 .71 e2 e1 e2 e1 1 .2 -.2 .71 .14 .69 .69 -.14 .71 e2 Eric Xing © Eric Xing @ CMU, 2006-2010 e1 e2

More formally … Recall generalized Ncut Minimizing this is equivalent to spectral clustering segments pixels Eric Xing © Eric Xing @ CMU, 2006-2010

Spectral Clustering Algorithms that cluster points using eigenvectors of matrices derived from the data Obtain data representation in the low-dimensional space that can be easily clustered Variety of methods that use the eigenvectors differently (we have seen an example) Empirically very successful Authors disagree: Which eigenvectors to use How to derive clusters from these eigenvectors Two general methods Eric Xing © Eric Xing @ CMU, 2006-2010

Method #1 Partition using only one eigenvector at a time Use procedure recursively Example: Image Segmentation Uses 2nd (smallest) eigenvector to define optimal cut Recursively generates two clusters with each cut Eric Xing © Eric Xing @ CMU, 2006-2010

Method #2 Use k eigenvectors (k chosen by user) Directly compute k-way partitioning Experimentally has been seen to be “better” Eric Xing © Eric Xing @ CMU, 2006-2010

Images from Matthew Brand (TR-2002-42) Toy examples Images from Matthew Brand (TR-2002-42) Eric Xing © Eric Xing @ CMU, 2006-2010

User’s Prerogative Choice of k, the number of clusters Choice of scaling factor Realistically, search over and pick value that gives the tightest clusters Choice of clustering method: k-way or recursive bipartite Kernel affinity matrix Eric Xing © Eric Xing @ CMU, 2006-2010

Conclusions Good news: Bad news: Simple and powerful methods to segment images. Flexible and easy to apply to other clustering problems. Bad news: High memory requirements (use sparse matrices). Very dependant on the scale factor for a specific problem. Eric Xing © Eric Xing @ CMU, 2006-2010