Download presentation
Presentation is loading. Please wait.
1
Spectral Clustering 指導教授 : 王聖智 S. J. Wang 學生 : 羅介暐 Jie-Wei Luo
2
Outline Motivation Graph overview Spectral Clustering Another point of view Conclusion
3
Motivation K-means performs very poorly in this space due Dataset exhibits complex cluster shapes cluster shapes
4
K-means
5
Spectral Clustering Scatter plot of a 2D data set K-means ClusteringSpectral Clustering U. von Luxburg. A tutorial on spectral clustering. Technical report, Max Planck Institute for Biological Cybernetics, Germany, 2007.
6
Graph overview Graph Partitioning Graph notation Graph Cut Distance and Similarity
7
Graph partitioning First-graph representation of data Then-graph partitioning In this talk–mainly how to find a good partitioning of a given graph using spectral properties of that graph
8
Graph notation Always assume that similarities s ij are symmetric, non-negative Then graph is undirected, can be weighted
9
Graph notation Degree of vertex v i є V v i
10
Graph Cuts Mincut : min Cut(A 1,A 2 ) However, mincut simply seperates one individual vertex from the rest of the graph Balanced cut Problem: finding an optimal graph (normalized) cut is NP-hard Approximation: spectral graph partitioning
12
12
13
13
14
14
16
Spectral Clustering Unnormalized graph Laplacian Normalized graph Laplacian Other explanation Example
17
Spectral clustering - main algorithms Input: Similarity matrix S, number k of clusters to construct Build similarity graph Compute the first k eigenvectors v 1,..., v k of the problem matrix L for unnormalized spectral clustering L rw for normalized spectral clustering Build the matrix V є R n×k with the eigenvectors as columns Interpret the rows of V as new data points Z i є R k Cluster the points Z i with the k-means algorithm in R k
18
Example-1 2 3 1 4 01100 10100 11000 00001 00010 5 20000 02000 00200 00010 00001 L: Laplacian matrix 2 00 2 00 200 0001 000 1 Similarity Graph W: adjacency matrixD: degree matrix
19
Example-1 2 3 1 4 5 Similarity Graph 10 10 10 01 01 L: Laplacian matrix 2 00 2 00 200 0001 000 1 Double Zero Eigenvalue Two Connected Components First Two Eigenvectors v1v1 v2v2
20
Example-1 2 3 1 4 5 Similarity Graph First k Eigenvectors New Clustering Space 10 10 10 01 01 y1 y2 y3 y4 y5 v1v2 Use k-means clustering in the new space v2v2 v1v1
21
Unnormalized graph Laplacian Define as L=D-W proof
22
Unnormalized graph Laplacian proof Relation between spectrum and clusters: Multiplicity of k eigenvalue 0 = number k of connected components A 1,..., A k of the graph. eigenspace is spanned by the characteristic functions 1 A1,..., 1 Ak of those components (so all eigenvecotrs are piecewise constant).
23
Unnormalized graph Laplacian Interpret s ij = 1 / d(X i, X j ) 2 looks like a discrete version of the standard Laplace operator
24
Normalized graph Laplacian Define Relation between L sym & L rw L sym L rw Eigenvalueλλ EigenvectorD1/2uD1/2uu
25
Normalized graph Laplacian Spectral properties similar to L: Positive semi-definite, smallest eigenvalue is 0 Attention: For L rw, eigenspace spanned by 1A i (piecewise const.) but for Lsym, eigenspace spanned by D 1/2 1A i (not piecewise const).
26
Random walk explanations General observation: Random walk on the graph has transition matrix P = D −1 W. note that L rw = I − P Specific observation about Ncut : define P(A|B) is the probability to jump from B to A if we assume that the random walk starts in the stationary distribution. Then: Ncut(A, B) = P(A|B) + P(B|A) Interpretation: Spectral clustering tries to construct groups such that a random walk stays as long as possible within the same group
27
Possible Explanations
28
Example-2 In the embedded space given by two leading eigenvectors, clusters are trivial to separate.
29
Example-3 In the embedded space given by three leading eigenvectors, clusters are trivial to separate.
30
Another point of view
33
Connections
34
PCA Linear combination the original data X i to now variable Z i
35
Rank reduce comparison between PCA & Laplacian Eigenmap PCA is linear combination to reduce dimension, though PCA minimize the Reconstruction error, but it’s not helpful to cluster groups. Spectral clustering is nonlinear reducing dimension which is helpful to cluster, however it’s doesn’t actually have a “rank reduce function” apply to new data, while PCA have it.
36
Conclusion Why is spectral clustering useful? Does not make strong assumptions on cluster shape Is simple to implement (solving an eigenproblem) Spectral clustering objective does not have local optima Has several different derivations Successful in many applications What are potential problems? Can be sensitive to choice of parameters (k in kNN-graph). Computational expensive on large non-sparse graphs
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.