Presentation is loading. Please wait.

Presentation is loading. Please wait.

Diffusion Maps and Spectral Clustering

Similar presentations


Presentation on theme: "Diffusion Maps and Spectral Clustering"— Presentation transcript:

1 Diffusion Maps and Spectral Clustering
1/14 Machine Learning Seminar Series Diffusion Maps and Spectral Clustering Author : Ronald R. Coifman et al. (Yale University) Presenter : Nilanjan Dasgupta (SIG Inc.)

2 Motivation Data lie on a low-dimensional manifold. The shape of the
2/14 Motivation X Y Z -- Datum Low-dimensional Manifold Data lie on a low-dimensional manifold. The shape of the manifold is not known a priori. PCA would fail to make compact representation since the manifold is not linear ! Spectral clustering as a non-linear dimensionality reduction scheme.

3 Outline Non-linear dimensionality reduction and spectral clustering.
3/14 Outline Non-linear dimensionality reduction and spectral clustering. Diffusion based probabilistic interpretation of spectral methods. Eigenvectors of normalized graph Laplacian is a discrete approximation of the continuous Fokker-Plank operator. Justification of the success of spectral clustering. Conclusions.

4 Spectral clustering Nomalized graph Laplacian :
4/14 Spectral clustering Nomalized graph Laplacian : Given N data points where each , the distance (similarity) between any two points xi and xj is given by with Gaussian kernel of width e and a diagonal normalization matrix Solve the normalized eigenvalue problem Use first few eigenvectors of M for low-dimensional representation of data or good coordinates for clustering.

5 Spectral Clustering : previous work
5/14 Spectral Clustering : previous work Non-linear dimensionality analysis by S. Roweis and L.Saul (published in Science magazine, 2000). Belkin & Niyogi (NIPS’02) show that if data are sampled uniformly from the low-dimensional manifold, first few eigenvectors of M=D-1L are discrete approximation of the Laplace-Beltrami operator on the manifold. Meila & Shi (AIStat’01) interpret M as a stochastic matrix representing random walk on the graph.

6 Diffusion distance and Diffusion map
6/14 Diffusion distance and Diffusion map A symmetric matrix Ms can be derived from M as M and Ms has same N eigenvalues, Under random walk representation of the graph M f : left eigenvector of M y : right eigenvector of M e : time step

7 Diffusion distance and Diffusion map
7/14 Diffusion distance and Diffusion map e has the dual representation (time step and kernel width). If one starts random walk from location xi , the probability of landing in location y after r time steps is given by For large e, all points in the graph are connected (Mi,j >0) and the eigenvalues of M where ei is a row vector with all zeros except that ith position = 1.

8 Diffusion distance and Diffusion map
8/14 Diffusion distance and Diffusion map One can show that regardless of starting point xi Left eigenvector of M with eigenvalue l0=1 with Eigenvector f0(x) has the dual representation : 1. Stationary probability distribution on the curve, i.e., the probability of landing at location x after taking infinite steps of random walk (independent of the start location). 2. It is the density estimate at location x.

9 Diffusion distance For any finite time r,
9/14 Diffusion distance For any finite time r, yk and fk are the right and left eigenvectors of graph Laplacian M. is the kth eigenvalue of M r (arranged in descending order). Given the definition of random walk, we denote Diffusion distance as a distance measure at time t between two pmfs as with empirical choice w(y)=1/f0(y).

10 Diffusion Map k eigenvectors as Relationship :
10/14 Diffusion Map Diffusion distance : Diffusion map : Mapping between original space and first k eigenvectors as Relationship : This relationship justifies using Euclidean distance in diffusion map space for spectral clustering. Since , it is justified to stop at appropriate k with a negligible error of order O(lk+1/lk)t).

11 Asymptotics of Diffusion Map
11/14 Asymptotics of Diffusion Map Suppose {xi} are sampled i.i.d. from probability density p(x) defined over manifold Z X Y Suppose p(x) = e-U(x) with U(x) is potential (energy) at location x. As , random walk on a discrete graph converges to random walk on the continuous manifold W. The forward and backward operators are given by

12 Asymptotics of Diffusion Map
12/14 Asymptotics of Diffusion Map Tf[f] : the probability distribution after one time-step e f(x) is probability distribution on the graph at t=0. Tb[y](x) is the mean of function y after one time-step e, for a random walk that started at location x at time t=0. Consider the limit , i.e., when each data point contains infinite nearby neighbors. Hence in that limit, random walk converges to a diffusion process with probability density evolving continuously in time as

13 Fokker-Plank operator
13/14 Fokker-Plank operator Infinitesimal generators (propagators) : The eigenfunctions of Tf and Tb converge to those of Hf and Hb, respectively. The backward generator is given by the Fokker –Plank operator which corresponds to a diffusion process in a potential field 2U(x).

14 Spectral clustering and Fokker-Plank operator
14/14 Spectral clustering and Fokker-Plank operator The term is interpreted as the drift term towards low potential (higher data density). The left and right eigenvectors of M can be viewed as discrete approximations of Tf and Tb, respectively. Tf and Tb can be viewed as approximation to Hf and Hb, which in the asymptotic case ( ) can be viewed as diffusion process with potential 2U(x) (p(x)=exp(-U(x)).


Download ppt "Diffusion Maps and Spectral Clustering"

Similar presentations


Ads by Google