Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metric Learning for Clustering

Similar presentations


Presentation on theme: "Metric Learning for Clustering"— Presentation transcript:

1 Metric Learning for Clustering
Jianping Fan CS Department UNC-Charlotte

2 Problems of K-MEANs Distance Function Optimization Step:
Inter-cluster distances are maximized Intra-cluster distances are minimized Distance Function Optimization Step: Assignment Step:

3 Space Transformation for Clustering (via similarity or distance function)
2

4 Data Transformation for Clustering
Traditional distance function Weighted distance function Equivalent to first applying linear transformation y = Ax, then using Euclidean distance in new space of y’s

5 Data Transformation for Clustering
Objective Function

6 Distance Function for KNN
Consider a KNN: for each Query Point x, we want the K-nearest neighbors of same class to become closer to x in new metric

7 Distance Function for KNN
Convex Objective Function (SDP) Penalize large distances between each input and target neighbors Penalize small distances between each input and all other points of different class Points from different classes are separated by large margin

8 Applications Image Clustering

9 Applications Image Clustering

10 Applications Image Clustering

11 image i image j

12 image i dji,m image j

13 image i image k Dji =Σ wj,mdji,m image j

14 image i image k < Dji Dki image j

15 wj,m ? image j image i Dki image k Dji < image j

16 Distance Function for Clustering
Distance function for multiple attributes

17 Distance Function for Clustering
Distance function for multiple attributes

18 Distance Function for Clustering
Distance function for multiple attributes

19 Distance Function for Clustering
Distance function for multiple attributes Distance parameterized by p.d. d × d matrix A: Similarity measure is associated generalized inner product (kernel)

20 Distance Function for Clustering
Distance function for multiple attributes Goal : keep all the data points within the same classes close, while separating all the data points from different classes. Formulate as a constrained convex programming problem minimize the distance between the data pairs in S Subject to data pairs in D are well separated

21 Distance Function for Clustering
Distance function for multiple attributes A is positive semi-definite Ensure the negativity and the triangle inequality of the metric The number of parameters is quadratic in the number of features Difficult to scale to a large number of features Simplify the computation

22 Distance Function for Clustering
Distance function for multiple attributes The simplest mapping is a linear transformation

23 Distance Function for Clustering
Distance function for multiple attributes (3)

24 Distance Function for Clustering
Distance function for multiple attributes


Download ppt "Metric Learning for Clustering"

Similar presentations


Ads by Google