Download presentation
Presentation is loading. Please wait.
1
Metric Learning for Clustering
Jianping Fan CS Department UNC-Charlotte
2
Problems of K-MEANs Distance Function Optimization Step:
Inter-cluster distances are maximized Intra-cluster distances are minimized Distance Function Optimization Step: Assignment Step:
3
Space Transformation for Clustering (via similarity or distance function)
2
4
Data Transformation for Clustering
Traditional distance function Weighted distance function Equivalent to first applying linear transformation y = Ax, then using Euclidean distance in new space of y’s
5
Data Transformation for Clustering
Objective Function
6
Distance Function for KNN
Consider a KNN: for each Query Point x, we want the K-nearest neighbors of same class to become closer to x in new metric
7
Distance Function for KNN
Convex Objective Function (SDP) Penalize large distances between each input and target neighbors Penalize small distances between each input and all other points of different class Points from different classes are separated by large margin
8
Applications Image Clustering
9
Applications Image Clustering
10
Applications Image Clustering
11
image i image j
12
image i dji,m image j
13
image i image k Dji =Σ wj,mdji,m image j
14
image i image k < Dji Dki image j
15
wj,m ? image j image i Dki image k Dji < image j
16
Distance Function for Clustering
Distance function for multiple attributes
17
Distance Function for Clustering
Distance function for multiple attributes
18
Distance Function for Clustering
Distance function for multiple attributes
19
Distance Function for Clustering
Distance function for multiple attributes Distance parameterized by p.d. d × d matrix A: Similarity measure is associated generalized inner product (kernel)
20
Distance Function for Clustering
Distance function for multiple attributes Goal : keep all the data points within the same classes close, while separating all the data points from different classes. Formulate as a constrained convex programming problem minimize the distance between the data pairs in S Subject to data pairs in D are well separated
21
Distance Function for Clustering
Distance function for multiple attributes A is positive semi-definite Ensure the negativity and the triangle inequality of the metric The number of parameters is quadratic in the number of features Difficult to scale to a large number of features Simplify the computation
22
Distance Function for Clustering
Distance function for multiple attributes The simplest mapping is a linear transformation
23
Distance Function for Clustering
Distance function for multiple attributes (3)
24
Distance Function for Clustering
Distance function for multiple attributes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.