Metric Learning for Clustering Jianping Fan CS Department UNC-Charlotte http://webpages.uncc.edu/jfan/
Problems of K-MEANs Distance Function Optimization Step: Inter-cluster distances are maximized Intra-cluster distances are minimized Distance Function Optimization Step: Assignment Step:
Space Transformation for Clustering (via similarity or distance function) 2
Data Transformation for Clustering Traditional distance function Weighted distance function Equivalent to first applying linear transformation y = Ax, then using Euclidean distance in new space of y’s
Data Transformation for Clustering Objective Function
Distance Function for KNN Consider a KNN: for each Query Point x, we want the K-nearest neighbors of same class to become closer to x in new metric
Distance Function for KNN Convex Objective Function (SDP) Penalize large distances between each input and target neighbors Penalize small distances between each input and all other points of different class Points from different classes are separated by large margin
Applications Image Clustering
Applications Image Clustering
Applications Image Clustering
image i image j
image i dji,m image j
image i image k Dji =Σ wj,mdji,m image j
image i image k < Dji Dki image j
wj,m ? image j image i Dki image k Dji < image j
Distance Function for Clustering Distance function for multiple attributes
Distance Function for Clustering Distance function for multiple attributes
Distance Function for Clustering Distance function for multiple attributes
Distance Function for Clustering Distance function for multiple attributes Distance parameterized by p.d. d × d matrix A: Similarity measure is associated generalized inner product (kernel)
Distance Function for Clustering Distance function for multiple attributes Goal : keep all the data points within the same classes close, while separating all the data points from different classes. Formulate as a constrained convex programming problem minimize the distance between the data pairs in S Subject to data pairs in D are well separated
Distance Function for Clustering Distance function for multiple attributes A is positive semi-definite Ensure the negativity and the triangle inequality of the metric The number of parameters is quadratic in the number of features Difficult to scale to a large number of features Simplify the computation
Distance Function for Clustering Distance function for multiple attributes The simplest mapping is a linear transformation
Distance Function for Clustering Distance function for multiple attributes (3)
Distance Function for Clustering Distance function for multiple attributes