Metric Learning for Clustering

Slides:



Advertisements
Similar presentations
ECG Signal processing (2)
Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Data Mining Classification: Alternative Techniques

Support Vector Machines
Machine learning continued Image source:
Distance Metric Learning with Spectral Clustering By Sheil Kumar.
1 A Survey on Distance Metric Learning (Part 1) Gerry Tesauro IBM T.J.Watson Research Center.
Robust Multi-Kernel Classification of Uncertain and Imbalanced Data
Classification and Decision Boundaries
Discriminative and generative methods for bags of features
Support Vector Machines
One-Shot Multi-Set Non-rigid Feature-Spatial Matching
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class.
CES 514 – Data Mining Lecture 8 classification (contd…)
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Support Vector Machines Formulation  Solve the quadratic program for some : min s. t.,, denotes where or membership.  Different error functions and measures.
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.
Support Vector Machines and Kernel Methods
The Implicit Mapping into Feature Space. In order to learn non-linear relations with a linear machine, we need to select a set of non- linear features.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
INSTANCE-BASE LEARNING
Distance Metric Learning for Large Margin Nearest Neighbor Classification (LMNN) NIPS 2006 Kilian Q. Weinberger, John Blitzer and Lawrence K. Saul.
SVMs, cont’d Intro to Bayesian learning. Quadratic programming Problems of the form Minimize: Subject to: are called “quadratic programming” problems.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Support Vector Machine & Image Classification Applications
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 – Fall 2014.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Low-Rank Kernel Learning with Bregman Matrix Divergences Brian Kulis, Matyas A. Sustik and Inderjit S. Dhillon Journal of Machine Learning Research 10.
Machine Learning Weak 4 Lecture 2. Hand in Data It is online Only around 6000 images!!! Deadline is one week. Next Thursday lecture will be only one hour.
Spoken Language Group Chinese Information Processing Lab. Institute of Information Science Academia Sinica, Taipei, Taiwan
An Introduction to Support Vector Machine (SVM)
CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Support Vector Machine Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata November 3, 2014.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
CS Machine Learning Instance Based Learning (Adapted from various sources)
Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:
Support Vector Machine Slides from Andrew Moore and Mingyue Tan.
Spectral Methods for Dimensionality
Semi-Supervised Clustering
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Correlative Multi-Label Multi-Instance Image Annotation
Unsupervised Riemannian Clustering of Probability Density Functions
Geometrical intuition behind the dual problem
Instance Based Learning (Adapted from various sources)
Jianping Fan Dept of CS UNC-Charlotte
Linear Transformations
K Nearest Neighbor Classification
Support Vector Machines
Learning with information of features
Nearest-Neighbor Classifiers
What Visualization can do for Data Clustering?
CS 2750: Machine Learning Support Vector Machines
Semi-supervised Learning
Jianping Fan Dept of Computer Science UNC-Charlotte
CSSE463: Image Recognition Day 14
COSC 4335: Other Classification Techniques
Clustering Wei Wang.
Nearest Neighbors CSC 576: Data Mining.
Nonlinear Dimension Reduction:
SVMs for Document Ranking
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Presentation transcript:

Metric Learning for Clustering Jianping Fan CS Department UNC-Charlotte http://webpages.uncc.edu/jfan/

Problems of K-MEANs Distance Function Optimization Step: Inter-cluster distances are maximized Intra-cluster distances are minimized Distance Function Optimization Step: Assignment Step:

Space Transformation for Clustering (via similarity or distance function)   2

Data Transformation for Clustering Traditional distance function Weighted distance function Equivalent to first applying linear transformation y = Ax, then using Euclidean distance in new space of y’s

Data Transformation for Clustering Objective Function

Distance Function for KNN Consider a KNN: for each Query Point x, we want the K-nearest neighbors of same class to become closer to x in new metric

Distance Function for KNN Convex Objective Function (SDP) Penalize large distances between each input and target neighbors Penalize small distances between each input and all other points of different class Points from different classes are separated by large margin

Applications Image Clustering

Applications Image Clustering

Applications Image Clustering

image i image j

image i dji,m image j

image i image k Dji =Σ wj,mdji,m image j

image i image k < Dji Dki image j

wj,m ? image j image i Dki image k Dji < image j

Distance Function for Clustering Distance function for multiple attributes

Distance Function for Clustering Distance function for multiple attributes

Distance Function for Clustering Distance function for multiple attributes

Distance Function for Clustering Distance function for multiple attributes Distance parameterized by p.d. d × d matrix A: Similarity measure is associated generalized inner product (kernel)

Distance Function for Clustering Distance function for multiple attributes Goal : keep all the data points within the same classes close, while separating all the data points from different classes. Formulate as a constrained convex programming problem minimize the distance between the data pairs in S Subject to data pairs in D are well separated

Distance Function for Clustering Distance function for multiple attributes A is positive semi-definite Ensure the negativity and the triangle inequality of the metric The number of parameters is quadratic in the number of features Difficult to scale to a large number of features Simplify the computation

Distance Function for Clustering Distance function for multiple attributes The simplest mapping is a linear transformation

Distance Function for Clustering Distance function for multiple attributes (3)

Distance Function for Clustering Distance function for multiple attributes