Mikhail Bilenko, Sugato Basu, Raymond J. Mooney

Slides:



Advertisements
Similar presentations
ADBIS 2007 Aggregating Multiple Instances in Relational Database Using Semi-Supervised Genetic Algorithm-based Clustering Technique Rayner Alfred Dimitar.
Advertisements

CS 478 – Tools for Machine Learning and Data Mining Clustering: Distance-based Approaches.
The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.
Machine Learning and Data Mining Clustering
A Probabilistic Framework for Semi-Supervised Clustering
One-Shot Multi-Set Non-rigid Feature-Spatial Matching
Unsupervised Training and Clustering Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Ulf Schmitz, Pattern recognition - Clustering1 Bioinformatics Pattern recognition - Clustering Ulf Schmitz
Evaluating Performance for Data Mining Techniques
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jennifer Rexford Princeton University With Jiayue He, Rui Zhang-Shen, Ying Li,
Surface Simplification Using Quadric Error Metrics Michael Garland Paul S. Heckbert.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
A two-stage approach for multi- objective decision making with applications to system reliability optimization Zhaojun Li, Haitao Liao, David W. Coit Reliability.
Lecture 20: Cluster Validation
CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.
Cluster Analysis Potyó László. Cluster: a collection of data objects Similar to one another within the same cluster Similar to one another within the.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:
Effective Keyword-Based Selection of Relational Databases By Bei Yu, Guoliang Li, Karen Sollins & Anthony K. H. Tung Presented by Deborah Kallina.
Machine Learning Queens College Lecture 7: Clustering.
Globally Consistent Range Scan Alignment for Environment Mapping F. LU ∗ AND E. MILIOS Department of Computer Science, York University, North York, Ontario,
Optimal Reverse Prediction: Linli Xu, Martha White and Dale Schuurmans ICML 2009, Best Overall Paper Honorable Mention A Unified Perspective on Supervised,
Algorithm for non-negative matrix factorization Daniel D. Lee, H. Sebastian Seung. Algorithm for non-negative matrix factorization. Nature.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
Flexible Speaker Adaptation using Maximum Likelihood Linear Regression Authors: C. J. Leggetter P. C. Woodland Presenter: 陳亮宇 Proc. ARPA Spoken Language.
Document Clustering with Prior Knowledge Xiang Ji et al. Document Clustering with Prior Knowledge. SIGIR 2006 Presenter: Suhan Yu.
Big Data Infrastructure Week 9: Data Mining (4/4) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States.
CSC321: Neural Networks Lecture 9: Speeding up the Learning
Big Data Infrastructure
Spectral Methods for Dimensionality
Mining Data Semantics (MDS'2011) Workshop
Learning to Align: a Statistical Approach
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
Semi-Supervised Clustering
Machine Learning and Data Mining Clustering
Object Orie’d Data Analysis, Last Time
Linli Xu Martha White Dale Schuurmans University of Alberta
Multi-task learning approaches to modeling context-specific networks
Learning Tree Structures
Constrained Clustering -Semi Supervised Clustering-
Machine Learning Lecture 9: Clustering
Restricted Boltzmann Machines for Classification
Classification of unlabeled data:
CSC 594 Topics in AI – Natural Language Processing
KDD 2004: Adversarial Classification
Machine Learning Basics
Metric Learning for Clustering
Clustering (3) Center-based algorithms Fuzzy k-means
Topic 3: Cluster Analysis
CSE 5243 Intro. to Data Mining
Clustering Evaluation The EM Algorithm
Lecture 9: Entity Resolution
RECORD. RECORD Gaussian Elimination: derived system back-substitution.
Probabilistic Models with Latent Variables
Information Organization: Clustering
Semi-supervised Learning
CS 391L: Machine Learning Clustering
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
Text Categorization Berlin Chen 2003 Reference:
Clustering Techniques
Machine Learning and Data Mining Clustering
Topic 5: Cluster Analysis
Clustering The process of grouping samples so that the samples are similar within each group.
Unsupervised Learning: Clustering
EM Algorithm and its Applications
Restrictions on sums over
Introduction to Machine learning
Machine Learning and Data Mining Clustering
Presentation transcript:

Mikhail Bilenko, Sugato Basu, Raymond J. Mooney Integrating Constraints and Metric Learning in Semi-Supervised Clustering Mikhail Bilenko, Sugato Basu, Raymond J. Mooney ICML 2004 Presented by Xin Li

Semi-Supervised Clustering K=4

Semi-Supervised Clustering

Semi-Supervised Clustering

How to exploit supervision in clustering Incorporate supervision as constraints Learn a distance metric using supervision Integration of these two approaches

K-means Clustering X = {x1,x2,…} L = {l1,l2,…,lk} Euclidean Distance: Minimizing:

Clustering with constraints Pairwise constraints: M – Must-link pairs (xi, xj) should be in the same cluster C -- Cannot-link pairs (xi, xj) should be in different clusters

Learning a pairwise distance metric Binary Classification: (xi, xj)  0/1 M  positive examples (xi, xj) are the same cluster C  negative examples (xi, xj) are in different clusters Apply the learned distance metric in clustering Metric learning and clustering are disjointed

Unsupervised Clustering with Metric Learning Learn a distance metric that optimize a quality function Maximizing the complete data log-likelihood under generalized K-means

Integrating Constraints and Metric Learning Combining the previous two equations leads to the following objective function that minimizes cluster dispersion under that learned metrics while reducing constraint violations.

Penalty for violating constraints Penalty for violating a must-link constraints between distant points should be higher than that between nearby points. Penalty for violating a cannot-link constraints between nearby points should be lower than that between nearby points.

MPCK-MEANS Algorithm Constraints are utilized during cluster initialization and when assigning points to clusters. The distance metric is adapted by re-estimating the weights in matrices Ah.

Initialization An initial guess of the clusters. Assign each point x to one of K clusters in a way that satisfies the constraints. Compute the centroid of each cluster.

E-step Every point x is assigned to the cluster that minimizes the sum of the distance of x to the cluster centroid according to the local metric and the cost of any constraint violations incurred by the cluster assignment.

M-Step = 0 Update Metrics:

Experimental Setting

Single Metric, Diagonal Matrix A

Single Metric, Diagonal Matrix A

Multiple Metrics, Full Matrix A

Multiple Metrics, Full Matrix A

Conclusion and Discussion This paper has presented MPCK-MEANS, a new approach to semi-supervised clustering. Supervision and metric learning are helpful in clustering and multiple distance metrics are not necessary in most cases. Question 1: If we have supervision in clustering, why not utilize supervision in the same way as in a typical classification task ? Question 2: If there are infinite number of classes, can we gain from supervision on part of them ?