Mikhail Bilenko, Sugato Basu, Raymond J. Mooney

Slides:

Advertisements

Similar presentations

ADBIS 2007 Aggregating Multiple Instances in Relational Database Using Semi-Supervised Genetic Algorithm-based Clustering Technique Rayner Alfred Dimitar.

Advertisements

CS 478 – Tools for Machine Learning and Data Mining Clustering: Distance-based Approaches.

The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.

Machine Learning and Data Mining Clustering

A Probabilistic Framework for Semi-Supervised Clustering

One-Shot Multi-Set Non-rigid Feature-Spatial Matching

Unsupervised Training and Clustering Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall

Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.

Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University

Ulf Schmitz, Pattern recognition - Clustering1 Bioinformatics Pattern recognition - Clustering Ulf Schmitz

Evaluating Performance for Data Mining Techniques

DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jennifer Rexford Princeton University With Jiayue He, Rui Zhang-Shen, Ying Li,

Surface Simplification Using Quadric Error Metrics Michael Garland Paul S. Heckbert.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

A two-stage approach for multi- objective decision making with applications to system reliability optimization Zhaojun Li, Haitao Liao, David W. Coit Reliability.

Lecture 20: Cluster Validation

CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.

Cluster Analysis Potyó László. Cluster: a collection of data objects Similar to one another within the same cluster Similar to one another within the.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:

Effective Keyword-Based Selection of Relational Databases By Bei Yu, Guoliang Li, Karen Sollins & Anthony K. H. Tung Presented by Deborah Kallina.

Machine Learning Queens College Lecture 7: Clustering.

Globally Consistent Range Scan Alignment for Environment Mapping F. LU ∗ AND E. MILIOS Department of Computer Science, York University, North York, Ontario,

Optimal Reverse Prediction: Linli Xu, Martha White and Dale Schuurmans ICML 2009, Best Overall Paper Honorable Mention A Unified Perspective on Supervised,

Algorithm for non-negative matrix factorization Daniel D. Lee, H. Sebastian Seung. Algorithm for non-negative matrix factorization. Nature.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.

Flexible Speaker Adaptation using Maximum Likelihood Linear Regression Authors: C. J. Leggetter P. C. Woodland Presenter: 陳亮宇 Proc. ARPA Spoken Language.

Document Clustering with Prior Knowledge Xiang Ji et al. Document Clustering with Prior Knowledge. SIGIR 2006 Presenter: Suhan Yu.

Big Data Infrastructure Week 9: Data Mining (4/4) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States.

CSC321: Neural Networks Lecture 9: Speeding up the Learning

Big Data Infrastructure

Spectral Methods for Dimensionality

Mining Data Semantics (MDS'2011) Workshop

Learning to Align: a Statistical Approach

Unsupervised Learning: Clustering

Unsupervised Learning: Clustering

Semi-Supervised Clustering

Machine Learning and Data Mining Clustering

Object Orie’d Data Analysis, Last Time

Linli Xu Martha White Dale Schuurmans University of Alberta

Multi-task learning approaches to modeling context-specific networks

Learning Tree Structures

Constrained Clustering -Semi Supervised Clustering-

Machine Learning Lecture 9: Clustering

Restricted Boltzmann Machines for Classification

Classification of unlabeled data:

CSC 594 Topics in AI – Natural Language Processing

KDD 2004: Adversarial Classification

Machine Learning Basics

Metric Learning for Clustering

Clustering (3) Center-based algorithms Fuzzy k-means

Topic 3: Cluster Analysis

CSE 5243 Intro. to Data Mining

Clustering Evaluation The EM Algorithm

Lecture 9: Entity Resolution

RECORD. RECORD Gaussian Elimination: derived system back-substitution.

Probabilistic Models with Latent Variables

Information Organization: Clustering

Semi-supervised Learning

CS 391L: Machine Learning Clustering

LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.

Text Categorization Berlin Chen 2003 Reference:

Clustering Techniques

Machine Learning and Data Mining Clustering

Topic 5: Cluster Analysis

Clustering The process of grouping samples so that the samples are similar within each group.

Unsupervised Learning: Clustering

EM Algorithm and its Applications

Restrictions on sums over

Introduction to Machine learning

Machine Learning and Data Mining Clustering

Presentation transcript:

Mikhail Bilenko, Sugato Basu, Raymond J. Mooney Integrating Constraints and Metric Learning in Semi-Supervised Clustering Mikhail Bilenko, Sugato Basu, Raymond J. Mooney ICML 2004 Presented by Xin Li

Semi-Supervised Clustering K=4

Semi-Supervised Clustering

Semi-Supervised Clustering

How to exploit supervision in clustering Incorporate supervision as constraints Learn a distance metric using supervision Integration of these two approaches

K-means Clustering X = {x1,x2,…} L = {l1,l2,…,lk} Euclidean Distance: Minimizing:

Clustering with constraints Pairwise constraints: M – Must-link pairs (xi, xj) should be in the same cluster C -- Cannot-link pairs (xi, xj) should be in different clusters

Learning a pairwise distance metric Binary Classification: (xi, xj)  0/1 M  positive examples (xi, xj) are the same cluster C  negative examples (xi, xj) are in different clusters Apply the learned distance metric in clustering Metric learning and clustering are disjointed

Unsupervised Clustering with Metric Learning Learn a distance metric that optimize a quality function Maximizing the complete data log-likelihood under generalized K-means

Integrating Constraints and Metric Learning Combining the previous two equations leads to the following objective function that minimizes cluster dispersion under that learned metrics while reducing constraint violations.

Penalty for violating constraints Penalty for violating a must-link constraints between distant points should be higher than that between nearby points. Penalty for violating a cannot-link constraints between nearby points should be lower than that between nearby points.

MPCK-MEANS Algorithm Constraints are utilized during cluster initialization and when assigning points to clusters. The distance metric is adapted by re-estimating the weights in matrices Ah.

Initialization An initial guess of the clusters. Assign each point x to one of K clusters in a way that satisfies the constraints. Compute the centroid of each cluster.

E-step Every point x is assigned to the cluster that minimizes the sum of the distance of x to the cluster centroid according to the local metric and the cost of any constraint violations incurred by the cluster assignment.

M-Step = 0 Update Metrics:

Experimental Setting

Single Metric, Diagonal Matrix A

Single Metric, Diagonal Matrix A

Multiple Metrics, Full Matrix A

Multiple Metrics, Full Matrix A

Conclusion and Discussion This paper has presented MPCK-MEANS, a new approach to semi-supervised clustering. Supervision and metric learning are helpful in clustering and multiple distance metrics are not necessary in most cases. Question 1: If we have supervision in clustering, why not utilize supervision in the same way as in a typical classification task ? Question 2: If there are infinite number of classes, can we gain from supervision on part of them ?