K-means conditional mixture models

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Unsupervised Learning

Biointelligence Laboratory, Seoul National University

Pattern Recognition and Machine Learning

Supervised Learning Recap

K-means clustering Hongning Wang

Multivariate linear models for regression and classification Outline: 1) multivariate linear regression 2) linear classification (perceptron) 3) logistic.

Visual Recognition Tutorial

Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.

EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.

Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction

Unsupervised Learning

Maximum Likelihood (ML), Expectation Maximization (EM)

Visual Recognition Tutorial

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Clustering & Dimensionality Reduction 273A Intro Machine Learning.

Gaussian Mixture Models and Expectation Maximization.

Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.

Biointelligence Laboratory, Seoul National University

EM and expected complete log-likelihood Mixture of Experts

Least-Mean-Square Training of Cluster-Weighted-Modeling National Taiwan University Department of Computer Science and Information Engineering.

CSC321: Neural Networks Lecture 12: Clustering Geoffrey Hinton.

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.

Learning Theory Reza Shadmehr LMS with Newton-Raphson, weighted least squares, choice of loss function.

Lecture 17 Gaussian Mixture Models and Expectation Maximization

Overview of the final test for CSC Overview PART A: 7 easy questions –You should answer 5 of them. If you answer more we will select 5 at random.

Regularization (Additional)

CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.

Lecture 2: Statistical learning primer for biologists

Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.

Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.

Regress-itation Feb. 5, Outline Linear regression – Regression: predicting a continuous value Logistic regression – Classification: predicting a.

6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,

CSC321: Introduction to Neural Networks and Machine Learning Lecture 15: Mixtures of Experts Geoffrey Hinton.

Fitting normal distribution: ML 1Computer vision: models, learning and inference. ©2011 Simon J.D. Prince.

For multivariate data of a continuous nature, attention has focussed on the use of multivariate normal components because of their computational convenience.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.

. The EM algorithm Lecture #11 Acknowledgement: Some slides of this lecture are due to Nir Friedman.

CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.

The EM algorithm for Mixture of Gaussians & Classification with Generative models Jakob Verbeek December 2, 2011 Course website:

Helge VossAdvanced Scientific Computing Workshop ETH Multivariate Methods of data analysis Helge Voss Advanced Scientific Computing Workshop ETH.

Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.

Machine Learning and Data Mining Clustering

Expectation-Maximization (EM)

Classification of unlabeled data:

Data Mining Lecture 11.

Latent Variables, Mixture Models and EM

with observed random variables

Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis

دانشگاه صنعتی امیرکبیر Instructor : Saeed Shiry

Today (2/11/16) Learning objectives (Sections 5.1 and 5.2):

Probabilistic Models with Latent Variables

Collaborative Filtering Matrix Factorization Approach

Multitask Learning Using Dirichlet Process

10701 / Machine Learning Today: - Cross validation,

Lecture 5 Unsupervised Learning in fully Observed Directed and Undirected Graphical Models.

Gaussian Mixture Models And their training with the EM algorithm

Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models

LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.

Lecture 11 Generalizations of EM.

Biointelligence Laboratory, Seoul National University

Machine Learning and Data Mining Clustering

Expectation Maximization Eric Xing Lecture 10, August 14, 2010

Ch 3. Linear Models for Regression (2/2) Pattern Recognition and Machine Learning, C. M. Bishop, Previously summarized by Yung-Kyun Noh Updated.

Radial Basis Functions: Alternative to Back Propagation

Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.

Clustering (2) & EM algorithm

Lecture 15 Factor Analysis.

Presentation transcript:

K-means conditional mixture models Lecture 14 K-means conditional mixture models

K-means VEM is a general bound optimization algorithm where we can choose parameterized posteriors. If we use delta functions instead of general responsibilities in MoG, we get the k-means algorithm. At each E-step we are forced to pick a winner cluster, instead of soft assignments. We minimize a cost function, but not the log-likelihood of the MoG model, so we wont get ML parameters. K-means is typically fast, but prone to local minima.

Conditional Mixture Models Recall the generative and discriminative methods for classification/regression. We will now make these model more flexible by using mixture models. Generative: p(x|z,y) p(z|y) p(y) Discriminative: p(y|z,x) p(z|x). p(z|x) are soft-swithes that are input dependent. They switch between the different models. p(y|z,x) are expert-models, such as linear regression models, but different for each value of z. For regression we get soft piecewise linear curve fitting (with uncertainty). Use EM to learn the parameters.