Download presentation
Presentation is loading. Please wait.
1
Unsupervised Training and Clustering Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall 2004-2005
2
Unsupervised Training Definition: The training set samples are unlabelled (unclassified) Motivation: Labeling is hard/time consuming Fully automatic adaptation of models (in the field)
3
Maximum Likelihood Training Given: N training examples drawn from c classes, i.e.,D = {x 1, x 2, … x N } (no class assignments are given!) Estimate: Class priors: p(w i ) Feature PDF parameters θ : p(x| θ i, w i ) Sometimes the number of classes c is not given and has to be also estimated
4
Unsupervised ML estimation k p(w i |x k,θ) i log p(x k | w i θ i ) = 0 Compared with supervised ML: additional term P(w i |x k,θ) P(w i |x k,θ) class membership function for each sample x k Unsupervised ML is a version of EM Pseudo-EM: P(w i |x k,θ) is binary 0 or 1
5
Mixture of Gaussians Estimates Linear combination of Gaussians with weights a i p(x k ) = i a i N(x k ; i, i ) ML estimates: a i = (1/N) k p(w i |x k ) i = ( k p(w i |x k ) x k ) / k p(w i |x k ) i = ( k p(w i |x k ) (x k - i ) (x k - i ) T ) / k p(w i |x k )
6
Clustering Basic Isodata: 1.Select initial partition of data into c classes and compute cluster means 2.Classify training samples using a classification criterion (Euclidean distance) 3.Recompute cluster means based on training set classification decisions 4.If no change in sample means stop else go to step 2
7
Iterative clustering algorithms Top down algorithms: Start from a single class (all data) Split class (e.g., std) Continue splitting the “largest” class until desired number of clusters is reached Bottom up algorithms Each training sample a different class Start merging classes (e.g., using a NNR criterion) until desired number of classes is reached
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.