Download presentation
Presentation is loading. Please wait.
Published byPercival Leonard Modified over 9 years ago
1
1 Unsupervised Learning and Clustering Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and Multimedia, National Taiwan University
2
2 Supervised vs. Unsupervised Learning Supervised training procedures –Use samples labeled by their category membership Unsupervised training procedures –Use unlabeled samples
3
3 Reasons for interest Collecting and labeling a large set of sample patterns can be costly –e.g., speech Training with large amount of unlabeled data, and using supervision to label the groupings found –For “ data mining ” applications Improved performance for data with slow changes of characteristics of patterns by tracking in an unsupervised mode –Automated food classification when seasons change
4
4 Reasons for interest Can use unsupervised methods to find features that will then be useful for categorization –Data dependent “ smart preprocessing ” or “ smart feature extraction ” Perform exploratory data analysis and gain insights into the nature or structure of the data –Discovery of distinct clusters may suggest us to alter the approach to designing the classifier
5
5 Basic Assumptions to Begin with Samples come from a known number c of classes Prior probabilities P( j ) for each class are known Forms for the class-conditional probability densities p(x| j, j ) are known Values for parameter vectors 1, …, c are unknown Category labels are unknown
6
6 Mixing Density
7
7 Goal and Approach Use samples drawn from the mixture density to estimate the unknown parameter vector With known , we can decompose the mixture into its components and use a maximum a posteriori classifier on the derived densities
8
8 Existence of Solutions Suppose unlimited number of samples and nonparametric methods are available If there is only one value of that will produce the observed values for p(x| ), a solution is possible in principle If several different values of can produce the same values for p(x| ), then there is no hope of obtaining a unique solution
9
9 Identifiable Density
10
10 An Example of Unidentifiable Mixture of Discrete Distributions
11
11 An Example of Unidentifiable Mixture of Gaussian Distributions
12
12 Maximum-Likelihood Estimates
13
13 Maximum-Likelihood Estimates
14
14 Maximum-Likelihood Estimates
15
15 Maximum-Likelihood Estimates for Unknown Priors
16
16 Maximum-Likelihood Estimates for Unknown Priors
17
17 Application to Normal Mixtures Component densities p(x| i, i )~N( i, i ) Three cases Case iiii iiii P(i)P(i)P(i)P(i)c 1?XXX 2???X 3????
18
18 Case 1: Unknown Mean Vectors
19
19 Case 1: Unknown Mean Vectors
20
20 Case 1: Unknown Mean Vectors
21
21 Case 2: All Parameters Unknown
22
22 Case 2: All Parameters Unknown
23
23 k-Means Clustering
24
24 k-Means Clustering initialize n, c, 1, 2, …, c do classify n samples according to nearest i recompute i until no change in i return 1, 2, …, c end
25
25 k-Means Clustering Complexity O(ndcT) In practice, the number of iterations T is generally much less than the number of samples The values obtained can be accepted as the answer, or can be used as starting points for more exact computations
26
26 k-Means Clustering
27
27 k-Means Clustering
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.