Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Unsupervised Learning and Clustering Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of.

Similar presentations


Presentation on theme: "1 Unsupervised Learning and Clustering Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of."— Presentation transcript:

1 1 Unsupervised Learning and Clustering Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and Multimedia, National Taiwan University

2 2 Supervised vs. Unsupervised Learning Supervised training procedures –Use samples labeled by their category membership Unsupervised training procedures –Use unlabeled samples

3 3 Reasons for interest Collecting and labeling a large set of sample patterns can be costly –e.g., speech Training with large amount of unlabeled data, and using supervision to label the groupings found –For “ data mining ” applications Improved performance for data with slow changes of characteristics of patterns by tracking in an unsupervised mode –Automated food classification when seasons change

4 4 Reasons for interest Can use unsupervised methods to find features that will then be useful for categorization –Data dependent “ smart preprocessing ” or “ smart feature extraction ” Perform exploratory data analysis and gain insights into the nature or structure of the data –Discovery of distinct clusters may suggest us to alter the approach to designing the classifier

5 5 Basic Assumptions to Begin with Samples come from a known number c of classes Prior probabilities P(  j ) for each class are known Forms for the class-conditional probability densities p(x|  j,  j ) are known Values for parameter vectors  1, …,  c are unknown Category labels are unknown

6 6 Mixing Density

7 7 Goal and Approach Use samples drawn from the mixture density to estimate the unknown parameter vector  With known , we can decompose the mixture into its components and use a maximum a posteriori classifier on the derived densities

8 8 Existence of Solutions Suppose unlimited number of samples and nonparametric methods are available If there is only one value of  that will produce the observed values for p(x|  ), a solution is possible in principle If several different values of  can produce the same values for p(x|  ), then there is no hope of obtaining a unique solution

9 9 Identifiable Density

10 10 An Example of Unidentifiable Mixture of Discrete Distributions

11 11 An Example of Unidentifiable Mixture of Gaussian Distributions

12 12 Maximum-Likelihood Estimates

13 13 Maximum-Likelihood Estimates

14 14 Maximum-Likelihood Estimates

15 15 Maximum-Likelihood Estimates for Unknown Priors

16 16 Maximum-Likelihood Estimates for Unknown Priors

17 17 Application to Normal Mixtures Component densities p(x|  i,  i )~N(  i,  i ) Three cases Case iiii iiii P(i)P(i)P(i)P(i)c 1?XXX 2???X 3????

18 18 Case 1: Unknown Mean Vectors

19 19 Case 1: Unknown Mean Vectors

20 20 Case 1: Unknown Mean Vectors

21 21 Case 2: All Parameters Unknown

22 22 Case 2: All Parameters Unknown

23 23 k-Means Clustering

24 24 k-Means Clustering initialize n, c,  1,  2, …,  c do classify n samples according to nearest  i recompute  i until no change in  i return  1,  2, …,  c end

25 25 k-Means Clustering Complexity O(ndcT) In practice, the number of iterations T is generally much less than the number of samples The values obtained can be accepted as the answer, or can be used as starting points for more exact computations

26 26 k-Means Clustering

27 27 k-Means Clustering


Download ppt "1 Unsupervised Learning and Clustering Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of."

Similar presentations


Ads by Google