Download presentation
Presentation is loading. Please wait.
Published byAdele Willis Modified over 8 years ago
1
Classification Heejune Ahn SeoulTech Last updated 2015. May. 03
2
Outline Introduction Purpose, type, and an example Classification design Design flow Simple classifier Linear discriminant functions Mahalanobis distance Bayesian classification K-means clustering : unsupervised learning
3
1.Pupose Purpose For decision making Topics of Pattern recognition (in artificial intelligence) Model Automation and Human intervention Task specification: what classes, what features Algorithm to used Training: tuning algorithm parameters Classifier (classification rules) classes Features (patterns, structures) Images
4
2. Supervised vs unsupervised Supervised (classification) trained by examples (by humans) Unsupervised (clustering) only by feature data using the mathematical properties (statistics) of data set
5
3. An example Classifying nuts Classifier (classification rules) Pine-nuts Lentils Pumpkin seeds Features (circularity, line-fit-error) lentil pumpkin seed pine nut
6
Observations What if a single features used? What for the singular points? Classification draw boundaries
7
Terminalogy
8
4. Design Flow
9
5. Prototypes & min-distance classifier Prototypes mean of training samples in each class
11
6. Linear discriminant Linear discriminant function g(x1,x2) = a*x1 + b*x2 + c = 0 Ex 11.1 & Fig11.6
13
8. Mahalanobis distance Problems In min-dist. mean-value only, no distribution considered e.g. (right figure) std(class 1) << std(class 2) Mahalanobis dist. Variance considered. (larger variance, less distance)
14
9. Bayesian classification Idea To assign each data to the “most-probable” class, based on “apriori-known probability” Assumption Priors (probability for class) are known. Bayes theorem
15
10. Bayes decision rule Classification rule Bayes Theorem Intuitively Class-conditional probability density function Prior probability Total probability & Not used in classification decision
16
Interpretation Need to know priors and class-conditional pdf: often not available MVN (multivariate normal) distribution model Practically quite good approximation MVN N-D Normal distribution with
17
12. Bayesian classifier for M-varirates taking log( ) It is monotonic increasing function
18
Case 1: identical independent Linear Machine: the decision region is hyper-plane (linears) Note: when same prob(w), then Minimum distance criterion
19
Case 2: all covariance is same: Matlab [class, err] = classify(test, training, group[, type, prior]) training and test Type ‘DiagLinear’ for naïve Baysian
20
Ex11.3 wrong priorscorrect priors
21
13. Ensemble classifier Combining multiple classifiers Utilizing diversity, similar to ask multiple experts for decision. AdaBoost Weak classifier: change (1/2) < accuracy << 1.0 weighting mis-classified training data for next classifiers H 1 (x)D 1 (x)H 2 (x) D 2 (x) H t (x) D 2 (x) H T (x) D T (x) uniform a t (x)
22
AdaBoost in details Given: Initialize weight: For t = 1,..., T: 1. WeakLearn, which return the weak classifier with minimum error w.r.t. distribution D t 2. Choose 3. Update Where Z t is a normalization factor chosen so that D t+1 is a distribution Output the strong classifier:
23
14. K-means clustering K-means Unsupervised classification Group data to minimize Iterative algorithm (re-)assign X i ’s to class (re-)calculate c i Demo http://shabal.in/visuals/kmeans/3.html
24
Issues Sensitive to “initial” centroid values. Multiple trials needed => choose the best one ‘K’ (# of clusters) should be given. Trade-off in K (bigger) and the objective function (smaller) No optimal algorithm to determine it. Nevertheless used in most of un-supervised clustering now.
25
Ex11.4 & F11.10 kmeans function [classIndexes, centers] = kmeans(data, k, options) k : # of clusters Options: ‘Replicates', ‘Display’
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.