Basic Classification Which is that?
The Classification Problem On the basis of some examples, determine in which class a previously unobserved instance belongs Can be analogous to learning Supervised: a teacher prescribes class composition Unsupervised: class memberships are formed autonomously
Common Classification Methods Template Matching Correlation Bayesian Classifier Neural Networks Fuzzy Clustering Support Vector Machines Principle Component Analysis Independent Component Analysis
Template Matching Identify or create class templates For a given entity x Find the distances from x to each of the class templates Associate x with the class whose template is minimally distant Optionally, update the class template
Example 1 x2 m2 m1 x1
Example 1: Create Class Templates Class exemplars are ordered pairs <x1, x2>, which may be written as vectors xT = <x1, x2> The mean vectors mi are obtained by averaging the component values of the class exemplars for each class i
Example 1: Find Minimum Distance Distance from a vector x to each class mean mi, Distancei(x) = ||x-mi|| = [(x-mi)T(x-mi)]½ Note: [(x-mi)T(x-mi)] = xTx-xTmi-miTx+miTmi = ||x||2-2xTmi+||mi||2 = ||x||2 – 2 (xTmi – ½ ||mi||2) xTx =||x||2 is fixed for all i Thus, Distancei(x) is minimized when the quantity (xTmi – ½ ||mi||2) is maximized
Example 1: The Decision Boundary The decision boundary d with respect to classes i and j, dij(x) = Distancei(x)-Distancej(x) = 0 → (||x||2 – 2 (xTmi – ½ ||mi||2)) – (||x||2 – 2 (xTmj – ½ ||mj||2)) = 0 → (xTmi – ½ ||mi||2) - (xTmj – ½ ||mj||2) = 0 → dij(x) = xT (mi-mj) - ½ (mi-mj)T (mi+mj) = 0 Note: This is not the same as Eq. 12.2-6
Example 2: Details Class 1 exemplars Class 2 Exemplars x1 x2 1 2 4 5.2 1.7 1.8 3.8 4.2 2.1 2.3 4.5 5.9 1.5 5 2.2 4.9 5.3 1.2 2.5 3.7 5.4 1.15 5.8 1.4 5.7 m1 m2 5.1
Example 2: Decision Boundary dij(x) = xT (mi-mj) - ½ (mi-mj)T (mi+mj) = 0 (m1-m2)T = <1.5, 1.8>-<4.5, 5.1> = <-3, -3.3> (m1+m2)T = <1.5, 1.8>+<4.5, 5.1> = <6, 6.9> -½(m1-m2)T (m1+m2)= 20.385 d12(x) = <x1, x2><-3, -3.3>T + 20.385 = 0 = -3x1 + -3.3x2 + 20.385 = 0 = 3x1 + 3.3x2 - 20.385 = 0
Correlation Commonly used to locate similar patterns in 1- or 2-dimensional domain Identify pattern x to which to correlate For x Find the correlation of x to samples Associate x with the samples whose correlation to x are largest Report location of highly correlated samples
Example 3: Finding Eyes x
Computational Matters Normalized correlation is typically computed using Pearson’s r
Notation and Interpretation The number n of pairs of values x and y for which the degree of correlation is to be determined |r| ≤ 1 r = 0, if x and y are uncorrelated r > 0, if y increases (decreases) as x increases (decreases), i.e., x and y are positively correlated (to some degree) r < 0, if y decreases (increases) as x increases (decreases), i.e., x and y are negatively correlated (to some degree) To assess the relative strengths of two values r1 and r2, compare their squares. If r1= 0.2 and r2=0.4, r2 indicates 4 times as strong a correlation.
Example 4: 5x5 Grid Patterns 1 1 1 r = 0.343 r = 0.514 1 r = -1.0
Bayesian Classifier Optimal for Gaussian data
Fuzzy Classifiers Jang, Sun, and Mizutani, Neuro-Fuzzy and Soft Computing Fuzzy C-Means (FCM)
Neural Networks Feedforward networks and the backpropagation training algorithm Adaptive resonance theory Kohonen netowrks