Pattern Recognition: Statistical and Neural Nanjing University of Science & Technology Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 17 Oct 21, 2005
Lecture 17 Topics View of Perceptron Algorithm in Pattern Space Fractional Correction Perceptron Algorithm Simplified Perceptron Algorithm 4. Derivation of the perceptron algorithm 5. Extension of Perceptron Algorithm to M Class Case: 3 Special Cases
from C1 from C2 Motivation
How do we find separating Hyperplane??? a “needle in the haystack” Question: How do we find separating Hyperplane??? a “needle in the haystack” Answer: The Perceptron Algorithm !!!! Other ways exist like random selection
from C1 from C2 Motivation Separating Hyperplane
Linear Discriminant Functions where Augmented Pattern Vector Weight vector C1 Decision Rule: if d(x) > 0 < C2
Finds a hyperplane that separates two sets of patterns Review Perceptron Algorithm Finds a hyperplane that separates two sets of patterns Algorithm
Review Perceptron Algorithm New Training Sample
Review Perceptron Algorithm wT(k)x(k) < 0 wT(k)x(k) > 0 If c is too large we may not get convergence
View of Perceptron correction in pattern space
Fractional Correction Perceptron Algorithm Weight Update Equation Where
Weight update in original pattern space
Weight update in original pattern space
Weight update in original pattern space
Simplified Perceptron Algorithm Given the following samples from two classes We want a weight vector w such that Negated patterns for class C2
We could define two classes as Consider the set of samples x of C11 U C21 then the weight vector update is as follows That is all of the samples should be on the positive side of the hyperplane boundary
This is a simplification of the perceptron algorithm in that it only contains one if branch Simplified algorithm: 1. Augment all patterns in C1 and C2 2. Negate the augmented patterns of C2 3. Combine the two sets above to form one set of samples 4. Iterate through this set using the weight update as
Derivation of the Perceptron Algorithm Define the following performance measure
Minimizing J for each sample will satisfy the conditions desired To minimize J we can use the gradient algorithm where the weight update is as follows
The partial derivatives with respect to each weight are determined as
Substituting these partials into the weight update equation yields the following for
Which can be written in the following vector form Which is the perceptron algorithm in augmented and negated form. (end of proof)
Other Perceptron like Algorithms If we use different performance measures in the preceding proof we get perceptron like algorithms to accomplish the separation of classes.
Other meaningful performance measures Each leads to different weight update equations. Each requires a convergence theorem to be useful.
So far only worked with separating two classes So far only worked with separating two classes. Most problems have more than two classes. Question: Can we modify the perceptron algorithm to work with more than two classes? Answer: Yes for certain special cases.
Case 1. Pattern classes Group separable
Case 1. K Pattern classes - Group separable Given: S1 is set of samples from class C1 S2 is set of samples from class C2 SK is set of samples from class CK S = S1 U S2 U . . . U SK … Define: Assume: Pattern classes are group separable S1 linearly separable from S1/ = S – S1 S2 linearly separable from S2/ = S – S2 SK linearly separable from SK/ = S – SK …
dk(x)=wk1x1 + wk2x2 + … + wknxn + wkn+1 Find K Linear Discriminant functions that separate the Sk in the following manner Discriminant Functions for k = 1, 2, … , K dk(x)=wk1x1 + wk2x2 + … + wknxn + wkn+1 Decision Rule: If dj(x) > 0 then decide Cj Solution: Find dk(x) by using the Perceptron algorithm on the two classes Sk and Sk/
Solution Continued (1) Find d1(x) by using the Perceptron algorithm on the two sets S1 and S1/ (2) Find d2(x) by using the Perceptron algorithm on the two sets S2 and S2/ … (K) Find dK(x) by using the Perceptron algorithm on the two sets SK and SK/
0 = wk1x1 + wk2x2 + … + wknxn + wkn+1 Solution Continued This gives the decision boundaries as: for k = 1, 2, … , K 0 = wk1x1 + wk2x2 + … + wknxn + wkn+1
Case 2. Pattern classes Pairwise separable
dkj(x)=wkj1x1 + wkj2x2 + … + wkjnxn + wkjn+1 Case 2. Pattern classes Pairwise separable Find Linear Discriminant functions that separate all pairs of Sk in the following manner: for k = 1, 2, … , K j = 1, 2, … , K j = k dkj(x)=wkj1x1 + wkj2x2 + … + wkjnxn + wkjn+1 Decision Rule: for k = 1, 2, … , K If dkj(x) > 0 for all j , j = k then decide Ck On Boundaries decide randomly
Solution: for k = 1, 2, … , K j = 1, 2, … , K , j = k Find dkj(x) by using the Perceptron algorithm on the two classes Sk and Sj . Notes: (1) dkj(x) = - djk(x) . (2) Requires determining K(K-1)/2 discriminant functions
Case 3. K Pattern classes separable by K discriminant functions
Flow Diagram for Case 3 Perceptron separability
Summary Lecture 17 View of Perceptron Algorithm in Pattern Space Fractional Correction Perceptron Algorithm Simplified Perceptron Algorithm 4. Derivation of the perceptron algorithm 5. Extension of Perceptron Algorithm to M Class Case: 3 Special Cases
End of Lecture 17