What is Unsupervised Learning? Learning without a teacher. No feedback to indicate the desired outputs. The network must by itself discover the relationship of interest from the input data.
The Nearest Neighbor Classifier x (1) x (2) x (3) x (4)
The Nearest Neighbor Classifier x (1) x (2) x (3) x (4) ?Class
The Hamming Networks Stored a set of classes represented by a set of binary prototypes. Given an incomplete binary input, find the class to which it belongs. Use Hamming distance as the distance measurement. Distance vs. Similarity.
The Hamming Net Similarity Measurement MAXNET Winner-Take-All x1x1 x2x2 xnxn
The Hamming Distance y = 1 x = 1 1 1 Hamming Distance = ?
The Hamming Distance y = 1 x = 1 1 1 Hamming Distance = ?
y = 1 x = 1 1 1 The Hamming Distance Hamming Distance = 3
y = 1 The Hamming Distance 1 1 1 Sum=1 x = 1 1 1
The Hamming Distance
The Hamming Net Similarity Measurement MAXNET Winner-Take-All n1n1 n1n1 n n x1x1 x2x2 xm1xm1 xmxm n1n1 n1n1 n n y1y1 y2y2 yn1yn1 ynyn
The Hamming Net Similarity Measurement MAXNET Winner-Take-All n1n1 n1n1 n n x1x1 x2x2 xm1xm1 xmxm n1n1 n1n1 n n y1y1 y2y2 yn1yn1 ynyn W S =? W M =?
The Stored Patterns Similarity Measurement MAXNET Winner-Take-All n1n1 n1n1 n n x1x1 x2x2 xm1xm1 xmxm n1n1 n1n1 n n y1y1 y2y2 yn1yn1 ynyn W S =? W M =?
The Stored Patterns Similarity Measurement k x1x1 x2x2 xmxm... m/2
Weight update: –Method 1:Method 2 In each method, is moved closer to i l –Normalize the weight vector to unit length after it is updated –Sample input vectors are also normalized –Distance wjwj ilil i l – w j η (i l - w j ) w j + η(i l - w j ) ilil wjwj w j + ηi l ηilηil i l + w j
is moving to the center of a cluster of sample vectors after repeated weight updates –Node j wins for three training samples: i 1, i 2 and i 3 –Initial weight vector w j (0) –After successively trained by i 1, i 2 and i 3, the weight vector changes to w j (1), w j (2), and w j (3), i2i2 i1i1 i3i3 w j (0) w j (1) w j (2) w j (3)
Example will always win no matter the sample is from which class is stuck and will not participate in learning unstuck: let output nodes have some conscience temporarily shot off nodes which have had very high winning rate (hard to determine what rate should be considered as “very high”) w1w1 w2w2
Example Results depend on the sequence of sample presentation w1w1 w2w2 Solution: Initialize w j to randomly selected input vector i l that are far away from each other w1w1 w2w2