Download presentation
Presentation is loading. Please wait.
1
Introduction to Neural Networks John Paxton Montana State University Summer 2003
2
Chapter 4: Competition Force a decision (yes, no, maybe) to be made. Winner take all is a common approach. Kohonen learning w j (new) = w j (old) + (x – w j (old)) w j is closest weight vector, determined by Euclidean distance.
3
MaxNet Lippman, 1987 Fixed-weight competitive net. Activation function f(x) = x if x > 0, else 0. Architecture a1a1 a2a2 -- 1 1
4
Algorithm 1.w ij = 1 if i = j, otherwise – 2.a j (0) = s i, t = 0. 3.a j (t+1) = f[a j (t) – * k<>j a k (t)] 4.go to step 3 if more than one node has a non-zero activation Special Case: More than one node has the same maximum activation.
5
Example s 1 =.5, s 2 =.1, =.1 a 1 (0) =.5, a 2 (0) =.1 a 1 (1) =.49, a 2 (1) =.05 a 1 (2) =.485, a 2 (2) =.001 a 1 (3) =.4849, a 2 (3) = 0
6
Mexican Hat Kohonen, 1989 Contrast enhancement Architecture (w 0, w 1, w 2, w 3 ) w 0 (x i -> x i ), w 1 (x i+1 -> x i and x i-1 ->x i ) x i-3 x i-2 x i-1 xixi x i+1 x i+2 x i+3 0 - + + + - 0
7
Algorithm 1.initialize weights 2.x i (0) = s i 3.for some number of steps do 4.x i (t+1) = f [ w k x i+k (t) ] 5.x i (t+1) = max(0, x i (t))
8
Example x 1, x 2, x 3, x 4, x 5 radius 0 weight = 1 radius 1 weight = 1 radius 2 weight = -.5 all other radii weights = 0 s = (0.5 1.5 0) f(x) = 0 if x < 0, x if 0 <= x <= 2, 2 otherwise
9
Example x(0) = (0.5 1.5 1) x 1 (1) = 1(0) + 1(.5) -.5(1) = 0 x 2 (1) = 1(0) + 1(.5) + 1(1) -.5(.5) = 1.25 x 3 (1) = -.5(0) + 1(.5) + 1(1) + 1(.5) -.5(0) = 2.0 x 4 (1) = 1.25 x 5 (1) = 0
10
Why the name? Plot x(0) vs. x(1) x 1 x 2 x 3 x 4 x 5 210210
11
Hamming Net Lippman, 1987 Maximum likelihood classifier The similarity of 2 vectors is taken to be n – H(v 1, v 2 ) where H is the Hamming distance Uses MaxNet with similarity metric
12
Architecture Concrete example: x1x1 x2x2 x3x3 y2y2 y1y1 MaxNet
13
Algorithm 1.w ij = s i (j)/2 2.n is the dimensionality of a vector 3.y in.j = x i w ij + (n/2) 4.select max(y in.j ) using MaxNet
14
Example Training examples: (1 1 1), (-1 -1 -1) n = 3 y in.1 = 1(.5) + 1(.5) + 1(.5) + 1.5 = 3 y in.2 = 1(-.5) + 1(-.5) + 1(-.5) + 1.5 = 0 These last 2 quantities represent the Hamming distance They are then fed into MaxNet.
15
Kohonen Self-Organizing Maps Kohonen, 1989 Maps inputs onto one of m clusters Human brains seem to be able to self organize.
16
Architecture x1x1 ymym y1y1 xnxn
17
Neighborhoods Linear 3 2 1 # 1 2 3 Rectangular 2 2 2 2 2 2 1 1 1 2 2 1 # 1 2 2 1 1 1 2 2 2 2 2 2
18
Algorithm 1.initialize w ij 2.select topology of y i 3.select learning rate parameters 4. while stopping criteria not reached 5.for each input vector do 6.compute D(j) = (w ij – x i ) 2 for each j
19
Algorithm. 7.select minimum D(j) 8.update neighborhood units w ij (new) = w ij (old) + [x i – w ij (old)] 9.update 10.reduce radius of neighborhood at specified times
20
Example Place (1 1 0 0), (0 0 0 1), (1 0 0 0), (0 0 1 1) into two clusters (0) =.6 (t+1) =.5 * (t) random initial weights.2.8.6.4.5.7.9.3
21
Example Present (1 1 0 0) D(1) = (.2 – 1) 2 + (.6 – 1) 2 + (.5 – 0) 2 + (.9 – 0) 2 = 1.86 D(2) =.98 D(2) wins!
22
Example w i2 (new) = w i2 (old) +.6[x i – w i2 (old)].2.92 (bigger).6.76(bigger).5.28(smaller).9.12(smaller) This example assumes no neighborhood
23
Example After many epochs 01(1 1 0 0) -> category 2 0.5(0 0 0 1) -> category 1.50(1 0 0 0) -> category 2 10(0 0 1 1) -> category 1
24
Applications Grouping characters Travelling Salesperson Problem –Cluster units can be represented graphically by weight vectors –Linear neighborhoods can be used with the first and last cluster units connected
25
Learning Vector Quantization Kohonen, 1989 Supervised learning There can be several output units per class
26
Architecture Like Kohonen nets, but no topology for output units Each y i represents a known class x1x1 ymym y1y1 xnxn
27
Algorithm 1.Initialize the weights (first m training examples, random) 2.choose 3.while stopping criteria not reached do (number of iterations, is very small) 4.for each training vector do
28
Algorithm 5.find minimum || x – w j || 6.if minimum is target class w j (new) = w j (old) + [x – w j (old)] else w j (new) = w j (old) – [x – w j (old)] 7.reduce
29
Example (1 1 -1 -1) belongs to category 1 (-1 -1 -1 1) belongs to category 2 (-1 -1 1 1) belongs to category 2 (1 -1 -1 -1) belongs to category 1 (-1 1 1 -1) belongs to category 2 2 output units, y 1 represents category 1 and y 2 represents category 2
30
Example Initial weights (where did these come from? 1-1 1-1 -1-1 -11 =.1
31
Example Present training example 3, (-1 -1 1 1). It belongs to category 2. D(1) = 16 = (1 + 1) 2 + (1 + 1) 2 + (-1 -1) 2 + (-1-1) 2 D(2) = 4 Category 2 wins. That is correct!
32
Example w2(new) = (-1 -1 -1 1) +.1[(-1 -1 1 1) - (-1 -1 -1 1)] = (-1 -1 -.8 1)
33
Issues How many y i should be used? How should we choose the class that each y i should represent? LVQ2, LVQ3 are enhancements to LVQ that modify the runner-up sometimes
34
Counterpropagation Hecht-Nielsen, 1987 There are input, output, and clustering layers Can be used to compress data Can be used to approximate functions Can be used to associate patterns
35
Stages Stage 1: Cluster input vectors Stage 2: Adapt weights from cluster units to output units
36
Stage 1 Architecture x1x1 xnxn zpzp z1z1 ymym y1y1 w 11 v 11
37
Stage 2 Architecture x* 1 x* n zjzj y* m y* 1 t j1 v j1
38
Full Counterpropagation Stage 1 Algorithm 1.initialize weights, 2.while stopping criteria is false do 3.for each training vector pair do 4.minimize ||x – w j || + ||y – v j || w j (new) = w j (old) + [x – w j (old)] v j (new) = v j (old) + [y-v j (old)] 5.reduce
39
Stage 2 Algorithm 1.while stopping criteria is false 2.for each training vector pair do 3.perform step 4 above 4.t j (new) = t j (old) + [x – t j (old)] v j (new) = v j (old) + [y – v j (old)]
40
Partial Example Approximate y = 1/x [0.1, 10.0] 1 x unit 1 y unit 10 z units 1 x* unit 1 y* unit
41
Partial Example v 11 =.11, w 11 = 9.0 v 12 =.14, w 12 = 7.0 … v 10,1 = 9.0, w 10,1 =.11 test.12, predict 9.0. In this example, the output weights will converge to the cluster weights.
42
Forward Only Counterpropagation Sometimes the function y = f(x) is not invertible. Architecture (only 1 z unit active) x1x1 xnxn zpzp z1z1 ymym y1y1
43
Stage 1 Algorithm 1.initialize weights, (.1), (.6) 2.while stopping criteria is false do 3.for each input vector do 4.find minimum || x – w|| w(new) = w(old) + [x – w(old)] 5.reduce
44
Stage 2 Algorithm 1.while stopping criteria is false do 2.for each training vector pair do 3.find minimum || x – w || w(new) = w(old) + [x – w(old)] v(new) = v(old) + [y – v(old)] 4.reduce Note: interpolation is possible.
45
Example y = f(x) over [0.1, 10.0] 10 z i units After phase 1, z i = 0.5, 1.5, …, 9.5. After phase 2, z i = 5.5, 0.75, …, 0.1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.