Intelligent Control Methods Lecture 14: Neuronal Nets (Part 2) Slovak University of Technology Faculty of Material Science and Technology in Trnava
2 Decision (solution, active dynamics) The input x k =(x 1k, x 2k,... x nk ) is applied to input layer. Neurons in all layers work out output signals according to their input(s), threshold and transfer function. The output signals are multiplicated by weights. The neuron output signals lead to next layer. The output of the last layer is y k =(y 1k, y 2k,... y mk ). The net gives (declares, indicates, defines) with decision y k what is x k.
3 Decision example: – characters classification x1=1 x2=1 x3=0 x4=1 x15=1 w11 y1 = 0.84 1 y2 = 0.04 0 y3 = 0.75 1 y4 = 0.92 1 y5 = 0.12 0 y = (1,0,1,1,0) responds to „R“
4 Learning (adaptive dynamics) Theoretically possible ways of learning: construction of new connections removing of connections changing of neurons threshold values changing of transfer functions changing of layers number or number of neurons in layers changing of synaptic weights (practical used only) Base idea: Synaptic weights are adapted to values, which guarantee the proper output vector y for each input vector x. The way (not only) of weight adaptation: Learning with training set.
5 Learning with training set: Training set: M = {(x 1,b 1 ),..., (x n,b n )} x k – input vector b k – correct output vector (response) The real net response to input x k : y k Adapted (learnt) net: y k = b k
6 Learning with training set: Vector x k from training set is connected to net input. The signal spreads in net and the net produces an output y k. y k compares with b k. The needed changes of synaptic weights are computed according to differences (net mistake) between y k and b k.. The biggest weight changes are in connections, where are the greatest differences (delta rule). Because the difference is measured in output, the first calculations are performed in output layer. The calculation process moves along to net from right to left (back propagation). The global net mistake is calculated after complete training net using. If it is in allowed range, the adaptation process ends. Otherwise the training set must be used again (perhaps thousands iterations).
7 Learning with training set: Net global mistake (with weights w): mistake of element j mistake of pattern k training net global mistake The sum of quadrates errors is used for mistake of pattern k estimation, therefore: w opt = arg min E(w) w
8 Learning with training set: Changes w ij are computed so, that the mistake of pattern k is minimal: - learning rate - defines a speed of learning process. y ij, w ij y i j i Iterative process. The initial synaptic weights are set up.
9 Learning with training set: The result of derivation: delta rule For hidden layer: all outputs into next layer needed learningmistake contribution change rateof of neuron j of w ij output ito input i For output layer:
10 Learning with training set: The weights are adapted mostly in places, where are the greatest mistakes. The calculations are performed in direction from output to input layer (back propagation).
11 Example of neuronal net learning: p w w 53 w 41 y 5 w 32 w 54 q 2 4 w 42 pqy XOR: = 0.01 (for all neurons), unit jump transfer functions, = 1. initial weights for 1. pattern (p=1, q=1) are: w 31 (1) = -4.9, w 41 (1) = 4.6, w 32 (1) = 5.0, w 42 (1) = -5.1, w 53 (1) = 2.2, w 54 (1) = 2.5
12 Example of neuronal net learning: b(1) = 0 For the 1st pattern: y 53 (1) =1, y 54 (1) = 0, y 5 (1) = 1. w 53 (1) = 1 (b 5 (1) – y 5 (1)) y 53 (1) = 1 (0 – 1) 1 = -1 w 54 (1) = 1 (b 5 (1) – y 5 (1)) y 54 (1) = 1 (0 – 1) 0 = 0 In hidden layer: w 31 (1) = 1 [w 53 (1) (b 5 (1) – y 5 (1))] y 31 (1) = 1 [2.2 (0 – 1)] 1 = -2.2 w 41 (1) = 1 [w 54 (1) (b 5 (1) – y 5 (1))] y 41 (1) = 1 [2.5 (0 – 1)] 1 = -2.5 w 32 (1) = 1 [w 53 (1) (b 5 (1) – y 5 (1))] y 32 (1) = 1 [2.2 (0 – 1)] 1 = -2.2 w 42 (1) = 1 [w 54 (1) (b 5 (1) – y 5 (1))] y 42 (1) = 1 [2.5 (0 – 1)] 1 = -2.5
13 Example of neuronal net learning: New weights for pattern 2: (p=0, q=1) w ij (2) = w ij (1) + w ij (1) w 31 (2) = w 31 (1) + w 31 (1) = (-2.2) = -7.1 w 41 (2) = w 41 (1) + w 41 (1) = = 2.1 w 32 (2) = w 32 (1) + w 32 (1) = = 2.8 w 42 (2) = w 42 (1) + w 42 (1) = = -7.6 w 53 (2) = w 53 (1) + w 53 (1) = = 1.2 w 54 (2) = w 54 (1) + w 54 (1) = = 2.5 Pattern 2 is used, the net produces an adequate output. The weights are right (expected and real output are equal), therefore they are not changed. The same for patterns 3 and 4. The net is learnt.
14 Applications of NNs: Pattern recognition (classification) example with characters recognition in raster Optimisation There are various input combinations and their optimal outputs in training set. The learnt net can find the optimum for another inputs, too. Used in cases, where the analytical formulation input-output misses.
15 Applications of NNs: data evaluation, state monitoring (system, organism, TP,...) – example: chemical column inputs (linguistic variables): volume quantity of input hydrocarbons mixture volume quantity of reflux (= backward) flow volume quantity of heating steam volume quantity of product output outputs: temperature pressure distilled material concentration in product contamination concentration on the column bottom
16 Applications of NNs: Processes control Example: Abrasive cutting. (Material with abrasium circulate along a closed loop. Inputs: flow speed hardness of abrasium, hardness of material size of abrasium, size of material number of cycles outputs: material decrease surface roughness Regulation The controllers constants are estimated according to combination of input, state, output and desired values.
17 Neuronal nets – concluding remarks: It do not exist rules for number of layers and for number of neurons in layers estimation. (Nets with hidden layers are used. The number of input neurons depends on input number, the number of input neurons depends on the needed outputs number n (example: 2 i > n). The number of hidden layers is 1 or 2, the number of neurons in hidden layers is low.) net size Net global mistake
18 Neuronal nets – concluding remarks: I have not found recommedations for the choice of transfer function, threshold value, initial synaptic weights setup (average values from allowed scope, random values?) Learning rate is selected from (little value needs more iterations but is more precise, the bigger one learns rapid, but it can oscillate around the extreme). It is not defined, when to stop the learning process. After some iterations the net global mistake can start to grow. (net overlearning).