BP - Review CS/CMPE 333 – Neural Networks
CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS2 Notation Consider a MLP with P input, Q hidden, and M output neurons There are two layers of inputs and outputs. Two single- layer networks are connected in series where the output of the first become the input to the second For convenience each layer can be considered separately If track of both layers have to be kept then an superscript index may be used to indicate layer number, e.g. w 2 12
CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS3 Identifying Parameters Letter indices i, j, k, m, n, etc are used to identify parameters If two or more indices are used, then the alphabetical order of the indices indicate the relative position of the parameters. E.g. x i y j indicates that the variable x corresponds to a layer that precedes the variable y (i -> j) w ji = synaptic weight connecting neuron i to neuron j
CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS4 l Q 1 x1x1 xpxp y1y1 yMyM x 0 = -1 Layer 1 Layer 2 W1W1 W2W2
CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS5 BP Equations (1) Delta rule w ji (n+1) = w ji (n) + Δw ji (n) where Δw ji (n) = ηδ j (n)y i (n) δ j (n) is given by If neuron j lies in the output layer δ j (n) = φ j ’(n)e j (n) If neuron j lies in a hidden layer δ j (n) = φ j ’(v j (n)) Σ k δ k (n)w kj (n)
CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS6 BP Equations (2) When logistic sigmoidal activation functions are used δ j (n) is given by If neuron j lies in the output layer δ j (n) = y j (n)[1 – y j (n)] e j (n) = y j (n)[1 – y j (n)][d j (n) – y j (n)] If neuron j lies in a hidden layer δ j (n) = y j (n)[1 – y j (n)] Σ k δ k (n)w kj (n)
CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS7 Matrix/Vector Notation (1) w ji = the synaptic weight from the ith neuron to the jth neuron (where neuron i precedes neuron j) w ji = element in the jth row and ith column of weight matrix W Consider a feedforward network with P inputs, Q hidden neurons, and M outputs What should be the dimension for W from hidden to output layers? W will have M rows and Q+1 columns. First column is for the bias inputs
CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS8 Vector/Matrix Notation (2) y j = output of the jth neuron (in a layer) y = vector in which the jth element is y j What should be dimension of y for the hidden layer? y is a vector of length Q+1, where the first element is the bias input of -1 What should be the dimension of y for the output layer? y is a vector of length M. No bias input is needed since this is the last layer of the network.
CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS9 BP Equation in Vector/Matrix Form Delta rule W j (n+1) = W j (n) + ΔW j (n) where ΔW j (n) = η[δ j (n)y i (n) T ] outer product When logistic sigmoidal activation functions are used δ j (n) is given by (in the following, omit the bias elements from the vectors and matrices) If j is the output layer δ j (n) = y j (n)[1 – y j (n)].[d j (n) – y j (n)] If neuron j lies in a hidden layer δ j (n) = y j (n)[1 – y j (n)].W k (n) T δ k (n)
CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS10 l Q 1 x1x1 xpxp y1y1 yMyM x 0 = -1 Layer 1 Layer 2 W1W1 W2W2