Neural Networks.

Neural Networks

Today’s Class Neural Networks The Perceptron Model
The Multi-layer Perceptron (MLP) Forward-pass in an MLP (Inference) Backward-pass in an MLP (Backpropagation)

Perceptron Model Frank Rosenblatt (1957) - Cornell University
Activation function 𝑥 1 𝑥 2 𝑥 3 𝑥 4 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑓 𝑥 = 1, if 𝑖=0 𝑛 𝑤 𝑖 𝑥 𝑖 +𝑏>0 &0, otherwise More:

!? Perceptron Model Frank Rosenblatt (1957) - Cornell University 𝑥 1
𝑥 2 𝑥 3 𝑥 4 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑓 𝑥 = 1, if 𝑖=0 𝑛 𝑤 𝑖 𝑥 𝑖 +𝑏>0 &0, otherwise More:

Perceptron Model Frank Rosenblatt (1957) - Cornell University
Activation function 𝑥 1 𝑥 2 𝑥 3 𝑥 4 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑓 𝑥 = 1, if 𝑖=0 𝑛 𝑤 𝑖 𝑥 𝑖 +𝑏>0 &0, otherwise More:

Activation Functions Step(x) Sigmoid(x) Tanh(x) ReLU(x) = max(0, x)

Two-layer Multi-layer Perceptron (MLP)
”hidden" layer Loss / Criterion 𝑎 1 𝑥 1 𝑎 2 𝑥 2 𝑦 1 𝑦 1 𝑎 3 𝑥 3 𝑎 4 𝑥 4

Linear Softmax 𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑔 𝑐 = 𝑤 𝑐1 𝑥 𝑖1 + 𝑤 𝑐2 𝑥 𝑖2 + 𝑤 𝑐3 𝑥 𝑖3 + 𝑤 𝑐4 𝑥 𝑖4 + 𝑏 𝑐 𝑔 𝑑 = 𝑤 𝑑1 𝑥 𝑖1 + 𝑤 𝑑2 𝑥 𝑖2 + 𝑤 𝑑3 𝑥 𝑖3 + 𝑤 𝑑4 𝑥 𝑖4 + 𝑏 𝑑 𝑔 𝑏 = 𝑤 𝑏1 𝑥 𝑖1 + 𝑤 𝑏2 𝑥 𝑖2 + 𝑤 𝑏3 𝑥 𝑖3 + 𝑤 𝑏4 𝑥 𝑖4 + 𝑏 𝑏 𝑓 𝑐 = 𝑒 𝑔 𝑐 / (𝑒 𝑔 𝑐 + 𝑒 𝑔 𝑑 + 𝑒 𝑔 𝑏 ) 𝑓 𝑑 = 𝑒 𝑔 𝑑 / (𝑒 𝑔 𝑐 + 𝑒 𝑔 𝑑 + 𝑒 𝑔 𝑏 ) 𝑓 𝑏 = 𝑒 𝑔 𝑏 / (𝑒 𝑔 𝑐 + 𝑒 𝑔 𝑑 + 𝑒 𝑔 𝑏 )

Linear Softmax 𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑤= 𝑤 𝑐1 𝑤 𝑐2 𝑤 𝑐3 𝑤 𝑐4 𝑤 𝑑1 𝑤 𝑑2 𝑤 𝑑3 𝑤 𝑑4 𝑤 𝑏1 𝑤 𝑏2 𝑤 𝑏3 𝑤 𝑏4 𝑔 𝑐 = 𝑤 𝑐1 𝑥 𝑖1 + 𝑤 𝑐2 𝑥 𝑖2 + 𝑤 𝑐3 𝑥 𝑖3 + 𝑤 𝑐4 𝑥 𝑖4 + 𝑏 𝑐 𝑔 𝑑 = 𝑤 𝑑1 𝑥 𝑖1 + 𝑤 𝑑2 𝑥 𝑖2 + 𝑤 𝑑3 𝑥 𝑖3 + 𝑤 𝑑4 𝑥 𝑖4 + 𝑏 𝑑 𝑔 𝑏 = 𝑤 𝑏1 𝑥 𝑖1 + 𝑤 𝑏2 𝑥 𝑖2 + 𝑤 𝑏3 𝑥 𝑖3 + 𝑤 𝑏4 𝑥 𝑖4 + 𝑏 𝑏 𝑏= 𝑏 𝑐 𝑏 𝑑 𝑏 𝑏 𝑓 𝑐 = 𝑒 𝑔 𝑐 / (𝑒 𝑔 𝑐 + 𝑒 𝑔 𝑑 + 𝑒 𝑔 𝑏 ) 𝑓 𝑑 = 𝑒 𝑔 𝑑 / (𝑒 𝑔 𝑐 + 𝑒 𝑔 𝑑 + 𝑒 𝑔 𝑏 ) 𝑓 𝑏 = 𝑒 𝑔 𝑏 / (𝑒 𝑔 𝑐 + 𝑒 𝑔 𝑑 + 𝑒 𝑔 𝑏 )

Linear Softmax 𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑤= 𝑤 𝑐1 𝑤 𝑐2 𝑤 𝑐3 𝑤 𝑐4 𝑤 𝑑1 𝑤 𝑑2 𝑤 𝑑3 𝑤 𝑑4 𝑤 𝑏1 𝑤 𝑏2 𝑤 𝑏3 𝑤 𝑏4 𝑔=𝑤 𝑥 𝑇 + 𝑏 𝑇 𝑏= 𝑏 𝑐 𝑏 𝑑 𝑏 𝑏 𝑓 𝑐 = 𝑒 𝑔 𝑐 / (𝑒 𝑔 𝑐 + 𝑒 𝑔 𝑑 + 𝑒 𝑔 𝑏 ) 𝑓 𝑑 = 𝑒 𝑔 𝑑 / (𝑒 𝑔 𝑐 + 𝑒 𝑔 𝑑 + 𝑒 𝑔 𝑏 ) 𝑓 𝑏 = 𝑒 𝑔 𝑏 / (𝑒 𝑔 𝑐 + 𝑒 𝑔 𝑑 + 𝑒 𝑔 𝑏 )

Linear Softmax 𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑤= 𝑤 𝑐1 𝑤 𝑐2 𝑤 𝑐3 𝑤 𝑐4 𝑤 𝑑1 𝑤 𝑑2 𝑤 𝑑3 𝑤 𝑑4 𝑤 𝑏1 𝑤 𝑏2 𝑤 𝑏3 𝑤 𝑏4 𝑔=𝑤 𝑥 𝑇 + 𝑏 𝑇 𝑏= 𝑏 𝑐 𝑏 𝑑 𝑏 𝑏 𝑓=𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑔)

Linear Softmax 𝑓=𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑤 𝑥 𝑇 + 𝑏 𝑇 ) 𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ]
𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑓=𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑤 𝑥 𝑇 + 𝑏 𝑇 )

Two-layer MLP + Softmax
𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑎 1 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [1] 𝑥 𝑇 + 𝑏 [1] 𝑇 ) 𝑓=𝑠𝑜𝑓𝑡𝑚𝑎𝑥( 𝑤 [2] 𝑥 𝑇 + 𝑏 [2] 𝑇 )

N-layer MLP + Softmax 𝑎 1 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [1] 𝑥 𝑇 + 𝑏 [1] 𝑇 )
𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑎 1 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [1] 𝑥 𝑇 + 𝑏 [1] 𝑇 ) 𝑎 2 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [2] 𝑎 1 𝑇 + 𝑏 [2] 𝑇 ) … 𝑎 𝑘 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [𝑘] 𝑎 𝑘−1 𝑇 + 𝑏 [𝑘] 𝑇 ) … 𝑓=𝑠𝑜𝑓𝑡𝑚𝑎𝑥( 𝑤 [𝑛] 𝑎 𝑛−1 𝑇 + 𝑏 [𝑛] 𝑇 )

How to train the parameters?
𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑎 1 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [1] 𝑥 𝑇 + 𝑏 [1] 𝑇 ) 𝑎 2 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [2] 𝑎 1 𝑇 + 𝑏 [2] 𝑇 ) … 𝑎 𝑘 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [𝑘] 𝑎 𝑘−1 𝑇 + 𝑏 [𝑘] 𝑇 ) … 𝑓=𝑠𝑜𝑓𝑡𝑚𝑎𝑥( 𝑤 [𝑛] 𝑎 𝑛−1 𝑇 + 𝑏 [𝑛] 𝑇 )

Forward pass (Forward-propagation)
𝑧 𝑖 = 𝑖=0 𝑛 𝑤 1𝑖𝑗 𝑥 𝑖 + 𝑏 1 𝑎 𝑖 = 𝑆𝑖𝑔𝑚𝑜𝑖𝑑( 𝑧 𝑖 ) 𝑝 1 = 𝑖=0 𝑛 𝑤 2𝑖 𝑎 𝑖 + 𝑏 2 𝑦 1 = 𝑆𝑖𝑔𝑚𝑜𝑖𝑑( 𝑝 𝑖 ) 𝑎 1 𝑥 1 𝑎 2 𝑥 2 𝐿𝑜𝑠𝑠=𝐿( 𝑦 1 , 𝑦 1 ) 𝑦 1 𝑦 1 𝑎 3 𝑥 3 𝑎 4 𝑥 4

𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑎 1 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [1] 𝑥 𝑇 + 𝑏 [1] 𝑇 ) 𝑎 2 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [2] 𝑎 1 𝑇 + 𝑏 [2] 𝑇 ) We can still use SGD … 𝑎 𝑘 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [𝑘] 𝑎 𝑘−1 𝑇 + 𝑏 [𝑖] 𝑇 ) … We need! 𝜕𝑙 𝜕 𝑤 [𝑘]𝑖𝑗 𝜕𝑙 𝜕 𝑏 𝑘 𝑖 𝑓=𝑠𝑜𝑓𝑡𝑚𝑎𝑥( 𝑤 [𝑛] 𝑎 𝑛−1 𝑇 + 𝑏 [𝑛] 𝑇 )

𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑎 1 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [1] 𝑥 𝑇 + 𝑏 [1] 𝑇 ) 𝑎 2 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [2] 𝑎 1 𝑇 + 𝑏 [2] 𝑇 ) … We can still use SGD 𝑎 𝑖 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [𝑘] 𝑎 𝑘−1 𝑇 + 𝑏 [𝑘] 𝑇 ) … We need! 𝑓=𝑠𝑜𝑓𝑡𝑚𝑎𝑥( 𝑤 [𝑛] 𝑎 𝑛−1 𝑇 + 𝑏 [𝑛] 𝑇 ) 𝜕𝑙 𝜕 𝑤 [𝑘]𝑖𝑗 𝜕𝑙 𝜕 𝑏 𝑘 𝑖 𝑙=𝑙𝑜𝑠𝑠(𝑓, 𝑦)

𝑥 𝑖 =[ 𝑥 𝑖1 𝑥 𝑖2 𝑥 𝑖3 𝑥 𝑖4 ] [ ] 𝑦 𝑖 = 𝑦 𝑖 = [ 𝑓 𝑐 𝑓 𝑑 𝑓 𝑏 ] 𝑎 1 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [1] 𝑥 𝑇 + 𝑏 [1] 𝑇 ) 𝑎 2 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [2] 𝑎 1 𝑇 + 𝑏 [2] 𝑇 ) … 𝜕𝑙 𝜕 𝑤 [𝑘]𝑖𝑗 = 𝜕𝑙 𝜕 𝑎 𝑛−1 𝜕 𝑎 𝑛−1 𝜕 𝑎 𝑛−2 … 𝜕 𝑎 𝑘 𝜕 𝑎 𝑘−1 𝜕 𝑎 𝑘−1 𝜕 𝑤 𝑘 𝑖𝑗 𝑎 𝑖 =𝑠𝑖𝑔𝑚𝑜𝑖𝑑( 𝑤 [𝑘] 𝑎 𝑘−1 𝑇 + 𝑏 [𝑘] 𝑇 ) … 𝑓=𝑠𝑜𝑓𝑡𝑚𝑎𝑥( 𝑤 [𝑛] 𝑎 𝑛−1 𝑇 + 𝑏 [𝑛] 𝑇 ) 𝑙=𝑙𝑜𝑠𝑠(𝑓, 𝑦)

Backward pass (Back-propagation)
GradInputs 𝜕𝐿 𝜕 𝑥 𝑘 =( 𝜕 𝜕 𝑥 𝑘 𝑖=0 𝑛 𝑤 1𝑖𝑗 𝑥 𝑖 + 𝑏 1 ) 𝜕𝐿 𝜕 𝑧 𝑖 𝜕𝐿 𝜕 𝑤 1𝑖𝑗 = 𝜕 𝑥 𝑘 𝜕 𝑤 1𝑖𝑗 𝜕𝐿 𝜕 𝑥 𝑘 𝜕𝐿 𝜕 𝑧 𝑖 = 𝜕 𝜕 𝑧 𝑖 𝑆𝑖𝑔𝑚𝑜𝑖𝑑( 𝑧 𝑖 ) 𝜕𝐿 𝜕 𝑎 𝑘 𝜕𝐿 𝜕 𝑎 𝑘 =( 𝜕 𝜕 𝑎 𝑘 𝑖=0 𝑛 𝑤 2𝑖 𝑎 𝑖 + 𝑏 2 ) 𝜕𝐿 𝜕 𝑝 1 𝜕𝐿 𝜕 𝑤 2𝑖 = 𝜕 𝑎 𝑘 𝜕 𝑤 2𝑖 𝜕𝐿 𝜕 𝑎 𝑘 𝜕𝐿 𝜕 𝑝 1 = 𝜕 𝜕 𝑝 1 𝑆𝑖𝑔𝑚𝑜𝑖𝑑( 𝑝 𝑖 ) 𝜕𝐿 𝜕 𝑦 1 𝑎 1 𝑥 1 𝑎 2 𝑥 2 𝜕𝐿 𝜕 𝑦 1 = 𝜕 𝜕 𝑦 1 𝐿( 𝑦 1 , 𝑦 1 ) 𝑦 1 𝑦 1 𝑎 3 𝑥 3 𝑎 4 𝑥 4 GradParams

Softmax + Negative Log Likelihood

Linear layer

ReLU layer

Two-layer Neural Network – Forward Pass

Two-layer Neural Network – Backward Pass

Convolutional Layer

Convolutional Layer Weights

Convolutional Layer Weights 4

Convolutional Layer Weights 1 4

Convolutional Layer (with 4 filters)
weights: 4x1x9x9 Input: 1x224x224 Output: 4x224x224 if zero padding, and stride = 1

Convolutional Layer (with 4 filters)
weights: 4x1x9x9 Input: 1x224x224 Output: 4x112x112 if zero padding, but stride = 2

Convolutional Layer in Torch
nOutputPlane x nInputPlane kW kH Input Output nOutputPlane (equals the number of convolutional filters for this layer) nInputPlane (e.g. 3 for RGB inputs)

Convolutional Layer in Keras
Convolution2D(nOutputPlane, kW, kH, input_shape = (3, 224, 224), subsample = 2, border_mode = valid) nOutputPlane x nInputPlane kW kH Input Output nOutputPlane (equals the number of convolutional filters for this layer) nInputPlane (e.g. 3 for RGB inputs)

Convolutional Layer in pytorch
out_channels x in_channels kernel_size Input Output out_channels (equals the number of convolutional filters for this layer) in_channels (e.g. 3 for RGB inputs)

Automatic Differentiation
You only need to write code for the forward pass, backward pass is computed automatically. Pytorch (Facebook -- mostly): Tensorflow (Google -- mostly): DyNet (team includes UVA Prof. Yangfeng Ji):

Questions?

Neural Networks.

Similar presentations

Presentation on theme: "Neural Networks."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Neural Networks.

Similar presentations

Presentation on theme: "Neural Networks."— Presentation transcript:

Similar presentations

About project

Feedback