COMP53311 Other Classification Models: Neural Network Prepared by Raymond Wong Some of the notes about Neural Network are borrowed from LW Chan’s notes Presented by Raymond Wong
COMP53312 What we learnt for Classification Decision Tree Bayesian Classifier Nearest Neighbor Classifier
COMP53313 Other Classification Models Neural Network
COMP53314 Neural Network A computing system made up of simple and highly interconnected processing elements Other terminologies: Connectionist Models Parallel distributed processing models (PDP) Artificial Neural Networks Computational Neural Networks Neurocomputers
COMP53315 Neural Network This approach is inspired by the way that brains process information, which is entirely differ from the way that conventional computers do Information processing occurs at many identical and simple processing elements called neurons (or also called units, cells or nodes) Interneuron connection strengths known as synaptic weights are used to store the knowledge
COMP53316 Advantages of Neural Network Parallel Processing – each neuron operates individually Fault tolerance – if a small number of neurons break down, the whole system is still able to operate with slight degradation in performance.
COMP53317 Neuron Network Neuron Network for OR Neuron Network input output x1x1 x2x2 y x1x1 x2x2 y OR Function
COMP53318 Neuron Network Neuron Network for OR input output x1x1 x2x2 y x1x1 x2x2 y OR Function Neuron
COMP53319 Neuron Network Neuron Network for OR input output x1x1 x2x2 y x1x1 x2x2 y OR Function Front Back net
COMP Neuron Network Neuron Network for OR input output x1x1 x2x2 y x1x1 x2x2 y OR Function Front Back net w1w1 w2w2 Weight net = w 1 x 1 + w 2 x 2 + b
COMP Neuron Network Neuron Network for OR input output x1x1 x2x2 y x1x1 x2x2 y OR Function Front Back net w1w1 w2w2 Activation function -Linear function: y = net or y = a. net -Non-linear function
COMP Activation Function Non-linear functions Threshold function, Step Function, Hard Limiter Piecewise-Linear Function Sigmoid Function Radial Basis Function
COMP Threshold function, Step Function, Hard Limiter y = if net 0 if net < net y 1
COMP Piecewise-Linear Function y = if net a if –a < net < a net y 1 if net -a ½(1/a x net + 1) -aa
COMP Sigmoid Function y = 0 net y e -net
COMP Radial Basis Function y = 0 net y 1 -net 2 e
COMP Neuron Network Neuron Network for OR input output x1x1 x2x2 y x1x1 x2x2 y OR Function Front Back net w1w1 w2w2 net = w 1 x 1 + w 2 x 2 + b Threshold function y = if net 0 if net <0 1 0
COMP Learning Let be the learning rate (a real number) Learning is done by w i w i + (d – y)x i where d is the desired output y is the output of our neural network b b + (d – y)
COMP Neuron Network Neuron Network for OR input output x1x1 x2x2 y x1x1 x2x2 d OR Function Front Back net w1w1 w2w2 net = w 1 x 1 + w 2 x 2 + b Threshold function y = if net 0 if net <0 1 0 net = w 1 x 1 + w 2 x 2 + b y = if net 0 if net <0 1 0
COMP Neuron Network x1x1 x2x2 d net = w 1 x 1 + w 2 x 2 + b y = if net 0 if net <0 1 0 net = w 1 x 1 + w 2 x 2 + b bw1w1 w2w2 111 =1 y = 1 w 1 = w 1 + (d – y)x 1 = 1+0.8*(0-1)*0 = 1 w 2 = w 2 + (d – y)x 2 = 1+0.8*(0-1)*0 = 1 b = b + (d – y) = 1+0.8*(0-1) = Incorrect! = 0.8
COMP Neuron Network x1x1 x2x2 d net = w 1 x 1 + w 2 x 2 + b y = if net 0 if net <0 1 0 net = w 1 x 1 + w 2 x 2 + b bw1w1 w2w =1.2 y = 1 w 1 = w 1 + (d – y)x 1 = 1+0.8*(1-1)*0 = 1 w 2 = w 2 + (d – y)x 2 = 1+0.8*(1-1)*1 = 1 b = b + (d – y) = *(1-1) = Correct! = 0.8
COMP Neuron Network x1x1 x2x2 d net = w 1 x 1 + w 2 x 2 + b y = if net 0 if net <0 1 0 net = w 1 x 1 + w 2 x 2 + b bw1w1 w2w =1.2 y = 1 w 1 = w 1 + (d – y)x 1 = 1+0.8*(1-1)*1 = 1 w 2 = w 2 + (d – y)x 2 = 1+0.8*(1-1)*0 = 1 b = b + (d – y) = *(1-1) = Correct! = 0.8
COMP Neuron Network x1x1 x2x2 d net = w 1 x 1 + w 2 x 2 + b y = if net 0 if net <0 1 0 net = w 1 x 1 + w 2 x 2 + b bw1w1 w2w =2.2 y = 1 w 1 = w 1 + (d – y)x 1 = 1+0.8*(1-1)*1 = 1 w 2 = w 2 + (d – y)x 2 = 1+0.8*(1-1)*1 = 1 b = b + (d – y) = *(1-1) = Correct! = 0.8
COMP Neuron Network x1x1 x2x2 d net = w 1 x 1 + w 2 x 2 + b y = if net 0 if net <0 1 0 net = w 1 x 1 + w 2 x 2 + b bw1w1 w2w =0.2 y = 1 w 1 = w 1 + (d – y)x 1 = 1+0.8*(0-1)*0 = 1 w 2 = w 2 + (d – y)x 2 = 1+0.8*(0-1)*0 = 1 b = b + (d – y) = *(0-1) = Incorrect! We repeat the above process until the neural networks output the correct values of y (i.e., y = d for each possible input) = 0.8
COMP Neuron Network Neuron Network for OR input output x1x1 x2x2 y x1x1 x2x2 y OR Function Front Back net w1w1 w2w2 net = w 1 x 1 + w 2 x 2 + b Threshold function y = if net 0 if net <0 1 0
COMP Limitation It can only solve linearly separable problems
COMP Multi-layer Perceptron (MLP) Neuron Network input output x1x1 x2x2 y input output x1x1 x2x2 y Neuron
COMP Multi-layer Perceptron (MLP) Neuron Network input output x1x1 x2x2 y input output x1x1 x2x2 y
COMP Multi-layer Perceptron (MLP) input output x1x1 y1y1 x2x2 x3x3 x4x4 x5x5 y2y2 y3y3 y4y4 Input layerHidden layer Output layer
COMP Advantages of MLP Can solve Linear separable problems Non-linear separable problems A universal approximator MLP has proven to be a universal approximator, i.e., it can model all types function y = f(x)