Other Classification Models: Neural Network COMP1942 Other Classification Models: Neural Network Prepared by Raymond Wong Some of the notes about Neural Network are borrowed from LW Chan’s notes Presented by Raymond Wong Screenshot captured by Kai Ho Chan raywong@cse COMP1942
What we learnt for Classification Decision Tree Bayesian Classifier Nearest Neighbor Classifier COMP1942
Other Classification Models Neural Network COMP1942
Neural Network Neural Network How to use the data mining tool COMP1942
Neural Network Neural Network Other terminologies: A computing system made up of simple and highly interconnected processing elements Other terminologies: Connectionist Models Parallel distributed processing models (PDP) Artificial Neural Networks Computational Neural Networks Neurocomputers COMP1942
Neural Network This approach is inspired by the way that brains process information, which is entirely differ from the way that conventional computers do Information processing occurs at many identical and simple processing elements called neurons (or also called units, cells or nodes) Interneuron connection strengths known as synaptic weights are used to store the knowledge COMP1942
Advantages of Neural Network Parallel Processing – each neuron operates individually Fault tolerance – if a small number of neurons break down, the whole system is still able to operate with slight degradation in performance. COMP1942
Neuron Network Neuron Network for OR x1 x2 y 1 OR Function input x1 1 Neuron Network for OR input OR Function Neuron Network x1 output y x2 COMP1942
Neuron Network Neuron Network for OR x1 x2 y 1 OR Function input 1 Neuron Network for OR input Neuron OR Function x1 output y x2 COMP1942
Neuron Network Neuron Network for OR x1 x2 y 1 input Front Back 1 Neuron Network for OR input Front Back OR Function x1 output net y x2 COMP1942
Neuron Network Neuron Network for OR x1 x2 y 1 input Front Back 1 Neuron Network for OR input Front Back OR Function w1 x1 output net w2 y x2 Weight net = w1x1 + w2x2 + b COMP1942
Neuron Network Neuron Network for OR x1 x2 y 1 input Front Back 1 Neuron Network for OR input Front Back OR Function w1 x1 output net w2 y x2 Activation function Linear function: y = net or y = a . net Non-linear function COMP1942
Activation Function Non-linear functions Threshold function, Step Function, Hard Limiter Piecewise-Linear Function Sigmoid Function Radial Basis Function COMP1942
Threshold function, Step Function, Hard Limiter 1 if net 0 y = if net <0 y 1 net COMP1942
Piecewise-Linear Function 1 if net a y = ½(1/a x net + 1) if –a < net < a if net -a y 1 net -a a COMP1942
Sigmoid Function 1 y = 1 + e-net y 1 net COMP1942
Radial Basis Function -net2 y = e y 1 net COMP1942
Neuron Network Neuron Network for OR x1 x2 y 1 input Front Back 1 Neuron Network for OR input Front Back OR Function w1 x1 output net w2 y x2 Threshold function 1 if net 0 net = w1x1 + w2x2 + b y = if net <0 COMP1942
Learning Let be the learning rate (a real number) Learning is done by wi wi + (d – y)xi where d is the desired output y is the output of our neural network b b + (d – y) COMP1942
Neuron Network Neuron Network for OR x1 x2 d 1 y = if net 0 Neuron Network net = w1x1 + w2x2 + b x1 x2 d 1 Neuron Network for OR input Front Back OR Function w1 x1 output net w2 y x2 Threshold function 1 if net 0 net = w1x1 + w2x2 + b y = if net <0 COMP1942
Neuron Network x1 x2 d 1 b w1 w2 y = if net 0 if net <0 1 Neuron Network net = w1x1 + w2x2 + b x1 x2 d 1 = 0.8 net = w1x1 + w2x2 + b =1 y = 1 Incorrect! b w1 w2 w1 = w1 + (d – y)x1 1 1 1 = 1+0.8*(0-1)*0 = 1 w2 = w2 + (d – y)x2 0.2 1 1 = 1+0.8*(0-1)*0 = 1 b = b + (d – y) = 1+0.8*(0-1) = 0.2 COMP1942
Neuron Network x1 x2 d 1 b w1 w2 y = if net 0 if net <0 1 Neuron Network net = w1x1 + w2x2 + b x1 x2 d 1 = 0.8 net = w1x1 + w2x2 + b =1.2 y = 1 Correct! b w1 w2 w1 = w1 + (d – y)x1 0.2 1 1 = 1+0.8*(1-1)*0 = 1 w2 = w2 + (d – y)x2 0.2 1 1 = 1+0.8*(1-1)*1 = 1 b = b + (d – y) = 0.2+0.8*(1-1) = 0.2 COMP1942
Neuron Network x1 x2 d 1 b w1 w2 y = if net 0 if net <0 1 Neuron Network net = w1x1 + w2x2 + b x1 x2 d 1 = 0.8 net = w1x1 + w2x2 + b =1.2 y = 1 Correct! b w1 w2 w1 = w1 + (d – y)x1 0.2 1 1 = 1+0.8*(1-1)*1 = 1 w2 = w2 + (d – y)x2 0.2 1 1 = 1+0.8*(1-1)*0 = 1 b = b + (d – y) = 0.2+0.8*(1-1) = 0.2 COMP1942
Neuron Network x1 x2 d 1 b w1 w2 y = if net 0 if net <0 1 Neuron Network net = w1x1 + w2x2 + b x1 x2 d 1 = 0.8 net = w1x1 + w2x2 + b =2.2 y = 1 Correct! b w1 w2 w1 = w1 + (d – y)x1 0.2 1 1 = 1+0.8*(1-1)*1 = 1 w2 = w2 + (d – y)x2 0.2 1 1 = 1+0.8*(1-1)*1 = 1 b = b + (d – y) = 0.2+0.8*(1-1) = 0.2 COMP1942
Neuron Network x1 x2 d 1 b w1 w2 y = if net 0 if net <0 1 Neuron Network net = w1x1 + w2x2 + b x1 x2 d 1 = 0.8 net = w1x1 + w2x2 + b =0.2 y = 1 Incorrect! b w1 w2 w1 = w1 + (d – y)x1 0.2 1 1 = 1+0.8*(0-1)*0 = 1 w2 = w2 + (d – y)x2 -0.6 1 1 = 1+0.8*(0-1)*0 = 1 b = b + (d – y) We repeat the above process until the neural networks output the correct values of y (i.e., y = d for each possible input) = 0.2+0.8*(0-1) = -0.6 COMP1942
Neuron Network Neuron Network for OR x1 x2 y 1 input Front Back 1 Neuron Network for OR input Front Back OR Function w1 x1 output net w2 y x2 Threshold function 1 if net 0 net = w1x1 + w2x2 + b y = if net <0 COMP1942
Limitation It can only solve linearly separable problems COMP1942
Multi-layer Perceptron (MLP) input Neuron Network x1 output y x2 input Neuron x1 output y x2 COMP1942
Multi-layer Perceptron (MLP) input Neuron Network x1 output y x2 input output x1 y x2 COMP1942
Multi-layer Perceptron (MLP) input output x1 y1 x2 y2 x3 y3 x4 y4 x5 COMP1942 Input layer Hidden layer Output layer
Advantages of MLP Can solve A universal approximator Linear separable problems Non-linear separable problems A universal approximator MLP has proven to be a universal approximator, i.e., it can model all types function y = f(x) COMP1942
Neural Network Neural Network How to use the data mining tool COMP1942
How to use the data mining tool We can use XLMiner for neural network Open “neuralNetwork.xls” in MS Excel COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
Data source Worksheet Data range Workbook COMP1942
COMP1942
First row contains header 3 # Columns # Rows in Training set 4 Variables First row contains header Variables in input data COMP1942
COMP1942
COMP1942
Classes in the output variable Specify “Success” class (for Lift Chart) # Classes: 1 2 Specify initial cutoff probability value for success: 0.5 COMP1942
Normalize input data 1 Network Architecture 1 # Hidden layers (max 4) # Nodes per layer COMP1942
1000 Training options # Epochs COMP1942
Gradient Descent Step Size 0.01 0.1 Error tolerance 0.6 Weight change momentum Weight decay COMP1942
Detailed report Summary report Score training data Lift charts COMP1942
Score new data In worksheet COMP1942
Data source Worksheet Workbook Data range COMP1942
COMP1942
First row contains headers # Rows 4 Variables # Cols 2 First row contains headers COMP1942
Continuous variables in input data Variables in new data Continuous variables in input data Match selected Unmatch all Match sequentially Unmatch selected Match by name COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942
COMP1942