Neural Network - 2 Mayank Vatsa

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Backpropagation Learning Algorithm
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Slides from: Doug Gray, David Poole
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Machine Learning Neural Networks
Lecture 14 – Neural Networks
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Aula 4 Radial Basis Function Networks
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Artificial Neural Network
Radial-Basis Function Networks
Neural networks.
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Multiple-Layer Networks and Backpropagation Algorithms
Artificial Neural Networks
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
Multi-Layer Perceptron
Non-Bayes classifiers. Linear discriminants, neural networks.
CS621 : Artificial Intelligence
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
EEE502 Pattern Recognition
Neural Networks 2nd Edition Simon Haykin
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Today’s Lecture Neural networks Training
Neural networks.
Multiple-Layer Networks and Backpropagation Algorithms
Fall 2004 Backpropagation CS478 - Machine Learning.
Neural Networks.
Artificial Neural Networks
Introduction to Radial Basis Function Networks
Chapter 2 Single Layer Feedforward Networks
Real Neurons Cell structures Cell body Dendrites Axon
Ranga Rodrigo February 8, 2014
LECTURE 28: NEURAL NETWORKS
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Classification / Regression Neural Networks 2
CS621: Artificial Intelligence
CSC 578 Neural Networks and Deep Learning
Data Mining with Neural Networks (HK: Chapter 7.5)
ECE 471/571 - Lecture 17 Back Propagation.
Neural Networks Advantages Criticism
Classification Neural Networks 1
Chapter 3. Artificial Neural Networks - Introduction -
Artificial Neural Network & Backpropagation Algorithm
Neuro-Computing Lecture 4 Radial Basis Function Network
Neural Networks Chapter 5
Artificial Neural Networks
Multilayer Perceptron & Backpropagation
Capabilities of Threshold Neurons
Lecture Notes for Chapter 4 Artificial Neural Networks
LECTURE 28: NEURAL NETWORKS
Introduction to Radial Basis Function Networks
Chapter - 3 Single Layer Percetron
COSC 4335: Part2: Other Classification Techniques
CS623: Introduction to Computing with Neural Nets (lecture-5)
Computer Vision Lecture 19: Object Recognition III
CS623: Introduction to Computing with Neural Nets (lecture-3)
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
The McCullough-Pitts Neuron
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
Artificial Neural Networks / Spring 2002
Presentation transcript:

Neural Network - 2 Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.

The Perceptron Binary classifier functions Threshold activation function

The Perceptron: Threshold Activation Function Binary classifier functions Threshold activation function

The vector of derivatives that form the gradient can be obtained by differentiating E Derivation of GDR … The weight update rule for standard gradient descent can be summarized as

Example Input 0: x1 = 12 Input 1: x2 = 4 Weight 0: 0.5 Weight 1: -1 Input 0 * Weight 0 ⇒ 12 * 0.5 = 6 Input 1 * Weight 1 ⇒ 4 * -1 = -4 Sum = 6 + -4 = 2 Output = sign(sum) ⇒ sign(2) ⇒ +1

AB problem

XOR problem

XOR problem 1

XOR problem 2

XOR problem

XOR problem 1 1

XOR problem 2 2

XOR problem AND 1 2

xn x1 x2 Input Output Three-layer networks Hidden layers

Feed-forward layered network Output layer 2nd hidden layer 1st hidden layer Input layer

Different Non-Linearly Separable Problems Types of Decision Regions Exclusive-OR Problem Class Separation Most General Region Shapes Structure Single-Layer Half Plane Bounded By Hyperplane A B B A Two-Layer Convex Open Or Closed Regions A B B A Arbitrary (Complexity Limited by No. of Nodes) Three-Layer A B B A

In the perceptron/single layer nets, we used gradient descent on the error function to find the correct weights: D wji = (tj - yj) xi We see that errors/updates are local to the node i.e. the change in the weight from node i to output j (wji) is controlled by the input that travels along the connection and the error signal from output j x1 (tj - yj)

x1 ? x2 But with more layers how are the weights for the first 2 layers found when the error is computed for layer 3 only? There is no direct error signal for the first layers!!!!!

Objective of Multilayer NNet Training set x1 x2 xn w1 w2 wm x = Goal for all k

Learn the Optimal Weight Vector Training set x1 x2 xn x = Goal for all k w1 w2 wm

First Complex NNet Algorithm Multilayer feedforward NNet

Training: Backprop algorithm Searches for weight values that minimize the total error of the network over the set of training examples. Repeated procedures of the following two passes: Forward pass: Compute the outputs of all units in the network, and the error of the output layers. Backward pass: The network error is used for updating the weights (credit assignment problem). Starting at the output layer, the error is propagated backwards through the network, layer by layer. This is done by recursively computing the local gradient of each neuron.

Back-propagation training algorithm illustrated: Backprop adjusts the weights of the NN in order to minimize the network total mean squared error. Network activation Error computation Forward Step Error propagation Backward Step

Learning Algorithm: Backpropagation Pictures below illustrate how signal is propagating through the network, Symbols w(xm)n represent weights of connections between network input xm and neuron n in input layer. Symbols yn represents output signal of neuron n.

Learning Algorithm: Backpropagation

Learning Algorithm: Backpropagation

Learning Algorithm: Backpropagation Propagation of signals through the hidden layer. Symbols wmn represent weights of connections between output of neuron m and input of neuron n in the next layer. 

Learning Algorithm: Backpropagation

Learning Algorithm: Backpropagation

Learning Algorithm: Backpropagation Propagation of signals through the output layer.

Learning Algorithm: Backpropagation In the next algorithm step the output signal of the network y is compared with the desired output value (the target), which is found in training data set. The difference is called error signal d of output layer neuron

Learning Algorithm: Backpropagation The idea is to propagate error signal d (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron. 

Learning Algorithm: Backpropagation The idea is to propagate error signal d (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron. 

Learning Algorithm: Backpropagation The weights' coefficients wmn used to propagate errors back are equal to this used during computing output value. Only the direction of data flow is changed (signals are propagated from output to inputs one after the other). This technique is used for all network layers. If propagated errors came from few neurons they are added. The illustration is below: 

Learning Algorithm: Backpropagation When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).

Learning Algorithm: Backpropagation When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).

Learning Algorithm: Backpropagation When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).

Bias in NNet

Bias in NNet

Bias in NNet This example shows that “The bias term allows us to make affine transformations to the data.”

Single-Hidden Layer NNet 1 2 m x1 x2 xn w1 w2 wm x = Hidden Units

Radial Basis Function Networks y 1 2 m x1 x2 xn w1 w2 wm x = Hidden Units

Non-Linear Models Weights Adjusted by the Learning process

Typical Radial Functions Gaussian Hardy Multiquadratic Inverse Multiquadratic

Gaussian Basis Function (=0.5,1.0,1.5)

Most General RBF + + +

The Topology of RBF NNet x1 x2 xn y1 ym Output Units Classes Hidden Units Subclasses Inputs Feature Vectors

Radial Basis Function Networks Training set x1 x2 xn w1 w2 wm x = Goal for all k

Learn the Optimal Weight Vector Training set x1 x2 xn x = Goal for all k w1 w2 wm

Regularization Training set If regularization is not needed, set Goal for all k

Learn the Optimal Weight Vector Minimize

Learning Kernel Parameters x1 x2 xn y1 ym 1 2 l w11 w12 w1l wm1 wm2 wml Training set Kernels

What to Learn? Weights wij’s Centers j’s of j’s Widths j’s of j’s x1 x2 xn y1 ym 1 2 l w11 w12 w1l wm1 wm2 wml Weights wij’s Centers j’s of j’s Widths j’s of j’s Number of j’s

Two-Stage Training w11 w12 w1l wm1 wm2 wml x1 x2 xn y1 ym 1 2 l Step 2 Determines wij’s. Step 1 Determines Centers j’s of j’s. Widths j’s of j’s. Number of j’s.

Learning Rule Backpropagation learning rule will apply in RBF also.

Three-layer RBF neural network

Auto-encoders

Creating your own architecture Variables: input Hidden layer characteristics Output layer Learn parameters and weights