CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Backpropagation Learning Algorithm
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2006 Shreekanth Mandayam ECE Department Rowan University.
CS 484 – Artificial Intelligence
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
CS621: Artificial Intelligence Lecture 24: Backpropagation Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-6) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Artificial Neural Networks
Appendix B: An Example of Back-propagation algorithm
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
Artificial Intelligence Techniques Multilayer Perceptrons.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Multi-Layer Perceptron
Non-Bayes classifiers. Linear discriminants, neural networks.
11 1 Backpropagation Multilayer Perceptron R – S 1 – S 2 – S 3 Network.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.
Pushpak Bhattacharyya Computer Science and Engineering Department
EEE502 Pattern Recognition
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Backpropagation and Application.
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Chapter 6 Neural Network.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Neural networks.
Multiple-Layer Networks and Backpropagation Algorithms
Fall 2004 Backpropagation CS478 - Machine Learning.
CS623: Introduction to Computing with Neural Nets (lecture-5)
Real Neurons Cell structures Cell body Dendrites Axon
CSE 473 Introduction to Artificial Intelligence Neural Networks
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
CSE P573 Applications of Artificial Intelligence Neural Networks
CS621: Artificial Intelligence
Prof. Carolina Ruiz Department of Computer Science
Lecture 11. MLP (III): Back-Propagation
CS623: Introduction to Computing with Neural Nets (lecture-2)
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Artificial Neural Network & Backpropagation Algorithm
Synaptic DynamicsII : Supervised Learning
of the Artificial Neural Networks.
CSE 573 Introduction to Artificial Intelligence Neural Networks
Artificial Neural Networks
Neural Network - 2 Mayank Vatsa
CS623: Introduction to Computing with Neural Nets (lecture-4)
CS 621 Artificial Intelligence Lecture 25 – 14/10/05
Capabilities of Threshold Neurons
CS623: Introduction to Computing with Neural Nets (lecture-9)
Backpropagation.
CS621 : Artificial Intelligence
Backpropagation David Kauchak CS159 – Fall 2019.
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS623: Introduction to Computing with Neural Nets (lecture-3)
Backpropagation.
David Kauchak CS158 – Spring 2019
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
Artificial Neural Networks / Spring 2002
Prof. Carolina Ruiz Department of Computer Science
Outline Announcement Neural networks Perceptrons - continued
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Training of the MLP Multilayer Perceptron (MLP) Question:- How to find weights for the hidden layers when no target output is available? Credit assignment problem – to be solved by “Gradient Descent”

Gradient Descent Technique Let E be the error at the output layer ti = target output; oi = observed output i is the index going over n neurons in the outermost layer j is the index going over the p patterns (1 to p) Ex: XOR:– p=4 and n=1

Weights in a ff NN wmn is the weight of the connection from the nth neuron to the mth neuron E vs surface is a complex surface in the space defined by the weights wij gives the direction in which a movement of the operating point in the wmn co-ordinate space will result in maximum decrease in error m wmn n

Sigmoid neurons Gradient Descent needs a derivative computation - not possible in perceptron due to the discontinuous step function used!  Sigmoid neurons with easy-to-compute derivatives used! Computing power comes from non-linearity of sigmoid function.

Derivative of Sigmoid function

Training algorithm Initialize weights to random values. For input x = <xn,xn-1,…,x0>, modify weights as follows Target output = t, Observed output = o Iterate until E <  (threshold)

Calculation of ∆wi

Observations Does the training technique support our intuition? The larger the xi, larger is ∆wi Error burden is borne by the weight values corresponding to large input values

Backpropagation on feedforward network

Backpropagation algorithm …. Output layer (m o/p neurons) j wji …. i Hidden layers …. …. Input layer (n i/p neurons) Fully connected feed forward network Pure FF network (no jumping of connections over layers)

Gradient Descent Equations

Backpropagation – for outermost layer

Backpropagation for hidden layers …. Output layer (m o/p neurons) k …. j Hidden layers …. i …. Input layer (n i/p neurons) k is propagated backwards to find value of j

Backpropagation – for hidden layers

General Backpropagation Rule General weight updating rule: Where for outermost layer for hidden layers

How does it work? Input propagation forward and error propagation backward (e.g. XOR) w2=1 w1=1 = 0.5 x1x2 -1 x1 x2 1.5 1