CS621: Artificial Intelligence Lecture 24: Backpropagation Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Multi-Layer Perceptron (MLP)
NEURAL NETWORKS Backpropagation Algorithm
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Mehran University of Engineering and Technology, Jamshoro Department of Electronic Engineering Neural Networks Feedforward Networks By Dr. Mukhtiar Ali.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
The back-propagation training algorithm
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
Data Mining with Neural Networks (HK: Chapter 7.5)
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2006 Shreekanth Mandayam ECE Department Rowan University.
CS623: Introduction to Computing with Neural Nets (lecture-6) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621: Artificial Intelligence Lecture 27: Backpropagation applied to recognition problems; start of logic Pushpak Bhattacharyya Computer Science and Engineering.
Artificial Neural Networks
Biointelligence Laboratory, Seoul National University
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Artificial Neural Networks
Neural Networks Chapter 6 Joost N. Kok Universiteit Leiden.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 35: Backpropagation; need for.
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Backpropagation and NN based IR.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Intelligence Techniques Multilayer Perceptrons.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
11 1 Backpropagation Multilayer Perceptron R – S 1 – S 2 – S 3 Network.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,
Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Artificial Neural Networks:
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.
Pushpak Bhattacharyya Computer Science and Engineering Department
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Image Source: ww.physiol.ucl.ac.uk/fedwards/ ca1%20neuron.jpg
EEE502 Pattern Recognition
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 45– Backpropagation issues; applications 11 th Nov, 2010.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Backpropagation and Application.
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Chapter 6 Neural Network.
CS623: Introduction to Computing with Neural Nets (lecture-17) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Fall 2004 Backpropagation CS478 - Machine Learning.
CS623: Introduction to Computing with Neural Nets (lecture-5)
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
CSE P573 Applications of Artificial Intelligence Neural Networks
CS621: Artificial Intelligence
CSC 578 Neural Networks and Deep Learning
ECE 471/571 - Lecture 17 Back Propagation.
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
CSE 573 Introduction to Artificial Intelligence Neural Networks
CS623: Introduction to Computing with Neural Nets (lecture-4)
CS 621 Artificial Intelligence Lecture 25 – 14/10/05
CS623: Introduction to Computing with Neural Nets (lecture-9)
Backpropagation.
CS621 : Artificial Intelligence
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS623: Introduction to Computing with Neural Nets (lecture-3)
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
Outline Announcement Neural networks Perceptrons - continued
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

CS621: Artificial Intelligence Lecture 24: Backpropagation Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Gradient Descent Technique Let E be the error at the output layer t i = target output; o i = observed output i is the index going over n neurons in the outermost layer j is the index going over the p patterns (1 to p) Ex: XOR:– p=4 and n=1

Backpropagation algorithm Fully connected feed forward network Pure FF network (no jumping of connections over layers) Hidden layers Input layer (n i/p neurons) Output layer (m o/p neurons) j i w ji ….

Gradient Descent Equations

Backpropagation – for outermost layer

Backpropagation for hidden layers Hidden layers Input layer (n i/p neurons) Output layer (m o/p neurons) j i …. k  k is propagated backwards to find value of  j

Backpropagation – for hidden layers

General Backpropagation Rule General weight updating rule: Where for outermost layer for hidden layers

How does it work? Input propagation forward and error propagation backward (e.g. XOR) w 2 =1w 1 =1 θ = 0.5 x1x2x1x2 x1x2x1x2 x1x1 x2x

Local Minima Due to the Greedy nature of BP, it can get stuck in local minimum m and will never be able to reach the global minimum g as the error can only decrease by weight change.

Momentum factor 1.Introduce momentum factor.  Accelerates the movement out of the trough.  Dampens oscillation inside the trough.  Choosing β : If β is large, we may jump over the minimum.

Symmetry breaking If mapping demands different weights, but we start with the same weights everywhere, then BP will never converge. w 2 =1w 1 =1 θ = 0.5 x1x2x1x2 x1x2x1x2 x1x1 x2x XOR n/w: if we s started with identical weight everywhere, BP will not converge

Example - Character Recognition Output layer – 26 neurons (all capital) First output neuron has the responsibility of detecting all forms of ‘A’ Centralized representation of outputs In distributed representations, all output neurons participate in output

An application in Medical Domain

Expert System for Skin Diseases Diagnosis Bumpiness and scaliness of skin Mostly for symptom gathering and for developing diagnosis skills Not replacing doctor’s diagnosis

Architecture of the FF NN input neurons, 20 hidden layer neurons, 10 output neurons Inputs: skin disease symptoms and their parameters – Location, distribution, shape, arrangement, pattern, number of lesions, presence of an active norder, amount of scale, elevation of papuls, color, altered pigmentation, itching, pustules, lymphadenopathy, palmer thickening, results of microscopic examination, presence of herald pathc, result of dermatology test called KOH

Output 10 neurons indicative of the diseases: – psoriasis, pityriasis rubra pilaris, lichen planus, pityriasis rosea, tinea versicolor, dermatophytosis, cutaneous T-cell lymphoma, secondery syphilis, chronic contact dermatitis, soberrheic dermatitis

Training data Input specs of 10 model diseases from 250 patients 0.5 is some specific symptom value is not knoiwn Trained using standard error backpropagation algorithm

Testing Previously unused symptom and disease data of 99 patients Result: Correct diagnosis achieved for 70% of papulosquamous group skin diseases Success rate above 80% for the remaining diseases except for psoriasis psoriasis diagnosed correctly only in 30% of the cases Psoriasis resembles other diseases within the papulosquamous group of diseases, and is somewhat difficult even for specialists to recognise.

Explanation capability Rule based systems reveal the explicit path of reasoning through the textual statements Connectionist expert systems reach conclusions through complex, non linear and simultaneous interaction of many units Analysing the effect of a single input or a single group of inputs would be difficult and would yield incor6rect results

Explanation contd. The hidden layer re-represents the data Outputs of hidden neurons are neither symtoms nor decisions

Discussion Symptoms and parameters contributing to the diagnosis found from the n/w Standard deviation, mean and other tests of significance used to arrive at the importance of contributing parameters The n/w acts as apprentice to the expert

Exercise Find the weakest condition for symmetry breaking. It is not the case that only when ALL weights are equal, the network faces the symmetry problem.