CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 35: Backpropagation; need for.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Introduction to Neural Networks Computing
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Simple Neural Nets For Pattern Classification
The back-propagation training algorithm
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.
CS621: Artificial Intelligence Lecture 24: Backpropagation Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Neuro-fuzzy Systems Xinbo Gao School of Electronic Engineering Xidian University 2004,10.
CS623: Introduction to Computing with Neural Nets (lecture-6) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621: Artificial Intelligence Lecture 27: Backpropagation applied to recognition problems; start of logic Pushpak Bhattacharyya Computer Science and Engineering.
Artificial Neural Networks
Biointelligence Laboratory, Seoul National University
Artificial Neural Networks
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
DIGITAL IMAGE PROCESSING Dr J. Shanbehzadeh M. Hosseinajad ( J.Shanbehzadeh M. Hosseinajad)
Neural Networks Chapter 6 Joost N. Kok Universiteit Leiden.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 23 Nov 2, 2005 Nanjing University of Science & Technology.
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic Algorithms (GAs)
CONTENTS:  Introduction  What is neural network?  Models of neural networks  Applications  Phases in the neural network  Perceptron  Model of fire.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Backpropagation and NN based IR.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Intelligence Techniques Multilayer Perceptrons.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 6: 22/09/2004, 24/09/2004,
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29: Perceptron training and.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Artificial Neural Networks:
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.
Pushpak Bhattacharyya Computer Science and Engineering Department
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 45– Backpropagation issues; applications 11 th Nov, 2010.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 36, 37: Hardness of training.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Backpropagation and Application.
Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.
CS623: Introduction to Computing with Neural Nets (lecture-17) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
An Introduction To The Backpropagation Algorithm.
Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.
J. Kubalík, Gerstner Laboratory for Intelligent Decision Making and Control Artificial Neural Networks II - Outline Cascade Nets and Cascade-Correlation.
Machine Learning Supervised Learning Classification and Regression
Fall 2004 Backpropagation CS478 - Machine Learning.
CS623: Introduction to Computing with Neural Nets (lecture-5)
CSE 473 Introduction to Artificial Intelligence Neural Networks
CS621: Artificial Intelligence
ECE 471/571 - Lecture 17 Back Propagation.
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
CS623: Introduction to Computing with Neural Nets (lecture-4)
CS 621 Artificial Intelligence Lecture 25 – 14/10/05
CS623: Introduction to Computing with Neural Nets (lecture-9)
CS621 : Artificial Intelligence
CS621: Artificial Intelligence
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS623: Introduction to Computing with Neural Nets (lecture-3)
CS344 : Introduction to Artificial Intelligence
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 35: Backpropagation; need for multiple layers and non linearity 7 th April, 2011

Backpropagation for hidden layers Hidden layers Input layer (n i/p neurons) Output layer (m o/p neurons) j i …. k for outermost layer

Observations on weight change rules Does the training technique support our intuition? The larger the x i, larger is ∆w i Error burden is borne by the weight values corresponding to large input values

Observations contd. ∆w i is proportional to the departure from target Saturation behaviour when o is 0 or 1 If o 0 and if o > t, ∆w i < 0 which is consistent with the Hebb’s law

Hebb’s law If n j and n i are both in excitatory state (+1) Then the change in weight must be such that it enhances the excitation The change is proportional to both the levels of excitation ∆w ji α e(n j ) e(n i ) If n i and n j are in a mutual state of inhibition ( one is +1 and the other is -1), Then the change in weight is such that the inhibition is enhanced (change in weight is negative) njnj nini w ji

Saturation behavior The algorithm is iterative and incremental If the weight values or number of input values is very large, the output will be large, then the output will be in saturation region. The weight values hardly change in the saturation region

Local Minima Due to the Greedy nature of BP, it can get stuck in local minimum m and will never be able to reach the global minimum g as the error can only decrease by weight change.

Momentum factor 1. Introduce momentum factor.  Accelerates the movement out of the trough.  Dampens oscillation inside the trough.  Choosing β : If β is large, we may jump over the minimum.

Symmetry breaking If mapping demands different weights, but we start with the same weights everywhere, then BP will never converge. w 2 =1w 1 =1 θ = 0.5 x1x2x1x2 x1x2x1x2 x1x1 x2x XOR n/w: if we started with identical weight everywhere, BP will not converge

Updating Weights Change in weight for multiple patterns Offline (batch) accumulate change and reflect after an epoch Online (incremental) reflect change after each pattern

Example-1: Digit Recognition System– 7-segment display

7 segment display - Network Design (Centralized Representation)

7 segment display – inputs and target outputs InputTarget Output X6X5X4X3X2X1X0O9O9 O8O8 O7O7 O6O6 O5O5 O4O4 O3O3 O2O2 O1O1 O

Example-2 - Character Recognition Output layer – 26 neurons (all capital) First output neuron has the responsibility of detecting all forms of ‘A’ Centralized representation of outputs In distributed representations, all output neurons participate in output

An application in Medical Domain

Expert System for Skin Diseases Diagnosis Bumpiness and scaliness of skin Mostly for symptom gathering and for developing diagnosis skills Not replacing doctor’s diagnosis

Architecture of the FF NN input neurons, 20 hidden layer neurons, 10 output neurons Inputs: skin disease symptoms and their parameters Location, distribution, shape, arrangement, pattern, number of lesions, presence of an active norder, amount of scale, elevation of papuls, color, altered pigmentation, itching, pustules, lymphadenopathy, palmer thickening, results of microscopic examination, presence of herald pathc, result of dermatology test called KOH

Output 10 neurons indicative of the diseases: psoriasis, pityriasis rubra pilaris, lichen planus, pityriasis rosea, tinea versicolor, dermatophytosis, cutaneous T-cell lymphoma, secondery syphilis, chronic contact dermatitis, soberrheic dermatitis

Training data Input specs of 10 model diseases from 250 patients 0.5 is some specific symptom value is not knoiwn Trained using standard error backpropagation algorithm

Testing Previously unused symptom and disease data of 99 patients Result: Correct diagnosis achieved for 70% of papulosquamous group skin diseases Success rate above 80% for the remaining diseases except for psoriasis psoriasis diagnosed correctly only in 30% of the cases Psoriasis resembles other diseases within the papulosquamous group of diseases, and is somewhat difficult even for specialists to recognise.

Explanation capability Rule based systems reveal the explicit path of reasoning through the textual statements Connectionist expert systems reach conclusions through complex, non linear and simultaneous interaction of many units Analysing the effect of a single input or a single group of inputs would be difficult and would yield incor6rect results

Explanation contd. The hidden layer re-represents the data Outputs of hidden neurons are neither symtoms nor decisions

Discussion Symptoms and parameters contributing to the diagnosis found from the n/w Standard deviation, mean and other tests of significance used to arrive at the importance of contributing parameters The n/w acts as apprentice to the expert

Exercise Find the weakest condition for symmetry breaking. It is not the case that only when ALL weights are equal, the network faces the symmetry problem.