CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Backpropagation and NN based IR.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Multi-Layer Perceptron (MLP)
NEURAL NETWORKS Backpropagation Algorithm
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Neural Networks  A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.
Mehran University of Engineering and Technology, Jamshoro Department of Electronic Engineering Neural Networks Feedforward Networks By Dr. Mukhtiar Ali.
Kostas Kontogiannis E&CE
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
The back-propagation training algorithm
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.
Neural Networks for Information Retrieval Hassan Bashiri May 2005.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Spring 2002 Shreekanth Mandayam Robi Polikar ECE Department.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2006 Shreekanth Mandayam ECE Department Rowan University.
CS 484 – Artificial Intelligence
November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:
CS621: Artificial Intelligence Lecture 24: Backpropagation Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-6) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621: Artificial Intelligence Lecture 27: Backpropagation applied to recognition problems; start of logic Pushpak Bhattacharyya Computer Science and Engineering.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Artificial Neural Networks
Neural Networks Chapter 6 Joost N. Kok Universiteit Leiden.
Advanced information retreival Chapter 02: Modeling - Neural Network Model Neural Network Model.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 23 Nov 2, 2005 Nanjing University of Science & Technology.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 35: Backpropagation; need for.
Backpropagation An efficient way to compute the gradient Hung-yi Lee.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 20 Oct 26, 2005 Nanjing University of Science & Technology.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Intelligence Techniques Multilayer Perceptrons.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Multi-Layer Perceptron
Najah Alshanableh. Fuzzy Set Model n Queries and docs represented by sets of index terms: matching is approximate from the start n This vagueness can.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Pushpak Bhattacharyya Computer Science and Engineering Department
EEE502 Pattern Recognition
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 45– Backpropagation issues; applications 11 th Nov, 2010.
Information Retrieval CSE 8337 Spring 2005 Modeling (Part II) Material for these slides obtained from: Modern Information Retrieval by Ricardo Baeza-Yates.
Recuperação de Informação B Cap. 02: Modeling (Latent Semantic Indexing & Neural Network Model) 2.7.2, September 27, 1999.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Backpropagation and Application.
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Chapter 6 Neural Network.
Fall 2004 Backpropagation CS478 - Machine Learning.
Advanced information retreival
CS623: Introduction to Computing with Neural Nets (lecture-5)
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Latent Semantic Indexing
CSE P573 Applications of Artificial Intelligence Neural Networks
CS621: Artificial Intelligence
Prof. Carolina Ruiz Department of Computer Science
Artificial Intelligence Methods
CSE 573 Introduction to Artificial Intelligence Neural Networks
CS 621 Artificial Intelligence Lecture 25 – 14/10/05
Neural Networks Geoff Hulten.
Capabilities of Threshold Neurons
CS621 : Artificial Intelligence
CS623: Introduction to Computing with Neural Nets (lecture-5)
Computer Vision Lecture 19: Object Recognition III
Recuperação de Informação B
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
Prof. Carolina Ruiz Department of Computer Science
Presentation transcript:

CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Backpropagation and NN based IR

Backpropagation algorithm Fully connected feed forward network Pure FF network (no jumping of connections over layers) Hidden layers Input layer (n i/p neurons) Output layer (m o/p neurons) j i w ji ….

Gradient Descent Equations

Backpropagation – for outermost layer

Backpropagation for hidden layers Hidden layers Input layer (n i/p neurons) Output layer (m o/p neurons) j i …. k  k is propagated backwards to find value of  j

Backpropagation – for hidden layers

General Backpropagation Rule General weight updating rule: Where for outermost layer for hidden layers

Issues in the training algorithm Greedy Algorithm Always changes weight such that E reduces. The algorithm may get stuck up in a local minimum. If we observe that E is not getting reduced anymore, the following may be the reasons:

Issues in the training algorithm contd. 1.Stuck in local minimum. 2.Network paralysis. (High –ve or +ve i/p makes neurons to saturate.) 3. (learning rate) is too small.

Diagnostics in action (1) 1) If stuck in local minimum, try the following: Re-initializing the weight vector. Increase the learning rate. Introduce more neurons in the hidden layer.

Diagnostics in action (2) 2) Observe the outputs: If they are close to 0 or 1, try the following: 1.Scale the inputs or divide by a normalizing factor. 2.Change the shape and size of the sigmoid.

Kolmogorov Statement pertaining to Hidden Layer Design Kolgomorov statement: A feedforward network with three layers (input, output and hidden) with appropriate I/O relation that can vary from neuron to neuron is sufficient to compute any function.  However, More hidden layers reduce the size of individual layers.

Momentum factor 1.To increase speed of learning Introduce momentum factor  Accelerates the movement out of the trough.  Dampens the oscillation inside the trough.  Choosing : If is large, we may jump over the global minimum.

An application in Medical Domain

Expert System for Skin Diseases Diagnosis Bumpiness and scaliness of skin Mostly for symptom gathering and for developing diagnosis skills Not replacing doctor’s diagnosis

Architecture of the FF NN input neurons, 20 hidden layer neurons, 10 output neurons Inputs: skin disease symptoms and their parameters –Location, distribution, shape, arrangement, pattern, number of lesions, presence of an active norder, amount of scale, elevation of papuls, color, altered pigmentation, itching, pustules, lymphadenopathy, palmer thickening, results of microscopic examination, presence of herald path, result of dermatology test called KOH

Output 10 neurons indicative of the diseases: –psoriasis, pityriasis rubra pilaris, lichen planus, pityriasis rosea, tinea versicolor, dermatophytosis, cutaneous T-cell lymphoma, secondery syphilis, chronic contact dermatitis, soberrheic dermatitis

Training data Input specs of 10 model diseases from 250 patients 0.5 if some specific symptom value is not known Trained using standard error backpropagation algorithm

Testing Previously unused symptom and disease data of 99 patients Correct diagnosis achieved for 70% of papulosquamous group skin diseases Success rate above 80% for the remaining diseases except for psoriasis psoriasis diagnosed correctly only in 30% of the cases Psoriasis resembles other diseases within the papulosquamous group of diseases, and is somewhat difficult even for specialists to recognise.

Explanation capability Rule based systems reveal the explicit path of reasoning through the textual statements Connectionist expert systems reach conclusions through complex, non linear and simultaneous interaction of many units Analysing the effect of a single input or a single group of inputs would be difficult and would yield incor6rect results

Explanation contd. The hidden layer re-represents the data Outputs of hidden neurons are neither symtoms nor decisions

Discussion Symptoms and parameters contributing to the diagnosis found from the n/w Standard deviation, mean and other tests of significance used to arrive at the importance of contributing parameters The n/w acts as apprentice to the expert

Neural Network for IR Slides taken from lecture notes material accompanying the book Modern Information Retrieval, Ricardo Baeza Yates

Neural Network for IR: from the work by Wilkinson & Hingston, SIGIR’91 Document Terms Query Terms Documents kaka kbkb kckc kaka kbkb kckc k1k1 ktkt d1d1 djdj d j+1 dNdN

Three layers network Query terms issue the first inputs –These inputs propagate accross the network to reach the document nodes Second level of propagation: –Document nodes might themselves generate new iputs which affect the document term nodes –Document term nodes might respond with new inputs of their own

Weight Assignments Weight associated with an edge from a query term node ki to a document term node ki

Next layer weight assignment Weight associated with an edge from a document term node ki to a document node dj:

Document Ranking After the first level of input propagation, the activation level of a document node d j is given by: which is exactly the ranking of the Vector model The feedback has the effect of thesaurus based expansion

Terrier Assignment corpus link _1989.tar.gzhttp:// _1989.tar.gz 50 topics link j http:// j