Pushpak Bhattacharyya Computer Science and Engineering Department

Slides:

Advertisements

Similar presentations

CS344: Principles of Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 11, 12: Perceptron Training 30 th and 31 st Jan, 2012.

Advertisements

CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 16: Perceptrons and their computing power 6 th and.

Lecture 11 Neural Networks L. Manevitz Dept of Computer Science Tel: 2420.

CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Perceptrons; # of regions;

CS621: Artificial Intelligence Lecture 11: Perceptrons capacity Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 6: 22/09/2004, 24/09/2004,

Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Linear Separability,

CS623: Introduction to Computing with Neural Nets Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29: Perceptron training and.

Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,

Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Artificial Neural Networks:

CS621 : Artificial Intelligence

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.

CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.

CS623: Introduction to Computing with Neural Nets (lecture-12) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

Neural NetworksNN 21 Architecture We consider the architecture: feedforward NN with one layer It is sufficient to study single layer perceptrons with.

CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20: Neural Net Basics: Perceptron.

CS621: Artificial Intelligence Lecture 10: Perceptrons introduction Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

CS621 : Artificial Intelligence

CS623: Introduction to Computing with Neural Nets (lecture-18) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 5- Deduction Theorem.

Chapter 2 Single Layer Feedforward Networks

CS623: Introduction to Computing with Neural Nets (lecture-5)

CS344: Introduction to Artificial Intelligence (associated lab: CS386)

Pushpak Bhattacharyya Computer Science and Engineering Department

Wed June 12 Goals of today’s lecture. Learning Mechanisms

CS621: Artificial Intelligence

Pattern Recognition: Statistical and Neural

CS623: Introduction to Computing with Neural Nets (lecture-2)

CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.

CSE (c) S. Tanimoto, 2004 Neural Networks

Perceptron as one Type of Linear Discriminants

Learning via Neural Networks

Pushpak Bhattacharyya Computer Science and Engineering Department

CS623: Introduction to Computing with Neural Nets (lecture-4)

Perceptrons Introduced in1957 by Rosenblatt

CS 621 Artificial Intelligence Lecture 25 – 14/10/05

Fuzzy Inferencing – Inverted Pendulum Problem

CS480/680: Intro to ML Lecture 01: Perceptron 9/11/18 Yao-Liang Yu.

CSE (c) S. Tanimoto, 2001 Neural Networks

CS623: Introduction to Computing with Neural Nets (lecture-9)

CS621: Artificial Intelligence

CS344 : Introduction to Artificial Intelligence

CS621: Artificial Intelligence

CSE (c) S. Tanimoto, 2002 Neural Networks

CS621: Artificial Intelligence

CS 621 Artificial Intelligence Lecture 27 – 21/10/05

Neuro-Computing Lecture 2 Single-Layer Perceptrons

Artificial Intelligence 9. Perceptron

Chapter - 3 Single Layer Percetron

ARTIFICIAL INTELLIGENCE

CS621: Artificial Intelligence Lecture 12: Counting no

CS623: Introduction to Computing with Neural Nets (lecture-5)

CS623: Introduction to Computing with Neural Nets (lecture-14)

CS623: Introduction to Computing with Neural Nets (lecture-3)

CS344 : Introduction to Artificial Intelligence

CS621: Artificial Intelligence

CSE (c) S. Tanimoto, 2007 Neural Nets

CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.

Prof. Pushpak Bhattacharyya, IIT Bombay

CS 621 Artificial Intelligence Lecture /09/05 Prof

CS621: Artificial Intelligence Lecture 14: perceptron training

CS621: Artificial Intelligence Lecture 18: Feedforward network contd

CS623: Introduction to Computing with Neural Nets (lecture-11)

CS621 : Artificial Intelligence

CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.

Presentation transcript:

CS621: Artificial Intelligence Lecture 15: perceptron training contd; Proof of Convergence of PTA Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Perceptron Training Algorithm (PTA) Preprocessing: The computation law is modified to y = 1 if ∑wixi > θ y = o if ∑wixi < θ  . . . θ, ≤ w1 w2 wn x1 x2 x3 xn . . . θ, < w1 w2 w3 wn x1 x2 x3 xn w3

PTA – preprocessing cont… 2. Absorb θ as a weight  3. Negate all the zero-class examples . . . θ w1 w2 w3 wn x2 x3 xn . . . θ w1 w2 w3 wn x2 x3 xn x1 w0=θ x0= -1 x1

Example to demonstrate preprocessing OR perceptron 1-class <1,1> , <1,0> , <0,1> 0-class <0,0> Augmented x vectors:- 1-class <-1,1,1> , <-1,1,0> , <-1,0,1> 0-class <-1,0,0> Negate 0-class:- <1,0,0>

Example to demonstrate preprocessing cont.. Now the vectors are x0 x1 x2 X1 -1 0 1 X2 -1 1 0 X3 -1 1 1 X4 1 0 0

Perceptron Training Algorithm Start with a random value of w ex: <0,0,0…> 2. Test for wxi > 0 If the test succeeds for i=1,2,…n then return w Modify w, wnext = wprev + xfail Goto 2

Tracing PTA on OR-example w=<0,0,0> wx1 fails w=<-1,0,1> wx4 fails w=<0,0,1> wx2 fails w=<-1,1,1> wx4 fails w=<0,1,1> wx4 fails w=<1,1,1> wx1 fails w=<0,1,2> wx4 fails w=<1,1,2> wx2 fails w=<0,2,2> wx4 fails w=<1,2,2> success

Proof of Convergence of PTA Perceptron Training Algorithm (PTA) Statement: Whatever be the initial choice of weights and whatever be the vector chosen for testing, PTA converges if the vectors are from a linearly separable function.

Proof of Convergence of PTA Suppose wn is the weight vector at the nth step of the algorithm. At the beginning, the weight vector is w0 Go from wi to wi+1 when a vector Xj fails the test wiXj > 0 and update wi as wi+1 = wi + Xj Since Xjs form a linearly separable function,  w* s.t. w*Xj > 0 j

Proof of Convergence of PTA Consider the expression G(wn) = wn . w* | wn| where wn = weight at nth iteration G(wn) = |wn| . |w*| . cos  |wn| where  = angle between wn and w* G(wn) = |w*| . cos  G(wn) ≤ |w*| ( as -1 ≤ cos  ≤ 1)

Behavior of Numerator of G wn . w* = (wn-1 + Xn-1fail ) . w* wn-1 . w* + Xn-1fail . w* (wn-2 + Xn-2fail ) . w* + Xn-1fail . w* ….. w0 . w* + ( X0fail + X1fail +.... + Xn-1fail ). w* w*.Xifail is always positive: note carefully Suppose |Xj| ≥  , where  is the minimum magnitude. Num of G ≥ |w0 . w*| + n  . |w*| So, numerator of G grows with n.

Behavior of Denominator of G |wn| =  wn . wn  (wn-1 + Xn-1fail )2  (wn-1)2 + 2. wn-1. Xn-1fail + (Xn-1fail )2  (wn-1)2 + (Xn-1fail )2 (as wn-1. Xn-1fail ≤ 0 )  (w0)2 + (X0fail )2 + (X1fail )2 +…. + (Xn-1fail )2 |Xj| ≤  (max magnitude) So, Denom ≤  (w0)2 + n2

Some Observations Numerator of G grows as n Denominator of G grows as  n => Numerator grows faster than denominator If PTA does not terminate, G(wn) values will become unbounded.

Some Observations contd. But, as |G(wn)| ≤ |w*| which is finite, this is impossible! Hence, PTA has to converge. Proof is due to Marvin Minsky.

Convergence of PTA proved Whatever be the initial choice of weights and whatever be the vector chosen for testing, PTA converges if the vectors are from a linearly separable function.