Chapter - 3 Single Layer Percetron

Slides:



Advertisements
Similar presentations
Aula 3 Single Layer Percetron
Advertisements

Multi-Layer Perceptron (MLP)
Introduction to Neural Networks Computing
G53MLE | Machine Learning | Dr Guoping Qiu
NEURAL NETWORKS Perceptron
Artificial Neural Networks
Perceptron.
Simple Neural Nets For Pattern Classification
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
NNs Adaline 1 Neural Networks - Adaline L. Manevitz.
The Perceptron CS/CMPE 333 – Neural Networks. CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS2 The Perceptron – Basics Simplest and one.
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
An Illustrative Example
Artificial Neural Networks
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
1 Mehran University of Engineering and Technology, Jamshoro Department of Electronic, Telecommunication and Bio-Medical Engineering Neural Networks Mukhtiar.
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
DIGITAL IMAGE PROCESSING Dr J. Shanbehzadeh M. Hosseinajad ( J.Shanbehzadeh M. Hosseinajad)
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
Non-Bayes classifiers. Linear discriminants, neural networks.
11 1 Backpropagation Multilayer Perceptron R – S 1 – S 2 – S 3 Network.
ADALINE (ADAptive LInear NEuron) Network and
Chapter 2 Single Layer Feedforward Networks
Neural Networks 2nd Edition Simon Haykin 柯博昌 Chap 3. Single-Layer Perceptrons.
شبکه های عصبی مصنوعی نيمسال دوم دانشكده مهندسي كامپيوتر
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Perceptrons Michael J. Watts
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Neural networks.
Neural networks and support vector machines
Fall 2004 Backpropagation CS478 - Machine Learning.
Artificial Neural Networks
Chapter 2 Single Layer Feedforward Networks
One-layer neural networks Approximation problems
第 3 章 神经网络.
Derivation of a Learning Rule for Perceptrons
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Classification with Perceptrons Reading:
Wed June 12 Goals of today’s lecture. Learning Mechanisms
Ch 2. Concept Map ⊂ ⊂ Single Layer Perceptron = McCulloch – Pitts Type Learning starts in Ch 2 Architecture, Learning Adaline : Linear Learning.
Data Mining with Neural Networks (HK: Chapter 7.5)
Computational Intelligence
Biological and Artificial Neuron
Classification Neural Networks 1
CS623: Introduction to Computing with Neural Nets (lecture-2)
Chapter 3. Artificial Neural Networks - Introduction -
Biological and Artificial Neuron
Neuro-Computing Lecture 4 Radial Basis Function Network
Artificial Intelligence Chapter 3 Neural Networks
Perceptron as one Type of Linear Discriminants
Biological and Artificial Neuron
Neural Networks Chapter 5
Neural Network - 2 Mayank Vatsa
Lecture 7. Learning (IV): Error correcting Learning and LMS Algorithm
Artificial Intelligence Chapter 3 Neural Networks
Backpropagation.
Artificial Intelligence Chapter 3 Neural Networks
Neuro-Computing Lecture 2 Single-Layer Perceptrons
Artificial Intelligence Chapter 3 Neural Networks
Backpropagation.
Lecture 8. Learning (V): Perceptron Learning
Artificial Intelligence Chapter 3 Neural Networks
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Chapter - 3 Single Layer Percetron Neural Networks, Simon Haykin, Prentice-Hall, 2nd edition

Single Layer Perceptron Architecture 10-00 We consider the architecture: feed-forward NN with one layer It is sufficient to study single layer perceptrons with just one neuron: 4/17/2019 Single Layer Perceptron

Perceptron: Neuron Model Uses a non-linear (McCulloch-Pitts) model of neuron: x1 x2 xm w2 w1 wm b (bias) v y (v)  is the sign function: (v) = +1 IF v >= 0 -1 IF v < 0 Is the function sign(v) 4/17/2019 Single Layer Perceptron

Perceptron: Applications The perceptron is used for classification: classify correctly a set of examples into one of the two classes C1, C2: If the output of the perceptron is +1 then the input is assigned to class C1 If the output is -1 then the input is assigned to C2 4/17/2019 Single Layer Perceptron

Perceptron: Classification The equation below describes a hyperplane in the input space. This hyperplane is used to separate the two classes C1 and C2 decision region for C1 x2 w1x1 + w2x2 + b > 0 decision boundary C1 x1 decision region for C2 C2 w1x1 + w2x2 + b <= 0 w1x1 + w2x2 + b = 0 4/17/2019 Single Layer Perceptron

Perceptron: Limitations The perceptron can only model linearly separable functions. The perceptron can be used to model the following Boolean functions: AND OR COMPLEMENT But it cannot model the XOR. Why? 4/17/2019 Single Layer Perceptron

Perceptron: Limitations The XOR is not linear separable It is impossible to separate the classes C1 and C2 with only one line 1 -1 x2 x1 C1 C2 4/17/2019 Single Layer Perceptron

Perceptron: Learning Algorithm Variables and parameters x(n) = input vector = [+1, x1(n), x2(n), …, xm(n)]T w(n) = weight vector = [b(n), w1(n), w2(n), …, wm(n)]T b(n) = bias y(n) = actual response d(n) = desired response  = learning rate parameter 4/17/2019 Single Layer Perceptron

The fixed-increment learning algorithm Initialization: set w(0) =0 Activation: activate perceptron by applying input example (vector x(n) and desired response d(n)) Compute actual response of perceptron: y(n) = sgn[wT(n)x(n)] Adapt weight vector: if d(n) and y(n) are different then w(n + 1) = w(n) + [d(n)-y(n)]x(n) Where d(n) = +1 if x(n)  C1 -1 if x(n)  C2 Continuation: increment time step n by 1 and go to Activation step 4/17/2019 Single Layer Perceptron

Example Consider a training set C1  C2, where: 10-00 Example Consider a training set C1  C2, where: C1 = {(1,1), (1, -1), (0, -1)} elements of class 1 C2 = {(-1,-1), (-1,1), (0,1)} elements of class -1 Use the perceptron learning algorithm to classify these examples. w(0) = [1, 0, 0]T  = 1

Single Layer Perceptron Example + - x1 x2 C2 C1 1 -1 1/2 Decision boundary: 2x1 - x2 = 0 4/17/2019 Single Layer Perceptron

Convergence of the learning algorithm 10-00 Convergence of the learning algorithm Suppose datasets C1, C2 are linearly separable. The perceptron convergence algorithm converges after n0 iterations, with n0  nmax on training set C1  C2. Proof: suppose x  C1  output = 1 and x  C2  output = -1. For simplicity assume w(1) = 0,  = 1. Suppose perceptron incorrectly classifies x(1) … x(n) … C1. Then wT(k) x(k)  0.  Error correction rule: w(2) = w(1) + x(1) w(3) = w(2) + x(2)  w(n+1) = x(1)+ …+ x(n) w(n+1) = w(n) + x(n).

Convergence theorem (proof) Let w0 be such that w0T x(n) > 0  x(n)  C1. w0 exists because C1 and C2 are linearly separable. Let  = min w0T x(n) | x(n)  C1. Then w0T w(n+1) = w0T x(1) + … + w0T x(n)  n Cauchy-Schwarz inequality: ||w0||2 ||w(n+1)||2  [w0T w(n+1)]2 ||w(n+1)||2  (A) n2  2 ||w0|| 2 4/17/2019 Single Layer Perceptron

Convergence theorem (proof) Now we consider another route: w(k+1) = w(k) + x(k) || w(k+1)||2 = || w(k)||2 + ||x(k)||2 + 2 w T(k)x(k) euclidean norm   0 because x(k) is misclassified  ||w(k+1)||2  ||w(k)||2 + ||x(k)||2 k=1,..,n =0 ||w(2)||2  ||w(1)||2 + ||x(1)||2 ||w(3)||2  ||w(2)||2 + ||x(2)||2  ||w(n+1)||2  4/17/2019 Single Layer Perceptron

convergence theorem (proof) Let  = max ||x(n)||2 x(n)  C1 ||w(n+1)||2  n  (B) For sufficiently large values of k: (B) becomes in conflict with (A). Then n cannot be greater than nmax such that (A) and (B) are both satisfied with the equality sign. Perceptron convergence algorithm terminates in at most nmax= iterations.  ||w0||2 2 4/17/2019 Single Layer Perceptron

Adaline: Adaptive Linear Element The output y is a linear combination o x x1 x2 xm w2 w1 wm y 4/17/2019 Single Layer Perceptron

Adaline: Adaptive Linear Element Adaline: uses a linear neuron model and the Least-Mean-Square (LMS) learning algorithm The idea: try to minimize the square error, which is a function of the weights We can find the minimum of the error function E by means of the Steepest descent method 4/17/2019 Single Layer Perceptron

Steepest Descent Method start with an arbitrary point find a direction in which E is decreasing most rapidly make a small step in that direction 4/17/2019 Single Layer Perceptron

Least-Mean-Square algorithm (Widrow-Hoff algorithm) Approximation of gradient(E) Update rule for the weights becomes: 4/17/2019 Single Layer Perceptron

Summary of LMS algorithm Training sample: input signal vector x(n) desired response d(n) User selected parameter  >0 Initialization set ŵ(1) = 0 Computation for n = 1, 2, … compute e(n) = d(n) - ŵT(n)x(n) ŵ(n+1) = ŵ(n) +  x(n)e(n) 4/17/2019 Single Layer Perceptron