Aula 3 Single Layer Percetron

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

The Perceptron.
Multi-Layer Perceptron (MLP)
Introduction to Neural Networks Computing
G53MLE | Machine Learning | Dr Guoping Qiu
NEURAL NETWORKS Perceptron
Artificial Neural Networks
Artificial neural networks
Support Vector Machines
Classification Neural Networks 1
Perceptron.
Simple Neural Nets For Pattern Classification
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
NNs Adaline 1 Neural Networks - Adaline L. Manevitz.
Least-Mean-Square Algorithm CS/CMPE 537 – Neural Networks.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
An Illustrative Example
Before we start ADALINE
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
Data Mining with Neural Networks (HK: Chapter 7.5)
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Aula 4 Radial Basis Function Networks
September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +
Radial-Basis Function Networks
Neural networks.
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
1 Mehran University of Engineering and Technology, Jamshoro Department of Electronic, Telecommunication and Bio-Medical Engineering Neural Networks Mukhtiar.
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
DIGITAL IMAGE PROCESSING Dr J. Shanbehzadeh M. Hosseinajad ( J.Shanbehzadeh M. Hosseinajad)
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 20 Oct 26, 2005 Nanjing University of Science & Technology.
Lecture 3 Introduction to Neural Networks and Fuzzy Logic President UniversityErwin SitompulNNFL 3/1 Dr.-Ing. Erwin Sitompul President University
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Feed-Forward Neural Networks 主講人 : 虞台文. Content Introduction Single-Layer Perceptron Networks Learning Rules for Single-Layer Perceptron Networks – Perceptron.
Features of Biological Neural Networks 1)Robustness and Fault Tolerance. 2)Flexibility. 3)Ability to deal with variety of Data situations. 4)Collective.
Non-Bayes classifiers. Linear discriminants, neural networks.
11 1 Backpropagation Multilayer Perceptron R – S 1 – S 2 – S 3 Network.
ADALINE (ADAptive LInear NEuron) Network and
Chapter 2 Single Layer Feedforward Networks
CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 1. Stat 231. A.L. Yuille. Fall Perceptron Rule and Convergence Proof Capacity.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
Neural Networks 2nd Edition Simon Haykin
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Perceptrons Michael J. Watts
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Fall 2004 Backpropagation CS478 - Machine Learning.
Chapter 2 Single Layer Feedforward Networks
One-layer neural networks Approximation problems
第 3 章 神经网络.
Ranga Rodrigo February 8, 2014
Derivation of a Learning Rule for Perceptrons
Ch 2. Concept Map ⊂ ⊂ Single Layer Perceptron = McCulloch – Pitts Type Learning starts in Ch 2 Architecture, Learning Adaline : Linear Learning.
Data Mining with Neural Networks (HK: Chapter 7.5)
Classification Neural Networks 1
Neural Network - 2 Mayank Vatsa
Lecture Notes for Chapter 4 Artificial Neural Networks
Backpropagation.
Neuro-Computing Lecture 2 Single-Layer Perceptrons
Chapter - 3 Single Layer Percetron
Backpropagation.
Artificial Intelligence Chapter 3 Neural Networks
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Aula 3 Single Layer Percetron PMR5406 Redes Neurais e Lógica Fuzzy Aula 3 Single Layer Percetron Baseado em: Neural Networks, Simon Haykin, Prentice-Hall, 2nd edition Slides do curso por Elena Marchiori, Vrije Unviersity

Single Layer Perceptron Architecture 10-00 We consider the architecture: feed-forward NN with one layer It is sufficient to study single layer perceptrons with just one neuron: PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Perceptron: Neuron Model Uses a non-linear (McCulloch-Pitts) model of neuron: x1 x2 xm w2 w1 wm b (bias) v y (v)  is the sign function: (v) = +1 IF v >= 0 -1 IF v < 0 Is the function sign(v) PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Perceptron: Applications The perceptron is used for classification: classify correctly a set of examples into one of the two classes C1, C2: If the output of the perceptron is +1 then the input is assigned to class C1 If the output is -1 then the input is assigned to C2 PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Perceptron: Classification The equation below describes a hyperplane in the input space. This hyperplane is used to separate the two classes C1 and C2 decision region for C1 x2 w1x1 + w2x2 + b > 0 decision boundary C1 x1 decision region for C2 C2 w1x1 + w2x2 + b <= 0 w1x1 + w2x2 + b = 0 PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Perceptron: Limitations The perceptron can only model linearly separable functions. The perceptron can be used to model the following Boolean functions: AND OR COMPLEMENT But it cannot model the XOR. Why? PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Perceptron: Limitations The XOR is not linear separable It is impossible to separate the classes C1 and C2 with only one line 1 -1 x2 x1 C1 C2 PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Perceptron: Learning Algorithm Variables and parameters x(n) = input vector = [+1, x1(n), x2(n), …, xm(n)]T w(n) = weight vector = [b(n), w1(n), w2(n), …, wm(n)]T b(n) = bias y(n) = actual response d(n) = desired response  = learning rate parameter PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

The fixed-increment learning algorithm Initialization: set w(0) =0 Activation: activate perceptron by applying input example (vector x(n) and desired response d(n)) Compute actual response of perceptron: y(n) = sgn[wT(n)x(n)] Adapt weight vector: if d(n) and y(n) are different then w(n + 1) = w(n) + [d(n)-y(n)]x(n) Where d(n) = +1 if x(n)  C1 -1 if x(n)  C2 Continuation: increment time step n by 1 and go to Activation step PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Example Consider a training set C1  C2, where: 10-00 Example Consider a training set C1  C2, where: C1 = {(1,1), (1, -1), (0, -1)} elements of class 1 C2 = {(-1,-1), (-1,1), (0,1)} elements of class -1 Use the perceptron learning algorithm to classify these examples. w(0) = [1, 0, 0]T  = 1

Single Layer Perceptron Example + - x1 x2 C2 C1 1 -1 1/2 Decision boundary: 2x1 - x2 = 0 PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Convergence of the learning algorithm 10-00 Convergence of the learning algorithm Suppose datasets C1, C2 are linearly separable. The perceptron convergence algorithm converges after n0 iterations, with n0  nmax on training set C1  C2. Proof: suppose x  C1  output = 1 and x  C2  output = -1. For simplicity assume w(1) = 0,  = 1. Suppose perceptron incorrectly classifies x(1) … x(n) … C1. Then wT(k) x(k)  0.  Error correction rule: w(2) = w(1) + x(1) w(3) = w(2) + x(2)  w(n+1) = x(1)+ …+ x(n) w(n+1) = w(n) + x(n).

Convergence theorem (proof) Let w0 be such that w0T x(n) > 0  x(n)  C1. w0 exists because C1 and C2 are linearly separable. Let  = min w0T x(n) | x(n)  C1. Then w0T w(n+1) = w0T x(1) + … + w0T x(n)  n Cauchy-Schwarz inequality: ||w0||2 ||w(n+1)||2  [w0T w(n+1)]2 ||w(n+1)||2  (A) n2  2 ||w0|| 2 PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Convergence theorem (proof) Now we consider another route: w(k+1) = w(k) + x(k) || w(k+1)||2 = || w(k)||2 + ||x(k)||2 + 2 w T(k)x(k) euclidean norm   0 because x(k) is misclassified  ||w(k+1)||2  ||w(k)||2 + ||x(k)||2 k=1,..,n =0 ||w(2)||2  ||w(1)||2 + ||x(1)||2 ||w(3)||2  ||w(2)||2 + ||x(2)||2  ||w(n+1)||2  PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

convergence theorem (proof) Let  = max ||x(n)||2 x(n)  C1 ||w(n+1)||2  n  (B) For sufficiently large values of k: (B) becomes in conflict with (A). Then n cannot be greater than nmax such that (A) and (B) are both satisfied with the equality sign. Perceptron convergence algorithm terminates in at most nmax= iterations.  ||w0||2 2 PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Adaline: Adaptive Linear Element The output y is a linear combination o x x1 x2 xm w2 w1 wm y PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Adaline: Adaptive Linear Element Adaline: uses a linear neuron model and the Least-Mean-Square (LMS) learning algorithm The idea: try to minimize the square error, which is a function of the weights We can find the minimum of the error function E by means of the Steepest descent method PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Steepest Descent Method start with an arbitrary point find a direction in which E is decreasing most rapidly make a small step in that direction PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Least-Mean-Square algorithm (Widrow-Hoff algorithm) Approximation of gradient(E) Update rule for the weights becomes: PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron

Summary of LMS algorithm Training sample: input signal vector x(n) desired response d(n) User selected parameter  >0 Initialization set ŵ(1) = 0 Computation for n = 1, 2, … compute e(n) = d(n) - ŵT(n)x(n) ŵ(n+1) = ŵ(n) +  x(n)e(n) PMR5406 Redes Neurais e Lógica Fuzzy Single Layer Perceptron