Least-Mean-Square Algorithm CS/CMPE 537 – Neural Networks.

Slides:



Advertisements
Similar presentations
Aula 3 Single Layer Percetron
Advertisements

Slides from: Doug Gray, David Poole
Introduction to Neural Networks Computing
Artificial Neural Networks
B.Macukow 1 Lecture 12 Neural Networks. B.Macukow 2 Neural Networks for Matrix Algebra Problems.
Perceptron.
Machine Learning Neural Networks
2806 Neural Computation Single Layer Perceptron Lecture Ari Visa.
Widrow-Hoff Learning. Outline 1 Introduction 2 ADALINE Network 3 Mean Square Error 4 LMS Algorithm 5 Analysis of Converge 6 Adaptive Filtering.
Performance Optimization
Simple Neural Nets For Pattern Classification
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
BP - Review CS/CMPE 333 – Neural Networks. CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS2 Notation Consider a MLP with P input, Q hidden,
Correlation Matrix Memory CS/CMPE 333 – Neural Networks.
Learning Process CS/CMPE 537 – Neural Networks. CS/CMPE Neural Networks (Sp 2004/2005) - Asim LUMS2 Learning Learning…? Learning is a process.
The Perceptron CS/CMPE 333 – Neural Networks. CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS2 The Perceptron – Basics Simplest and one.
Self Organization: Hebbian Learning CS/CMPE 333 – Neural Networks.
Back-Propagation Algorithm
Before we start ADALINE
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +
Chapter 5ELE Adaptive Signal Processing 1 Least Mean-Square Adaptive Filtering.
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
1 Mehran University of Engineering and Technology, Jamshoro Department of Electronic, Telecommunication and Bio-Medical Engineering Neural Networks Mukhtiar.
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 20 Oct 26, 2005 Nanjing University of Science & Technology.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.
Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.
Linear Discrimination Reading: Chapter 2 of textbook.
Non-Bayes classifiers. Linear discriminants, neural networks.
LEAST MEAN-SQUARE (LMS) ADAPTIVE FILTERING. Steepest Descent The update rule for SD is where or SD is a deterministic algorithm, in the sense that p and.
ADALINE (ADAptive LInear NEuron) Network and
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Chapter 2 Single Layer Feedforward Networks
CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.
SUPERVISED LEARNING NETWORK
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Overview of Adaptive Filters Quote of the Day When you look at yourself from a universal standpoint, something inside always reminds or informs you that.
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
Neural Networks 2nd Edition Simon Haykin 柯博昌 Chap 3. Single-Layer Perceptrons.
METHOD OF STEEPEST DESCENT ELE Adaptive Signal Processing1 Week 5.
Neural Networks 2nd Edition Simon Haykin
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
10 1 Widrow-Hoff Learning (LMS Algorithm) ADALINE Network  w i w i1  w i2  w iR  =
Lecture 2 Introduction to Neural Networks and Fuzzy Logic President UniversityErwin SitompulNNFL 2/1 Dr.-Ing. Erwin Sitompul President University
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
Fall 2004 Backpropagation CS478 - Machine Learning.
Supervised Learning in ANNs
Chapter 2 Single Layer Feedforward Networks
One-layer neural networks Approximation problems
第 3 章 神经网络.
Ranga Rodrigo February 8, 2014
Pipelined Adaptive Filters
CSE 473 Introduction to Artificial Intelligence Neural Networks
Widrow-Hoff Learning (LMS Algorithm).
Biological and Artificial Neuron
Biological and Artificial Neuron
لجنة الهندسة الكهربائية
Artificial Intelligence Chapter 3 Neural Networks
Biological and Artificial Neuron
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Chapter - 3 Single Layer Percetron
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Presentation transcript:

Least-Mean-Square Algorithm CS/CMPE 537 – Neural Networks

CS/CMPE Neural Networks (Sp ) - Asim LUMS2 Linear Adaptive Filter Linear adaptive filter performs a linear transformation of signal according to a performance measure which is minimized or maximized The development of LAFs followed work of Rosenblatt (perceptron) and early neural network researchers  LAFs can be considered as linear single layer feedforward neural networks  Least-mean-square algorithm is a popular learning algorithm for LAFs (and linear single layer networks) Wide applicability  Signal processing  Control

CS/CMPE Neural Networks (Sp ) - Asim LUMS3 Historical Note Linear associative memory (early 1970s)  Function: memory by association  Type: linear single layer feedforward network Perceptron (late 50s, early 60s)  Function: pattern classification  Type: Nonlinear single layer feedforward network Linear adaptive filter or Adaline (1960s)  Function: adaptive signal processing  Type: linear single layer feedforward network

CS/CMPE Neural Networks (Sp ) - Asim LUMS4 Spatial Filter

CS/CMPE Neural Networks (Sp ) - Asim LUMS5 Wiener-Hopf Equations (1) The goal is to find the optimum weights that minimizes the difference between the system output y and some desired response d in the mean-square sense System equations y = Σ k=1 p w k x k e = d – y Performance measure or cost function J = 0.5E[e 2 ] ; E = expectation operator  Find the optimum weights for which J is a minimum

CS/CMPE Neural Networks (Sp ) - Asim LUMS6 Wiener-Hopf Equations (2) Substituting and simplifying J = 0.5E[d 2 ] – E[Σ k=1 p w k x k d] + 0.5E[Σ j=1 p Σ k=1 p w j w k x j x k ] Noting that expectation is a linear operator and w a constant J = 0.5E[d 2 ] – Σ k=1 p w k E[x k d] + 0.5Σ j=1 p Σ k=1 p w j w k E[x j x k ] Let r d = E[d 2 ]; r dx (k)= E[dx k ]; r x (j, k) = E[x j x k ] Then J = 0.5r d – Σ k=1 p w k r dx (k) + 0.5Σ j=1 p Σ k=1 p w j w k r x (j,k) To find the optimum weight Nabla wk J = δJ/ δw k = 0 k = 1, 2,…, p = -r dx (k) + Σ j=1 p w j r x (j,k)

CS/CMPE Neural Networks (Sp ) - Asim LUMS7 Wiener-Hopf Equations (3) Let w ok be the optimum weights, then Σ j=1 p w oj r x (j,k) = r dx (k); k = 1, 2,…, p  These system of equations are known as the Wiener-Hopf equations. Their solution yields the optimum weights for the Wiener filter (spatial filter) The solution of the Wiener-Hopf equations require the inverse of the autocorrelation matrix r x (j, k). This can be computationally expensive

CS/CMPE Neural Networks (Sp ) - Asim LUMS8 Method of Steepest Descent (1)

CS/CMPE Neural Networks (Sp ) - Asim LUMS9 Method of Steepest Descent (2) Iteratively move in the direction of steepest descent (opposite the gradient direction) until the minimum is reached approximately Let w k (n) be the weight at iteration n. Then, the gradient at iteration n is Nabla wk J(n) = -r dx (k) + Σ j=1 p w j (n)r x (j,k) Adjustment applied to w k (n) at iteration n is given by  η = positive learning rate parameter

CS/CMPE Neural Networks (Sp ) - Asim LUMS10 Method of Steepest Descent (3) Cost function J(n) = 0.5E[e 2 (n)] is the ensemble average of all squared errors at the instant n drawn from a population of identical filters An identical update rule can be derived when cost function is J = 0.5Σ i=1 n e 2 (i) Method of steepest descent requires knowledge of the environment. Specifically, the terms r dx (k) and r x (j, k) must be known What happens in an unknown environment?  Use estimates -> least-mean-square algorithm

CS/CMPE Neural Networks (Sp ) - Asim LUMS11 Least-Mean-Square Algorithm (1) LMS algorithm is based on instantaneous estimates of r x (j, k) and r dx (k) r’ x (j, k;n) = x j (n)x k (n) r’ dx (k;n) = x k (n)d(n) Substituting these estimates, the update rule becomes w’ k (n+1) = w’ k (n) + η[x k (n)d(n) – Σ j=1 p w’ j (n)x j (n)x k (n)] w’ k (n+1) = w’ k (n) + η[d(n) – Σ j=1 p w’ j (n)x j (n)]x k (n) w’ k (n+1) = w’ k (n) + η[d(n) – y(n)]x k (n); k = 1, 2,…, p  This is also know as the delta rule or the Widrow-Hoff rule

CS/CMPE Neural Networks (Sp ) - Asim LUMS12 LMS Algorithm (2)

CS/CMPE Neural Networks (Sp ) - Asim LUMS13 LMS Vs Method of Steepest Descent LMSSteepest Descent Can operate in unknown environment Cannot operate in unknown environment (r x and r dx mut be known Can operate in stationary and non-stationary environment (optimum seeking and tracking) Can operate in stationary environment only (no adaptation or tracking) Minimizes instantaneous square error Minimizes mean-square-error (or sum of squared errors) StochasticDeterministic ApproximateExact

CS/CMPE Neural Networks (Sp ) - Asim LUMS14 Adaline (1)

CS/CMPE Neural Networks (Sp ) - Asim LUMS15 Adaline (2) Adaline (adaptive linear element) is an adaptive signal processing/pattern classification machine that uses LMS algorithm. Developed by Widrow and Hoff Inputs x are either -1 or +1, threshold is between 0 and 1 and output is either -1 or +1 LMS algorithm is used to determine the weights. Instead of using the output y, the net input u is used in the error computation, i.e., e = d – u (because y is quantized in the Adaline)