Least-Mean-Square Algorithm CS/CMPE 537 – Neural Networks.

Slides:

Advertisements

Similar presentations

Aula 3 Single Layer Percetron

Advertisements

Slides from: Doug Gray, David Poole

Introduction to Neural Networks Computing

Artificial Neural Networks

B.Macukow 1 Lecture 12 Neural Networks. B.Macukow 2 Neural Networks for Matrix Algebra Problems.

Machine Learning Neural Networks

2806 Neural Computation Single Layer Perceptron Lecture Ari Visa.

Widrow-Hoff Learning. Outline 1 Introduction 2 ADALINE Network 3 Mean Square Error 4 LMS Algorithm 5 Analysis of Converge 6 Adaptive Filtering.

Performance Optimization

Simple Neural Nets For Pattern Classification

Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.

BP - Review CS/CMPE 333 – Neural Networks. CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS2 Notation Consider a MLP with P input, Q hidden,

Correlation Matrix Memory CS/CMPE 333 – Neural Networks.

Learning Process CS/CMPE 537 – Neural Networks. CS/CMPE Neural Networks (Sp 2004/2005) - Asim LUMS2 Learning Learning…? Learning is a process.

The Perceptron CS/CMPE 333 – Neural Networks. CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS2 The Perceptron – Basics Simplest and one.

Self Organization: Hebbian Learning CS/CMPE 333 – Neural Networks.

Back-Propagation Algorithm

Before we start ADALINE

September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.

CHAPTER 11 Back-Propagation Ming-Feng Yeh.

September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +

Chapter 5ELE Adaptive Signal Processing 1 Least Mean-Square Adaptive Filtering.

Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences

1 Mehran University of Engineering and Technology, Jamshoro Department of Electronic, Telecommunication and Bio-Medical Engineering Neural Networks Mukhtiar.

Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.

1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.

1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 20 Oct 26, 2005 Nanjing University of Science & Technology.

CS 478 – Tools for Machine Learning and Data Mining Backpropagation.

CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.

Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.

Linear Discrimination Reading: Chapter 2 of textbook.

Non-Bayes classifiers. Linear discriminants, neural networks.

LEAST MEAN-SQUARE (LMS) ADAPTIVE FILTERING. Steepest Descent The update rule for SD is where or SD is a deterministic algorithm, in the sense that p and.

ADALINE (ADAptive LInear NEuron) Network and

1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.

Chapter 2 Single Layer Feedforward Networks

CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.

SUPERVISED LEARNING NETWORK

Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.

Overview of Adaptive Filters Quote of the Day When you look at yourself from a universal standpoint, something inside always reminds or informs you that.

Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.

Neural Networks 2nd Edition Simon Haykin 柯博昌 Chap 3. Single-Layer Perceptrons.

METHOD OF STEEPEST DESCENT ELE Adaptive Signal Processing1 Week 5.

Neural Networks 2nd Edition Simon Haykin

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Neural NetworksNN 21 Architecture We consider the architecture: feedforward NN with one layer It is sufficient to study single layer perceptrons with.

Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.

10 1 Widrow-Hoff Learning (LMS Algorithm) ADALINE Network  w i w i1  w i2  w iR  =

Lecture 2 Introduction to Neural Networks and Fuzzy Logic President UniversityErwin SitompulNNFL 2/1 Dr.-Ing. Erwin Sitompul President University

Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.

Fall 2004 Backpropagation CS478 - Machine Learning.

Supervised Learning in ANNs

Chapter 2 Single Layer Feedforward Networks

One-layer neural networks Approximation problems

第 3 章神经网络.

Ranga Rodrigo February 8, 2014

Pipelined Adaptive Filters

CSE 473 Introduction to Artificial Intelligence Neural Networks

Widrow-Hoff Learning (LMS Algorithm).

Biological and Artificial Neuron

Biological and Artificial Neuron

لجنة الهندسة الكهربائية

Artificial Intelligence Chapter 3 Neural Networks

Biological and Artificial Neuron

Artificial Intelligence Chapter 3 Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Chapter - 3 Single Layer Percetron

Artificial Intelligence Chapter 3 Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Presentation transcript:

Least-Mean-Square Algorithm CS/CMPE 537 – Neural Networks

CS/CMPE Neural Networks (Sp ) - Asim LUMS2 Linear Adaptive Filter Linear adaptive filter performs a linear transformation of signal according to a performance measure which is minimized or maximized The development of LAFs followed work of Rosenblatt (perceptron) and early neural network researchers  LAFs can be considered as linear single layer feedforward neural networks  Least-mean-square algorithm is a popular learning algorithm for LAFs (and linear single layer networks) Wide applicability  Signal processing  Control

CS/CMPE Neural Networks (Sp ) - Asim LUMS3 Historical Note Linear associative memory (early 1970s)  Function: memory by association  Type: linear single layer feedforward network Perceptron (late 50s, early 60s)  Function: pattern classification  Type: Nonlinear single layer feedforward network Linear adaptive filter or Adaline (1960s)  Function: adaptive signal processing  Type: linear single layer feedforward network

CS/CMPE Neural Networks (Sp ) - Asim LUMS4 Spatial Filter

CS/CMPE Neural Networks (Sp ) - Asim LUMS5 Wiener-Hopf Equations (1) The goal is to find the optimum weights that minimizes the difference between the system output y and some desired response d in the mean-square sense System equations y = Σ k=1 p w k x k e = d – y Performance measure or cost function J = 0.5E[e 2 ] ; E = expectation operator  Find the optimum weights for which J is a minimum

CS/CMPE Neural Networks (Sp ) - Asim LUMS6 Wiener-Hopf Equations (2) Substituting and simplifying J = 0.5E[d 2 ] – E[Σ k=1 p w k x k d] + 0.5E[Σ j=1 p Σ k=1 p w j w k x j x k ] Noting that expectation is a linear operator and w a constant J = 0.5E[d 2 ] – Σ k=1 p w k E[x k d] + 0.5Σ j=1 p Σ k=1 p w j w k E[x j x k ] Let r d = E[d 2 ]; r dx (k)= E[dx k ]; r x (j, k) = E[x j x k ] Then J = 0.5r d – Σ k=1 p w k r dx (k) + 0.5Σ j=1 p Σ k=1 p w j w k r x (j,k) To find the optimum weight Nabla wk J = δJ/ δw k = 0 k = 1, 2,…, p = -r dx (k) + Σ j=1 p w j r x (j,k)

CS/CMPE Neural Networks (Sp ) - Asim LUMS7 Wiener-Hopf Equations (3) Let w ok be the optimum weights, then Σ j=1 p w oj r x (j,k) = r dx (k); k = 1, 2,…, p  These system of equations are known as the Wiener-Hopf equations. Their solution yields the optimum weights for the Wiener filter (spatial filter) The solution of the Wiener-Hopf equations require the inverse of the autocorrelation matrix r x (j, k). This can be computationally expensive

CS/CMPE Neural Networks (Sp ) - Asim LUMS8 Method of Steepest Descent (1)

CS/CMPE Neural Networks (Sp ) - Asim LUMS9 Method of Steepest Descent (2) Iteratively move in the direction of steepest descent (opposite the gradient direction) until the minimum is reached approximately Let w k (n) be the weight at iteration n. Then, the gradient at iteration n is Nabla wk J(n) = -r dx (k) + Σ j=1 p w j (n)r x (j,k) Adjustment applied to w k (n) at iteration n is given by  η = positive learning rate parameter

CS/CMPE Neural Networks (Sp ) - Asim LUMS10 Method of Steepest Descent (3) Cost function J(n) = 0.5E[e 2 (n)] is the ensemble average of all squared errors at the instant n drawn from a population of identical filters An identical update rule can be derived when cost function is J = 0.5Σ i=1 n e 2 (i) Method of steepest descent requires knowledge of the environment. Specifically, the terms r dx (k) and r x (j, k) must be known What happens in an unknown environment?  Use estimates -> least-mean-square algorithm

CS/CMPE Neural Networks (Sp ) - Asim LUMS11 Least-Mean-Square Algorithm (1) LMS algorithm is based on instantaneous estimates of r x (j, k) and r dx (k) r’ x (j, k;n) = x j (n)x k (n) r’ dx (k;n) = x k (n)d(n) Substituting these estimates, the update rule becomes w’ k (n+1) = w’ k (n) + η[x k (n)d(n) – Σ j=1 p w’ j (n)x j (n)x k (n)] w’ k (n+1) = w’ k (n) + η[d(n) – Σ j=1 p w’ j (n)x j (n)]x k (n) w’ k (n+1) = w’ k (n) + η[d(n) – y(n)]x k (n); k = 1, 2,…, p  This is also know as the delta rule or the Widrow-Hoff rule

CS/CMPE Neural Networks (Sp ) - Asim LUMS12 LMS Algorithm (2)

CS/CMPE Neural Networks (Sp ) - Asim LUMS13 LMS Vs Method of Steepest Descent LMSSteepest Descent Can operate in unknown environment Cannot operate in unknown environment (r x and r dx mut be known Can operate in stationary and non-stationary environment (optimum seeking and tracking) Can operate in stationary environment only (no adaptation or tracking) Minimizes instantaneous square error Minimizes mean-square-error (or sum of squared errors) StochasticDeterministic ApproximateExact

CS/CMPE Neural Networks (Sp ) - Asim LUMS14 Adaline (1)

CS/CMPE Neural Networks (Sp ) - Asim LUMS15 Adaline (2) Adaline (adaptive linear element) is an adaptive signal processing/pattern classification machine that uses LMS algorithm. Developed by Widrow and Hoff Inputs x are either -1 or +1, threshold is between 0 and 1 and output is either -1 or +1 LMS algorithm is used to determine the weights. Instead of using the output y, the net input u is used in the error computation, i.e., e = d – u (because y is quantized in the Adaline)