6.3 Gradient Search of Chi-Square Space

Slides:



Advertisements
Similar presentations
Arc-length computation and arc-length parameterization
Advertisements

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. by Lale Yurttas, Texas A&M University Chapter 14.
Least squares CS1114
Pattern Recognition and Machine Learning
Hazırlayan NEURAL NETWORKS Least Squares Estimation PROF. DR. YUSUF OYSAL.
Optimization of thermal processes
Maximum Covariance Analysis Canonical Correlation Analysis.
PARTIAL DERIVATIVES 14. PARTIAL DERIVATIVES 14.6 Directional Derivatives and the Gradient Vector In this section, we will learn how to find: The rate.
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Pattern Recognition and Machine Learning
Performance Optimization
Radial Basis Functions
Motion Analysis (contd.) Slides are from RPI Registration Class.
Improved BP algorithms ( first order gradient method) 1.BP with momentum 2.Delta- bar- delta 3.Decoupled momentum 4.RProp 5.Adaptive BP 6.Trinary BP 7.BP.
12 1 Variations on Backpropagation Variations Heuristic Modifications –Momentum –Variable Learning Rate Standard Numerical Optimization –Conjugate.
Advanced Topics in Optimization
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
456/556 Introduction to Operations Research Optimization with the Excel 2007 Solver.
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
Introduction to Error Analysis
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Physics 114: Lecture 18 Least Squares Fit to Arbitrary Functions Dale E. Gary NJIT Physics Department.
Directional Derivatives and Gradients
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Course 8 Contours. Def: edge list ---- ordered set of edge point or fragments. Def: contour ---- an edge list or expression that is used to represent.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Polynomials, Curve Fitting and Interpolation. In this chapter will study Polynomials – functions of a special form that arise often in science and engineering.
Chapter 10 Minimization or Maximization of Functions.
CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.
Steepest Descent Method Contours are shown below.
Variations on Backpropagation.
Section 14.3 Local Linearity and the Differential.
CSC321: Neural Networks Lecture 9: Speeding up the Learning
Physics 114: Lecture 16 Least Squares Fit to Arbitrary Functions
Copyright © Cengage Learning. All rights reserved.
Supervised Learning in ANNs
Line Fitting James Hayes.
3.1 Examples of Demand Functions
MONDAY TUESDAY WEDNESDAY
A Simple Artificial Neuron
Watermarking with Side Information
Mathematical Modeling
A special case of calibration
Fitting Curve Models to Edges
Chapter 14.
Modelling data and curve fitting
Variations on Backpropagation.
By Viput Subharngkasen
Instructor :Dr. Aamer Iqbal Bhatti
6.2 Grid Search of Chi-Square Space
6.7 Practical Problems with Curve Fitting simple conceptual problems
6.5 Taylor Series Linearization
5.2 Least-Squares Fit to a Straight Line
5.4 General Linear Least-Squares
6.1 Introduction to Chi-Square Space
6.6 The Marquardt Algorithm
9.5 Least-Squares Digital Filters
Mathematical Foundations of BME Reza Shadmehr
Variations on Backpropagation.
§ 2.2 The First and Second Derivative Rules.
Mathematical Foundations of BME
11 Vectors and the Geometry of Space
Vector Components.
Introduction to Artificial Intelligence Lecture 22: Computer Vision II
MATH 2311 Section 5.3.
Presentation transcript:

6.3 Gradient Search of Chi-Square Space example data are from the same Gaussian peak used in Section 6.2 the basic gradient search strategy is outlined the mathematical details of estimating the gradient are given an automatic Mathcad program is described the ease with which the gradient search can traverse "tilted" chi-square space is shown 6.3 : 1/8

Gaussian Example Continued This is the same example used in Section 6.2. Find the least-squares coefficients that minimize chi-square for the Gaussian equation. The following data were collected across the peak: (45,0.001)(46,0.010) (47,0.037)(48,0.075) (49,0.120)(50,0.178) (51,0.184)(52,0.160) (53,0.126)(54,0.064) (55,0.034) The initial parameter guesses were: g10 = 2.00 g11 = 51.0 6.3 : 2/8

Basic Strategy Start with the first guesses, g10 and g11. Compute chi-square. Choose a value of Da for each axis that will give the desired parameter resolutions. At the location, (g10, g11), compute the local direction of steepest descent (negative of the mathematical gradient). Continue in the direction of steepest descent until the value of chi-square no longer decreases. The value of the two coefficients producing the lowest chi-square are now your second guesses, g20 and g21. Re-compute the local direction of steepest descent at the new location, (g20, g21). Continue in the direction of steepest descent until the value of chi-square no longer decreases. This is the next guess at the coefficients. Repeat the process with each new location until chi-square space becomes flat, i.e. the gradient is zero. 6.3 : 3/8

Mathematical Details The direction of steepest descent is given by the negative of the gradient. For two coefficients, this direction is given by the following. Often the coefficient resolution, Da, is too large to obtain reasonable estimates of the derivatives. Instead, the partial derivatives are estimated by using a fraction of the resolution, fDa, where 0 < f < 1. A reasonable value is f = 0.01 to f = 0.1. The step sizes have to be adjusted for each parameter so that travel is in the direction of steepest descent. At the right, L is the length of the descent vector, and d is the step size along each parameter. 6.3 : 4/8

Gradient Search Program Description A program that automatically computes the minimum in chi-square space is shown in the Mathcad worksheet, "6.3 Gradient Search Program.mcd". It uses five functions with x,y and a as vector inputs, and res as a scalar input: f(x,a), inputs are the x-data and the initial coefficient guesses, a; output is the corresponding y-value as a scalar. chisqr(y,x,a), inputs are the x- and y-data, and the a-coefficients; output is chi-square at the location given by a. dir(y,x,a,res), inputs are the data, coefficients, and resolution along all coefficient axes, res; output is a vector containing the coefficient step sizes that will follow the steepest descent. move(y,x,a,res), moves along the direction of steepest descent until the gradient is zero; output is the a-vector at the minimum. grad(y,x,a,res), performs a gradient search using the initial guesses given in the a-vector at the specified resolution; output is a vector containing the coefficient values at the minimum. 6.3 : 5/8

Gradient Search Program Output Start with the initial guesses and a resolution about 100 times finer than the resolution of the guesses. This result matches the manual search. If higher resolution is desired, use the above output as the initial guesses. Note that these values are slightly different than those obtained with the grid search, (2.14100, 50.93800), because the gradient search can locate points "off the grid." The minimum chi-square is the same value. 6.3 : 6/8

Chi-Square Space The contour graph shows the route taken while using a gradient search to locate the minimum. It took two moves to get to the minimum, but the second move had a distance too small to show up on the graph. Note that the grid lines are not followed The minimum c2 at the resolution used for graph (Da0 = 0.01, Da1 = 0.01) occurs at a0 = 2.139 and a1 = 50.937). cmin2 = 0.0002459 6.3 : 7/8

Gradient Search with Covariance The gradient search of "tilted" chi-square space works much better than a grid search. The grid search required 23 changes in coefficient axes while locating the minimum. The gradient search found the minimum with only three changes in direction! 6.3 : 8/8