Presentation is loading. Please wait.

Presentation is loading. Please wait.

6.6 The Marquardt Algorithm

Similar presentations


Presentation on theme: "6.6 The Marquardt Algorithm"— Presentation transcript:

1 6.6 The Marquardt Algorithm
limitations of the gradient and Taylor expansion methods recasting the Taylor expansion in terms of chi-square derivatives recasting the gradient search into an iterative matrix formalism Marquardt's algorithm automatically combines the gradient and Taylor expansion methods program description and example output for the Gaussian peak 6.6 : 1/9

2 Limitations of Previous Methods
The gradient search works well when the slope of chi-square space is steep. It "flounders" when getting near the minimum because the gradient is approaching zero. A second-order Taylor series expansion works best when the initial guesses are within a region of chi-square space having positive curvature. When the initial estimates are in regions of negative curvature the iterative procedure diverges. The Marquardt algorithm is a method of automatically switching between the gradient search and Taylor expansion to enhance convergence to the minimum in chi-square space. 6.6 : 2/9

3 Taylor Series Derivatives
One of the biggest drawbacks to the second-order Taylor expansion is the need to provide a functional form for the first and second derivatives. This is especially acute when writing a computer program. Marquardt showed that the functions could be replaced by numeric differentiation of chi-square. Additionally, Marquardt pointed out the Dyi term in the last equation will sum to a small value as N increases, since the deviations have a pdf with a mean of zero. 6.6 : 3/9

4 Gradient Search Matrix Formalism
The second important thing recognized by Marquardt was the ability to write the gradient search as an iterative matrix problem, just like a Taylor's series expansion. The matrix solution is, , where d and b are the previously defined vectors, and Da is a diagonal matrix containing the step sizes along each coefficient axis in chi-square space. Since c2 is unitless, each br has units of 1/ar. Since each dr must have units of ar, each step size, Dar,r, has to have units of ar2. The most natural choice for the step sizes is one proportional to the coefficient variance, (a-1)r,r. Finally, the step should be some small fraction of the variance, (a-1)r,r /l, where l > 1. The Da matrix is then obtained by multiplying the diagonal of a by the constant l, where l is serving the same role as the factor f in the gradient search. 6.6 : 4/9

5 Gradient Search Matrix Formalism
Quick review of the gradient search. Gradient of c2 is given by Step size is proportional to the magnitude of the gradient, weighted by the empirical factor of f = 0.01 – But (remarkably)… Where Da is a diagonal matrix.

6 Matrix Solution Marquardt [Journal of the Society for Industrial and Applied Mathematics, 1963, vol. 11, pp ] demonstrated that the second-order expansion and the gradient search could be combined into one mathematical operation. To do this define a new a matrix. When l << 1 , the solution corresponds to a Taylor expansion. When l >> 1, the solution corresponds to a gradient search. The calculation uses the standard iterative procedure, where the subscript, cur, denotes the current guesses, and the subscript, new, denotes the improved estimate of the coefficient values. Iteration involves substituting new guesses for current guesses and repeating the matrix algebra. 6.6 : 5/9

7 The Algorithm compute chi-square at the initial guesses
assemble the a matrix and b vector using the partial derivatives start with l equal to a small number, say 10-3, and assemble the a' matrix - the fit will start with a Taylor's expansion solve the matrix equation for anew and compute chi-square at anew (a) if c2(anew)  c2(acur) multiply l by 10, reassemble a' and repeat step 4 with acur - make the fit more like a gradient (b) if c2(anew) < c2(acur) divide l by 10, reassemble a' and repeat step 4 with acur = anew - make the fit more like an expansion (c) if c2(anew)  c2(acur) stop the iterations and use acur as estimates of the coefficients - the minimum has been found 6. set l = 0 and compute a-1 with acur. Use the diagonal elements to obtain the variance of the coefficients. 6.6 : 6/9

8 Marquardt Program Description
A program that automatically computes the minimum in chi-square space is shown in the Mathcad worksheet, "6.6 Marquardt Algorithm.mcd". It uses six functions with a variety of inputs: f(x,a), inputs are the x-data and the initial coefficient guesses, a; output is the corresponding y-value as a scalar. chisqr(y,x,a), inputs are the x- and y-data, and the a-coefficients; output is chi-square at the location given by a. beta(y,x,a,D), inputs are the data, coefficients, and the step size, D used to compute the derivatives; output is the b vector which contains -0.5. alpha(y,x,a,D), computes the derivatives; output is the a matrix which contains the second-order derivatives. setl(a,l), computes the a' matrix given a and l. marquart(y,x,a,D), executes the Marquardt algorithm; output is a vector containing the coefficient values at the minimum, the value of chi-square at the minimum, the final value of l, and the number of required iterations. 6.6 : 7/9

9 Marquardt Program Output
Start with the initial guesses, a0 = 2 and a1 = 51, and D = (the largest value of D giving a stable derivative). Three iterations were required. l = 0.0010.13, which means the search was always a Taylor expansion. The coefficients have the lowest c2 that we have seen. Test the function in a region of negative curvature, a0 = 6 and a1 = The large number of iterations indicates that the algorithm spent a significant fraction of the time performing a gradient search. 6.6 : 8/9

10 Coefficient Errors The coefficients were set to those at the minimum, a0 = and a1 = The alpha matrix was recalculated with l = 0. The a-1 matrix yielded the diagonal elements, (a0,0)-1 = and (a1,1)-1 = The estimated variance of the fit was s2 = 2.7310-5. The coefficient variance can be computed from these two terms. All of the methods gave statistically indistinguishable results for the least-squares coefficients. The Taylor and Marquardt gave statistically indistinguishable standard deviations for the coefficients. The grid search is slow but easy to implement manually, the Marquardt algorithm is the most tolerant to bad guesses. 6.6 : 9/9


Download ppt "6.6 The Marquardt Algorithm"

Similar presentations


Ads by Google