5.2 Least-Squares Fit to a Straight Line total probability and chi-square minimizing chi-square for a straight line solving the determinants example least-squares fit weighted least-squares with an example least-squares with counting errors counting example with an iterative solution 5.2 : 1/14
Probability for yi = a0 + a1xi The dependent variable, y, is related to the independent variable, x, by an equation representing a straight line. If the true value of the coefficients were known (assume x is error-free), the mean of y would be given by the following equation. m = a0 + a1x To test some hypotheses, estimates of the equation coefficients, a0 and a1, are needed. To obtain these estimates, N values of x will be chosen and the corresponding values of y measured. It is assumed that the measured value of y has error described by a normal pdf and that the magnitude of this error, s, can vary with x. For any one chosen value of x = xi , the probability that some particular value, yi , would be measured is, 5.2 : 2/14
Total Probability and Chi-Square Now consider the case where N measurements are made. The method of maximum likelihood will be applied to the total probability of the N measurements in the belief that this procedure will provide the most probable estimates of the coefficients. Only the term within the middle set of brackets contains a0 and a1. As a result, only this term is important in maximizing ptotal with respect to the values of a0 and a1. Maximization is achieved by minimizing the summation within the exponent - a procedure called least-squares. This particular summation is so common it is given its own name and symbol - chi-square, c2 . 5.2 : 3/14
Minimizing Chi-Square For a straight line, chi-square has the following form. The method of maximum likelihood requires that chi-square be minimized with respect to the two coefficients. This is done by taking the partial derivative with respect to each coefficient and setting the expression equal to zero. 5.2 : 4/14
Minimizing Chi-Square And so, we have a system of two equations with two unknowns (a0, a1). All of the terms consisting of summations are known constants calculable from the experimental data set. The two equations above can be rewritten using matrix notation. Or… 5.2 : 4/14
Minimizing Chi-Square Now, we just need to invert the equation to isolate and solve for the most probable coefficients a. For a 2×2 matrix, the inverse is straightforward to calculate. Substituting back in the explicit expressions for the elements of the different vectors and matrices yields the following. 5.2 : 4/14
Solving the Determinant (2) In most experiments, the noise on y does not depend upon x, and all si = s. The un-weighted least-squares solution is then given by the following, where the subscripts have been left off to simplify writing. Note that the units of D are now x2, giving a0 units of y and a1 units of y/x. 5.2 : 6/14
Example Unweighted Least-Squares N = 5 values of x were selected and the value of y measured: (1,16.88)(2,9.42)(3,2.41)(4,-5.28)(5,-13.23) The summations were computed as: Sx = 15, Sx2 = 55, Sy = 10.20, Sxy = -44.33 The graph shows the raw data plus the regression line, y = 24.51 - 7.49x. 5.2 : 7/14
Example Chi-Square Minimum The two graphs at the right show how the total probability is maximized at the least-squares coefficients. The two graphs at the right show how chi-square is minimized at the least squares coefficients. Since these were synthetic data, the true coefficients are known: m0 = 24.53, m1 = -7.22 The true values do not give a maximum probability nor minimum chi-square! 5.2 : 8/14
Weighted Least-Squares In a weighted least-squares, the inverse of the variance is defined as the weight. When the variance is large, the weight is small, etc. The least-squares equations can be written using this new variable. For example, With normally distributed noise it is uncommon to have a separate weight for each value of x. It is more common to have ranges of the independent variable with different weights. Also, the weights are often whole numbers, since only the relative values of s may be known. For the example on the next slide there is a change in sensitivity at 5, where s = 5 for x 5, and s = 1 for x > 5. The corresponding weights would be w = 1 for x 5, and w = 25 for x > 5. 5.2 : 9/14
Example Weighted Least-Squares (1) N = 11 values of x were selected and the value of y measured: (0,4.75)(1,11.14)(2,21.74)(3,37.13)(4,37.43)(5,45.53) (6,47.38)(7,54.36)(8,61.77)(9,68.51)(10,74.99) The first row of values had a sensitivity 5 times larger than the second row. The higher sensitivity made the noise 5 times larger. The first 6 values were weighted 1, and the second 5 weighted 25. 5.2 : 10/14
Example Weighted Least-Squares (2) The experimental data and regression line are shown below. The error in the first six points was purposely made sufficiently large that the difference in standard deviation could be seen visually. The blue line is the weighted least-squares while the magenta line is the unweighted least-squares. Note how the unweighted least-squares incorrectly fits the points for x = 6 though 10. 5.2 : 11/14
Weighting with Counting Errors (1) When the y-axis involves counts with a Poisson pdf, each value of y will have a unique s. To the extent that the Poisson pdf approximates a normal pdf, the least-squares equations can be used. The equations are obtained by substituting mi for si2. With counting experiments y, m, and s have no units. Thus D has units of x2, making a0 have no units and a1 have units of x-1. 5.2 : 12/14
Weighting with Counting Errors (2) Since the mi are unknown, an iterative procedure must be used. For the first step, estimate mi with the experimental yi. Use the a0 and a1 obtained in step (1) to compute a better estimate of the means, mi = a0 + a1xi Repeat step (2) using the revised estimates of mi. Continue repeating until the values of a0 and a1 are stable to the desired precision. The procedure gives the same coefficient values as the method of maximum likelihood applied to the Poisson pdf. 5.2 : 13/14
Example with Counting Errors N = 11 values of x were selected and the value of y measured: (0,20)(1,31)(2,36)(3,41)(4,37)(5,50) (6,58)(7,52)(8,58)(9,54)(10,56) iteration a0 a1 1 25.352 3.775 2 26.052 3.753 3 26.083 3.747 4 26.086 3.746 MML The blue regression line is the initial iteration, the magenta line is the Poisson MML solution. 5.2 : 14/14