Modelling data and curve fitting Least squares Maximum likelihood Chi squared Confidence limits General linear fits (Chapter 15, Numerical recipes. Press et al )
Best fit straight line Assume we measure a parameter y for a set of x values, giving a set of data [xi ] and [yi ] We want to model the data using a linear relation y(xi) = a + b xi
Best fit straight line How do we find the coefficients a and b that give the best fit to the data? Given a pair of values for a and b, we need to define a measure of the ‘goodness of fit’. Then choose the a and b values that give the best fit.
Least squares fit For each data point, xi, calculate the difference between measured yi and the model prediction, a+bxi Note, Δyi can be positive or negative, so ΣΔyi can be zero. Minimizing the sum of the squared residuals will give a good overall fit Computationally, try a range of values for a and b, and for each pair calculate The pair which gives the smallest S is the best fit Δyi = yi – a – bxi S=Σ(Δyi2)
Maximum likelihood It can be shown that the parameters that minimize the sum of the squares are the most likely, given the measured data Assume the x values are exact, and the measurement errors on the y values are Gaussian, with mean zero, and deviation σ. So Where εi is a random variable taken from a Gaussian distribution yi = ytrue(xi) + εi
Example Gaussian distribution
If the true values of a and b are a0 and b0 then So the probability of observing yi is (assuming σ is the same for all measurements)
And the probability of observing the whole dataset [yi ] is We can use Bayes theorem to relate this to how likely it is that the model parameters are a and b
P(A|B) P(B) = P(B|A)P(A) Bayes theorem Given two events A and B, then the conditional probabilities are related P(A|B) P(B) = P(B|A)P(A) P(A|B) is the probability of A happening, given that B has happened P(A) is the probability of A happening, independent of B
Application of Bayes theorem Consider a model M and some data D. Then Bayes theorem tells you the probability that the model is right, given the data that you have observed: So the probability of a particular model, given the data, depends on the probability of observing your data given the model The most probable model is the one for which the observed data is most likely Vary a and b to find the maximum P(M(a,b)|D), which is the same P(a0,b0) defined earlier P(M|D) = P(D|M)P(M)/P(D)
Maximizing means minimizing So for uniform Gaussian errors, maximum likelihood is the same as least squares
Non-Gaussian errors Sometimes you know errors are not Gaussian, so least squares may not be the best method. Minimizing the sum of the modulus is very robust It is equivalent to using the median instead of the mean In general use M-estimates: maximum-likelihood based on non-Gaussian error distribution
(Chi squared) If the uncertainty is different for each measurement then define a quantity If the errors are Gaussian, then minimizing will give the maximum likelihood values of the parameters.
Example of minimum
Finding minimum of (numerically) Calculate Σ(Δyi2) for a grid of a and b values and pick the point that is the minimum
Finding minimum of (analytically) Analytically differentiate with respect to a and b and set and Leads to
Confidence interval The distribution of has a chi-square distribution with N-M degrees of freedom. The distribution of has a chi-square distribution with M degrees of freedom (for M parameters). The probability of a given value of A being the true value is given by the probability of getting the observed for that value. When this corresponds to 68% ie 1σ
The value of The value of tells you more about the model and the data: If is greater than the number of degrees of freedom either the real errors are greater than the that you used, or the model is not good. If is less than the number of degrees of freedom either the real errors are smaller than the that you used, or the model has too many parameters.
General linear models Express your model as the sum of basis functions with linear coefficients The functions can be arbitrary, but are fixed A common example is a polynomial fit, where the functions Xi(x) are powers of x
Finding minimum of (analytically for general model) Differentiate with respect to each parameter ak and set the differentials to zero Define a matrix, α, and a vector β α is called the curvature matrix
Then the equations can be written in matrix form And the solutions are given by Where C is the inverse of the curvature matrix C is also called the covariance matrix
Non-linear fits Easiest approach is to make it linear, for example take logs Otherwise use a direct parameter search for then minimum
Workshop Least squares straight line fit, and interpreting the measuring chi square Non-linear fit using a simple search for the minimum chi square