Download presentation
Presentation is loading. Please wait.
1
Parameter Estimation and Fitting to Data
Maximum likelihood Least squares Goodness-of-fit Examples Elton S. Smith, Jefferson Lab
2
Parameter estimation
3
Properties of estimators
4
An estimator for the mean
5
An estimator for the variance
6
The Likelihood function via example
We have a data set given by N data pairs (xi, yi±si) graphically represented below. The goal is to determine the fixed, but unknown, m = f(x). s is known or estimated from the data.
7
Gaussian probabilities (least-squares)
We assume that at a fixed value of xi, we have made a measurement yi and that the measurement was drawn from a Gaussian probability distribution with mean y(xi) = a + bxi and variance si2.
8
c2 minimization
9
Solution for linear fit
For simplicity, assume constant s = si. Then solve two simultaneous equations for 2 unknowns: Parameter uncertainties can be estimated from the curvature of the c2 function.
10
Parameter uncertainties
In a graphical method the uncertainty in the parameter estimator q0 is obtained by changing c2 by one unit. c2(q0±sq) = c2(q0) + 1 In general, using maximum likelihood lnL(q0±sq) = lnL(q0) – 1/2
11
The Likelihood function via example
What does the fit look like? ROOT fit to ‘pol1’ Additional information about the fit: c2 and probability Were the assumptions for the fit valid? This question is addressed by exploring the significance and the goodness of the fit
12
Testing significance/goodness of fit
Quantify the level of agreement between the data and a hypothesis without explicit reference to alternative hypotheses. This is done by defining a goodness-of-fit statistic, and the goodness-of-fit is quantified using the p-value. For the case when the c2 is the goodness-of-fit statistic, then the p-value is given by The p-value is a function of the observed value c2obs and is therefore itself a random variable. If the hypothesis used to compute the p-value is true, then p will be uniformly distributed between zero and one.
13
c2 distribution Gaussian-like
14
p-value for c2 distribution
15
Using the goodness-of-fit
Data generated using Y(x) = x, s = 0.5 Compare three different polynomial fits to the same data. y(x) = p0 + p1x y(x) = p0 y(x) = p0 + p1x + p2x2
16
c2/DOF vs degrees of freedom
17
More about the likelihood method
Recall likelihood for least squares: But the probability density depends on application Proceed as before maximizing lnL (c2 has minus sign). The values of the estimated parameters might not be very different, but the uncertainties can be greatly affected.
18
Applications si2 = constant si2 = yi si2 = Y(x)
Poisson distribution (see PDG Eq ) Stirling’s approx ln(n!) ~ n ln(n) - n
19
Exercise 3 - Linear Fits Assume a parent distribution of the form
y(x) = a + bx, a=5, b=1 Assume one experiment collects a data set of ten points of the form (xi, yi±s), i=0,1,2,...9, with the measurements yi following a Gaussian distribution with a fixed width s=0.5. Invent the data points yi for one experiment. Fit the data yi to the form y = a + bx. Determine y and the uncertainty of y as a function of x from the fit.
20
Linear Fit – one “experiment”
Fit for one “experiment” showing the fitted parameters Uncertainty on sy can be computed using sy2=sa2+x2sb2+2xsab
21
Linear Fit
22
Linear Fit – Covariance Matrix
23
Fitted Results to 1000 “experiments”
For each fit, plot the fitted value of the intercept and slope. Fit the distributions to Gaussian functions Mean = ± 0.010 Sigma = 0.303 What is the relation between these two? Mean = ± Sigma =
24
Plot difference between fitted and true values
Fit Gaussian to slices y(x)-yfit Uncertainty on sy can be computed using sab=0 sy2=sa2+x2sb2+2xsab Correlation term is important
25
Statistical tests In addition to estimating parameters, one often wants to assess the validity of statements concerning the underlying distribution Hypothesis tests provide a rule for accepting or rejecting hypotheses depending on the outcome of an experiment. [Comparison of H0 vs H1] In goodness-of-fit tests one gives the probability to obtain a level of incompatibility with a certain hypothesis that is greater than or equal to the level observed in the actual data. [How good is my assumed H0?] Formulation of the relevant question is critical to deciding how to answer it.
26
Summary of second lecture
Parameter estimation Illustrated the method of maximum likelihood using the least squares assumption Use of goodness-of-fit statistics to determine the validity of underlying assumptions used to determine parent parameters
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.