Examples using ROOT http://root.cern.ch Ex1: Event Generation (Binomial Distribution) Ex2: Generate various random distributions Ex3: Linear fits Ex4: Determination of the area under a Gaussian Ex5: Bayesian inference Elton S. Smith, Jefferson Lab
Exercise 1 – Generate Binomial Distribution Exercise: Use the ROOT random number generator to generate 10 random numbers according to the Binomial distribution. Compare with the parent probability distribution function.
Exercise 1 - output
Exercise 2 – Various random distributions Exercise: Use the ROOT random number generator to generate random numbers according to a Uniform distribution Exponential distribution Gaussian distribution Poisson distribution Compare with the parent probability distribution function. Extra credit: Generate random number according to an arbitrary input function.
Random distributions
Polynomial function
Exercise 3 - Linear Fits Assume a parent distribution of the form y(x) = a + bx, a=5, b=1 Assume one experiment collects a data set of ten points of the form (xi, yi±s), i=0,1,2,...9, with the measurements yi following a Gaussian distribution with a fixed width s=0.5. Invent the data points yi for one experiment. Fit the data yi to the form y = a + bx. Determine y and the uncertainty of y as a function of x from the fit.
Exercise 3 - Linear Fits Extra credit Generate 1000 Monte Carlo experiments For each experiment fit the data set to the functional form given above Plot the difference between the fitted function and data Histogram the difference between the fitted parameters and the true parameter a and b Use the width of the distribution to determine the uncertainty in the parameters. Compare with the estimated uncertainties in the fit
Linear Fit – one “experiment” Fit for one “experiment” showing the fitted parameters Repeat 1000 times
Fitted Results to 1000 “experiments” For each fit, plot the fitted value of the intercept and slope. Fit the distributions to Gaussian functions Mean = 5.004 ± 0.010 Sigma = 0.303 Mean = 0.9997 ± 0.0018 Sigma = 0.05626 What is the relation between these two?
Plot difference between fitted and true values Fit Gaussian to slices y(x)-yfit Uncertainty on sy can be computed using sab=0 sy2=sa2+x2sb2+2xsab Correlation term is important
Exercise 4 - Fitted Gaussian area Many measurements of a variable x have been accumulated in one experiment. The measurements are dominated by experimental resolution, so the distribution of measurements is Gaussian. Obtain Integral Generate 1000 measurements (one experiment) according to a Gaussian distribution Fit the distribution to a Gaussian and determine the area under the curve and its uncertainty. Systematic Study (extra credit) Generate 100 experiments with 1000 measurements each and empirically determine areas and uncertainties Use fit defaults and option=‘L’ Compare and discuss
Fitted Gaussian area – one “experiment” (969) (322) (5) Fitting Option=default Variable x (xunits)
Fitted Gaussian area – 100 “experiments” Events Generated Fitted Gaussian Sum Mean = 1000 ± 3.7 Mean = 976.5 ± 3.7 Sigma = 34 Sigma = 35
Fitted Gaussian area – one “experiment” (997) (322) (5) Fitting Option=‘L’ Variable x (xunits)
Fitted Gaussian area – 100 “experiments” Events Generated Fitted Gaussian Sum Mean = 1000 ± 3.7 Mean = 998 ± 3.5 Sigma = 34 Sigma = 32
Summary of examples of fitting Linear fit Obtained values of the uncertainties of the parameters by Monte Carlo. Demonstrated these uncertainties corresponded to the computed estimate The correlation term must be included Determination of areas under Gaussian distributions The area and uncertainties were computed The computation requires inclusion of the correlation term. Monte Carlo determination of the area shows that the use of least-square fits lead to estimates for the area that are systematically low
Bayesian inference - example An experiment is interested in identifying pions in the presence of a large number of muon tracks. A detector is designed to identify pions, but has a 96% efficiency for tagging pions as pions and a 1% chance of misidentifying a muon and tagging it as a pion. Assume there are 10 times more muons than pions. Assume that a track has been tagged as a pion. What is the degree of belief that the track is a pion? How does this change if there are 100 times more muons than pions?
Bayesian inference p m m
Dependence on background ratio