The asymmetric uncertainties on data points in Roofit Few slides on ‘easy/trivial’ topic that will hopefully leave you confused di-muon invariant mass [MeV/c2] Candidates (50 MeV/c2) CERN seminar Flavio Archilli 14-2-2017 Ivo van Vulpen (Nikhef/UvA, ATLAS)
How are our asymmetric uncertainties on data points defined ? Not ± √n 4 Candidates (50 MeV/c2) di-muon invariant mass [MeV/c2] 2/20
statistics 3/20
You are responsible for how you summarize your measurement Solving statistics problems in general Discussions (in large experiments): - strong opinions, very outspoken ‘experts’ - use RooSomethingFancy: made by experts & debugged - let’s do what we did last time, let’s be conservative - … You are responsible for how you summarize your measurement The tools you use have assumptions, biases, pitfalls, … , so you know best how others (and you) should interpret your measurement 4/20
Six reasonable options Discuss in your group which one you prefer How should we represent this measurement ? 4 Candidates (50 MeV/c2) di-muon invariant mass [MeV/c2] Basis for this talk: http://www.nikhef.nl/~ivov/Statistics/PoissonError/BobCousins_Poisson.pdf 5/20
Option 1: assign NO uncertainty ± 0.00 The number of observed events is what it is: 4 The uncertainty is in the interpretation step, i.e. on the model-parameters that you extract from it 6/20
Comfortable territory: Poisson distribution Probability to observe n events when λ are expected Binomial with n∞, p0 en np=λ Number of observed events #observed λ hypothesis P(Nobs|λ=4.9) λ=4.90 4 7/20
Properties of the Poisson distribution the famous √n properties λ=1.00 (1) Mean: (2) Variance: (3) Most likely: first integer ≤ λ λ=4.90 λ=5.00 8/20
Option 2: the famous sqrt(n) +2.00 -2.00 Poisson variance for λ equal to measured number of events … but Poisson distribution is asymetric: 9/20
Just treat it like a normal measurement What you have: 1) construct the Likelihood λ as free parameter 2) Find value of λ that maximizes the Likelihood 3) Determine error interval: Δ(-2Log(Lik.)) = +1 10/20
Likelihood Likelihood Poisson: probability to observe n events when λ are expected Likelihood P(Nobs=4|λ) #observed λ hypothesis λ (hypothesis) fixed scan 11/20
Option 3: Likelihood 12/20 P(Nobs=4|λ) Log(Likelihood) +2.35 P(Nobs=4|λ) -1.68 λ (hypothesis) -2Log(L) Log(Likelihood) -1.68 +2.35 -2Log(P(Nobs=4|λ)) ΔL=+1 12/20 2.32 4.00 6.35
Bayesian: statement on value of λ What you have: What you want: Likelihood pdf for λ Probability to observe Nobs events … given a specific hypothesis (λ) What can we say about theory (λ) … given # of observed events (4) 13/20
Bayesian: statement on value of λ Choice of prior P(λ): (basis for religious wars) Here: assume all values for λ equally likely Likelihood Probability density function λ (hypothesis) λ (hypothesis) Posterior PDF for λ Integrate to get confidence interval
Option 4 & 5: representing the Bayesian posterior Bayes central +3.16 68% 16% 16% -1.16 Bayes equal probability +2.40 68% 8.3% 23.5% -1.71 15/20
Option 6: Frequentist approach Find values of λ that are on border of being compatible with observed #events λ=7.16 If λ > 7.16 then probability to observe 4 events (or less) <16% 16% Note: also uses ‘data you didn’t observe’, i.e. a bit like definition of significance smallest λ (>n) for which P(n≤nobs|λ) ≤ 0.159 +3.16 λ=2.09 -1.91 largest λ (<n) for which P(n≥nobs|λ) ≤ 0.159 16% 16/20
The six options 8 7 -1.91 +3.16 6 +2.00 +2.35 +3.16 +2.40 5 4 3 -2.00 -1.68 -1.16 -1.71 2 1 Nothing Poisson Likelihood Bayesian central Bayesian equal prob Frequentist Δλ: 0.00 4.00 4.03 4.32 4.11 5.07 17/20
The six options Δλ: 0.00 4.00 4.03 4.32 4.11 5.07 RooFit default 18/20 -1.91 +3.16 6 +2.00 +2.35 +3.16 +2.40 5 4 3 -2.00 -1.68 -1.16 -1.71 2 1 Nothing Poisson Likelihood Bayesian central Bayesian equal prob Frequentist Δλ: 0.00 4.00 4.03 4.32 4.11 5.07 18/20
Discussions in other experiment: Example discussion in CDF: https://www-cdf.fnal.gov/physics/statistics/notes/pois_eb.txt We feel it is important to have a relatively simple rule that is readily understood by readers. A reader does not want to have to work hard simply to understand what an error bar on a plot represents. <…> Since the use of +-sqrt(n) is so widespread, the argument in favour of an alternative should be convincing in order for it to be adopted. 19/20
Summary di-muon invariant mass [MeV/c2] Candidates (50 MeV/c2) +3.16 4 -1.91 - You now know how Roofit ‘Poisson errors’ are defined Note: choice has no impact on likelihood fits - Do you agree with RooFit default? What about empty bins then? - Perfect topic to confuse and irritate people over coffee do it!