Statistics for the LHC Lecture 3: Setting limits

Slides:



Advertisements
Similar presentations
Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 1 Using the Profile Likelihood in Searches for New Physics arXiv:
Advertisements

27 th March CERN Higgs searches: CL s W. J. Murray RAL.
G. Cowan Statistics for HEP / LAL Orsay, 3-5 January 2012 / Lecture 2 1 Statistical Methods for Particle Physics Lecture 2: Tests based on likelihood ratios.
Statistics In HEP 2 Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
G. Cowan Statistics for HEP / NIKHEF, December 2011 / Lecture 3 1 Statistical Methods for Particle Physics Lecture 3: Limits for Poisson mean: Bayesian.
1 Nuisance parameters and systematic uncertainties Glen Cowan Royal Holloway, University of London IoP Half.
G. Cowan Statistical methods for HEP / Freiburg June 2011 / Lecture 3 1 Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion.
G. Cowan RHUL Physics Comment on use of LR for limits page 1 Comment on definition of likelihood ratio for limits ATLAS Statistics Forum CERN, 2 September,
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan 2011 CERN Summer Student Lectures on Statistics / Lecture 41 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability.
G. Cowan Statistics for HEP / NIKHEF, December 2011 / Lecture 2 1 Statistical Methods for Particle Physics Lecture 2: Tests based on likelihood ratios.
G. Cowan Statistical Data Analysis / Stat 4 1 Statistical Data Analysis Stat 4: confidence intervals, limits, discovery London Postgraduate Lectures on.
G. Cowan Discovery and limits / DESY, 4-7 October 2011 / Lecture 2 1 Statistical Methods for Discovery and Limits Lecture 2: Tests based on likelihood.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan Discovery and limits / DESY, 4-7 October 2011 / Lecture 3 1 Statistical Methods for Discovery and Limits Lecture 3: Limits for Poisson mean: Bayesian.
Graduierten-Kolleg RWTH Aachen February 2014 Glen Cowan
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
G. Cowan SUSSP65, St Andrews, August 2009 / Statistical Methods 2 page 1 Statistical Methods in Particle Physics Lecture 2: Limits and Discovery.
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
G. Cowan Weizmann Statistics Workshop, 2015 / GDC Lecture 31 Statistical Methods for Particle Physics Lecture 3: asymptotics I; Asimov data set Statistical.
G. Cowan RHUL Physics page 1 Status of search procedures for ATLAS ATLAS-CMS Joint Statistics Meeting CERN, 15 October, 2009 Glen Cowan Physics Department.
G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics.
G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 41 Statistics for the LHC Lecture 4: Bayesian methods and further topics Academic.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Statistical methods for particle physics / Cambridge Recent developments in statistical methods for particle physics HEP Phenomenology.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
G. Cowan, RHUL Physics Statistics for early physics page 1 Statistics jump-start for early physics ATLAS Statistics Forum EVO/Phone, 4 May, 2010 Glen Cowan.
G. Cowan RHUL Physics Status of Higgs combination page 1 Status of Higgs Combination ATLAS Higgs Meeting CERN/phone, 7 November, 2008 Glen Cowan, RHUL.
In Bayesian theory, a test statistics can be defined by taking the ratio of the Bayes factors for the two hypotheses: The ratio measures the probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Statistical methods for HEP / Freiburg June 2011 / Lecture 2 1 Statistical Methods for Discovery and Limits in HEP Experiments Day 2: Discovery.
G. Cowan RHUL Physics Statistical Issues for Higgs Search page 1 Statistical Issues for Higgs Search ATLAS Statistics Forum CERN, 16 April, 2007 Glen Cowan.
1 Nuisance parameters and systematic uncertainties Glen Cowan Royal Holloway, University of London IoP Half.
G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 21 Statistics for the LHC Lecture 2: Discovery Academic Training Lectures CERN,
G. Cowan SOS 2010 / Statistical Tests and Limits -- lecture 31 Statistical Tests and Limits Lecture 3 IN2P3 School of Statistics Autrans, France 17—21.
G. Cowan SLAC Statistics Meeting / 4-6 June 2012 / Two Developments 1 Two developments in discovery tests: use of weighted Monte Carlo events and an improved.
G. Cowan CERN Academic Training 2012 / Statistics for HEP / Lecture 21 Statistics for HEP Lecture 2: Discovery and Limits Academic Training Lectures CERN,
Discussion on significance
arXiv:physics/ v3 [physics.data-an]
Statistics for HEP Lecture 3: More on Discovery and Limits
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
LAL Orsay, 2016 / Lectures on Statistics
Comment on Event Quality Variables for Multivariate Analyses
Lecture 4 1 Probability (90 min.)
School on Data Science in (Astro)particle Physics
Recent developments in statistical methods for particle physics
Statistical Methods for the LHC Discovering New Physics
Statistical Methods for Particle Physics (II)
TAE 2017 / Statistics Lecture 3
Statistical Methods for Particle Physics Lecture 3: further topics
HCPSS 2016 / Statistics Lecture 1
TAE 2018 / Statistics Lecture 3
Statistical Data Analysis / Stat 4
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
TAE 2018 Benasque, Spain 3-15 Sept 2018 Glen Cowan Physics Department
Statistical Tests and Limits Lecture 2: Limits
Computing and Statistical Data Analysis / Stat 7
TRISEP 2016 / Statistics Lecture 1
TAE 2017 / Statistics Lecture 2
Statistical Methods for HEP Lecture 3: Discovery and Limits
Decomposition of Stat/Sys Errors
Introduction to Statistics − Day 4
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
Statistics for HEP Roger Barlow Manchester University
Computing and Statistical Data Analysis / Stat 10
Statistical Methods for Particle Physics Lecture 3: further topics
Presentation transcript:

Statistics for the LHC Lecture 3: Setting limits Academic Training Lectures CERN, 14–17 June, 2010 indico.cern.ch/conferenceDisplay.py?confId=77830 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA

Outline Lecture 1: Introduction and basic formalism Probability, statistical tests, parameter estimation. Lecture 2: Discovery Quantifying discovery significance and sensitivity Systematic uncertainties (nuisance parameters) Lecture 3: Exclusion limits Frequentist and Bayesian intervals/limits Lecture 4: Further topics More on Bayesian methods / model selection G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Interval estimation — introduction In addition to a ‘point estimate’ of a parameter we should report an interval reflecting its statistical uncertainty. Desirable properties of such an interval may include (PDG): communicate objectively the result of the experiment; have a given probability of containing the true parameter; provide information needed to draw conclusions about the parameter possibly incorporating stated prior beliefs. Often use +/- the estimated standard deviation of the estimator. In some cases, however, this is not adequate: estimate near a physical boundary, e.g., an observed event rate consistent with zero. We will look at both Frequentist and Bayesian intervals. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Frequentist confidence intervals Consider an estimator for a parameter q and an estimate We also need for all possible q its sampling distribution Specify upper and lower tail probabilities, e.g., a = 0.05, b = 0.05, then find functions ua(q) and vb(q) such that: G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Confidence interval from the confidence belt The region between ua(q) and vb(q) is called the confidence belt. Find points where observed estimate intersects the confidence belt. This gives the confidence interval [a, b] Confidence level = 1 - a - b = probability for the interval to cover true value of the parameter (holds for any possible true q). G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Confidence intervals in practice The recipe to find the interval [a, b] boils down to solving → a is hypothetical value of q such that → b is hypothetical value of q such that G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Confidence intervals by inverting a test Confidence intervals for a parameter q can be found by defining a test of the hypothesized value q (do this for all q): Specify values of the data that are ‘disfavoured’ by q (critical region) such that P(data in critical region) ≤ g for a prespecified g, e.g., 0.05 or 0.1. If data observed in the critical region, reject the value q . Now invert the test to define a confidence interval as: set of q values that would not be rejected in a test of size g (confidence level is 1 - g ). The interval will cover the true value of q with probability ≥ 1 - g. Equivalent to confidence belt construction; confidence belt is acceptance region of a test. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Relation between confidence interval and p-value Equivalently we can consider a significance test for each hypothesized value of q, resulting in a p-value, pq.. If pq < g, then we reject q. The confidence interval at CL = 1 – g consists of those values of q that are not rejected. E.g. an upper limit on q is the greatest value for which pq ≥ g. In practice find by setting pq = g and solve for q. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Meaning of a confidence interval G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Central vs. one-sided confidence intervals Intervals from other types of tests (e.g. likeihood ratio) can have a, b variable depending on the parameter, but fixed 1 – a – b. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Setting limits: Poisson data with background Count n events, e.g., in fixed time or integrated luminosity. s = expected number of signal events b = expected number of background events n ~ Poisson(s+b): Suppose the number of events found is roughly equal to the expected number of background events, e.g., b = 4.6 and we observe nobs = 5 events. The evidence for the presence of signal events is not statistically significant, → set upper limit on the parameter s. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Upper limit for Poisson parameter Find the hypothetical value of s such that there is a given small probability, say, g = 0.05, to find as few events as we did or less: Solve numerically for s = sup, this gives an upper limit on s at a confidence level of 1-g. Example: suppose b = 0 and we find nobs = 0. For 1-g = 0.95, → G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Calculating Poisson parameter limits To solve for slo, sup, can exploit relation to 2 distribution: Quantile of 2 distribution For low fluctuation of n this can give negative result for sup; i.e. confidence interval is empty. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Limits near a physical boundary Suppose e.g. b = 2.5 and we observe n = 0. If we choose CL = 0.9, we find from the formula for sup Physicist: We already knew s ≥ 0 before we started; can’t use zero- length interval to report result of expensive experiment! Statistician: The interval is designed to cover the true value only 90% of the time — this was clearly not one of those times. Not uncommon dilemma when limit of parameter is close to a physical boundary. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Expected limit for s = 0 Physicist: I should have used CL = 0.95 — then sup = 0.496 Even better: for CL = 0.917923 we get sup = 10-4 ! Reality check: with b = 2.5, typical Poisson fluctuation in n is at least √2.5 = 1.6. How can the limit be so low? Look at the mean (or median) limit for the no-signal hypothesis (s = 0) (sensitivity). Distribution of 95% CL limits with b = 2.5, s = 0. Mean upper limit = 4.44 G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Likelihood ratio limits (Feldman-Cousins) Define likelihood ratio for hypothesized parameter value s: Here is the ML estimator, note Critical region defined by low values of likelihood ratio. Resulting intervals can be one- or two-sided (depending on n). (Re)discovered for HEP by Feldman and Cousins, Phys. Rev. D 57 (1998) 3873. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Feldman-Cousins intervals for Poisson mean Upper/lower edge of intervals for s from n ~ Poisson(s+b) (On plots, m = s.) Feldman & Cousins, PRD 57 (1998) 3873 G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

More on intervals from LR test (Feldman-Cousins) Caveat with coverage: suppose we find n >> b. Usually one then quotes a measurement: If, however, n isn’t large enough to claim discovery, one sets a limit on s. FC pointed out that if this decision is made based on n, then the actual coverage probability of the interval can be less than the stated confidence level (‘flip-flopping’). FC intervals remove this, providing a smooth transition from 1- to 2-sided intervals, depending on n. But, suppose FC gives e.g. 0.1 < s < 5 at 90% CL, p-value of s=0 still substantial. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Generic search (again) Recall from yesterday the prototype analysis with a primary measured histogram where we search for signal: Possibly as well a subsidiary measurement to constrain some of the nuisance parameters (e.g., background rate/shape). Model ni, mi as Poisson distributed; likelihood function is G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Profile likelihood ratio for upper limits For purposes of setting an upper limit on m use where Note for purposes of setting an upper limit, one does not regard an upwards fluctuation of the data as representing incompatibility with the hypothesized m. Note also here we allow the estimator for m be negative (but must be positive). G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Alternative test statistic for upper limits Assume physical signal model has m > 0, therefore if estimator for m comes out negative, the closest physical model has m = 0. Therefore could also measure level of discrepancy between data and hypothesized m with This is in fact the test statistic used in the Higgs CSC combination. Performance not identical to but very close to qm (of previous slide). qm is simpler in important ways (Fayard, Nisati et al.) G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Relation between test statistics and ~ Assuming the Wald approximation for – 2lnl(m), qm and qm both have monotonic relation with m. ~ And therefore quantiles of qm, qm can be obtained directly from those of m (which is Gaussian). ˆ ̃ G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Distribution of qm Similar results for qm G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Distribution of qm ̃ Similar results for qm ̃ G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Monte Carlo test of asymptotic formulae O. Vitells, E. Gross G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Example: exclusion sensitivity Median p-value of m = 1 hypothesis versus Higgs mass assuming background-only data (ATLAS, arXiv:0901.0512). G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

The “CLs” issue When the b and s+b hypotheses are well separated, there is a high probability of excluding the s+b hypothesis (ps+b < a) if in fact the data contain background only (power of test of s+b relative to the alternative b is high). f (Q|b) f (Q| s+b) ps+b pb G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

The “CLs” issue (2) But if the two distributions are close to each other (e.g., we test a Higgs mass far above the accessible kinematic limit) then there is a non-negligible probability of rejecting s+b even though we have low sensitivity (test of s+b low power relative to b). In limiting case of no sensitivity, the distri- butions coincide and the probability of exclusion = a (e.g. 0.05). But we should not regard a model as excluded if we have no sensitivity to it! f (Q|s+b) f (Q|b) ps+b pb G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

The CLs solution The CLs solution (A. Read et al.) is to base the test not on the usual p-value (CLs+b), but rather to divide this by CLb (one minus the background of the b-only hypothesis, i.e., f (q|s+b) Define: f (q|b) 1-CLb = pb CLs+b = ps+b Reject s+b hypothesis if: Reduces “effective” p-value when the two distributions become close (prevents exclusion if sensitivity is low). G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

CLs discussion In the CLs method the p-value is reduced according to the recipe Statistics community does not smile upon ratio of p-values An alternative would to regard parameter m as excluded if: (a) p-value of m < 0.05 (b) power of test of m with respect to background-only exceeds a specified threshold i.e. “Power Constrained Limits”. Coverage is 1-a if one is sensitive to the tested parameter (sufficient power) otherwise never exclude (coverage is then 100%). Ongoing study. In any case should produce CLs result for purposes of comparison with other experiments. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Wrapping up lecture 3 General concept of a confidence interval Constructed to cover true value of the parameter with specified probability. Interval is random, not the parameter. Intervals (limits) from inversion of LR test. CLs issue: In case of no sensitivity, false exclusion rate = 1 - CL CLs solution: ps → ps / (1 – pb) Alternative solution: exclude parameter only if power of test exceeds minimum threshold. Next: Bayesian limits, more on systematics G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Extra slides G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Intervals from the likelihood function In the large sample limit it can be shown for ML estimators: (n-dimensional Gaussian, covariance V) defines a hyper-ellipsoidal confidence region, If then G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Distance between estimated and true q G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Approximate confidence regions from L( ) So the recipe to find the confidence region with CL = 1- is: For finite samples, these are approximate confidence regions. Coverage probability not guaranteed to be equal to 1- ; no simple theorem to say by how far off it will be (use MC). Remember here the region is random, not the parameter. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Example of interval from ln L( ) For n=1 parameter, CL = 0.683, Qg = 1. G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Properties of upper limits Example: take b = 5.0, 1 -  = 0.95 Upper limit sup vs. n Mean upper limit vs. s (N.B. here Feldman-Cousins “upper-limit” refers to the upper edge of the interval, which can be two-sided.) G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Upper limit versus b Feldman & Cousins, PRD 57 (1998) 3873 If n = 0 observed, should upper limit depend on b? Classical: yes Bayesian: no FC: yes G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3

Coverage probability of intervals Because of discreteness of Poisson data, probability for interval to include true value in general > confidence level (‘over-coverage’) G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 3