Statistical Issues for ATLAS Physics

Slides:

Advertisements

Similar presentations

1 Statistical Methods in HEP, in particular... Nuisance parameters and systematic uncertainties Glen Cowan Royal Holloway, University of London

Advertisements

G. Cowan RHUL Physics Bayesian methods for HEP / DESY Terascale School page 1 Bayesian statistical methods for HEP Terascale Statistics School DESY, Hamburg.

1 Nuisance parameters and systematic uncertainties Glen Cowan Royal Holloway, University of London IoP Half.

Machine Learning CMPT 726 Simon Fraser University

G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.

G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.

G. Cowan 2011 CERN Summer Student Lectures on Statistics / Lecture 41 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability.

G. Cowan RHUL Physics Statistical Methods for Particle Physics / 2007 CERN-FNAL HCP School page 1 Statistical Methods for Particle Physics (2) CERN-FNAL.

G. Cowan Lectures on Statistical Data Analysis Lecture 14 page 1 Statistical Data Analysis: Lecture 14 1Probability, Bayes’ theorem 2Random variables and.

G. Cowan Lectures on Statistical Data Analysis Lecture 13 page 1 Statistical Data Analysis: Lecture 13 1Probability, Bayes’ theorem 2Random variables and.

G. Cowan RHUL Physics Statistical Methods for Particle Physics / 2007 CERN-FNAL HCP School page 1 Statistical Methods for Particle Physics CERN-FNAL Hadron.

G. Cowan RHUL Physics Moriond QCD 2007 page 1 Bayesian analysis and problems with the frequentist approach Rencontres de Moriond (QCD) La Thuile,

G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.

G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.

G. Cowan Discovery and limits / DESY, 4-7 October 2011 / Lecture 3 1 Statistical Methods for Discovery and Limits Lecture 3: Limits for Poisson mean: Bayesian.

Lecture II-2: Probability Review

Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.

G. Cowan SUSSP65, St Andrews, August 2009 / Statistical Methods 1 page 1 Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65.

G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,

G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.

1 Introduction to Statistical Methods for High Energy Physics Glen Cowan 2006 CERN Summer Student Lectures CERN Summer Student Lectures on Statistics Glen.

G. Cowan Lectures on Statistical Data Analysis Lecture 1 page 1 Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics;

G. Cowan CLASHEP 2011 / Topics in Statistical Data Analysis / Lecture 1 1 Topics in Statistical Data Analysis for HEP Lecture 1: Parameter Estimation CERN.

G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1 Lecture 2 1 Probability Definition, Bayes’ theorem, probability densities and their properties,

Uniovi1 Some statistics books, papers, etc. G. Cowan, Statistical Data Analysis, Clarendon, Oxford, 1998 see also R.J. Barlow,

G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination based on event counts (follow-up from 11 May 07) ATLAS Statistics Forum.

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.

G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.

1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.

G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.

G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.

1 Introduction to Statistics − Day 2 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.

G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.

G. Cowan SUSSP65, St Andrews, August 2009 / Statistical Methods 1 page 1 Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65.

G. Cowan Lectures on Statistical Data Analysis Lecture 6 page 1 Statistical Data Analysis: Lecture 6 1Probability, Bayes’ theorem 2Random variables and.

G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.

G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.

G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability Definition, Bayes’ theorem, probability densities and their properties,

G. Cowan RHUL Physics Statistical Issues for Higgs Search page 1 Statistical Issues for Higgs Search ATLAS Statistics Forum CERN, 16 April, 2007 Glen Cowan.

1 Nuisance parameters and systematic uncertainties Glen Cowan Royal Holloway, University of London IoP Half.

G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability Definition, Bayes’ theorem, probability densities and their properties,

Statistics for HEP Lecture 1: Introduction and basic formalism

Discussion on significance

Lecture 1.31 Criteria for optimal reception of radio signals.

arXiv:physics/ v3 [physics.data-an]

Statistics for HEP Lecture 1: Introduction and basic formalism

Statistics for the LHC Lecture 3: Setting limits

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

Comment on Event Quality Variables for Multivariate Analyses

Lecture 4 1 Probability (90 min.)

Introduction to Instrumentation Engineering

Hidden Markov Models Part 2: Algorithms

Graduierten-Kolleg RWTH Aachen February 2014 Glen Cowan

Computing and Statistical Data Analysis / Stat 8

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Summarizing Data by Statistics

Lecture 3 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.

Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.

Statistics for Particle Physics Lecture 3: Parameter Estimation

TAE 2018 Benasque, Spain 3-15 Sept 2018 Glen Cowan Physics Department

Statistical Tests and Limits Lecture 2: Limits

Computing and Statistical Data Analysis / Stat 6

Computing and Statistical Data Analysis / Stat 7

Introduction to Unfolding

Statistics for HEP Lecture 1: Introduction and basic formalism

Parametric Methods Berlin Chen, 2005 References:

Introduction to Statistics − Day 4

Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.

Computing and Statistical Data Analysis / Stat 10

Presentation transcript:

Statistical Issues for ATLAS Physics ATLAS Statistics Workshop CERN, 18-19 January, 2007 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Outline 1 Probability Frequentist vs. Subjective (Bayesian) 2 Statistical tests multivariate methods, goodness-of-fit tests 3 Parameter estimation maximum likelihood, least squares, variance of estimators 4 Interval estimation setting limits 5 Further topics systematic errors, MCMC G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Frequentist Statistics − general philosophy In frequentist statistics, probabilities are associated only with the data, i.e., outcomes of repeatable observations. Probability = limiting frequency Probabilities such as P (Higgs boson exists), P (0.117 < as < 0.121), etc. are either 0 or 1, but we don’t know which. The tools of frequentist statistics tell us what to expect, under the assumption of certain probabilities, about hypothetical repeated observations. The preferred theories (models, hypotheses, ...) are those for which our observations would be considered ‘usual’. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Bayesian Statistics − general philosophy In Bayesian statistics, interpretation of probability extended to degree of belief (subjective probability). Use this for hypotheses: probability of the data assuming hypothesis H (the likelihood) prior probability, i.e., before seeing the data posterior probability, i.e., after seeing the data normalization involves sum over all possible hypotheses Bayesian methods can provide more natural treatment of non- repeatable phenomena: systematic uncertainties, probability that Higgs boson exists,... No golden rule for priors (“if-then” character of Bayes’ thm.) G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Statistical tests (in a particle physics context) Suppose the result of a measurement for an individual event is a collection of numbers x1 = number of muons, x2 = mean pt of jets, x3 = missing energy, ... follows some n-dimensional joint pdf, which depends on the type of event produced, i.e., was it For each reaction we consider we will have a hypothesis for the pdf of , e.g., etc. Often call H0 the signal hypothesis (the event type we want); H1, H2, ... are background hypotheses. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Selecting events Suppose we have a data sample with two kinds of events, corresponding to hypotheses H0 and H1 and we want to select those of type H0. Each event is a point in space. What decision boundary should we use to accept/reject events as belonging to event type H0? H1 Probably start with cuts: H0 accept G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Other ways to select events Or maybe use some other sort of decision boundary: linear or nonlinear H1 H1 H0 H0 accept accept How can we do this in an ‘optimal’ way? G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Test statistics Construct a ‘test statistic’ of lower dimension (e.g. scalar) Goal is to compactify data without losing ability to discriminate between hypotheses. We can work out the pdfs Decision boundary is now a single cut on t. This effectively divides the sample space into two regions where we either: accept H0 (acceptance region) or reject it (critical region). G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Significance level and power of a test Probability to reject H0 if it is true (error of the 1st kind): (significance level) Probability to accept H0 if H1 is true (error of the 2nd kind): (1 - b = power) G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Efficiency, purity, etc. Signal efficiency Background efficiency Expected number of signal events: s = s s L Expected number of background events: b =  b b L Prior probabilities ps, pb proportional to cross sections, so for e.g. the signal purity, G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Constructing a test statistic How can we select events in an ‘optimal way’? Neyman-Pearson lemma (proof in Brandt Ch. 8) states: To get the lowest eb for a given es (highest power for a given significance level), choose acceptance region such that where c is a constant which determines es. Equivalently, optimal scalar test statistic is N.B. any monotonic function of this is just as good. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Purity vs. efficiency — optimal trade-off Consider selecting n events: expected numbers s from signal, b from background; → n ~ Poisson (s + b) Suppose b is known and goal is to estimate s with minimum relative statistical error. Take as estimator: Variance of Poisson variable equals its mean, therefore → So we should maximize equivalent to maximizing product of signal efficiency  purity. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Why Neyman-Pearson doesn’t always help The problem is that we usually don’t have explicit formulae for the pdfs Instead we may have Monte Carlo models for signal and background processes, so we can produce simulated data, and enter each event into an n-dimensional histogram. Use e.g. M bins for each of the n dimensions, total of Mn cells. But n is potentially large, → prohibitively large number of cells to populate with Monte Carlo data. Compromise: make Ansatz for form of test statistic with fewer parameters; determine them (e.g. using MC) to give best discrimination between signal and background. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Fisher discriminant Assume linear test statistic, H1 and maximize ‘separation’ between the two classes: H0 Corresponds to a linear decision boundary. accept Equivalent to Neyman-Pearson if the signal and background pdfs are multivariate Gaussian with equal covariances; otherwise not optimal, but still often a simple, practical solution. Sometimes first transform data to better approximate Gaussians. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Nonlinear test statistics The optimal decision boundary may not be a hyperplane, → nonlinear test statistic Multivariate statistical methods are a Big Industry: H1 Neural Networks, Support Vector Machines, Boosted decision trees, Kernel density methods, ... H0 accept Particle Physics can benefit from progress in Machine Learning. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Neural networks: the multi-layer perceptron Use e.g. logistic sigmoid activation function, Define values for ‘hidden nodes’ The network output is given by G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Neural network discussion Why not use all of the available input variables? Fewer inputs → fewer parameters to be adjusted, → parameters better determined for finite training data. Some inputs may be highly correlated → drop all but one. Some inputs may contain little or no discriminating power between the hypotheses → drop them. NN exploits higher moments (nonlinear features) of joint pdf f(x|H), but these may not be well modeled in training data. Better to have simper t(x) where you can ‘understand what it’s doing’. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Neural network discussion (2) The purpose of the statistical test is often to select objects for further study and then measure their properties. Need to avoid input variables that are correlated with the properties of the selected objects that you want to study. (Not always easy; correlations may be poorly known.) Some NN references: L. Lönnblad et al., Comp. Phys. Comm., 70 (1992) 167; C. Peterson et al., Comp. Phys. Comm., 81 (1994) 185; C.M. Bishop, Neural Networks for Pattern Recognition, OUP (1995); John Hertz et al., Introduction to the Theory of Neural Computation, Addison-Wesley, New York (1991). G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Testing goodness-of-fit Suppose hypothesis H predicts pdf for a set of observations We observe a single point in this space: What can we say about the validity of H in light of the data? Decide what part of the data space represents less compatibility with H than does the point more compatible with H less compatible with H (Not unique!) G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

p-values Express ‘goodness-of-fit’ by giving the p-value for H: p = probability, under assumption of H, to observe data with equal or lesser compatibility with H relative to the data we got. This is not the probability that H is true! In frequentist statistics we don’t talk about P(H) (unless H represents a repeatable observation). In Bayesian statistics we do; use Bayes’ theorem to obtain where p (H) is the prior probability for H. For now stick with the frequentist approach; result is p-value, regrettably easy to misinterpret as P(H). G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

The significance of an observed signal Suppose we observe n events; these can consist of: nb events from known processes (background) ns events from a new process (signal) If ns, nb are Poisson r.v.s with means s, b, then n = ns + nb is also Poisson, mean = s + b: Suppose b = 0.5, and we observe nobs = 5. Should we claim evidence for a new discovery? Give p-value for hypothesis s = 0: G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

The significance of a peak Suppose we measure a value x for each event and find: Each bin (observed) is a Poisson r.v., means are given by dashed lines. In the two bins with the peak, 11 entries found with b = 3.2. The p-value for the s = 0 hypothesis is: G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

The significance of a peak (2) But... did we know where to look for the peak? → give P(n ≥ 11) in any 2 adjacent bins Is the observed width consistent with the expected x resolution? → take x window several times the expected resolution How many bins  distributions have we looked at? → look at a thousand of them, you’ll find a 10-3 effect Did we adjust the cuts to ‘enhance’ the peak? → freeze cuts, repeat analysis with new data How about the bins to the sides of the peak... (too low!) Should we publish???? G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Making a discovery Often compute p-value of the ‘background only’ hypothesis H0 using test variable related to a characteristic of the signal. p-value = Probability to see data as incompatible with H0, or more so, relative to the data observed. Requires definition of ‘incompatible with H0’ HEP folklore: claim discovery if p-value equivalent to a 5 fluctuation of Gaussian variable (one-sided) Actual p-value at which discovery becomes believable will depend on signal in question (subjective) Why not do Bayesian analysis? Often don’t know how to assign meaningful prior probabilities G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Parameter estimation Consider a pdf containing one or more undetermined parameters: r.v. parameter Suppose we have a sample of observed values: We want to find some function of the data to estimate the parameter(s): ← estimator written with a hat No golden rule -- construct estimators to possess ‘desirable’ properties (small or zero bias, small variance, etc.) G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Properties of estimators If we were to repeat the entire measurement, the estimates from each would follow a pdf: best large variance biased We want small (or zero) bias (systematic error): → average of repeated measurements should tend to true value. And we want a small variance (statistical error): → small bias & variance are in general conflicting criteria G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

The likelihood function Suppose the output of our measurement is some set of quantities x (here symbolizes the whole set of measured values). A hypothesized model will predict the joint pdf for x, f (x; q). The model may contain undetermined parameters q = (q1, ..., qm). Now evaluate f (x; q) with the data sample obtained and regard it as a function of the parameter(s). This is the likelihood function: If the data are n independent and identically distributed (iid) observations of x: x1, ..., xn, then the joint pdf of x factorizes and G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Maximum likelihood estimators If the hypothesized q is close to the true value, then we expect a high probability to get data like that which we actually found. So we define the maximum likelihood (ML) estimator(s) to be the parameter value(s) for which the likelihood is maximum. ML estimators not guaranteed to have any ‘optimal’ properties, (but in practice they’re very good). G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

The method of least squares Suppose we measure N values, y1, ..., yN, assumed to be independent Gaussian r.v.s with Assume known values of the control variable x1, ..., xN and known variances We want to estimate q, i.e., fit the curve to the data points. The likelihood function is G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

The method of least squares (2) The log-likelihood function is therefore So maximizing the likelihood is equivalent to minimizing Minimum of this quantity defines the least squares estimator, even for cases when data are not Gaussian. Outliers arising from non-Gaussian errors can have unexpectedly large weight in LS fits. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Parameter estimation: discussion There are some cases where we would intentionally choose a biased (non-ML) estimator, e.g., in unfolding problems where we accept a small bias in exchange for a large reduction in variance. And there could be cases where, for sake of convenience, we choose LS rather than ML even for non-Gaussian data. And occasionally we choose an unbiased estimator rather than ML if it’s not too difficult, e.g., the sample variance. But usually ML estimators are about as good as you can do. “No one ever got fired for maximizing the likelihood.” — anonymous G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Confidence intervals Frequentist intervals (limits) for a parameter q can be found by defining a test of the hypothesized value q (do this for all q): Specify values of the data x that are ‘disfavoured’ by q (critical region) such that P(x in critical region) ≤ g for a prespecified g, e.g., 0.05 or 0.1. If x is observed in the critical region, reject the value q. Now invert the test to define a confidence interval as: set of q values that would not be rejected in a test with significance level g (confidence level is 1 - g ). The interval will cover the true value of q with probability ≥ 1 - g. Equivalent to Neyman confidence-belt construction. Confidence belt is acceptance region of test. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Poisson data with background Count n events, e.g., in fixed time or integrated luminosity. s = expected number of signal events b = expected number of background events n ~ Poisson(s+b): Sometimes b known, other times it is in some way uncertain. Goals: (i) convince people that s ≠ 0 (discovery); (ii) measure or place limits on s, taking into consideration the uncertainty in b. Widely discussed in HEP community, see e.g. proceedings of PHYSTAT meetings, Durham, Fermilab, CERN workshops... G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Setting limits: ‘classical method’ E.g. for upper limit on s, take critical region to be low values of n, limit sup at confidence level 1 - b thus found from Similarly for lower limit at confidence level 1 - a, Sometimes choose a = b = g /2 → central confidence interval. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Calculating classical limits To solve for slo, sup, can exploit relation to 2 distribution: Quantile of 2 distribution For low fluctuation of n this can give negative result for sup; i.e. confidence interval is empty. b G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

The Bayesian approach In Bayesian statistics need to start with ‘prior pdf’ p(q), this reflects degree of belief about q before doing the experiment. Bayes’ theorem tells how our beliefs should be updated in light of the data x: Integrate posterior pdf p(q | x) to give interval with any desired probability content. For e.g. Poisson parameter 95% CL upper limit from G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Bayesian prior for Poisson parameter Include knowledge that s ≥0 by setting prior p(s) = 0 for s<0. Often try to reflect ‘prior ignorance’ with e.g. Not normalized but this is OK as long as L(s) dies off for large s. Not invariant under change of parameter — if we had used instead a flat prior for, say, the mass of the Higgs boson, this would imply a non-flat prior for the expected number of Higgs events. Doesn’t really reflect a reasonable degree of belief, but often used as a point of reference; or viewed as a recipe for producing an interval whose frequentist properties can be studied (coverage will depend on true s). G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Bayesian interval with flat prior for s Solve numerically to find limit sup. For special case b = 0, Bayesian upper limit with flat prior numerically same as classical case (‘coincidence’). Otherwise Bayesian limit is everywhere greater than classical (‘conservative’). Never goes negative. Doesn’t depend on b if n = 0. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Likelihood ratio limits (Feldman-Cousins) Define likelihood ratio for hypothesized parameter value s: Here is the ML estimator, note Critical region defined by low values of likelihood ratio. Resulting intervals can be one- or two-sided (depending on n). (Re)discovered for HEP by Feldman and Cousins, Phys. Rev. D 57 (1998) 3873. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

More on intervals from LR test (Feldman-Cousins) Caveat with coverage: suppose we find n >> b. Usually one then quotes a measurement: If, however, n isn’t large enough to claim discovery, one sets a limit on s. FC pointed out that if this decision is made based on n, then the actual coverage probability of the interval can be less than the stated confidence level (‘flip-flopping’). FC intervals remove this, providing a smooth transition from 1- to 2-sided intervals, depending on n. But, suppose FC gives e.g. 0.1 < s < 5 at 90% CL, p-value of s=0 still substantial. Part of upper-limit ‘wasted’? G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Properties of upper limits Example: take b = 5.0, 1 -  = 0.95 Upper limit sup vs. n Mean upper limit vs. s G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Upper limit versus b Feldman & Cousins, PRD 57 (1998) 3873 b If n = 0 observed, should upper limit depend on b? Classical: yes Bayesian: no FC: yes G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Coverage probability of confidence intervals Because of discreteness of Poisson data, probability for interval to include true value in general > confidence level (‘over-coverage’) G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Discussion on limits Different sorts of limits answer different questions. A frequentist confidence interval does not (necessarily) answer, “What do we believe the parameter’s value is?” Coverage — nice, but crucial? Look at sensitivity, e.g., E[sup | s = 0]. Consider also: politics, need for consensus/conventions; convenience and ability to combine results, ... For any result, consumer will compute (mentally or otherwise): Need likelihood (or summary thereof). consumer’s prior G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Statistical vs. systematic errors Statistical errors: How much would the result fluctuate upon repetition of the measurement? Implies some set of assumptions to define probability of outcome of the measurement. Systematic errors: What is the uncertainty in my result due to uncertainty in my assumptions, e.g., model (theoretical) uncertainty; modelling of measurement apparatus. Usually taken to mean the sources of error do not vary upon repetition of the measurement. Often result from uncertain value of, e.g., calibration constants, efficiencies, etc. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Systematic errors and nuisance parameters Response of measurement apparatus is never modelled perfectly: y (measured value) model: truth: x (true value) Model can be made to approximate better the truth by including more free parameters. systematic uncertainty ↔ nuisance parameters G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

A typical fit (symbolic) Given measurements: and (usually) covariances: Predicted value: expectation value control variable PDF parameters, s, etc. bias Often take: Minimize Equivalent to maximizing L() » e-2/2, i.e., least squares same as maximum likelihood using a Gaussian likelihood function. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Its Bayesian equivalent Take Joint probability for all parameters and use Bayes’ theorem: To get desired probability for , integrate (marginalize) over b: → Posterior is Gaussian with mode same as least squares estimator,  same as from 2 = 2min + 1. (Back where we started!) G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Marginalizing with Markov Chain Monte Carlo In a Bayesian analysis we usually need to integrate over some (or all) of the parameters, e.g., Probability density for prediction of observable () Integrals often high dimension, usually cannot be done in closed form or with acceptance-rejection Monte Carlo. Markov Chain Monte Carlo (MCMC) has revolutionized Bayesian computation. (Google words: Metropolis-Hastings, MCMC) Produces a correlated sequence of points in the sampled space. Correlations here not fatal, but stat. error larger than naive √n. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

A prior for bias b(b) with longer tails Represents ‘error on the error’; standard deviation of ps(s) is ss. b(b) b Gaussian (s = 0) P(|b| > 4sys) = 6.3 £ 10-5 s = 0.5 P(|b| > 4sys) = 0.65% G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

A simple test Suppose fit effectively averages four measurements. Take sys = stat = 0.1, uncorrelated. Case #1: data appear compatible Posterior p(|y): measurement p(|y) experiment  Usually summarize posterior p(|y) with mode and standard deviation: G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Simple test with inconsistent data Case #2: there is an outlier Posterior p(|y): measurement p(|y) experiment  → Bayesian fit less sensitive to outlier. → Error now connected to goodness-of-fit. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Goodness-of-fit vs. size of error In LS fit, value of minimized 2 does not affect size of error on fitted parameter. In Bayesian analysis with non-Gaussian prior for systematics, a high 2 corresponds to a larger error (and vice versa). 2000 repetitions of experiment, s = 0.5, here no actual bias. posterior   from least squares 2 G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Bayesian limits with uncertainty on b Uncertainty on b goes into the prior, e.g., Put this into Bayes’ theorem, Marginalize over b, then use p(s|n) to find intervals for s with any desired probability content. Controversial part here is prior for signal s(s) (treatment of nuisance parameters is easy). G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Cousins-Highland method Regard b as ‘random’, characterized by pdf (b). Makes sense in Bayesian approach, but in frequentist model b is constant (although unknown). A measurement bmeas is random but this is not the mean number of background events, rather, b is. Compute anyway This would be the probability for n if Nature were to generate a new value of b upon repetition of the experiment with b(b). Now e.g. use this P(n;s) in the classical recipe for upper limit at CL = 1 - b: Result has hybrid Bayesian/frequentist character. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

‘Integrated likelihoods’ Consider again signal s and background b, suppose we have uncertainty in b characterized by a prior pdf b(b). Define integrated likelihood as also called modified profile likelihood, in any case not a real likelihood. Now use this to construct likelihood ratio test and invert to obtain confidence intervals. Feldman-Cousins & Cousins-Highland (FHC2), see e.g. J. Conrad et al., Phys. Rev. D67 (2003) 012002 and Conrad/Tegenfeldt PHYSTAT05 talk. Calculators available (Conrad, Tegenfeldt, Barlow). G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Interval from inverting profile LR test Suppose we have a measurement bmeas of b. Build the likelihood ratio test with profile likelihood: and use this to construct confidence intervals. See PHYSTAT05 talks by Cranmer, Feldman, Cousins, Reid. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Wrapping up... Some areas to think about: Look at Bayesian methods, especially for systematics. Multivariate methods (→ machine learning) Practical concerns: conventions(?), ease of use, etc. Tools: ROOT, R, ...? (Much room for discussion here.) Hopefully Nature will be kind and we won’t be so worried about setting limits as we were at LEP...:-) G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Extra slides G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Some statistics books, papers, etc. G. Cowan, Statistical Data Analysis, Clarendon, Oxford, 1998 see also www.pp.rhul.ac.uk/~cowan/sda R.J. Barlow, Statistics, A Guide to the Use of Statistical in the Physical Sciences, Wiley, 1989 see also hepwww.ph.man.ac.uk/~roger/book.html L. Lyons, Statistics for Nuclear and Particle Physics, CUP, 1986 W. Eadie et al., Statistical and Computational Methods in Experimental Physics, North-Holland, 1971 (2nd ed. imminent) S. Brandt, Statistical and Computational Methods in Data Analysis, Springer, New York, 1998 (with program library on CD) W.M. Yao et al. (Particle Data Group), Review of Particle Physics, Journal of Physics G 33 (2006) 1; see also pdg.lbl.gov sections on probability statistics, Monte Carlo G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Bayes’ theorem From the definition of conditional probability we have and but , so Bayes’ theorem First published (posthumously) by the Reverend Thomas Bayes (1702−1761) An essay towards solving a problem in the doctrine of chances, Philos. Trans. R. Soc. 53 (1763) 370; reprinted in Biometrika, 45 (1958) 293. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Digression: marginalization with MCMC Bayesian computations involve integrals like often high dimensionality and impossible in closed form, also impossible with ‘normal’ acceptance-rejection Monte Carlo. Markov Chain Monte Carlo (MCMC) has revolutionized Bayesian computation. Google for ‘MCMC’, ‘Metropolis’, ‘Bayesian computation’, ... MCMC generates correlated sequence of random numbers: cannot use for many applications, e.g., detector MC; effective stat. error greater than √n . Basic idea: sample multidimensional look, e.g., only at distribution of parameters of interest. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

MCMC basics: Metropolis-Hastings algorithm Goal: given an n-dimensional pdf generate a sequence of points Proposal density e.g. Gaussian centred about 1) Start at some point 2) Generate 3) Form Hastings test ratio 4) Generate 5) If move to proposed point else old point repeated 6) Iterate G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Metropolis-Hastings (continued) This rule produces a correlated sequence of points (note how each new point depends on the previous one). For our purposes this correlation is not fatal, but statistical errors larger than naive The proposal density can be (almost) anything, but choose so as to minimize autocorrelation. Often take proposal density symmetric: Test ratio is (Metropolis-Hastings): I.e. if the proposed step is to a point of higher , take it; if not, only take the step with probability If proposed step rejected, hop in place. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics

Metropolis-Hastings caveats Actually one can only prove that the sequence of points follows the desired pdf in the limit where it runs forever. There may be a “burn-in” period where the sequence does not initially follow Unfortunately there are few useful theorems to tell us when the sequence has converged. Look at trace plots, autocorrelation. Check result with different proposal density. If you think it’s converged, try it again with 10 times more points. G. Cowan RHUL Physics Statistical Issues for ATLAS Physics