Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 1 Using the Profile Likelihood in Searches for New Physics arXiv:1007.1727.

Slides:



Advertisements
Similar presentations
G. Cowan Statistics for HEP / LAL Orsay, 3-5 January 2012 / Lecture 2 1 Statistical Methods for Particle Physics Lecture 2: Tests based on likelihood ratios.
Advertisements

Statistics In HEP 2 Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
G. Cowan RHUL Physics Profile likelihood for systematic uncertainties page 1 Use of profile likelihood to determine systematic uncertainties ATLAS Top.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
G. Cowan Statistics for HEP / NIKHEF, December 2011 / Lecture 3 1 Statistical Methods for Particle Physics Lecture 3: Limits for Poisson mean: Bayesian.
G. Cowan RHUL Physics Comment on use of LR for limits page 1 Comment on definition of likelihood ratio for limits ATLAS Statistics Forum CERN, 2 September,
G. Cowan Statistical methods for HEP / Birmingham 9 Nov Recent developments in statistical methods for particle physics Particle Physics Seminar.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan 2011 CERN Summer Student Lectures on Statistics / Lecture 41 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability.
G. Cowan RHUL Physics Higgs combination note status page 1 Status of Higgs Combination Note ATLAS Statistics/Higgs Meeting Phone, 7 April, 2008 Glen Cowan.
G. Cowan Statistics for HEP / NIKHEF, December 2011 / Lecture 2 1 Statistical Methods for Particle Physics Lecture 2: Tests based on likelihood ratios.
G. Cowan Statistical Data Analysis / Stat 4 1 Statistical Data Analysis Stat 4: confidence intervals, limits, discovery London Postgraduate Lectures on.
G. Cowan Shandong seminar / 1 September Some Developments in Statistical Methods for Particle Physics Particle Physics Seminar Shandong University.
G. Cowan Discovery and limits / DESY, 4-7 October 2011 / Lecture 2 1 Statistical Methods for Discovery and Limits Lecture 2: Tests based on likelihood.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan Discovery and limits / DESY, 4-7 October 2011 / Lecture 3 1 Statistical Methods for Discovery and Limits Lecture 3: Limits for Poisson mean: Bayesian.
Graduierten-Kolleg RWTH Aachen February 2014 Glen Cowan
1 Glen Cowan Statistics Forum News Glen Cowan Eilam Gross ATLAS Statistics Forum CERN, 3 December, 2008.
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
G. Cowan Orsay 2014 / Discussion on Statistics1 Topics in Statistics for Particle Physics Discussion on Statistics LAL Orsay 16 June 2014 Glen Cowan Physics.
G. Cowan CLASHEP 2011 / Topics in Statistical Data Analysis / Lecture 21 Topics in Statistical Data Analysis for HEP Lecture 2: Statistical Tests CERN.
G. Cowan Statistical techniques for systematics page 1 Statistical techniques for incorporating systematic/theory uncertainties Theory/Experiment Interplay.
G. Cowan Weizmann Statistics Workshop, 2015 / GDC Lecture 31 Statistical Methods for Particle Physics Lecture 3: asymptotics I; Asimov data set Statistical.
G. Cowan RHUL Physics page 1 Status of search procedures for ATLAS ATLAS-CMS Joint Statistics Meeting CERN, 15 October, 2009 Glen Cowan Physics Department.
G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics.
G. Cowan S0S 2010 / Statistical Tests and Limits 1 Statistical Tests and Limits Lecture 1: general formalism IN2P3 School of Statistics Autrans, France.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Easy Limit Statistics Andreas Hoecker CAT Physics, Mar 25, 2011.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
G. Cowan NExT Workshop, 2015 / GDC Lecture 11 Statistical Methods for Particle Physics Lecture 1: introduction & statistical tests Fifth NExT PhD Workshop:
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Statistical methods for particle physics / Cambridge Recent developments in statistical methods for particle physics HEP Phenomenology.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
G. Cowan, RHUL Physics Statistics for early physics page 1 Statistics jump-start for early physics ATLAS Statistics Forum EVO/Phone, 4 May, 2010 Glen Cowan.
G. Cowan Statistical methods for particle physics / Wuppertal Recent developments in statistical methods for particle physics Particle Physics.
G. Cowan Terascale Statistics School 2015 / Statistics for Run II1 Statistics for LHC Run II Terascale Statistics School DESY, Hamburg March 23-27, 2015.
G. Cowan RHUL Physics Status of Higgs combination page 1 Status of Higgs Combination ATLAS Higgs Meeting CERN/phone, 7 November, 2008 Glen Cowan, RHUL.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Cargese 2012 / Statistics for HEP / Lecture 21 Statistics for HEP Lecture 2: Discovery and Limits International School Cargèse August 2012 Glen.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Statistical methods for HEP / Freiburg June 2011 / Lecture 2 1 Statistical Methods for Discovery and Limits in HEP Experiments Day 2: Discovery.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan IHEP seminar / 19 August Some Developments in Statistical Methods for Particle Physics Particle Physics Seminar IHEP, Beijing 19 August,
G. Cowan St. Andrews 2012 / Statistics for HEP / Lecture 21 Statistics for HEP Lecture 2: Discovery and Limits 69 th SUSSP LHC Physics St. Andrews
G. Cowan iSTEP 2014, Beijing / Statistics for Particle Physics / Lecture 31 Statistical Methods for Particle Physics Lecture 3: systematic uncertainties.
G. Cowan RHUL Physics Statistical Issues for Higgs Search page 1 Statistical Issues for Higgs Search ATLAS Statistics Forum CERN, 16 April, 2007 Glen Cowan.
G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 21 Statistics for the LHC Lecture 2: Discovery Academic Training Lectures CERN,
G. Cowan SLAC Statistics Meeting / 4-6 June 2012 / Two Developments 1 Two developments in discovery tests: use of weighted Monte Carlo events and an improved.
G. Cowan CERN Academic Training 2012 / Statistics for HEP / Lecture 21 Statistics for HEP Lecture 2: Discovery and Limits Academic Training Lectures CERN,
Discussion on significance
Computing and Statistical Data Analysis / Stat 11
Estimating Statistical Significance
Statistics for the LHC Lecture 3: Setting limits
Some Statistical Tools for Particle Physics
LAL Orsay, 2016 / Lectures on Statistics
Comment on Event Quality Variables for Multivariate Analyses
Recent developments in statistical methods for particle physics
Statistical Methods for the LHC Discovering New Physics
Statistical Methods for Particle Physics (II)
TAE 2017 / Statistics Lecture 3
TAE 2018 / Statistics Lecture 3
Statistical Methods for HEP Lecture 3: Discovery and Limits
Introduction to Statistics − Day 4
Statistical Methods for Particle Physics Lecture 3: further topics
Presentation transcript:

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 1 Using the Profile Likelihood in Searches for New Physics arXiv: PHYSTAT2011 CERN, January, 2011 Glen Cowan 1, Kyle Cranmer 2, Eilam Gross 3, Ofer Vitells 3

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 2 Outline Prototype search analysis for LHC Test statistics based on profile likelihood ratio Systematics covered via nuisance parameters Sampling distributions to get significance/sensitivity Asymptotic formulae from Wilks/Wald Examples: n ~ Poisson (  s + b), m ~ Poisson(  b) Shape analysis Conclusions

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Prototype search analysis Search for signal in a region of phase space; result is histogram of some variable x giving numbers: Assume the n i are Poisson distributed with expectation values G. Cowan signal where background strength parameter

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Prototype analysis (II) Often also have a subsidiary measurement that constrains some of the background and/or shape parameters: Assume the m i are Poisson distributed with expectation values G. Cowan nuisance parameters (  s,  b,b tot ) Likelihood function is

Using the Profile Likelihood in Searches for New Physics / PHYSTAT The profile likelihood ratio Base significance test on the profile likelihood ratio: G. Cowan maximizes L for specified  maximize L The likelihood ratio of point hypotheses gives optimum test (Neyman-Pearson lemma). The profile LR hould be near-optimal in present analysis with variable  and nuisance parameters .

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Test statistic for discovery Try to reject background-only (  = 0) hypothesis using G. Cowan i.e. here only regard upward fluctuation of data as evidence against the background-only hypothesis. Note that even though here physically  ≥ 0, we allow to be negative. In large sample limit its distribution becomes Gaussian, and this will allow us to write down simple expressions for distributions of our test statistics.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT p-value for discovery G. Cowan Large q 0 means increasing incompatibility between the data and hypothesis, therefore p-value for an observed q 0,obs is will get formula for this later From p-value get equivalent significance,

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Expected (or median) significance / sensitivity When planning the experiment, we want to quantify how sensitive we are to a potential discovery, e.g., by given median significance assuming some nonzero strength parameter  ′. G. Cowan So for p-value, need f(q 0 |0), for sensitivity, will need f(q 0 |  ′),

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Wald approximation for profile likelihood ratio To find p-values, we need: For median significance under alternative, need: G. Cowan Use approximation due to Wald (1943) sample size

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Noncentral chi-square for  2ln (  ) G. Cowan If we can neglect the O(1/√N) term,  2ln (  ) follows a noncentral chi-square distribution for one degree of freedom with noncentrality parameter As a special case, if  ′ =  then  = 0 and  2ln (  ) follows a chi-square distribution for one degree of freedom (Wilks).

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Distribution of q 0 Assuming the Wald approximation, we can write down the full distribution of q  as G. Cowan The special case  ′ = 0 is a “half chi-square” distribution:

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Cumulative distribution of q 0, significance From the pdf, the cumulative distribution of q  is found to be G. Cowan The special case  ′ = 0 is The p-value of the  = 0 hypothesis is Therefore the discovery significance Z is simply

Using the Profile Likelihood in Searches for New Physics / PHYSTAT The Asimov data set To estimate median value of  2ln (  ), consider special data set where all statistical fluctuations suppressed and n i, m i are replaced by their expectation values (the “Asimov” data set): G. Cowan Asimov value of  2ln (  ) gives non- centrality param.  or equivalently, 

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Relation between test statistics and G. Cowan

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Profile likelihood ratio for upper limits For purposes of setting an upper limit on  use G. Cowan Note for purposes of setting an upper limit, one does not regard an upwards fluctuation of the data as representing incompatibility with the hypothesized . Note also here we allow the estimator for  be negative (but must be positive). where

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Alternative test statistic for upper limits Assume physical signal model has  > 0, therefore if estimator for  comes out negative, the closest physical model has  = 0. Therefore could also measure level of discrepancy between data and hypothesized  with G. Cowan Performance not identical to but very close to q  (of previous slide). q  is simpler in important ways.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Relation between test statistics and G. Cowan Assuming the Wald approximation for – 2ln (  ), q  and q  both have monotonic relation with . ~ And therefore quantiles of q , q  can be obtained directly from those of  (which is Gaussian). ˆ ̃ ~

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Distribution of q  Similar results for q  G. Cowan

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 Similar results for q  ̃ 19 Distribution of q  G. Cowan ̃

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Monte Carlo test of asymptotic formula G. Cowan Here take  = 1. Asymptotic formula is good approximation to 5  level (q 0 = 25) already for b ~ 20.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Monte Carlo test of asymptotic formulae G. Cowan Significance from asymptotic formula, here Z 0 = √q 0 = 4, compared to MC (true) value. For very low b, asymptotic formula underestimates Z 0. Then slight overshoot before rapidly converging to MC value.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Monte Carlo test of asymptotic formulae G. Cowan Asymptotic f (q 0 |1) good already for fairly small samples. Median[q 0 |1] from Asimov data set; good agreement with MC.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Monte Carlo test of asymptotic formulae G. Cowan Consider again n ~ Poisson (  s + b), m ~ Poisson(  b) Use q  to find p-value of hypothesized  values. E.g. f (q 1 |1) for p-value of  =1. Typically interested in 95% CL, i.e., p-value threshold = 0.05, i.e., q 1 = 2.69 or Z 1 = √q 1 = Median[q 1 |0] gives “exclusion sensitivity”. Here asymptotic formulae good for s = 6, b = 9.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Monte Carlo test of asymptotic formulae G. Cowan Same message for test based on q . q  and q  give similar tests to the extent that asymptotic formulae are valid. ~ ~

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Example 2: Shape analysis G. Cowan Look for a Gaussian bump sitting on top of:

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Monte Carlo test of asymptotic formulae G. Cowan Distributions of q  here for  that gave p  = 0.05.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Using f(q  |0) to get error bands G. Cowan We are not only interested in the median[q  |0]; we want to know how much statistical variation to expect from a real data set. But we have full f(q  |0); we can get any desired quantiles.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Distribution of upper limit on  G. Cowan ±1  (green) and ±2  (yellow) bands from MC; Vertical lines from asymptotic formulae

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Limit on  versus peak position (mass) G. Cowan ±1  (green) and ±2  (yellow) bands from asymptotic formulae; Points are from a single arbitrary data set.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Using likelihood ratio L s+b /L b G. Cowan Many searches at the Tevatron have used the statistic likelihood of  = 1 model (s+b) likelihood of  = 0 model (bkg only) This can be written

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Wald approximation for L s+b /L b G. Cowan Assuming the Wald approximation, q can be written as i.e. q is Gaussian distributed with mean and variance of To get  2 use 2 nd derivatives of lnL with Asimov data set.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Example with L s+b /L b G. Cowan Consider again n ~ Poisson (  s + b), m ~ Poisson(  b) b = 20, s = 10,  = 1. So even for smallish data sample, Wald approximation can be useful; no MC needed.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 33 Summary ˆ Asymptotic distributions of profile LR applied to an LHC search. Wilks: f (q  |  ) for p-value of . Wald approximation for f (q  |  ′). “Asimov” data set used to estimate median q  for sensitivity. Gives  of distribution of estimator for . Asymptotic formulae especially useful for estimating sensitivity in high-dimensional parameter space. Can always check with MC for very low data samples and/or when precision crucial. Implementation in RooStats (ongoing). Thanks to Louis Fayard, Nancy Andari, Francesco Polci, Marumi Kado for their observations related to allowing a negative estimator for .

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 34 Extra slides

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 35 Discovery significance for n ~ Poisson(s + b) Consider again the case where we observe n events, model as following Poisson distribution with mean s + b (assume b is known). 1) For an observed n, what is the significance Z 0 with which we would reject the s = 0 hypothesis? 2) What is the expected (or more precisely, median ) Z 0 if the true value of the signal rate is s?

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 36 Gaussian approximation for Poisson significance For large s + b, n → x ~ Gaussian( ,  ),  = s + b,  = √(s + b). For observed value x obs, p-value of s = 0 is Prob(x > x obs | s = 0),: Significance for rejecting s = 0 is therefore Expected (median) significance assuming signal rate s is

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 37 Better approximation for Poisson significance Likelihood function for parameter s is or equivalently the log-likelihood is Find the maximum by setting gives the estimator for s:

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 38 Approximate Poisson significance (continued) The likelihood ratio statistic for testing s = 0 is For sufficiently large s + b, (use Wilks’ theorem), To find median[Z 0 |s+b], let n → s + b (i.e., the Asimov data set): This reduces to s/√b for s << b.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 39 n ~ Poisson(  s+b), median significance, assuming  = 1, of the hypothesis  = 0 “Exact” values from MC, jumps due to discrete data. Asimov √q 0,A good approx. for broad range of s, b. s/√b only good for s « b.

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Profile likelihood ratio for unified interval We can also use directly G. Cowan as a test statistic for a hypothesized . where Large discrepancy between data and hypothesis can correspond either to the estimate for  being observed high or low relative to .

Using the Profile Likelihood in Searches for New Physics / PHYSTAT Distribution of t  Using Wald approximation, f (t  |  ′) is noncentral chi-square for one degree of freedom: G. Cowan Special case of  =  ′ is chi-square for one d.o.f. (Wilks). The p-value for an observed value of t  is and the corresponding significance is

G. Cowan Using the Profile Likelihood in Searches for New Physics / PHYSTAT Combination of channels For a set of independent decay channels, full likelihood function is product of the individual ones: Trick for median significance: estimator for  is equal to the Asimov value  ′ for all channels separately, so for combination, For combination need to form the full function and maximize to find estimators of , . → ongoing ATLAS/CMS effort with RooStats framework where