G. Cowan RHUL Physics LR test to determine number of parameters page 1 Likelihood ratio test to determine best number of parameters ATLAS Statistics Forum.

Slides:



Advertisements
Similar presentations
G. Cowan TAE 2013 / Statistics Problems1 TAE Statistics Problems Glen Cowan Physics Department Royal Holloway, University of London
Advertisements

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 1 Using the Profile Likelihood in Searches for New Physics arXiv:
Statistical Data Analysis Stat 3: p-values, parameter estimation
Statistics In HEP 2 Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
G. Cowan RHUL Physics Profile likelihood for systematic uncertainties page 1 Use of profile likelihood to determine systematic uncertainties ATLAS Top.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Statistical Image Modelling and Particle Physics Comments on talk by D.M. Titterington Glen Cowan RHUL Physics PHYSTAT05 Glen Cowan Royal Holloway, University.
G. Cowan RHUL Physics Comment on use of LR for limits page 1 Comment on definition of likelihood ratio for limits ATLAS Statistics Forum CERN, 2 September,
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan 2011 CERN Summer Student Lectures on Statistics / Lecture 41 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability.
G. Cowan RHUL Physics Statistical Methods for Particle Physics / 2007 CERN-FNAL HCP School page 1 Statistical Methods for Particle Physics (2) CERN-FNAL.
G. Cowan RHUL Physics Higgs combination note status page 1 Status of Higgs Combination Note ATLAS Statistics/Higgs Meeting Phone, 7 April, 2008 Glen Cowan.
G. Cowan Statistics for HEP / NIKHEF, December 2011 / Lecture 2 1 Statistical Methods for Particle Physics Lecture 2: Tests based on likelihood ratios.
An Introduction to Logistic Regression
G. Cowan Discovery and limits / DESY, 4-7 October 2011 / Lecture 2 1 Statistical Methods for Discovery and Limits Lecture 2: Tests based on likelihood.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination using shapes ATLAS Statistics Meeting CERN, 19 December, 2007 Glen Cowan.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
TOPLHCWG. Introduction The ATLAS+CMS combination of single-top production cross-section measurements in the t channel was performed using the BLUE (Best.
1 Glen Cowan Statistics Forum News Glen Cowan Eilam Gross ATLAS Statistics Forum CERN, 3 December, 2008.
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 3: Multivariate Methods (II) 清华大学高能物理研究中心 2010 年 4 月 12—16.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #23.
IE241: Introduction to Hypothesis Testing. We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to.
G. Cowan Lectures on Statistical Data Analysis Lecture 1 page 1 Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics;
G. Cowan CLASHEP 2011 / Topics in Statistical Data Analysis / Lecture 21 Topics in Statistical Data Analysis for HEP Lecture 2: Statistical Tests CERN.
G. Cowan Statistical techniques for systematics page 1 Statistical techniques for incorporating systematic/theory uncertainties Theory/Experiment Interplay.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
NASSP Masters 5003F - Computational Astronomy Lecture 6 Objective functions for model fitting: –Sum of squared residuals (=> the ‘method of least.
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination based on event counts (follow-up from 11 May 07) ATLAS Statistics Forum.
G. Cowan RHUL Physics page 1 Status of search procedures for ATLAS ATLAS-CMS Joint Statistics Meeting CERN, 15 October, 2009 Glen Cowan Physics Department.
G. Cowan ATLAS Statistics Forum / Minimum Power for PCL 1 Minimum Power for PCL ATLAS Statistics Forum EVO, 10 June, 2011 Glen Cowan* Physics Department.
G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics.
G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 41 Statistics for the LHC Lecture 4: Bayesian methods and further topics Academic.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
G. Cowan RHUL Physics Input from Statistics Forum for Exotics page 1 Input from Statistics Forum for Exotics ATLAS Exotics Meeting CERN/phone, 22 January,
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
G. Cowan RHUL Physics Analysis Strategies Workshop -- Statistics Forum Report page 1 Report from the Statistics Forum ATLAS Analysis Strategies Workshop.
G. Cowan, RHUL Physics Statistics for early physics page 1 Statistics jump-start for early physics ATLAS Statistics Forum EVO/Phone, 4 May, 2010 Glen Cowan.
G. Cowan RHUL Physics Status of Higgs combination page 1 Status of Higgs Combination ATLAS Higgs Meeting CERN/phone, 7 November, 2008 Glen Cowan, RHUL.
G. Cowan Systematic uncertainties in statistical data analysis page 1 Systematic uncertainties in statistical data analysis for particle physics DESY Seminar.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
In Bayesian theory, a test statistics can be defined by taking the ratio of the Bayes factors for the two hypotheses: The ratio measures the probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
G. Cowan RHUL Physics Statistical Issues for Higgs Search page 1 Statistical Issues for Higgs Search ATLAS Statistics Forum CERN, 16 April, 2007 Glen Cowan.
G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 21 Statistics for the LHC Lecture 2: Discovery Academic Training Lectures CERN,
G. Cowan SLAC Statistics Meeting / 4-6 June 2012 / Two Developments 1 Two developments in discovery tests: use of weighted Monte Carlo events and an improved.
23 Jan 2012 Background shape estimates using sidebands Paul Dauncey G. Davies, D. Futyan, J. Hays, M. Jarvis, M. Kenzie, C. Seez, J. Virdee, N. Wardle.
Discussion on significance
Parameter Estimation and Fitting to Data
Comment on Event Quality Variables for Multivariate Analyses
Computing and Statistical Data Analysis / Stat 8
Computing and Statistical Data Analysis / Stat 7
Introduction to Unfolding
Statistical Methods for HEP Lecture 3: Discovery and Limits
Decomposition of Stat/Sys Errors
Introduction to Statistics − Day 4
Computing and Statistical Data Analysis / Stat 10
Presentation transcript:

G. Cowan RHUL Physics LR test to determine number of parameters page 1 Likelihood ratio test to determine best number of parameters ATLAS Statistics Forum CERN, 18 February, 2009 Glen Cowan Physics Department Royal Holloway, University of London

G. Cowan RHUL Physics LR test to determine number of parameters page 2 Introduction Present study motivated by discussions with Eilam, Stephan Horner, Sascha Caron, et al., regarding Stephan's presentation on SUSYFit at 3 December 2008 Statistics Forum. Discussions also in Top Properties meeting (16 Dec 08) and Exotics meeting (22 Jan 08). Basic idea is to develop general method for increasing number of parameters in a model; stop when fit is OK. Systematics in the original model are then included in the statistical errors of the extended model. A draft note is attached on the agenda page; also at

G. Cowan RHUL Physics LR test to determine number of parameters page 3 Determining distributions: systematics E.g. M ll distribution from Z'→dilepton search (CSC Book p 1709), uses 4-parameter function for signal. Sidebands provide estimate of background. So nothing in real analysis from MC, but... Still should consider some systematic due to fact that assumed parametric functions not perfect.

G. Cowan RHUL Physics LR test to determine number of parameters page 4 A general strategy (see attached note) Suppose one needs to know the shape of a distribution. Initial model (e.g. MC) is available, but known to be imperfect. Q: How can one incorporate the systematic error arising from use of the incorrect model? A: Improve the model. That is, introduce more adjustable parameters into the model so that for some point in the enlarged parameter space it is very close to the truth. Then use profile the likelihood with respect to the additional (nuisance) parameters. The correlations with the nuisance parameters will inflate the errors in the parameters of interest. Difficulty is deciding how to introduce the additional parameters.

G. Cowan RHUL Physics LR test to determine number of parameters page 5 Comparing model vs. data In the example shown, the model and data clearly don't agree well. To compare, use e.g. Model number of entries n i in ith bin as ~Poisson( i ) Will follow chi-square distribution for N dof for sufficiently large n i.

G. Cowan RHUL Physics LR test to determine number of parameters page 6 Model-data comparison with likelihood ratio This is very similar to a comparison based on the likelihood ratio where L( ) = P(n; ) is the likelihood and the hat indicates the ML estimator (value that maximizes the likelihood). Here easy to show that Equivalently use logarithmic variable If model correct, q ~ chi-square for N degrees of freedom.

G. Cowan RHUL Physics LR test to determine number of parameters page 7 p-values Using either  2 P or q, state level of data-model agreement by giving the p-value: the probability, under assumption of the model, of obtaining an equal or greater incompatibility with the data relative to that found with the actual data: where (in both cases) the integrand is the chi-square distribution for N degrees of freedom,

G. Cowan RHUL Physics LR test to determine number of parameters page 8 A simple example The naive model (a) could have been e.g. from MC (here statistical errors suppressed; point is to illustrate how to incorporate systematics.) 0th order model True model (Nature) Data

G. Cowan RHUL Physics LR test to determine number of parameters page 9 Comparison with the 0th order model The 0th order model gives q = 258.8, p  ×  

G. Cowan RHUL Physics LR test to determine number of parameters page 10 Enlarging the model Here try to enlarge the model by multiplying the 0th order distribution by a function s: where s(x) is a linear superposition of Bernstein basis polynomials of order m:

G. Cowan RHUL Physics LR test to determine number of parameters page 11 Bernstein basis polynomials

G. Cowan RHUL Physics LR test to determine number of parameters page 12 Enlarging the parameter space Using increasingly high order for the basis polynomials gives an increasingly flexible function. At each stage compare the p-value to some threshold, e.g., 0.1 or 0.2, to decide whether to include the additional parameter. Now iterate this procedure, and stop when the data do not require addition of further parameters based on the likelihood ratio test. Once the enlarged model has been found, simply include it in any further statistical procedures, and the statistical errors from the additional parameters will account for the systematic uncertainty in the original model.

G. Cowan RHUL Physics LR test to determine number of parameters page 13 Fits using increasing numbers of parameters Stop here

G. Cowan RHUL Physics LR test to determine number of parameters page 14 Goodness-of-fit for the extended models q gives overall goodness-of-fit q compares model with n par parameters to that with n par +1 p-values

G. Cowan RHUL Physics LR test to determine number of parameters page 15 Summary Example shown here uses a very general idea; similar philosophy applied in many analyses (cf. choosing order of a polynomial for LS fit). Example here assumes distribution can be corrected by a scale factor; need somewhat different strategy for the tail of a distribution, where MC bin contents go to zero. What to do if e.g. overall goodness-of-fit not great, but additional parameters do not help? (Tom LeCompte: F-test using ratio of chi-squares?) How to proceed if the additional parameters add too much flex- ibility, e.g., what if normalization is well known, but not, say, slope? Stephan Horner et al. have done similar things with SUSYFit (next talk).