An Answer and a Question Limits: Combining 2 results Significance: Does  2 give  2 ? Roger Barlow BIRS meeting July 2006.

Slides:



Advertisements
Similar presentations
Inferential Statistics and t - tests
Advertisements

Statistical Analysis SC504/HS927 Spring Term 2008
DO’S AND DONT’S WITH LIKELIHOODS Louis Lyons Oxford (CDF)
Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 1 Using the Profile Likelihood in Searches for New Physics arXiv:
27 th March CERN Higgs searches: CL s W. J. Murray RAL.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Statistics for Particle Physics: Intervals Roger Barlow Karlsruhe: 12 October 2009.
1 6 th September 2007 C.P. Ward Sensitivity of ZZ→llνν to Anomalous Couplings Pat Ward University of Cambridge Neutral Triple Gauge Couplings Fit Procedure.
Slide 1 Statistics for HEP Roger Barlow Manchester University Lecture 3: Estimation.
G. Cowan RHUL Physics Comment on use of LR for limits page 1 Comment on definition of likelihood ratio for limits ATLAS Statistics Forum CERN, 2 September,
Machine Learning CMPT 726 Simon Fraser University
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan 2011 CERN Summer Student Lectures on Statistics / Lecture 41 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability.
G. Cowan RHUL Physics Higgs combination note status page 1 Status of Higgs Combination Note ATLAS Statistics/Higgs Meeting Phone, 7 April, 2008 Glen Cowan.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Chi Square Distribution (c2) and Least Squares Fitting
Rao-Cramer-Frechet (RCF) bound of minimum variance (w/o proof) Variance of an estimator of single parameter is limited as: is called “efficient” when the.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination using shapes ATLAS Statistics Meeting CERN, 19 December, 2007 Glen Cowan.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
880.P20 Winter 2006 Richard Kass 1 Maximum Likelihood Method (MLM) Does this procedure make sense? The MLM answers this question and provides a method.
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
Why do Wouter (and ATLAS) put asymmetric errors on data points ? What is involved in the CLs exclusion method and what do the colours/lines mean ? ATLAS.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #23.
LOGISTIC REGRESSION David Kauchak CS451 – Fall 2013.
Statistics In HEP Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #25.
G. Cowan RHUL Physics page 1 Status of search procedures for ATLAS ATLAS-CMS Joint Statistics Meeting CERN, 15 October, 2009 Glen Cowan Physics Department.
G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Easy Limit Statistics Andreas Hoecker CAT Physics, Mar 25, 2011.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
A bin-free Extended Maximum Likelihood Fit + Feldman-Cousins error analysis Peter Litchfield  A bin free Extended Maximum Likelihood method of fitting.
DIJET STATUS Kazim Gumus 30 Aug Our signal is spread over many bins, and the background varies widely over the bins. If we were to simply sum up.
ES 07 These slides can be found at optimized for Windows)
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
2005 Unbinned Point Source Analysis Update Jim Braun IceCube Fall 2006 Collaboration Meeting.
Chapter 4 Exploring Chemical Analysis, Harris
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
NASSP Masters 5003F - Computational Astronomy Lecture 4: mostly about model fitting. The model is our estimate of the parent function. Let’s express.
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Confidence Limits and Intervals 3: Various other topics Roger Barlow SLUO Lectures on Statistics August 2006.
Frequentist approach Bayesian approach Statistics I.
Discussion on significance
IC40 Flares analysis update Mike Baker PS call, Jan
Statistical Significance & Its Systematic Uncertainties
arXiv:physics/ v3 [physics.data-an]
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Lecture 4 1 Probability (90 min.)
Physics 114: Exam 2 Review Material from Weeks 7-11
Modelling data and curve fitting
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
Introduction to Statistics − Day 4
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
Statistics for HEP Roger Barlow Manchester University
Presentation transcript:

An Answer and a Question Limits: Combining 2 results Significance: Does  2 give  2 ? Roger Barlow BIRS meeting July 2006

2 Revisit  s+b Calculator (used in BaBar) based on Cousins and Highland: frequentist for s, Bayesian integration for  and b See and C.P.C. 149 (2002) different priors (uniform in ,1/ , ln  )

3 Combining Limits? With 2 measurements x=1.1  0.1 and x=1.2  0.1 the combination is obvious With 2 measurements 90% CL and 90% CL all we can say is 90% CL

4 Frequentist problem Given N 1 events with effcy  1, background b 1 N 2 events with effcy  2, background b 2 (Could be 2 experiments, or 2 channels in same experiment) For significance need to calculate, given source strength s, probability of result {N 1,N 2 } or less.

5 What does “Or less” mean? Is (3,4) larger or smaller than (2,5) ? N1N1 N2N2 More Less ??

6 Constraint If  1 =  2 and b 1 =b 2 then N 1 +N 2 is sufficient. So cannot just take lower left quadrant as ‘less’. (And the example given yesterday is trivial)

7 Suggestion Could estimate s by maximising log (Poisson) likelihood  -(  i s +b i ) + N i ln (  i s +b i ) Hence  N i  i /(  i s +b i ) -  i =0 Order results by the value of s they give from solving this Easier than it looks. For a given {N i } this quantity is monotonic decreasing with s. Solve once to get s data, explore s space generating many {N i } : sign of  N i  i /(  i s data +b i ) -  i tells you whether this estimated s is greater or less than s data

8 Message This is implemented in the code – ‘Add experiment’ button (up to 10) Comments as to whether this is useful are welcome

9 Significance Analysis looking for bumps Pure background gives  2 old of 60 for 37 dof (Prob 1%). Not good but not totally impossible Fit to background+bump (4 new parameters) gives better  2 new of 28 Question: Is this significant? Answer: Yes Question: How much? Answer: Significance is  (  2 new  -  2 old ) =  (60-28)=5.65 Schematic only!! No reference to any experimental data, real or fictitious Puzzle. How does a 3 sigma discrepancy become a 5 sigma discovery?

10 Justification? ‘We always do it this way’ ‘Belle does it this way’ ‘CLEO does it this way’

11 Possible Justification Likelihood Ratio Test a.k.a. Maximum Likelihood Ratio Test If M 1 and M 2 are models with max. likelihoods L 1 and L 2 for the data, then 2ln(L 2 / L 1 ) is distributed as a  2 with N 1 - N 2 degrees of freedom Provided that 1.M 2 contains M 1  2.Ns are large  3.Errors are Gaussian  4.Models are linear 

12 Does it matter? Investigate with toy MC Generate with Uniform distribution in 100 bins, = is large and Poisson is reasonably Gaussian Fit with –Uniform distribution (99 dof) –Linear distribution (98 dof) –Cubic (96 dof):a 0 +a 1 x + a 2 x 2 + a 3 x 3 –Flat+Gaussian (96 dof): a 0 +a 1 exp(-0.5(x- a 2 ) 2 /a 3 2 ) Cubic is linear: Gaussian is not linear in a 2 and a 3

13 One ‘experiment’ Flat +Gauss Cubic linear Flat

14 Calculate  2 probabilities of differences in models Compare linear and uniform models. 1 dof. Probability flat Method OK Compare cubic and uniform models. 3 dof. Probability flat Method OK Compare flat+gaussian and uniform models. 3 dof. Probability very unflat Method invalid Peak at low P corresponds to large  2 i.e. false claims of significant signal

15 Not all parameters are equally useful If 2 models have the same number of parameters and both contain the true model, one can give better results than the other. This tells us nothing about the data Shows  2 for flat+gauss v. cubic Same number of parameters Flat+gauss tends to be lower Conclude:  2 does not give  2 ?

16 But surely… In the large N limit, ln L is parabolic in fitted parameters. Model 2 contains Model 1 with a 2 =0 etc. So expect ln L to increase by equivalent of 3 in chi squared. Question. What is wrong with this argument? Asymptopic? Different probability? Or is it right and the previous analysis is wrong?