An Answer and a Question Limits: Combining 2 results Significance: Does 2 give 2 ? Roger Barlow BIRS meeting July 2006
2 Revisit s+b Calculator (used in BaBar) based on Cousins and Highland: frequentist for s, Bayesian integration for and b See and C.P.C. 149 (2002) different priors (uniform in ,1/ , ln )
3 Combining Limits? With 2 measurements x=1.1 0.1 and x=1.2 0.1 the combination is obvious With 2 measurements 90% CL and 90% CL all we can say is 90% CL
4 Frequentist problem Given N 1 events with effcy 1, background b 1 N 2 events with effcy 2, background b 2 (Could be 2 experiments, or 2 channels in same experiment) For significance need to calculate, given source strength s, probability of result {N 1,N 2 } or less.
5 What does “Or less” mean? Is (3,4) larger or smaller than (2,5) ? N1N1 N2N2 More Less ??
6 Constraint If 1 = 2 and b 1 =b 2 then N 1 +N 2 is sufficient. So cannot just take lower left quadrant as ‘less’. (And the example given yesterday is trivial)
7 Suggestion Could estimate s by maximising log (Poisson) likelihood -( i s +b i ) + N i ln ( i s +b i ) Hence N i i /( i s +b i ) - i =0 Order results by the value of s they give from solving this Easier than it looks. For a given {N i } this quantity is monotonic decreasing with s. Solve once to get s data, explore s space generating many {N i } : sign of N i i /( i s data +b i ) - i tells you whether this estimated s is greater or less than s data
8 Message This is implemented in the code – ‘Add experiment’ button (up to 10) Comments as to whether this is useful are welcome
9 Significance Analysis looking for bumps Pure background gives 2 old of 60 for 37 dof (Prob 1%). Not good but not totally impossible Fit to background+bump (4 new parameters) gives better 2 new of 28 Question: Is this significant? Answer: Yes Question: How much? Answer: Significance is ( 2 new - 2 old ) = (60-28)=5.65 Schematic only!! No reference to any experimental data, real or fictitious Puzzle. How does a 3 sigma discrepancy become a 5 sigma discovery?
10 Justification? ‘We always do it this way’ ‘Belle does it this way’ ‘CLEO does it this way’
11 Possible Justification Likelihood Ratio Test a.k.a. Maximum Likelihood Ratio Test If M 1 and M 2 are models with max. likelihoods L 1 and L 2 for the data, then 2ln(L 2 / L 1 ) is distributed as a 2 with N 1 - N 2 degrees of freedom Provided that 1.M 2 contains M 1 2.Ns are large 3.Errors are Gaussian 4.Models are linear
12 Does it matter? Investigate with toy MC Generate with Uniform distribution in 100 bins, = is large and Poisson is reasonably Gaussian Fit with –Uniform distribution (99 dof) –Linear distribution (98 dof) –Cubic (96 dof):a 0 +a 1 x + a 2 x 2 + a 3 x 3 –Flat+Gaussian (96 dof): a 0 +a 1 exp(-0.5(x- a 2 ) 2 /a 3 2 ) Cubic is linear: Gaussian is not linear in a 2 and a 3
13 One ‘experiment’ Flat +Gauss Cubic linear Flat
14 Calculate 2 probabilities of differences in models Compare linear and uniform models. 1 dof. Probability flat Method OK Compare cubic and uniform models. 3 dof. Probability flat Method OK Compare flat+gaussian and uniform models. 3 dof. Probability very unflat Method invalid Peak at low P corresponds to large 2 i.e. false claims of significant signal
15 Not all parameters are equally useful If 2 models have the same number of parameters and both contain the true model, one can give better results than the other. This tells us nothing about the data Shows 2 for flat+gauss v. cubic Same number of parameters Flat+gauss tends to be lower Conclude: 2 does not give 2 ?
16 But surely… In the large N limit, ln L is parabolic in fitted parameters. Model 2 contains Model 1 with a 2 =0 etc. So expect ln L to increase by equivalent of 3 in chi squared. Question. What is wrong with this argument? Asymptopic? Different probability? Or is it right and the previous analysis is wrong?