Presentation is loading. Please wait.

Presentation is loading. Please wait.

Eilam Gross, Higgs Statistics, Orsay 09

Similar presentations


Presentation on theme: "Eilam Gross, Higgs Statistics, Orsay 09"— Presentation transcript:

1 http://eilam.weizmann.ac.il/Images/TheHiggsMechanism.swf 1
Eilam Gross, Higgs Statistics, Orsay 09 Eilam Gross, Higgs Statistics, Santander 2009

2 Statistical issues for Higgs Physics
Eilam Gross, Weizmann Institute of Science 2 Eilam Gross, Higgs Statistics, Orsay 2009

3 Modified Frequentist - CLs
Bayesian Modified Frequentist - CLs 3 Eilam Gross, Higgs Statistics, Orsay 09

4 CMS & ATLAS Higgs Prospects
Profile Likelihood Bayesian ATLAS CMS Grיegory Schott SUSY 2009 4 Eilam Gross, Higgs Statistics, Orsay 09

5 Discovery vs Exclusion
Higgs statistics is about testing one hypothesis against another hypothesis One hypothesis is the Standard Model with no Higgs Boson (H0) Another hypothesis is the SM with a Higgs boson with a specific mass mH (H1) Rejecting the No-Higgs (H0) hypothesis DISCOVERY Rejecting the Higgs hypothesis (H1) EXCLUDING the Higgs at the 95% CL: Usually H0 is referred to as the null hypothesis, but it depends on the context. 5 Eilam Gross, Higgs Statistics, Orsay 09

6 Basic Definition: Test Statistic
Given an hypothesis H0 (Background only) one wants to test against an alternate hypothesis H1 (Higgs with mass mH) One (very good) way is to construct a test statistic Q and use it to accept or reject an hypothesis For a physicist the test statistic is part of the analysis model Eilam Gross, Higgs Statistics, Orsay 09

7 Basic Definitions: p-Value
The pdf of Q…. A lot of it is about a language…. A jargon Discovery…. A deviation from the SM - from the background only hypothesis… When will one reject an hypothesis? p-value = probability that result is as or less compatible with the background only hypothesis Control region a (or size a) defines the criterion It is a custom to choose a=2.8710-7 If result falls within the control region, i.e. p< a the BG only hypothesis is rejected A discovery Control region Of size a Eilam Gross, Higgs Statistics, Orsay 09

8 From p-values to Gaussian significance
It is a custom to express the p-value as the significance associated to it, had the pdf were Gaussians Beware of 1 vs 2-sided definitions! 8 Eilam Gross, Higgs Statistics, Orsay 09 8

9 LR motivation- The Neyman-Pearson Lemma
When performing a hypothesis test between two simple hypotheses, H0 and H1, the Likelihood Ratio test, which rejects H0 in favor of H1, is the most powerful test of size  for a threshold  Define a test statistic Note: Likelihoods are functions of the data, even though we often not specify it explicitly 9 Eilam Gross, Higgs Statistics, Orsay 09 9

10 Basic Definitions: Confidence Interval & Coverage
Say you have a measurement mmeas of mH with mtrue being the unknown true value of mH Assume you know the pdf of p(mmeas|mH) Given the measurement you deduce somehow (based on your statistical model) that there is a 90% Confidence interval [m1,m2].... The correct statement: In an ensemble of experiments 90% of the obtained confidence intervals will contain the true value of mH. The misconception; Given the data, the probability that there is a Higgs with a mass inH the interval [m1,m2] is 90%. Eilam Gross, Higgs Statistics, Orsay 09

11 Basic Definitions: Confidence Interval & Coverage
Confidence Level: A CL of 90% means that in an ensemble of experiments, each producing a confidence interval, 90% of the confidence intervals will contain the true value of mH Normally, we make one experiment and try to estimate from this one experiment the confidence interval at a specified CL% Confidence Level…. If in an ensemble of (MC) experiments our estimated Confidence Interval fail to contain the true value of mH 90% of the cases (for every possible mH) we claim that our method undercover If in an ensemble of (MC) experiments the true value of mH is covered within the estimated confidence interval , we claim a coverage Eilam Gross, Higgs Statistics, Orsay 09

12 Basic Definitions: Confidence Interval & Coverage
The basic question will come again and again: WHAT IS THE IMPORTANCE OF COVERAGE for a Physicist? The “ problem” : Maybe coverage answers the wrong question… you want to know what is the probability that the Higgs boson exists and is in specific mass range… Excluding a Higgs with a mass<mH at the 95% CL aims to mean that the interval [0,mH] does not contain a Higgs Boson with that mass with a 95% probability This is a wrong statement, and the right statement depends on the statistical model. You can never test the signal by itself, only s+b, so your statement is about s(mH)+b , including the coverage…. We might prefer not to exclude a signal to which we are not sensitive to at the price of undercoverage (CLS) Eilam Gross, Higgs Statistics, Orsay 09

13 Subjective Bayesian is Good for YOU
Thomas Bayes (b 1702) a British mathematician and Presbyterian minister Eilam Gross, Higgs Statistics, Orsay 09

14 What is the Right Question
Is there a Higgs Boson? What do you mean? Given the data , is there a Higgs Boson? Can you really answer that without any a priori knowledge of the Higgs Boson? Change your question: What is your degree of belief in the Higgs Boson given the data… Need a prior degree of belief regarding the Higgs Boson itself… Make sure that when you quote your answer you also quote your prior assumption! Can we assign a probability to a model P(Higgs)? Of course not, we can only assign to it a degree of belief! The most refined question is: Assuming there is a Higgs Boson with some mass mH, how well the data agrees with that? But even then the answer relies on the way you measured the data (i.e. measurement uncertainties), and that might include some pre-assumptions, priors! Eilam Gross, Higgs Statistics, Orsay 09

15 What is the Right Answer?
The Question is: Is there a Higgs Boson? Is there a God? In the book the author uses “divine factors” to estimate the P(Earth|God), a prior for God of 50% He “calculates” a 67% probability for God’s existence given earth… In Scientific American July 2004, playing a bit with the “divine factors” the probability drops to 2%... Eilam Gross, Higgs Statistics, Orsay 09

16 Basic Definitions: The Bayesian Way
Can the model have a probability (what is Prob(SM)?) ? We assign a degree of belief in models parameterized by µ Instead of talking about confidence intervals we talk about credible intervals, where p(µ|x) is the credibility of µ given the data. Eilam Gross, Higgs Statistics, Orsay 09

17 Systematics Why “download” only music?
“Download” original ideas as well….  Eilam Gross, Higgs Statistics, Orsay 09

18 Systematics is Important
An analysis might be killed by systematics Eilam Gross, Higgs Statistics, Orsay 09

19 Basic Definitions: Nuisance Parameters (Systematics)
There are two kinds of parameters: Parameters of interest (signal strength… cross section… µ) Nuisance parameters (background cross section, b, signal efficiency) The nuisance parameters carry systematic uncertainties There are two related issues: Classifying and estimating the systematic uncertainties Implementing them in the analysis The physicist must make the difference between cross checks and identifying the sources of the systematic uncertainty. Shifting cuts around and measure the effect on the observable… Very often the observed variation is dominated by the statistical uncertainty in the measurement. Eilam Gross, Higgs Statistics, Orsay 09

20 Basic Definitions: Implementation of Nuisance Parameters
Implement by marginalizing or profiling Marginalization (Integrating) (The C&H Hybrid) Integrate L over possible values of nuisance parameters (weighted by their prior belief functions -- Gaussian,gamma, others...) Consistent Bayesian interpretation of uncertainty on nuisance parameters Note that in that sense MC “statistical” uncertainties (like background statistical uncertainty) are systematic uncertainties Eilam Gross, Higgs Statistics, Orsay 09

21 Integrating Out The Nuisance Parameters (Marginalization)
Our degree of belief in µ is the sum of our degree of belief in µ given  (nuisance parameter), over “all” possible values of  That’s a Bayesian way Eilam Gross, Higgs Statistics, Orsay 09

22 Basic Definitions: Priors
A prior probability is interpreted as a description of what we believe about a parameter preceding the current experiment Informative Priors: When you have some information about  the prior might be informative (Gaussian or Truncated Gaussians…) Most would say that subjective informative priors about the parameters of interest should be avoided (“….what's wrong with assuming that there is a Higgs in the mass range [115,140] with equal probability for each mass point?”) Subjective informative priors about the Nuisance parameters are more difficult to argue with These Priors can come from our assumed model (Pythia, Herwig etc…) These priors can come from subsidiary measurements of the response of the detector to the photon energy, for example. Some priors come from subjective assumptions (theoretical, prejudice symmetries….) of our model Eilam Gross, Higgs Statistics, Orsay 09

23 Basic Definitions: Priors – Uninformative Priors
Uninformative Priors: All priors on the parameter of interest are usually uninformative…. Therefore flat uninformative priors are most common in HEP. When taking a uniform prior for the Higgs mass [115,]… is it really uninformative? do uninformative priors exist? When constructing an uninformative prior you actually put some information in it… But a prior flat in the coupling g will not be flat in s~g2 Depends on the metric! Note, flat priors are improper and might lead to serious problems of undercoverage (when one deals with >1 channel, i.e. beyond counting, one should AVOID them) Eilam Gross, Higgs Statistics, Orsay 09

24 DISCOVERY 24 Eilam Gross, Higgs Statistics, Orsay 2009

25 Basic Definitions: Rayleigh Distribution
Eilam Gross, Higgs Statistics, Orsay 09

26 The Discovery Case Study
We assume a Gaussian “Higgs” signal (s) on top of a Rayleigh shaped background (b) The signal strength is µ µ=1, SM Higgs µ=0, SM without Higgs Two hypothetical measurements Data NOTE: b=b() mH 26 Eilam Gross, Higgs Statistics, Orsay 09 26

27 The Discovery Case Study
BG control sample scaled to the expected BG via a factor  The control sample constraints the background NOTE: b=b() mH 27 Eilam Gross, Higgs Statistics, Orsay 09

28 A Simultaneous Fit Two measurements mH mH
28 Eilam Gross, Higgs Statistics, Orsay 09 28

29 mH mH 29 Eilam Gross, Higgs Statistics, Orsay 09 29

30 The Discovery Case Study
Note, in this example, the signal towards the end of the background mass distribution (mH=20,80) is better separated from the signal near the middle (mH=50). mH mH mH 30 Eilam Gross, Higgs Statistics, Orsay 09

31 LR motivation- The Neyman-Pearson Lemma
When performing a hypothesis test between two simple hypotheses, H0 and H1, the Likelihood Ratio test, which rejects H0 in favor of H1, is the most powerful test of size  for a threshold  Define a test statistic Note: Likelihoods are functions of the data, even though we often not specify it explicitly 31 Eilam Gross, Higgs Statistics, Orsay 09

32 The frequentist LR (CLb) method
Define a test statistics 32 Eilam Gross, Higgs Statistics, Orsay 09

33 The frequentist LR (CL) method
Define a test statistics Use MC to generate the pdf of  under H0(B only) and H1 (S+B) p-value 33 Eilam Gross, Higgs Statistics, Orsay 09

34 The frequentist LR (CL) method
Define a test statistics Use MC to generate the pdf of  under H0(B only) and H1 (S+B) Let obs be a result of one experiment (LHC) p-value logobserved 34 Eilam Gross, Higgs Statistics, Orsay 09

35 Example (from SUSY02): Simulating BG Only Experiments
The likelihood ratio, -2ln(mH) tells us how much the outcome of an experiment is signal-like NOTE, here the s+b pdf is plotted to the left! Discriminator (from SUSY02): -2ln(mH) Eilam Gross, Higgs Statistics, Orsay 09 s+b like b-like

36 Example (from SUSY02): Simulating S(mH)+b Experiments
The likelihood ratio, -2ln(mH) tells us how much the outcome of an experiment is signal-like NOTE, here the s+b pdf is plotted to the left! s+b like b-like -2ln(mH) Eilam Gross, Higgs Statistics, Orsay 09

37 Straighten Things Up s+b like b-like
Eilam Gross, Higgs Statistics, Orsay 09

38 The frequentist LR (CL) method
Define a test statistics Use MC to generate the pdf of  under H0(B only) and H1 (S+B) Let  be a result of one experiment (LHC) The p-value is the probability to get an observation which is less b-like than the observed one If the result of the experiment (LHC) yields a p-value< 2.8· a 5 discovery is claimed NOTE: the p-value can be interpreted as a frequency  this is a frequentist approach p-value logobserved 38 Eilam Gross, Higgs Statistics, Orsay 09

39 Profile Likelihood Let us consider a counting experiment, measuring n
Define Construct a Likelihood Test the bg-only ( µ=0 ) hypothesis Eilam Gross, Higgs Statistics, Orsay 09

40 Profile Likelihood Define the PL ratio
Define the test statistics to be Note: If the data is b-like If the data contains a signal q0 b-like s+b-like 40 Eilam Gross, Higgs Statistics, Orsay 09

41 The Profile Likelihood Simulator
The Profile Likelihood Matlab Simulator © O. Vitells & E. Gross Eilam Gross, Higgs Statistics, Orsay 09 41

42 Systematics Normally, the background, b(), has an uncertainty which has to be taken into account. In this case  is called a nuisance parameter (which we associate with background systematics) How can we take into account the nuisance parameters? One way: marginalize them (integrate them out using priors)  the Hybrid CL (mix frequentist and Bayesian approach) Another way is profiling via the MLEs: prior 42 Eilam Gross, Higgs Statistics, Orsay 09 42

43 Systematics via Profiling
is the MLE of is the MLE of under H1 is the MLE of under H0 PL Ratio LR (CLs+b) 43 Eilam Gross, Higgs Statistics, Orsay 09 43

44 Wilks Theorem Under a set of regularity conditions and for a sufficiently large data sample, Wilks’ theorem says that for a hypothesized value of μ, the pdf of the statistic −2lnλ (μ) approaches the chi-square pdf for one degree of freedom One of the conditions is HµH0 Eilam Gross, Higgs Statistics, Orsay 09

45 Wilks Theorem distributes as a c2 with 1 d.o.f for experiments under the hypothesis Hµ i.e. q0 distributes as c2 for b-only experiments, q1 distributes as c2 for s+b experiments This ensures simplicity, coverage, speed Eilam Gross, Higgs Statistics, Orsay 09

46 Wilks Theorem distributes as a c2 with 1 d.o.f for experiments under the hypothesis Hµ Z is the significance Eilam Gross, Higgs Statistics, Orsay 09 46

47 Discovery - Illustrated
Eilam Gross, Higgs Statistics, Orsay 09 47 p0 is the level of compatibility between the data and the no-Higgs hypothesis If p0 is smaller than ~2.8∙10-7 we claim a 5s discovery

48 Median Sensitivity To estimate the median sensitivity of an experiment (before looking at the data), one can either perform lots of s+b experiments and estimate the median q0,med or evaluate q0 with respect to a representative data set, the ASIMOV data set with m=1, i.e. n=s+b Eilam Gross, Higgs Statistics, Orsay 09

49 The Profile Likelihood Simulator
The Profile Likelihood Matlab Simulator © O. Vitells & E. Gross Eilam Gross, Higgs Statistics, Orsay 09 49

50 The ASIMOV data sets The name of the Asimov data set is inspired by the short story Franchise, by Isaac Asimov [1]. In it, elections are held by selecting a single voter to represent the entire electorate. The "Asimov" Representative Data-set for Estimating Median Sensitivities with the Profile Likelihood G/ Cowan, K. Cranmer, E. Gross , O. Vitells, under preparation [1] Isaac Asimov, Franchise, in Isaac Asimov: The Complete Stories, Vol. 1, Broadway Books, 1990. Eilam Gross, Higgs Statistics, Orsay 09

51 The Profile Likelihood Simulator
The Profile Likelihood Matlab Simulator © O. Vitells & E. Gross Eilam Gross, Higgs Statistics, Orsay 09

52 Wilks Theorem with Nuisance Parameters
if there are n parameters of interest, i.e., those parameters that do not get a double hat in the numerator of the likelihood ratio then −2lnλ (μ) asymptotically follows a chi-square distribution for n degrees of freedom. Eilam Gross, Higgs Statistics, Orsay 09

53 The Profile Likelihood with Nuisance Parameters
The procedure is identical to the one without nuisance parameters Eilam Gross, Higgs Statistics, Orsay 09

54 CL & CI - Wikipedia A confidence interval (CI) is a particular kind of interval estimate of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given. Thus, confidence intervals are used to indicate the reliability of an estimate. How likely the interval is to contain the parameter is determined by the confidence level or confidence coefficient. Increasing the desired confidence level will widen the confidence interval. Eilam Gross, Higgs Statistics, Orsay 09

55 The Profiled CL way Generate the pdf of -2lnPL under H0 and H1
Define CI for q0: [-,qobs] The CL of this CI is given by p0 ''='' 1-CLb ATLAS, CERN – Open Cowan, Cranmer, E.G., Vitells, in preparation 55 Eilam Gross, Higgs Statistics, Orsay 09

56 The Profiled CL way p0 ''='' 1-CLb The median significance can be obtained with the one Asimov data set n~s+b, m~b ATLAS, CERN – Open Cowan, Cranmer, E.G., Vitells, in preparation 56 Eilam Gross, Higgs Statistics, Orsay 09 56

57 The Profiled CL way In this example a Higgs with a mass mH<32 or mH>52 is expected to be discovered, i.e. if the Higgs exists in this mass range it will be discovered >50% of hypothetical LHC experiments 57 Eilam Gross, Higgs Statistics, Orsay 09

58 The Profile Likelihood Ratio
CL Profiled LR PROFILE LIKELIHOOD RATIO IS DIFFERENT! PL Ratio: Test the null H0 hypothesis Use the Wilks theorem and the Asimov data set to deduce the p0,med 58 58 Eilam Gross, Higgs Statistics, Orsay 09

59 The frequentist Profile Likelihood Ratio vs Profiled CL
Its not surprising that using a modified LR (not the Neyman-Pearson motivated LR) gives a slightly less sensitive result ,yet Why using a method with a slightly lower sensitivity? Because of the Wilks theorem which can save us hours and days of computer time Profiled CL PL Ratio 59 Eilam Gross, Higgs Statistics, Orsay 09

60 Expected Discovery Sensitivity Combination of Channels
Each channel has its PLR with respect to the Asimov data set (x=s+b) For each channel We find That is, lA(0|x=s+b) approximates the median value of l (0) one would obtain from data generated according to the s+b hypothesis. The value of lA(0|x=s+b) is used to determine the median q0,med, which is used to find the median p-value, pmed. Eilam Gross, Higgs Statistics, Orsay 09

61 The Look Elsewhere Effect
Eilam Gross, Higgs Statistics, Orsay 2009

62 Look Elsewhere Effect To establish a discovery we try to reject the background only hypothesis H0 against the alternate hypothesis H1 H1 could be A Higgs Boson with a specified mass mH A Higgs Boson at some mass mH in the search mass range The look elsewhere effect deals with the floating mass case Let the Higgs mass, mH, and the signal strength µ be 2 parameters of interest 62 Eilam Gross, Higgs Statistics, Orsay 09

63 Look Elsewhere Effect 2 parameters of interest: the signal strength µ and the Higgs mass mH mH mH 63 Eilam Gross, Higgs Statistics, Orsay 09

64 Look Elsewhere Effect Letting the Higgs mass float we find that the background- only experiments distribute approximately as a The median sensitivity is given by the corresponding p- value Does Does it satisfy Wilks theorem? 64 Eilam Gross, Higgs Statistics, Orsay 09

65 Look Elsewhere Effect Back of the envelope: mH
65 Eilam Gross, Higgs Statistics, Orsay 09

66 Discovery via the Bayes Way: Bayes Factors
66 Eilam Gross, Higgs Statistics, Orsay 2009 66

67 The Bayes Way Eilam Gross, Higgs Statistics, Orsay 09

68 Frequentist ~ Bayesian ?
Eilam Gross, Higgs Statistics, Orsay 09

69 Frequentist ~ Bayesian ?
We found With some work (E.G.,O.Vitells, POS Krakow 09) we show that for the Asimov data set (s+b) Finally we find Eilam Gross, Higgs Statistics, Orsay 09

70 Frequentist ~ Bayesian ?
Eilam Gross, Higgs Statistics, Orsay 09

71 Frequentist~Bayesian Why Does It Work
This relationship between the Bayes factor and the frequentist PL ratio, though disturbing in the first place, is not surprising when you come to think about it. Wilks theorem ensures that using the PL ratio you do not need to perform any toy MC experiments to tell a significance of an observation based on the one observed data set. This is also the characteristics of a Bayesian hypothesis test. Eilam Gross, Higgs Statistics, Orsay 09

72 EXCLUSION 72 Eilam Gross, Higgs Statistics, Orsay 2009

73 Exclusion Case Study M=20 M=50 M=80 mH mH mH
73 Eilam Gross, Higgs Statistics, Orsay 09

74 Exclusion and p-value To exclude the s(mH)+b hypothesis (H1) we try to reject it i.e. calculating the observed or expected p-value of the test statistic under the s(mH)+b pdf If the p-value ps+b<5% we claim the s+b hypothesis was rejected at >95% CL Here we use the arguable relationship CL=1-ps+b Eilam Gross, Higgs Statistics, Orsay 09

75 The frequentist CLs+b method
Generate the pdf of -2ln under H0 and H1 Test the H1 hypothesis Define CI for q1: [-,qobs] The CL of this CI is given by p1 ''='' CLs+b 75 Eilam Gross, Higgs Statistics, Orsay 09 75

76 The frequentist CLs+b method
Use the LR as a test statistics To take systematics unto account integrate the nuisance parameters or profile them The exclusion is given by the s(mH)+b hypothesis p-value ps+b=CLs+b If ps+b<5%, the s(mH)+b hypothesis is rejected at the 95% CL p-value p-value=CLs+b 76 Eilam Gross, Higgs Statistics, Orsay 09 76

77 Exclusion – Illustrated
Eilam Gross, Higgs Statistics, Orsay 09 77 p1 is the level of compatibility between the data and the Higgs hypothesis If p1 is smaller than 0.05 we claim an exclusion at the 95% CL

78 Exclusion with Profile Likelihood
Exclusion is related to the probability of the “would be” signal to fluctuate down to the background only region (i.e. the p-value of the s+b hypothesis) To evaluate the median sensitivity of an experiment we generate a BG only data and calculate the median q1,medp1,medZmed. Exclusion at the 95% C.L. (one sided) means Z=1.64 78 Eilam Gross, Higgs Statistics, Orsay 09

79 Deriving an Upper Limit
µ is the signal strength When measured it can be interpreted as To get a “95% CL” upper limit on µ one has to solve This will give the 95% credible (or confidence) interval [0,µ95] If this interval contains the value µ=1, the SM Higgs is NOT excluded This µ95 is therefore interpreted as a 95% upper limit on µ, i.e. µupper. Eilam Gross, Higgs Statistics, Orsay 09

80 Exclusion Expected Limit, Combination of Channels
Each channel has its PLR with respect to the Asimov data set (background only, x=b) For each channel We find That is, lA(m|x=b) approximates the median value of l (m) one would obtain from data generated according to the background-only hypothesis. The value of lA(m|x=b) is used to determine the median qmed, which is used to find the median p-value, pmed. This has to be computed for all µ and the point where pµmed = 0.05 gives the 95% CL upper limit on µmed. Eilam Gross, Higgs Statistics, Orsay 09

81 Profile Likelihood Ratio
Test the S(mH)+b hypothesis i.e. test the µ=1 hypotheis q1 distributes as a 2 under s(mH)+b experiments (H1) The exclusion significance can be expressed in terms of an equivalent exclusion CL The exclusion sensitivity is the median CL, and using toy MCs one can find the 1 and 2  bands If ps+b<5% we (using a wrong jargon) say that the signal is excluded at >95% CL (CL=1-ps+b( Note that the exclusion CL is identified here with CL=1-CLs+b where CLs+b=ps+b 81 Eilam Gross, Higgs Statistics, Orsay 09

82 Exclusion Profile Likelihood Ratio
If ps+b<5%, the s(mH)+b hypothesis is rejected at the 95% CL A Higgs with a specific mass mH is excluded at the 95% CL if the observed p-value of the s(mH)+b hypothesis is below 0.05 In this example a Higgs Boson is expected to be excluded p1<0.05 (CL>95%) in all the mass range 82 Eilam Gross, Higgs Statistics, Orsay 09

83 Exclusion Bayesian Let be the posterior for µ NOTE: The pdf of the posterior is based on the one observed data event with the likelihood integrated over the nuisance parameters To set an upper limit on the signal strength calculate the credibility interval [0,µ95] Data = Asimov b 83 Eilam Gross, Higgs Statistics, Orsay 09

84 Exclusion Bayesian Let be the posterior for µ NOTE: The toy MC are needed just to find the median sensitivity, but once the data is delivered, it is sufficient to determine the upper limit using the posterior integration Data = b-only 84 Eilam Gross, Higgs Statistics, Orsay 09

85 Exclusion Bayesian We find that the credibility interval [0,µ95] does not contain µ95=1 (SM) for mH<28 or mH>61 This is sometimes wrongly expressed as an exclusion at the 95% CL mH 85 Eilam Gross, Higgs Statistics, Orsay 09

86 Comparing Bayesian to Frequentist PL
saddle-point approximation (for flat priors) For the Asimov BG only data Note: Taking the proper normalizations into account we find the following equivalence: 86 Eilam Gross, Higgs Statistics, Orsay 09

87 Exclusion Bayesian vs PL Ratio
Comparing a credibility Bayesian interval to 95% frequentist CL is like comparing oranges to apples…. Yet In Bayesian statistics, the observed data is sufficient to infer how strong is an hypothesis (assuming some priors). In the Profile Likelihood frequentist approach, Wilks’ theorem ensures that under a set of regularity conditions and for a sufficiently large data sample, for an hypothesized value of µ, the mH NOTE: One has to be careful about the 1-sided vs 2-sided significance 87 Eilam Gross, Higgs Statistics, Orsay 09

88 The problem with the CLs+b method
Use the LR as a test statistics To take systematics unto account integrate the nuisance parameters or profile them The exclusion is given by the s(mH)+b hypothesis p-value ps+b=CLs+b If ps+b<5%, the s(mH)+b hypothesis is rejected at the 95% CL p-value p-value=CLs+b 88 Eilam Gross, Higgs Statistics, Orsay 09

89 The modified frequentist CLs
CLs+b enables the exclusion of the s(mH)+b hypothesis, a downward fluctuation of the background might lead to an exclusion of a signal to which one is not sensitive (with a very low cross section) To protect against such fluctuations, the CL was redefined in a non-frequentist way to be Statisticians do not like this p-values ratio, yet, physics-wise it is conservative in a sense of coverage. Alex Read J.Phys.G28: ,2002 89 Eilam Gross, Higgs Statistics, Orsay 09 Eilam Gross, Higgs Statistics, Santander 2009

90 A wrong jargon – mixing p-values and Confidence Levels
s+b like b-like Eilam Gross, Higgs Statistics, Orsay 09

91 CLs+b and CLb s+b like b-like Observed Likelihood
1-CLb is the p value of the b-hypothesis, i.e. the probability to get a result less compatible with the BG only hypothesis than the observed one (in experiments where BG only hypothesis is true) CLs+b is the p-value of the s+b hypothesis, i.e. the probability to get a result which is less compatible with a Higgs signal when the signal hypothesis is true! A small CLs+b leads to an exclusion of the signal hypothesis at the CL=1- CLs+b confidence level. s+b like b-like Observed Likelihood Eilam Gross, Higgs Statistics, Orsay 09

92 The Problem of Small Signal
<Nobs>=s+b leads to the physical requirement that Nobs>b A very small expected s might lead to an anomaly when Nobs fluctuates far below the expected background, b. At one point DELPHI alone had CLs+b=0.03 for mH=116 GeV However, the cross section for 116 GeV Higgs at LEP was too small and Delphi actually had no sensitivity to observe it The frequntist would say: Suppose there is a 116 GeV Higgs…. Only 3% of the confidence intervals contain the true value of qSM(mH) i.e. in 3% of the experiments the true signal would be rejected… (one would obtain a result incompatible or more so with m=116) i.e. a 116 GeV Higgs is excluded at the 97% CL….. 97% of the intervals [qobs,] do not contain qSM (mH) Observed Likelihood Eilam Gross, Higgs Statistics, Orsay 09

93 The CLs Method for Upper Limits
Inspired by Zech(Roe and Woodroofe)’s derivation for counting experiments A. Read suggested the CLs method with In the DELPHI example, CLs=0.03/0.13=0.26, i.e. a 116 GeV could not be excluded at the 97% CL anymore….. (pb=1-CLb=0.87) Observed Likelihood Eilam Gross, Higgs Statistics, Orsay 09

94 The Meaning of CLs False exclusion rate of Signal when Signal is true
mH 115 GeV False exclusion rate of Signal when Signal is true Is it really that bad that a method undercovers where Physics is sort of handicapped… (due to loss of sensitivity)? Eilam Gross, Higgs Statistics, Orsay 09

95 The modified frequentist CLs
In this example, while using PL or the CLs the Higgs is excluded in all the mass range, the CLs reduces the sensitivity and does not allow to exclude a Higgs with 30<mH<60 95 Eilam Gross, Higgs Statistics, Orsay 09

96 Conclusions We have explored and compared all the methods to test hypotheses that are currently in use in the High Energy Physics market (PLR, CLs+b, CLs, Bayesian ) We have shown that all methods tend to give similar results, (for both exclusion and discovery using flat priors) weather one integrates the nuisance parameters or profile them Even though we have used typical case studies, real life might be different and all available methods should be explored 96 Eilam Gross, Higgs Statistics, Orsay 09


Download ppt "Eilam Gross, Higgs Statistics, Orsay 09"

Similar presentations


Ads by Google