Presentation is loading. Please wait.

Presentation is loading. Please wait.

Frequentist approach Bayesian approach Statistics I.

Similar presentations


Presentation on theme: "Frequentist approach Bayesian approach Statistics I."— Presentation transcript:

1 Frequentist approach Bayesian approach Statistics I

2

3 2

4

5 PHYSTAT 05 - Oxford 12th - 15th September 2005
Statistical problems in Particle Physics, Astrophysics and Cosmology

6

7

8

9 Frequentist confidence intervals q2 q1 x

10 q1< q <q2 when x1 < x <x2
True value q CL q1 x x x1 x2 Possible interval x= x1  q1<q<q x= x2  q<q<q2 q1< q <q2 when x1 < x <x2 P(q1< q <q2) = P(x1 < x<x2) = CL

11 Elementary statistics
NEYMAN INTEGRALS q1 q2 Elementary statistics may be WRONG!! x x

12

13 Search for pivotal variables
q1 x q2 Neyman integrals Bootstrap Search for pivotal variables This method avoids the graphic procedure and the resolution of the Neyman integrals

14 Because P{Q} does not contain the parameter!

15 Estimation of the sample mean
since Due to the Central Limit theorem we have a pivot quantity when N>>1 Hence:

16 ta t is the quantile of the normal distribution t=1, area 84%
Quantile a=0.84 P[|f-p|<t s]= 68% a [ ] ta t is the quantile of the normal distribution Wilson Wald

17 p1 p2 p

18 The 90% CL gaussian upper limit
90% area 10% area 1.28 s Observed value Meaning I: with this upper limit, values less than the observed one are possible with a probability <10% Meaning II: a larger upper limit should give values less than the observed one in less than 10% of the experiments Meaning III: the probability to be wrong is 10%

19

20

21 The trigger problem The probability to be a muon after the trigger P(m|T):

22 prior 10.000 particles 9000 p 1.000 m trigger trigger 8550 50 450 950
enrichment 950/( ) = 68% Efficiency ( )/ = 14%

23 Bayesian

24

25

26

27

28 Bayesian credible interval

29

30 From coin tossing to physics: the efficiency measurement
ArXiv:physics/ v1 Valid also for k=0 and k=n

31 e =[0.104, 0.455] e1, e2 e =[0.122, 0.423] Elementary example
20 events have been generated and 5 passed the cut What is the estimation of the efficiency with CL=90%? x=5, n=20, CL=90% Frequentist result: e1, e2 e =[0.104, 0.455] Bayesian result: What meaning?? e1, e2 e =[0.122, 0.423]

32 Efficiency calculation: an OPEN PROBLEM!!
Wilson interval (1934) Wald (1950) Standard in Physics Exact frequentist Clopper Pearson (1934) (PDG) Bayes.This is not frequentist but can be tested in a frequentist way

33 Coverage simulation e=k/n k++ x = gRandom → Binomial(p,N) → x 1-CL = a
Tmath:: BinomialI(p,N,x) p1 p2 p2 p1 k++ e=k/n p 0ne expects e ~ CL

34 Simulate many x with a true p and check when the intervals contain the true value p . Compare this frequency with the stated CL CHAOS ? CL=0.95, n=50

35 Simulate many x with a true p and check when the intervals contain the true value p . Compare this frequency with the stated CL CL=0.90, n=20

36 BYE-BYE In the estimation of the efficiency (probability)
the coverage is “chaotic” The new standard (not yet for physicists) is to use the exact frequentist or the formula The standard formula should be abandoned BYE-BYE

37 The problem persists also with large samples!
0.95 0.90 0.86

38 (2001)

39 Counting experiments: Poisson case
Wilson interval (1934) Wald (1950) Standard in Physics Exact frequentist Clopper Pearson (1934) (PDG) Bayes.This is not frequentist but can be tested in a frequentist way

40 Poissonian Coverage simulation
CL=68%

41 Poissonian Coverage simulation
CL=90%

42 Poissonian Coverage simulation maximum probability constraint
CL CL k n k

43 Poissonian Coverage simulation max likelihood constraint
Feldman & Cousins, Phys. Rev. D 57(1998)3873 k n k

44 Poissonian Coverage simulation
CL=68%

45 Poissonian Coverage simulation
CL=90%

46 frequentism is the best way to give
By adopting a practical attitude, also bayesian formulae can be tested in a frequentist way frequentism is the best way to give the result of an experiment in pysics x ± s

47 The standard interpretation is frequentist
Quantum Mechanics: frequentist or bayesian? Born or Bohr? The standard interpretation is frequentist

48 Signal over Background in Physics Analysis of counting experiments
Some case studies Statistics II

49

50 .. From the Curtis Meyer review (Miami 2004)

51 The first result PRL 91(2003)012002

52 4.6 sigma! Is it convincing???

53 Hypothesis test I true density N

54 Hypothesis test II mb + ms true density N

55 Parameter estimation Nb N= Ns + Nb

56 PRL 91(2003)012002

57 This is the most common Seldom used Recently Proposed
(hypothesis test)

58

59 ???

60 HERMES : 27.6 positron beam on deuterium (2004)

61

62 No 5s effect!!

63 A powerful method: Maximum Likelihood
hypothesis observation the p(x;q) form is fitted to data by maximizing the ordinates of the observed data

64

65

66

67 ... in Physics P(H1) P(H0) 1- a 1- b a b exp value power

68 ) A Milestone: the Neyman-Pearson theorem Likelihood Ratio Test

69 Likelihood Ratio ni from MC samples!

70

71 Steps of the likelihood ratio test
Determine the ratio si/bi for each bin (model + MC simulation) Find lnQ pdf simulating ni from background (with the same experimental statistics) Find lnQ pdf simulating ni with signal+backg. Calculate the lnQ for the data ni and make the test -lnQ ni

72

73 ALEP, DELPHI, L3, OPAL, 2003 One can sum-up over the bins of histograms from different experiments and to construct a GLOBAL statistics!

74 ALEPH DELPHI L3 OPAL 2003 mH 5% mH ≥ GeV/c2 CL=95%

75 Conclusions

76

77

78

79

80

81 Bayes formula s [P(S|T)]

82

83

84

85

86 The non parametric Sampling methods The best one !!!

87 Non parametric Bootstrap

88

89 same element in different samples (sampling with and wihout replacement) same element in the same sample (sampling with replacement)

90 check the bootstrap with MC !!!

91

92 The dual Bootstrap Fix the background on one sample and
calculate the peak signal with another sample to avoid biases !! Repeat on bootstrap samples (dual bootstrap)

93

94 Conclusions Poissonian Counting: most of the tests
do not consider the error on background and overestimate the signal. Often true (mean) values and measured values are improperly confused. Binomial counting: a general theory there exists and should be applied. The errors should be calculated by MC methods and the procedure checked with MC toy models Nonparametric Bootstrap methods should be used also by physicists

95 end

96 The other branch of Statistics: Hypothesis Testing

97 2s

98

99

100 Photoproduction on a deuterium target

101

102

103

104 The extended likelihood
Since N is a function of q as in the case of a detector efficiency, If there is no functional relation between and the result is the same as for the non extended likelihood

105 error on the mean 1/√n bootstrap underestimates

106 error on the mean 1/√n

107

108 Bootstrap of B1 and Bi data
Trimmed mean 50% Correlation between measurements Weighted resampling Int(1/s2) times The error on measurements is not considered Scope of the analysis: to test wether errors only or the data itself are unreliable

109

110 Results Bootstrap: Some data are unreliable Standard analysis:

111

112

113

114 Useful when the two samples are
signal and background....

115 if 110 GeV H is true...

116

117 A word on the permutation tests

118

119

120

121

122 Hypothesis test III

123 end

124

125 Elementary example II There is a large number of marbles, which are either white or black, and you wish information on the white fraction, m. You draw a single marble, and it is white. What is the fraction m with 90% of confidence? Classical: p1 = 1 – CL =  m ≥ 0.1 Bayesian: flat prior p21 = 1 – CL =  m ≥ 0.316 1/m prior p1 = 1 – CL =  m ≥ 0.100 m prior p31 = 1 – CL =  m ≥ 0.464

126 A medicl test is 100% effective on sicks, but is positive also on 5% of sounds. If the didease affects 1% of tet populatin, which is the probability to be sick if the test is positive ? The probability to be sick if the test is positive P(M|P)

127 100 persone 99 sounds 1 sick 94 negativi 0 negative 5 positive 1 positive sick/positive= 1/6 ~ 17%

128

129 Check with the simulation
Simulate many x with n=20 and the true e=0.25 and check when frequentist and bayesian intervals contain the true value 0.25 Frequentist result: CL=90% p1, p2 CL=93.6 ± 0.3 % Bayesian result: p1, p2 CL=86.8 ± 0.3% Bayes tends to underestimate

130 wrong Last Informative Prior (LIP)

131

132

133 Uniform Jeffreys’ Prior

134

135 Simulate many x with a true p and check when the intervals contain the true value p . Compare this frequency with the stated CL CHAOS !!!!!!! CL=0.90, n=20

136 Poissonian Coverage simulation

137

138


Download ppt "Frequentist approach Bayesian approach Statistics I."

Similar presentations


Ads by Google