Statistical methods in LHC data analysis part II.2 Luca Lista INFN Napoli.

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

Current limits (95% C.L.): LEP direct searches m H > GeV Global fit to precision EW data (excludes direct search results) m H < 157 GeV Latest Tevatron.
Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 1 Using the Profile Likelihood in Searches for New Physics arXiv:
27 th March CERN Higgs searches: CL s W. J. Murray RAL.
Practical Statistics for LHC Physicists Bayesian Inference Harrison B. Prosper Florida State University CERN Academic Training Lectures 9 April, 2015.
1 LIMITS Why limits? Methods for upper limits Desirable properties Dealing with systematics Feldman-Cousins Recommendations.
Statistics In HEP 2 Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
DPF Victor Pavlunin on behalf of the CLEO Collaboration DPF-2006 Results from four CLEO Y (5S) analyses:  Exclusive B s and B Reconstruction at.
1 Nuisance parameters and systematic uncertainties Glen Cowan Royal Holloway, University of London IoP Half.
Search for B     with SemiExclusive reconstruction C.Cartaro, G. De Nardo, F. Fabozzi, L. Lista Università & INFN - Sezione di Napoli.
G. Cowan RHUL Physics Comment on use of LR for limits page 1 Comment on definition of likelihood ratio for limits ATLAS Statistics Forum CERN, 2 September,
Credibility of confidence intervals Dean Karlen / Carleton University Advanced Statistical Techniques in Particle Physics Durham, March 2002.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan 2011 CERN Summer Student Lectures on Statistics / Lecture 41 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
8. Hypotheses 8.2 The Likelihood ratio at work K. Desch – Statistical methods of data analysis SS10.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.
Statistical aspects of Higgs analyses W. Verkerke (NIKHEF)
Discovery Experience: CMS Giovanni Petrucciani (UCSD)
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
Statistical problems in network data analysis: burst searches by narrowband detectors L.Baggio and G.A.Prodi ICRR TokyoUniv.Trento and INFN IGEC time coincidence.
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
Why do Wouter (and ATLAS) put asymmetric errors on data points ? What is involved in the CLs exclusion method and what do the colours/lines mean ? ATLAS.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Results of combination Higgs toy combination, within and across experiments, with RooStats Grégory Schott Institute for Experimental Nuclear Physics of.
Practical Statistics for Particle Physicists Lecture 3 Harrison B. Prosper Florida State University European School of High-Energy Physics Anjou, France.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Statistical Methods for Data Analysis Introduction to the course Luca Lista INFN Napoli.
Confidence Intervals First ICFA Instrumentation School/Workshop At Morelia, Mexico, November 18-29, 2002 Harrison B. Prosper Florida State University.
ROOT and statistics tutorial Exercise: Discover the Higgs, part 2 Attilio Andreazza Università di Milano and INFN Caterina Doglioni Université de Genève.
G. Cowan RHUL Physics page 1 Status of search procedures for ATLAS ATLAS-CMS Joint Statistics Meeting CERN, 15 October, 2009 Glen Cowan Physics Department.
Statistical Methods for Data Analysis upper limits Luca Lista INFN Napoli.
G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics.
Experience from Searches at the Tevatron Harrison B. Prosper Florida State University 18 January, 2011 PHYSTAT 2011 CERN.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Easy Limit Statistics Andreas Hoecker CAT Physics, Mar 25, 2011.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #24.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
Study of pair-produced doubly charged Higgs bosons with a four muon final state at the CMS detector (CMS NOTE 2006/081, Authors : T.Rommerskirchen and.
CMSSM and NUHM Analysis by BayesFITS Leszek Roszkowski* National Centre for Nuclear Research (NCBJ) Warsaw, Poland Leszek Roszkowski *On leave of absence.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
G. Cowan, RHUL Physics Statistics for early physics page 1 Statistics jump-start for early physics ATLAS Statistics Forum EVO/Phone, 4 May, 2010 Glen Cowan.
G. Cowan RHUL Physics Status of Higgs combination page 1 Status of Higgs Combination ATLAS Higgs Meeting CERN/phone, 7 November, 2008 Glen Cowan, RHUL.
RECENT RESULTS FROM THE TEVATRON AND LHC Suyong Choi Korea University.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
In Bayesian theory, a test statistics can be defined by taking the ratio of the Bayes factors for the two hypotheses: The ratio measures the probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan RHUL Physics Statistical Issues for Higgs Search page 1 Statistical Issues for Higgs Search ATLAS Statistics Forum CERN, 16 April, 2007 Glen Cowan.
Status of the Higgs to tau tau
Statistical Significance & Its Systematic Uncertainties
arXiv:physics/ v3 [physics.data-an]
BAYES and FREQUENTISM: The Return of an Old Controversy
Confidence Intervals and Limits
Lecture 4 1 Probability (90 min.)
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
B  at B-factories Guglielmo De Nardo Universita’ and INFN Napoli
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
Top mass measurements at the Tevatron and the standard model fits
Introduction to Statistics − Day 4
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
Statistics for HEP Roger Barlow Manchester University
Why do Wouter (and ATLAS) put asymmetric errors on data points ?
Presentation transcript:

Statistical methods in LHC data analysis part II.2 Luca Lista INFN Napoli

Luca ListaStatistical methods in LHC data analysis2 Contents Upper limits Treatment of background Bayesian limits Modified frequentist approach (CL s method) Profile likelihood Nuisance parameters and Cousins-Highland approach

Luca ListaStatistical methods in LHC data analysis3 Significance: claiming a discovery If we measure a signal yield s sufficiently inconsistent with zero, we can claim a discovery Statistical significance = probability  to observe s or larger signal in the case of pure background fluctuation Often preferred to quote “ n  ” significance, where: Usually, in literature: –If the significance is > 3  (“3  ”) one claims “evidence of” –If the significance is > 5  (“5  ”) one claims “observation” (discovery!) probability of background fluctuation = 2.87  n = TMath::NormalQuantile(alpha)

Luca ListaStatistical methods in LHC data analysis4 When using upper limits Not always experiments lead to discoveries  What information can be determined if no evidence of signal is observed? One possible definition of upper limit: –“largest value of the signal s for which the probability of a signal under-fluctuation smaller or equal to what has been observed is less than a given level  (usually 10% or 5%)” –Upper x Confidence Level = s such that  (s) = 1  x –Similar to confidence interval with a central value, but the interval is fully asymmetric now Other approaches are possible: –Bayesian limits: extreme of an interval over which the poster probability of [0, s] is 1  It’s a different definition! –Unified Feldman-Cousins frequentist limits

Luca ListaStatistical methods in LHC data analysis5 Frequentist vs Bayesian intervals The two approaches address different questions Frequentist: –Probability that a fixed  [  1,  2 ] = 1  –The interval [  1,  2 ] is a random variable interval determined from the experiment response –Choosing the interval requires an ordering principle (fully asymmetric, central, unified) Bayesian: –The a posterior probability (degree of belief) of  in the interval [  1,  2 ] is equal to 

Luca ListaStatistical methods in LHC data analysis6 Event counting Assume we have a signal process on top of a background process that lead to a Poissonian number of counts: If we can’t discriminate signal and background events, we will measure: –n = n s + n b The distribution of n is again Poissonian!

Luca ListaStatistical methods in LHC data analysis7 Simplest case: zero events If we observe zero events we can state that: –No background events have been observed ( n b = 0 ) –No signal events have been observed ( n s = 0 ) The Poissonian probability to observe zero events expecting s (assuming the expected background b is negligible) is: –p(n s =0) = P s (0) = e  s = 1 – C.L. =  So, naïvely: s < s up =  ln(1  C.L.) : –s < 95% C.L. –s < 90% C.L.

Luca ListaStatistical methods in LHC data analysis8 Bayesian interpretation We saw that, assuming a uniform prior for s, the posterior PDF for s is: The cumulative is: In particular: Which gives by chance identical result: s f(s|0) 0123  =5%

Luca ListaStatistical methods in LHC data analysis9 Bayesian approach (flat prior, from 0 to  ): Where (for fixed b !): The limit can be obtained inverting the equation: Upper limits in case of background

Luca ListaStatistical methods in LHC data analysis10 Upper limits with background Graphical view (O. Helene, 1983) b s up n b n Bayesian

Luca ListaStatistical methods in LHC data analysis11 Upper limit with event counting From PDG in case of no background “It happens that the upper limit from [central Neyman interval] coincides numerically with the Bayesian upper limit for a Poisson parameter, using a uniform prior p.d.f. for ν.”

Luca ListaStatistical methods in LHC data analysis12 Poissonian background uncertainty Assume we estimate the background from sidebands applying scaling factors: – = s = n – b = n   n sb  n cb –sb = “side band”, cb = “corner band” Upper limit on s with a C.L. =  can be as difficult as: K.K.Gan et al., 1998 cb sb Physical integration bound: sb cb sb

Luca ListaStatistical methods in LHC data analysis13 Restrict the probability to the observed condition that the number of background events does not exceed the number of observed events “In an experiment where b background events are expected and n events are found, P(n b ;b) no longer corresponds to our improved knowledge of the background distributions. Since n b can only take the numbers n b  n, it has to be renormalized to the new range of n b : ” Leads to a result identical to the Bayesian approach! Zech’s frequentist derivation attempted was criticized by Highlands: does not respect the coverage requirement Often used in a “pragmatic” way, and recommended for some time by the PDG Zech’s “frequentist” interpretation

Luca ListaStatistical methods in LHC data analysis14 Zech’s derivation references Bayesian solution found first proposed by O.Helene –O. Helene. Nucl. Instr. and Meth. A 212 (1983), p. 319 Upper limit of peak area (Bayesian) Attempt to derive the same conclusion with a frequentist approach –G. Zech, Nucl. Instr. and Meth. A 277 (1989) Upper limits in experiments with background or measurement errors Frequentist validity criticized by Highland –V.L. Highland Nucl. Instr. and Meth. A 398 (1989) Comment on “Upper limits in experiments with background or measurement errors” [Nucl. Instr. and Meth. A 277 (1989) 608–610] Zech agreement that his derivation is not rigorously frequentist –G. Zech, Nucl. Instr. and Meth. A 398 (1989) Reply to ‘Comment on “Upper limits in experiments with background or measurement errors” [Nucl. Instr. and Meth. A 277 (1989) 608–610]’ Cousins overview and summary of the controversy –Workshop on Confidence Limits, March, 2000, Fermilab

Luca ListaStatistical methods in LHC data analysis15 Frequentist (classic) upper limits Construct of Neyman belt with discrete values, can’t exactly satisfy coverage: P(s  [s 1, s 2 ])  1  Build for all s ; asymmetric in this example Determine the upper limit on given the observed n obs The limit s up on s is such that: In case of n obs = 0 the simple formula holds: n P(n;s) Lower limit choice 1  = 90.84%  = 9.16% s = 4, b = 0 Identical to Bayesian limit for this simple case

Luca ListaStatistical methods in LHC data analysis16 Poissonian using Feldman-Cousins b = 3, 90% C.L. Frequentist ordering based on likelihood ratio (see slides from part I) Belt depends on b, of course G.Feldman, R.Cousins, Phys.Rev.D,57(1998), 3873

Luca ListaStatistical methods in LHC data analysis17 Limits using Feldman-Cousins G.Feldman, R.Cousins, Phys.Rev.D,57(1998), % C.L. Note that the curve for n = 0 decreases with b, while the result of the Bayesian calculation is constant at 2.3 Frequentist interval do not express P(  |x) !

Luca ListaStatistical methods in LHC data analysis18 A Close-up C. Giunti, Phys.Rev.D,59(1999), Note the ‘ripple’ structure

Luca ListaStatistical methods in LHC data analysis19 Limits in case of no background From PDG “Unified” (i.e.: Feldman- Cousins) limits for Poissonian counting in case of no background Larger than Bayesian limits

Luca ListaStatistical methods in LHC data analysis20 From PDG Review… “The intervals constructed according to the unified procedure for a Poisson variable n consisting of signal and background have the property that for n = 0 observed events, the upper limit decreases for increasing expected background. This is counter- intuitive, since it is known that if n = 0 for the experiment in question, then no background was observed, and therefore one may argue that the expected background should not be relevant. The extent to which one should regard this feature as a drawback is a subject of some controversy”

Luca ListaStatistical methods in LHC data analysis21 Problems of “classic” (frequentist) methods The presence of background may introduce problems in interpreting the meaning of upper limits A statistical under-fluctuation of the background may lead to the exclusion of a signal of zero at 95% C.L. –Unphysical estimated “negative” signal? “tends to say more about the probability of observing a similar or stronger exclusion in future experiments with the same expected signal and background than about the non-existence of the signal itself” [ * ] What we should derive, is just that there is not sufficient information to discriminate the b and s+b hypotheses When adding channels that have low signal sensitivity may produce upper limits that are severely worse than without adding those channels [ * ] A. L. Read, Modified frequentist analysis of search results (the CLs method), 1st Workshop on Confidence Limits, CERN, 2000

Luca ListaStatistical methods in LHC data analysis22 Analysis channel separated by experiment (Aleph, Delphi, L3, Opal) and separate decay modes Using the Likelihood ratio discriminator: Confidence levels estimator (  different from Feldman-Cousins): –Gives over-coverage w.r.t. classical limit (CL s > CL s+b : conservative) –Similarities with Bayesian C.L. Identical to Bayesian limit for Poissonian counting! “approximation to the confidence in the signal hypothesis, one might have obtained if the experiment had been performed in the complete absence of background” No problem when adding channels with low discrimination CL s : Higgs search at LEP-II

Luca ListaStatistical methods in LHC data analysis23 Observations on CL s method “A specific modification of a purely classical statistical analysis is used to avoid excluding or discovering signals which the search is in fact not sensitive to” “The use of CLs is a conscious decision not to insist on the frequentist concept of full coverage (to guarantee that the confidence interval doesn’t include the true value of the parameter in a fixed fraction of experiments).” “confidence intervals obtained in this manner do not have the same interpretation as traditional frequentist confidence intervals nor as Bayesian credible intervals” A. L. Read, Modified frequentist analysis of search results (the CLls method), 1st Workshop on Confidence Limits, CERN, 2000

Profile Likelihood Use a different test statistics: Wilk’s theorem ensures asymptotical χ 2 distribution Popular technique in ATLAS CMS and LEP use more frequently: Luca ListaStatistical methods in LHC data analysis24 Fix μ, fit θ Fit both μ and θ RooStats::ProfileLikelihoodCalculator New signal SM only

Luca ListaStatistical methods in LHC data analysis25 What if s is exactly zero If a signal does not exist, s is exactly zero –E.g.: searches for non existing exotic processes In that case, experiments will observe no signal events (within background fluctuation), but the finite size of the experimental sample will always determine an upper limit greater than zero (at, say, 95% C.L.) –Feldman-Cousins gives a C.L. lower than e  s when zero events are seen So, the number of experiments where s is greater than the upper limit is zero, not 5%! This is a reason of criticism of frequentist upper limits from Bayesians

Concrete example I Higgs search at LEP I (L3)

Luca ListaStatistical methods in LHC data analysis27 Higgs search at LEP Production via e  e   HZ *  bbl  l  Higgs candidate mass measured via missing mass to lepton pair Most of the background rejected via kinematic cuts and isolation requirements for the lepton pair Search mainly dominated by statistics A few background events survived selection (first observed in L3 at LEP-I)

Luca ListaStatistical methods in LHC data analysis28 First Higgs candidate (m H  70 GeV)

Luca ListaStatistical methods in LHC data analysis29 Extended likelihood approach Assume both signal and background are present, with different PDF for mass distribution: Gaussian peak for signal, flat for background (from Monte Carlo samples): Bayesian approach can be used to extract the upper limit (with flat prior):

Luca ListaStatistical methods in LHC data analysis30 Application to Higgs search at L    events “standard” limit LEP-I

Luca ListaStatistical methods in LHC data analysis31 Comparison with frequentist C.L. Toy MC can be generated for different signal and background scenarios “classic” C.L. can be computed counting the fraction of toy experiments above/below the Bayesian limit Always conservative!

Concrete example II Combined Higgs search at LEP II

Luca ListaStatistical methods in LHC data analysis33 Combined Higgs search at LEP-II Extended likelihood definition:  = 0 for b only, 1 for s + b hypotheses Likelihood ratio:

Luca ListaStatistical methods in LHC data analysis34 Higgs candidates mass

Luca ListaStatistical methods in LHC data analysis35 Mass scan plot Green: 68% Yellow: 95%

Luca ListaStatistical methods in LHC data analysis36 Mass scan by experiment

Luca ListaStatistical methods in LHC data analysis37 Mass scan by channel

Luca ListaStatistical methods in LHC data analysis38 Background hypothesis C.L.

Luca ListaStatistical methods in LHC data analysis39 Background C.L. by experiment

Luca ListaStatistical methods in LHC data analysis40 Signal hypothesis C.L.

Luca ListaStatistical methods in LHC data analysis41 The new limit from Tevatron (I)

Luca ListaStatistical methods in LHC data analysis42 The new limit from Tevatron (II)

Luca ListaStatistical methods in LHC data analysis43 The new limit from Tevatron (III)

Luca ListaStatistical methods in LHC data analysis44 “Look elsewhere” effect When searching for a signal over a wide range of unknown parameters (e.g.: the Higgs mass), the chance that an over- fluctuation may occur on at least one point increases with the searched range –Average fraction of false discoveries at a fixed point (False Discovery Rate): FDR = (1  CL b ) / (1  CL s+b ) Not sufficient to test the significance at the most likely point! –Significance would be overestimated Magnitude of the effect: roughly proportional to the ratio of resolution over the search range –Better resolution = less chance to have more events compatible with the same mass value The effect can be evaluated with brute-force Toy Monte Carlo –Run N experiments with background-only, find the largest ‘local’ significance over the whole search range, and get its distribution to determine ‘overall’ significance

Luca ListaStatistical methods in LHC data analysis45 Nuisance parameters So called “nuisance parameters” are unknown parameters that are not interesting for the measurement –E.g.: detector resolution, uncertainty in backgrounds, other systematic errors, etc. Two main possible approaches: Add the nuisance parameters together with the interesting unknown to your likelihood model –But the model becomes more complex! –Easier to incorporate in a fit than in upper limits “Integrate it away” (  Bayesian)

Luca ListaStatistical methods in LHC data analysis46 Nuisance pars in Bayesian approach No particular treatment: P(  |x) obtained as marginal PDF, “integrating out” :

Luca ListaStatistical methods in LHC data analysis47 Cousins - Highland hybrid approach No fully solid background exists on how to incorporate nuisance parameters within a frequentist approach Hybrid approach proposed by Cousins and Highland –Integrate the posterior PDF over the nuisance parameters (Nucl.Instr.Meth.A , 1992) –Some Bayesian approach in the integration…: ‘‘seems to be acceptable to many pragmatic frequentists” (G. Zech, Eur. Phys. J. C 4 (2002) 12) Bayesian integration of PDF, then likelihood used in a frequentist way –Some numerical studies with Toy Monte Carlo showed that the frequentist calculation gives very similar results in many cases RooStats::HybridCalculator

Concrete example III Search for B  at BaBar

Luca ListaStatistical methods in LHC data analysis49 Upper limits to B  at BaBar Reconstruct one B with complete hadronic decays Look for a tau decay on other side with missing energy (neutrinos) –Five decay channels used:  , e ,  ,    0,       Likelihood function: product of Poissonian likelihoods for the five channels Background is known with finite uncertainties from side-band applying scaling factors (taken from MC)

Luca ListaStatistical methods in LHC data analysis50 Combined likelihood Combine the five channels with likelihood: Define likelihood ratio estimator, as for combined Higgs search: In case Q shows a significant minimum a non-null measurement of s can be determined More discriminating variables may be incorporated in the likelihood definition

Luca ListaStatistical methods in LHC data analysis51 Upper limit evaluation Use toy Monte Carlo to generate a large number of counting experiments Evaluate the C.L. for a signal hypothesis defined as the fraction of C.L. for the s+b and b hypotheses: Frequentist, so far!

Luca ListaStatistical methods in LHC data analysis52 Including (Gaussian) uncertainties Nuisance parameters are the backgrounds b i known with some uncertainty from side-band extrapolation Convolve likelihood with a Gaussian PDF –Note: b i is the estimated background, not the “true” one! … but C.L. evaluated anyway with a frequentist approach (Toy Monte Carlo)! Analytical integrability leads to huge CPU saving! (L.L., A 517 (2004) 360–363)

Luca ListaStatistical methods in LHC data analysis53 Analytical expression Simplified analytic Q derivation: Where p n ( ,  ) are polynomials defined with a recursive relation: … but in many cases it’ hard to be so lucky!

Luca ListaStatistical methods in LHC data analysis54 Branching ratio:  Ldt = 82 fb -1 Without background uncertainty With background uncertainty Low statistics scenario withoutwith uncertainty withoutwith uncertainty No evidence for a local minumum RooStats::HypoTestInverter

Luca ListaStatistical methods in LHC data analysis55 Branching ratio:  Ldt = 350 fb -1 Without background uncertainty With background uncertainty Significance:  = 2.2  (with uncertainties) 2.7  (without uncert.) Expectations from MC under some assumptions, The analysis is still “blind” now!

Luca ListaStatistical methods in LHC data analysis56 In conclusion Many recipes and approaches available Bayesian and Frequentist approaches lead to similar results in the easiest cases, but may diverge in frontier cases Be ready to master both approaches! Bayesian and Frequentist limits have very different meanings Unified Feldman-Cousins approach or its modified frequentist versions (CLs, profile likelihood) address in a unified way upper limit and central values + confidence interval cases –Bayesian approach still needed for “hybrid” treatment of nuisance parameters (Highlands-Cousins) If you want your paper to be approved soon: –Be consistent with your assumptions –Understand the meaning of what you are computing –Try to adopt a popular and consolidated approach (an better, software tools!), wherever possible –Debate your preferred statistical technique in a statistics paper, not a physics result publication!

Luca ListaStatistical methods in LHC data analysis57 References Upper limits with event counting: –PDG: Statistics review –CERN Yellow Workshop on confidence limits Feldman-Cousins approach: –Feldman and Cousins, Phys. Rev. D 57, 3873 (1998) Upper limits for Higgs search –The LEP Working Group for Higgs Boson Searches / Physics Letters B 565 (2003) 61–75 –A. L. Read, Modified frequentist analysis of search results (the CLls method), 1st Workshop on Confidence Limits, CERN, 2000 –V.Innocente, L.Lista, NIM A 340 (1994) Evaluation of the upper limit to rare processes in the presence of background, and comparison between the Bayesian and classical approaches B to tau nu at BaBar –BABAR Collaboration, Phys.Rev.Lett.95:041804,2005, Search for the Rare Leptonic Decay B -   - . Nuisance parameters –R.D.Cousins, V.L.Highland, Nucl.Instr.Meth.A (1992) –C.Blocker, CDF/MEMO/STATISTICS/PUBLIC/7539, 2006 Including uncertainties: –O. Helene. Nucl. Instr. and Meth. A 212 (1983), p. 319 Upper limit of peak area (Bayesian) –G. Zech, Nucl. Instr. and Meth. A 277 (1989) Upper limits in experiments with background or measurement errors –V.L. Highland Nucl. Instr. and Meth. A 398 (1989) Comment on “Upper limits in experiments with background or measurement errors” [Nucl. Instr. and Meth. A 277 (1989) 608–610] –G. Zech, Nucl. Instr. and Meth. A 398 (1989) Reply to ‘Comment on “Upper limits in experiments with background or measurement errors” [Nucl. Instr. and Meth. A 277 (1989) 608–610]’ –K.K. Gan, et al., Nucl. Instr. and Meth. A 412 (1998) 475 Incorporation of the statistical uncertainty in the background estimate into the upper limit on the signal –G. Zech, Eur. Phys. J. C 4 (2002) 12. –L.Lista, A 517 (2004) 360–363 Including Gaussian uncertainty on the background estimate for upper limit calculations using Poissonian sampling –C. Giunti, Phys. Rev. D, Vol. 59, , New ordering principle for the classical statistical analysis of Poisson processes with background G. Zech, Frequentist and Bayesian confidence intervals, EPJdirect C12, 1–81 (2002)