RooStatsCms: a tool for analyses modelling, combination and statistical studies D. Piparo, G. Schott, G. Quast Institut für Experimentelle Kernphysik Universität.

Slides:



Advertisements
Similar presentations
Investigation on Higgs physics Group Ye Li Graduate Student UW - Madison.
Advertisements

Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 1 Using the Profile Likelihood in Searches for New Physics arXiv:
Probability and Statistics Basic concepts II (from a physicist point of view) Benoit CLEMENT – Université J. Fourier / LPSC
27 th March CERN Higgs searches: CL s W. J. Murray RAL.
Recent Results on the Possibility of Observing a Standard Model Higgs Boson Decaying to WW (*) Majid Hashemi University of Antwerp, Belgium.
Some answers to the questions/comments at the Heavy/Light presentation of March 17, Outline : 1) Addition of the id quadrant cut (slides 2-3); 2)
1 6 th September 2007 C.P. Ward Sensitivity of ZZ→llνν to Anomalous Couplings Pat Ward University of Cambridge Neutral Triple Gauge Couplings Fit Procedure.
DPF Victor Pavlunin on behalf of the CLEO Collaboration DPF-2006 Results from four CLEO Y (5S) analyses:  Exclusive B s and B Reconstruction at.
G. Cowan RHUL Physics Comment on use of LR for limits page 1 Comment on definition of likelihood ratio for limits ATLAS Statistics Forum CERN, 2 September,
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination using shapes ATLAS Statistics Meeting CERN, 19 December, 2007 Glen Cowan.
8. Hypotheses 8.2 The Likelihood ratio at work K. Desch – Statistical methods of data analysis SS10.
Statistical aspects of Higgs analyses W. Verkerke (NIKHEF)
Discovery Experience: CMS Giovanni Petrucciani (UCSD)
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
1 Glen Cowan Statistics Forum News Glen Cowan Eilam Gross ATLAS Statistics Forum CERN, 3 December, 2008.
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
Why do Wouter (and ATLAS) put asymmetric errors on data points ? What is involved in the CLs exclusion method and what do the colours/lines mean ? ATLAS.
Results of combination Higgs toy combination, within and across experiments, with RooStats Grégory Schott Institute for Experimental Nuclear Physics of.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #23.
1 HEP 2008, Olympia, Greece Ariadni Antonaki Dimitris Fassouliotis Christine Kourkoumelis Konstantinos Nikolopoulos University of Athens Studies for the.
1 Proposal to change limits for exclusion N.Andari, L. Fayard, M.Kado, F.Polci LAL- Orsay Y.Fang, Y.Pan, H.Wang, Sau Lan Wu University of Wisconsin-Madison.
Sensitivity Prospects for Light Charged Higgs at 7 TeV J.L. Lane, P.S. Miyagawa, U.K. Yang (Manchester) M. Klemetti, C.T. Potter (McGill) P. Mal (Arizona)
1 TTU report “Dijet Resonances” Kazim Gumus and Nural Akchurin Texas Tech University Selda Esen and Robert M. Harris Fermilab Jan. 26, 2006.
ROOT and statistics tutorial Exercise: Discover the Higgs, part 2 Attilio Andreazza Università di Milano and INFN Caterina Doglioni Université de Genève.
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination based on event counts (follow-up from 11 May 07) ATLAS Statistics Forum.
Moriond QCD 2001Enrico Piotto EP division CERN SM Higgs Search at LEP in Channels other than 4 Jets Enrico Piotto EP Division, CERN SM Higgs Search at.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #25.
G. Cowan RHUL Physics page 1 Status of search procedures for ATLAS ATLAS-CMS Joint Statistics Meeting CERN, 15 October, 2009 Glen Cowan Physics Department.
G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics.
Measurements of Top Quark Properties at Run II of the Tevatron Erich W.Varnes University of Arizona for the CDF and DØ Collaborations International Workshop.
Experience from Searches at the Tevatron Harrison B. Prosper Florida State University 18 January, 2011 PHYSTAT 2011 CERN.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Easy Limit Statistics Andreas Hoecker CAT Physics, Mar 25, 2011.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #24.
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
Projected Exclusion Limits on the SM Higgs Boson Cross Section by Combining Higgs Channels at LHC Tommaso Dorigo (INFN and University of Padova) for the.
Study of pair-produced doubly charged Higgs bosons with a four muon final state at the CMS detector (CMS NOTE 2006/081, Authors : T.Rommerskirchen and.
G. Cowan RHUL Physics Analysis Strategies Workshop -- Statistics Forum Report page 1 Report from the Statistics Forum ATLAS Analysis Strategies Workshop.
G. Cowan, RHUL Physics Statistics for early physics page 1 Statistics jump-start for early physics ATLAS Statistics Forum EVO/Phone, 4 May, 2010 Glen Cowan.
G. Cowan RHUL Physics Status of Higgs combination page 1 Status of Higgs Combination ATLAS Higgs Meeting CERN/phone, 7 November, 2008 Glen Cowan, RHUL.
The Higgs boson Faster-than-light neutrino’s 09:00-12:3014:00-17:00 Aart Heijboer and Ivo van Vulpen Nikhef Topical lectures on Statistics (3 rd day)
Andrzej Bożek for Belle Coll. I NSTITUTE OF N UCLEAR P HYSICS, K RAKOW ICHEP Beijing 2004  3 and sin(2  1 +  3 ) at Belle  3 and sin(2  1 +  3 )
Search for the Standard Model Higgs in  and  lepton final states P. Grannis, ICHEP 2012 for the DØ Collaboration Tevatron, pp √s = 1.96 TeV -
1 Studies on Exclusion Limits N.Andari, L. Fayard, M.Kado, F.Polci LAL- Orsay LAL meeting 23/07/2009.
Stano Tokar, slide 1 Top into Dileptons Stano Tokar Comenius University, Bratislava With a kind permissison of the CDF top group Dec 2004 RTN Workshop.
Search for a Standard Model Higgs Boson in the Diphoton Final State at the CDF Detector Karen Bland [ ] Department of Physics,
Shawn Hayes Cohorts: Tianqi Tang, Molly Herman Higgs Analysis for CLIC at 380 GeV simulations.
Search for Invisible Higgs Decays at the ILC Ayumi Yamamoto, Akimasa Ishikawa, Hitoshi Yamamoto (Tohoku University) Keisuke Fujii (KEK)
G. Cowan RHUL Physics Statistical Issues for Higgs Search page 1 Statistical Issues for Higgs Search ATLAS Statistics Forum CERN, 16 April, 2007 Glen Cowan.
S. Ferrag, G. Steele University of Glasgow. RooStats and MClimit comparison Exercise to use RooStats by an MClimit-formatted person: – Use two programs.
Search for Standard Model Higgs in ZH  l + l  bb channel at DØ Shaohua Fu Fermilab For the DØ Collaboration DPF 2006, Oct. 29 – Nov. 3 Honolulu, Hawaii.
Investigation on CDF Top Physics Group Ye Li Graduate Student UW - Madison.
Status of the Higgs to tau tau analysis Carlos Solans Cristobal Cuenca.
Barbara Storaci, Nicola Serra, Niels Tuning Bs →μ+μ- Bfys-meeting, 15 th May
Discussion on significance
Bounds on light higgs in future electron positron colliders
Status of the Higgs to tau tau
Statistical Significance & Its Systematic Uncertainties
Statistical Methods used for Higgs Boson Searches
CMS RooStats Higgs Combination Package
Grégory Schott Institute for Experimental Nuclear Physics
Search for WHlnbb at the Tevatron DPF 2009
Overview of CMS H→ analysis and the IHEP/IPNL contributions
° status report analysis details: overview; “where we are”; plans: before finalizing result.. I.Larin 02/13/2009.
B  at B-factories Guglielmo De Nardo Universita’ and INFN Napoli
Application of CLs Method to ATLAS Higgs Searches
Top mass measurements at the Tevatron and the standard model fits
Why do Wouter (and ATLAS) put asymmetric errors on data points ?
° status report analysis details: overview; “where we are”; plans: before finalizing result.. I.Larin 02/13/2009.
Presentation transcript:

RooStatsCms: a tool for analyses modelling, combination and statistical studies D. Piparo, G. Schott, G. Quast Institut für Experimentelle Kernphysik Universität Karlsruhe

D. Piparo – IPRD082 Outline The need for a tool for statistical methods and channels combination A possible solution: RooStatsCms Benchmark analysis: H →  (VBF) The “modified frequentist” method –Significance –SM cross-section exclusion The “profile likelihood” method –Upper limits

D. Piparo – IPRD083 The need for a tool Reliable implementation of multiple statistical methods Combine analyses: –Information lies more at the analysis level than at the result level –Consistent treatment of constraints and their correlations: no double counting –Stronger limits on quantities like Higgs production cross section, mass... Do not replace existing analyses but complement their results Easy user interface Satisfactory documentation Crucial especially in the early phases of the data taking

D. Piparo – IPRD084 RooStatsCms A possible solution: RooStatsCms (RSC). Based on RooFit:RooFit –Originally developed in BaBar, used in many experiments/collaborations –Part of standard ROOT distribution RSC runs on a laptop. Three parts: –Modelling and combination –Statistical methods (based on likelihood ratios) –Advanced graphic routines Doxy documentation of every class, method and member. It comes with CINT dictionaries (macros, interactive root). Available to CMS at: www-ekp.physik.uni-karlsruhe.de/~RooStatsCms. www-ekp.physik.uni-karlsruhe.de/~RooStatsCms –Visit tinyurl.com/rscpasswd for username and passwordtinyurl.com/rscpasswd –More material in the CMS Wiki –Statistical methods and graphic routines public: www-ekp.physik.uni-karlsruhe.de/~RooStatsKarlsruhe www-ekp.physik.uni-karlsruhe.de/~RooStatsKarlsruhe –RooStatsKarlsruhe: part of the negociations towards a common tool with Atlas RSC: in “production phase” –Workshop at CERN in June –Approved results

D. Piparo – IPRD085 RSC – Modelling 1/2 Build a complete combined analysis model from ASCII datacards (“config files”) –Background and signal components of each analysis –Shapes from parametrisation or histos –Constraints and their correlations –Basic syntax: include, if... –Two lines of C++ to produce the RooFit Pdf Datacard advantages: –Automatic bookkeeping of what is done –Factorise model from C++ code –Easy to share RscCombinedModel mymodel ("hzz4l"); RooAbsPdf* sb_pdf=mymodel.getPdf(); C++ ASCII Card 2 analyses

D. Piparo – IPRD086 RSC – Modelling 2/2 Yields can be expressed as products of different terms: – Branching Ratios – Efficiencies – Cross section – Luminosity – σ H / σ SM Each term: systematics can be included The same applies also to shape parameters Relate terms from one analysis to the other with correlations Yield = BR · ε · σ Prod · Lumi · σ H / σ SM

D. Piparo – IPRD087 Combination example: PTDR 30 fb -1 H→  H→ 4l Reproduced analysis of PTDR: H→ZZ→4l and H→  –(bkgs H→ZZ 100% correlated) Added combination of H→ZZ→4l and H→  –counting and non counting experiment: symmetrical treatment Significance estimator: sqrt(2lnQ) Variable Q=L s+b /L b with L s+b,L b likelihoods in the sig+bkg and bkg only hypotheses

D. Piparo – IPRD088 RSC – Statistical Methods 1/2 Perform a statistical analysis of your result RSC statistical methods: based on likelihood ratios Two statistical methods well tested: –The -2lnQ distributions for hypothesis separation –The Profile Likelihood method Sometimes analysis time-consuming (lots of toy-MC experiments): –“Batch friendly”: sum up your results Easy to get out of results plots in a presentation-ready form

D. Piparo – IPRD089 RSC – Statistical Methods 2/2 Statistical Methods – Mother: StatisticalMethod LimitCalculatorPLScanFCCalculator LimitResultsPLScanResultsFCResults Statistical Methods Results – Mother: StatisticalResult LimitPlot PLScanPlot (add also FC curves) Statistical Plot – Mother: StatisticalPlot Constraint.cc ConstrBlock2.cc ConstrBlock3.cc ConstrBlockArray.cc Constraints Mother: NLLPenalty.cc LEPBandPlot ExclusionBandPlot + Organisation of the classes:

D. Piparo – IPRD0810 RSC – Treatment of systematics Marginalisation MC phase-space integration Lots of toy experiments Profiling Penalty term in the likelihood (logL T = logL base + logL Penalty ) e.g. 1. One uncorrelated Gaussian constraint logL P ~ 0.5·(m-m 0 ) 2 / σ m 2 2. Correlated Gaussian constraints logL P ~ 0.5·(m-m 0 ) T · V -1 · (m-m 0 ), V is correlation matrix No toys: go for a few fits High statistics/ Gaussian case: two methods converge

11 Benchmark analysis: VBF H →  Used as benchmark for the tool Results approved by the CMS collaboration Vector boson fusion Integrated lumi: 1 fb -1 Small signal on a significant background No discovery expected with this lumi Four mass hypotheses: –115,125,135,145 GeV MassN Sig (12% sys) N Bkg (30% sys) D. Piparo – IPRD08

12 Separation of s+b and b only CL sb 1-CL b Idea: separation of hypotheses using the likelihoods ratio, Q, assuming signal+background (“s+b”) and the background-only “b” hypotheses, as test statistic Consider “P-values” (also called 1-CL S+B, 1-CL B ) of -2lnQ distributions obtained from s+b and b samples Treatment of systematics: For every toy MC experiment, before the generation of the toy dataset, parameters affected by systematics are properly fluctuated. Distributions built with toy MC experiments D. Piparo – IPRD08

13 CL B : background CL, measure of the compatibility of the experiment with the B- only hypothesis 1 – CL B : probability for a B-only experiment to give a more S+B-like likelihood ratio than the observed one Correspondence between 1 – CL B and the resulting significance (Gaussian approximation): - # of standard deviations of an (assumed) Gaussian distribution of the background. - Take CL B assuming the expected s+b yield (i.e. median -2lnQ for s+b distribution) CL S+B : measure of the compatibility of the experiment with the S+B hypothesis if CL is small ( < 5% ) the S+B hypothesis can be excluded at more than 95% CL but it does not mean that the signal hypothesis is excluded at that level Modified frequentist approach: take CL S the signal significance, to be: CL S ≡ CL S+B / CL B (heavily used by LEP, HERA and TEVATRON experiments) Modified frequentist approach – Significance D. Piparo – IPRD08

14 VBF H→  : Significance Significance calculated for the H→  analysis using 1-CL B In this case significance does not tell us much. The question becomes: “Which production cross section can I exclude with the data I have?” D. Piparo – IPRD08

15 Modified Frequentist method – Exclusion Assume to observe the expected background (i.e. median of the background distribution) and no signal Amplify the SM production cross section by a factor necessary to obtain CL s =0.05 → “95% exclusion” Bands: Assume to observe N b + n · sqrt (N b ), where n=2,1,-1,-2 for the -2,-1,1,2 sigma band border respectively Systematics taken into account in distributions of -2lnQ (marginalisation) Obtained with real data Less exclusion power than expected: “bad luck” More exclusion power than expected: “good luck” ~ 80 h on one CPU D. Piparo – IPRD08

16 The “profile likelihood” method Likelihood scanned w.r.t. a variable (e.g. signal yield) At each point, partial likelihood maximized w.r.t. nuisance parameters Intersection with horizontal lines gives upper limits / two sided intervals Systematics taken into account with penalty terms in the Likelihoods (profiling) Likelihood scan Interpolated scan minimum Horizontal cuts See PLCalcuator, PLResults, PLPlot documentation D. Piparo – IPRD08

17 Limits and coverage Again VBF H→  as benchmark (no systematics here) With profile likelihood the 95% CL UL is events = 6.7 SM cross section –to compare to ~5.5 with CL s Coverage: frequence in which in of toy experiments the “real” value is included in the confidence interval Coverage tested with several MC toys experiments: –For low signal yields, the profile likelihood method largely over-covers –The method works well for large signal (and luminosity) Plot of coverage VS Ns Plot of upper limits at 95% CL (Δ logL = 1.36) Median Signal Yield D. Piparo – IPRD08

18 Conclusions RooStatsCms - tool for statistical studies and analyses combination in the CMS collaboration Implemented and tested existing and widely accepted statistical methods: in 'production phase' Study of VBF H→  carried out: –SM production cross section exclusion power –PL likelihood upper limits The tool has been 'adopted' by the Higgs WG; it will be used for the Higgs results. Extensive X-checks done or planned It became solid tool –Example macros, documentation, CMS workshop, … Integration in ROOT being discussed Working on documenting the tool and used methods in a support document Future Plans Continue crosschecking with independent tools Add other statistical methods (Working on a full frequentist approach) Improve MC integration technique and numerical procedures –Such as approach based on Markov chains Monte Carlo D. Piparo – IPRD08