Using Profile Likelihood ratios at the LHC Clement Helsens, CERN-PH Top LHC-France, CC IN2P3-Lyon, 22 March 2013.

Slides:



Advertisements
Similar presentations
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Advertisements

Current limits (95% C.L.): LEP direct searches m H > GeV Global fit to precision EW data (excludes direct search results) m H < 157 GeV Latest Tevatron.
Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 1 Using the Profile Likelihood in Searches for New Physics arXiv:
Probability and Statistics Basic concepts II (from a physicist point of view) Benoit CLEMENT – Université J. Fourier / LPSC
27 th March CERN Higgs searches: CL s W. J. Murray RAL.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #22.
Statistics In HEP 2 Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
8. Statistical tests 8.1 Hypotheses K. Desch – Statistical methods of data analysis SS10 Frequent problem: Decision making based on statistical information.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Top Turns Ten March 2 nd, Measurement of the Top Quark Mass The Low Bias Template Method using Lepton + jets events Kevin Black, Meenakshi Narain.
Kevin Black Meenakshi Narain Boston University
Inferences About Process Quality
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination using shapes ATLAS Statistics Meeting CERN, 19 December, 2007 Glen Cowan.
Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Statistical aspects of Higgs analyses W. Verkerke (NIKHEF)
Hypothesis Testing. Distribution of Estimator To see the impact of the sample on estimates, try different samples Plot histogram of answers –Is it “normal”
Why do Wouter (and ATLAS) put asymmetric errors on data points ? What is involved in the CLs exclusion method and what do the colours/lines mean ? ATLAS.
Results of combination Higgs toy combination, within and across experiments, with RooStats Grégory Schott Institute for Experimental Nuclear Physics of.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #23.
Hypotheses tests for means
Statistics (cont.) Psych 231: Research Methods in Psychology.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Statistics In HEP Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
Copyright © 2012 Pearson Education. All rights reserved © 2010 Pearson Education Copyright © 2012 Pearson Education. All rights reserved. Chapter.
August 30, 2006 CAT physics meeting Calibration of b-tagging at Tevatron 1. A Secondary Vertex Tagger 2. Primary and secondary vertex reconstruction 3.
Sensitivity Prospects for Light Charged Higgs at 7 TeV J.L. Lane, P.S. Miyagawa, U.K. Yang (Manchester) M. Klemetti, C.T. Potter (McGill) P. Mal (Arizona)
Possibility of tan  measurement with in CMS Majid Hashemi CERN, CMS IPM,Tehran,Iran QCD and Hadronic Interactions, March 2005, La Thuile, Italy.
Statistics/Thomas R. Junk/TSI July Statistical Methods for Experimental Particle Physics Theory and Lots of Examples Thomas R. Junk Fermilab TRIUMF.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Checks with the Fourier Method A. Cerri. Outline Description of the tool Validation devices –“lifetime fit” –Pulls Toy Montecarlo –Ingredients –Comparison.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #25.
G. Cowan RHUL Physics page 1 Status of search procedures for ATLAS ATLAS-CMS Joint Statistics Meeting CERN, 15 October, 2009 Glen Cowan Physics Department.
G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics.
Measurements of Top Quark Properties at Run II of the Tevatron Erich W.Varnes University of Arizona for the CDF and DØ Collaborations International Workshop.
G. Cowan RHUL Physics LR test to determine number of parameters page 1 Likelihood ratio test to determine best number of parameters ATLAS Statistics Forum.
Search for the Higgs boson in H  ZZ (*) decay modes on ATLAS German D Carrillo Montoya, Lashkar Kashif University of Wisconsin-Madison On behalf of the.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
Lecture 8 Source detection NASSP Masters 5003S - Computational Astronomy
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
Study of pair-produced doubly charged Higgs bosons with a four muon final state at the CMS detector (CMS NOTE 2006/081, Authors : T.Rommerskirchen and.
1 Searching for Z’ and model discrimination in ATLAS ● Motivations ● Current limits and discovery potential ● Discriminating variables in channel Z’ 
1 TOP MASS MEASUREMENT WITH ATLAS A.-I. Etienvre, for the ATLAS Collaboration.
G. Cowan RHUL Physics Status of Higgs combination page 1 Status of Higgs Combination ATLAS Higgs Meeting CERN/phone, 7 November, 2008 Glen Cowan, RHUL.
In Bayesian theory, a test statistics can be defined by taking the ratio of the Bayes factors for the two hypotheses: The ratio measures the probability.
Kalanand Mishra February 23, Branching Ratio Measurements of Decays D 0  π - π + π 0, D 0  K - K + π 0 Relative to D 0  K - π + π 0 decay Giampiero.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Systematics in Hfitter. Reminder: profiling nuisance parameters Likelihood ratio is the most powerful discriminant between 2 hypotheses What if the hypotheses.
Inferential Statistics Psych 231: Research Methods in Psychology.
S. Ferrag, G. Steele University of Glasgow. RooStats and MClimit comparison Exercise to use RooStats by an MClimit-formatted person: – Use two programs.
G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 21 Statistics for the LHC Lecture 2: Discovery Academic Training Lectures CERN,
Search for Standard Model Higgs in ZH  l + l  bb channel at DØ Shaohua Fu Fermilab For the DØ Collaboration DPF 2006, Oct. 29 – Nov. 3 Honolulu, Hawaii.
Investigation on CDF Top Physics Group Ye Li Graduate Student UW - Madison.
G. Cowan SLAC Statistics Meeting / 4-6 June 2012 / Two Developments 1 Two developments in discovery tests: use of weighted Monte Carlo events and an improved.
Status of the Higgs to tau tau
Statistical Methods used for Higgs Boson Searches
Multi-dimensional likelihood
Discrete Event Simulation - 4
W boson helicity measurement
B  at B-factories Guglielmo De Nardo Universita’ and INFN Napoli
Application of CLs Method to ATLAS Higgs Searches
Dilepton Mass. Progress report.
Measurement of the Single Top Production Cross Section at CDF
Presentation transcript:

Using Profile Likelihood ratios at the LHC Clement Helsens, CERN-PH Top LHC-France, CC IN2P3-Lyon, 22 March 2013

Outline Introduction Reminder of statistic Hypothesis testing Profile Likelihood ratio Some example helping to build an analysis From real analyses From Toy MC Conclusion 22/03/13 Helsens Clement Top LHC-France 2

Introduction Disclaimers This talk is not a lecture in statistic! I will not encourage you to use any particular tool or method Only talk about (hybrid) Frequentist methods and not about Bayesian marginalization This talk should be seen like a methodology to follow when one wants to use profiling in an analysis For the example I will only talk about searches (LHC is a discovery machine ) I will rather try to give tips to perform an analysis using profiling rather than reviewing analysis using it This might help you to have better results 22/03/13 Helsens Clement Top LHC-France 3

Hypothesis Testing 1/5 Deciding between two hypothesis Null hypothesis H 0 (background only, process already known) Test hypothesis H 1 (background + alternative model) Why can’t we just decide by testing H 0 hypothesis only? Why do we need an alternate hypothesis? Data points are randomly distributed: If a discrepancy between the data and the H 0 hypothesis is observed, we will be obliged to call it a random fluctuation H 0 might look globally right but predictions slightly wrong If we look at enough different distributions, we will find some that are mis-modeled Having a second hypothesis provides guidance where to look Duhem–Quine thesis: It is impossible to test a scientific hypothesis in isolation, because an empirical test of the hypothesis requires one or more background assumptions (also called auxiliary/alternate hypotheses). 22/03/13 Helsens Clement Top LHC-France 4

Hypothesis Testing 2/5 Is square A darker than square B? (there is only one correct answer) 22/03/13 Helsens Clement Top LHC-France 5

Hypothesis Testing 3/5 Is square A darker than square B? (there is only one correct answer) 22/03/13 Helsens Clement Top LHC-France 6

Hypothesis Testing 4/5 Since the perception of the human visual system is affected by context, square A appears to be darker than square B but they are exactly the same shade of gray 22/03/13 Helsens Clement Top LHC-France 7

Hypothesis Testing 5/5 So proving one hypothesis is wrong does not mean the proposed alternative must right For example, search for highly energetic processes (like heavy- quarks) Use inclusive distributions like H T (Σp T ) If discrepancies observed in the tails of HT, does this necessarily means we have new physics? 22/03/13 Helsens Clement Top LHC-France 8

Frequentist Hypothesis Testing 1/2 1) Construct a quantity that ranks outcomes as being more signal-like or more background-like. Called a test statistic: Search for a new particle by counting events passing selection cuts Expect B events in H 0 and S+B events in H 1 The number of observed events n Obs is a good test statistic 2) Build a prediction of the test statistic separately assuming H 0 is true H 1 is true 3) Run the experiment and get n Obs (in our case run LHC + ATLAS/CMS) 4) Compute the p-value 22/03/13 Helsens Clement Top LHC-France 9

Frequentist Hypothesis Testing 2/2 Could ask the question: what is the chance of getting n==n Obs (Chance of getting exactly 1000 events when 1000 are predicted? It is small) If p<p thr, then we can make a statement We commonly use p thr = 0.05 and say we can exclude the hypothesis under test at the 95% C.L. (Confidence Level) 22/03/13 Helsens Clement Top LHC-France 10 A p-value is not the probability that H 0 is true Poisson distribution

Log Likelihood ratio What should be done if we do not want a counting experiment? Neyman-Pearson Lemma (1933): The likelihood ratio is the “uniformly most powerful” test statistic Acts like the difference of χ 2 in the Gaussian limit Used at the Tevatron (mclimit, collie). Needs Pseudo-data 22/03/13 Helsens Clement Top LHC-France 11

P-values and -2lnQ (From LEP) P-value for testing H 0 = P(- 2lnQ ≤ -2lnQ obs | H 0 ) = CL b Blue  p-value to rule out H 0 called in HEP 1-CL b Use for discovery 22/03/13 Helsens Clement Top LHC-France 12 P-value for testing H 1 = P(-2lnQ ≥ -2lnQ obs | H 1 ) = CL sb Red  p-value to rule out H 1 Use for exclusion For exclusion use instead CL s = CL sb /CL b better for small number of expected events If CL s ≤ 0.05  95% C.L. exclusion Does not exclude where there is no sensitivity

Sensitivity H 0 and H 1 are not separated at all Large CL sb No sensitivity Not able to exclude H 1 22/03/13 Helsens Clement Top LHC-France 13 H 0 and H 1 well separated very small CL sb very sensitive No signal, able to exclude H 1 May want to reconsider modeling if -2ln(Q obs ) >10 or <-15

Incorporating systematics Our Monte-Carlo model can never be perfect, as well as our theoretical predictions This is why systematics uncertainties are here for, no? We parameterize our ignorance of the model predictions with nuisance parameters. Systematics are usually called nuisance parameters What we usually do (in hybrid/frequentist methods) Define those nuisance parameters for 2 variations, typically give the +/- 1σ and allow them to vary in a range Assume a probability density for the nuisance parameters Gaussian (most used) But could be also LogNormal, unconstrained Assume some interpolation methods Linear  MINUIT can run into troubles at 0 Parabolic 22/03/13 Helsens Clement Top LHC-France 14

Fitting/Profiling Fitting == Profiling nuisance parameters Fitting or Profiling nuisance parameters should/could be seen as an optimization step Usually use MINUIT to fit the nuisance parameters A nuisance parameter could be for example the b-tagging efficiency Imagine the performance group is not able to measure the b- tagging efficiency very accurately: Large values of the b-tagging systematic will be observed Could even be the dominant one What if we see that data/MC agrees very well in control regions? Shall we estimate sensitivity without profiling? Might be better to use the information in data! 22/03/13 Helsens Clement Top LHC-France 15

Deeper in the Log Likelihood ratio Models with large uncertainties will be hard to exclude: Either many different nuisance parameters Or one parameter that has a big impact : Maximize LLR assuming H 1 : Maximize LLR assuming H 0 22/03/13 Helsens Clement Top LHC-France 16 are function of the nuisance parameters that are fitted

What is done in practice Fit twice: Once assuming H 0, once assuming H 1 Two sets of fitted parameters are extracted When running Toy-MC should: Assume H 0 Assume H 1 So at the end of the day, 4 fits are needed to have one 2 expected values to be used to compute the confidence level 22/03/13 Helsens Clement Top LHC-France 17

Building an analysis using profiling If you are running a cut and count analysis, you can not use profiling of nuisance parameters, all the systematics have the same impact for all the samples: All normalization, no shape If you are using a shape analysis that is tight enough there is also maybe no need to use profiling But if you have sidebands (enough bins or channels to constrain the nuisance parameters), you might want to consider using profiling Number of things needs to be checked (not a complete list!!) : If the fitted nuisance parameters are constrained in data Pull distributions: (fit-injected)/(fitted error) Fitted error 22/03/13 Helsens Clement Top LHC-France 18

Fitting or not fitting? See Favara and Pieri, hep-ex/ Some channels or bins within channels might be better off being neglected when estimating the sensitivity in order to gain discrimination power If the systematic uncertainty on the background B exceeds the expected signal S, then reduce sensitivity Fitting background helps to constraint them Sidebands with little signal provide useful information, but they need to be fitted 22/03/13 Helsens Clement Top LHC-France 19

Toy MC example: Binning All cases : 500 GeV t’, 100% mixing to Wb Only consider ttbar as a background Systematic added (norm only) 50% in total for BG (same in all bins) Comparison made for Statistical only nuisance parameters Statistical + Systematics no profiling Statistical + Systematics profiling 22/03/13 Helsens Clement Top LHC-France 20

Toy MC example: Case 1 CL s (STAT only) = 1.5e -5 CL s (STAT+SYST) = 2.9e -5 CL s (STAT+SYST PROF) = 2.2e -5 22/03/13 Helsens Clement Top LHC-France 21 Nominal distributions for background and signal

Toy MC example: Case 2 CL s (STAT only) = 1.5e -5 CL s (STAT+SYST) = 2.8e -5 CL s (STAT+SYST PROF) = 1.4e -5 22/03/13 Helsens Clement Top LHC-France 22 Set the first bin to: Signal: 0 Background: 100 S/B = 0

Toy MC example: Case 3 CL s (STAT only) = 1.2e -5 CL s (STAT+SYST) = 2.0e -4 CL s (STAT+SYST PROF) = 1.7e -5 22/03/13 Helsens Clement Top LHC-France 23 Set the first bin to: Signal: 10 Background: 100 S/B = 0.1

Toy MC example: Summary 22/03/13 Helsens Clement Top LHC-France 24 S,B, S/B (first bin) Log(1+S/B) (first bin) CLs (STAT) CLs (STAT+SYST) CLs (STAT+SYST Prof) Case 10.25, 0.35, Case 20, 100, Case 310, 100, Case 41, 100, Case 5100, 100, Case 61, 1, If not fitting  bins with large B and medium S degrades sensitivity by a lot! Fitting helps to recover sensitivity!

Toy MC example: Profiling In the next slides I will take an other toy-MC example Signal: Gaussian signal BG1: linearly falling background BG2: flat background Data are fluctuations around the expected Monte-Carlo predictions Systematics Normalization only: Luminosity ± 5% for all the samples BG1: ± 20% BG2: ± 20% One shape systematic affecting BG1 and BG2 22/03/13 Helsens Clement Top LHC-France 25

Optimize the binning 1/4 Two competing effects: 1) Split events into classes with very different S/B improves the sensitivity of a search or a measurement Adding events in categories with low S/B to events in categories with higher S/B dilutes information and reduces sensitivity  Pushes towards more bins 2) Insufficient Monte-Carlo can cause some bins to be empty, or nearly so. Need reliable predictions of signals and backgrounds in each bin  Pushes towards fewer bins 22/03/13 Helsens Clement Top LHC-France 26

Optimize the binning 2/4 It doesn’t matter that there are bins with zero data events in any case, most of the time a search analysis is build blinded so you do not know a-priori if all your bins will be populated with data events there’s always a Poisson probability for observing zero events The problem is wrong prediction: Zero background expectation and nonzero signal expectation is a discovery! Never have bins with empty background predictions Pay attention to Monte-Carlo error keep in mind that the statistical error in each bin is an un-correlated nuisance parameter Do not hesitate to merge bins in order to reduce the statistical error in each bin below a certain threshold For example ΔB/B < 10% 22/03/13 Helsens Clement Top LHC-France 27

Optimize the binning 3/4 Binning (1) is obviously too fine Binning (2) seems more or less okay Binning (3) is obviously too coarse  reduced sensitivity 22/03/13 Helsens Clement Top LHC-France 28 (1) (2) (3)

Optimize the binning 4/4 Binning (1) has ΔB/B always > 10% Binning (2) has ΔB/B always < 10% Binning (3) has a very small ΔB/B but only 2 bins!!! Take binning 2 in the following (could even have considered a non-uniform binning) 22/03/13 Helsens Clement Top LHC-France 29 (1) (2) (3)

Pre-fit plot Very large systematics at low values (Pseudo) Data compatible with MC predictions 22/03/13 Helsens Clement Top LHC-France 30

Shape systematic Real shape systematics Asymmetric 22/03/13 Helsens Clement Top LHC-France 31

Context of the study Will consider 3 cases in the following: No fitting Fitting the shape systematic only Fitting all the systematics 22/03/13 Helsens Clement Top LHC-France 32

No fitting CL s expected =  not able to exclude 22/03/13 Helsens Clement Top LHC-France 33

Fitting the shape systematic 1/2 CL s expected =  not able to exclude, but much better result Reduce the uncertainty 22/03/13 Helsens Clement Top LHC-France 34 Post-Fit considering H 0 Shape: ± 0.252σ Post-Fit considering H 1 Shape: ± 0.256σ

Fitting the shape systematic 2/2 We have a constraint here H 0 : Shape: ± 0.252σ Pulls are wide, meaning that the shape systematic is also absorbing the others systematics 22/03/13 Helsens Clement Top LHC-France 35 Pull Injected/fitted Fitted error

Fitting all systematics 1/5 Post Fit considering H 0 : BG1_XS: ± 0.81 σ BG2_XS: ± 0.81 σ Shape: ± 0.38 σ Luminosity: ± 0.98 σ 22/03/13 Helsens Clement Top LHC-France 36 CL s expected =  still not able to exclude, but better results Reduce the uncertainty Post Fit considering H 1 : BG1_XS: ± 0.94 σ BG2_XS: ± 0.82 σ Shape: ± 0.39 σ Luminosity: ± 0.97 σ

Fitting all systematics BG1_XS 2/5 22/03/13 Helsens Clement Top LHC-France 37 No constraining power H 0 : BG1_XS: ± 0.81σ Pulls, error and fitted values look good Pull Injected/fitted Fitted error

Fitting all systematics BG2_XS 3/5 22/03/13 Helsens Clement Top LHC-France 38 No constraining power H 0 : BG2_XS: ± 0.81σ Pulls, errors and fitted values looks good Pull Injected/fitted Fitted error

Fitting all systematics Luminosity 4/5 22/03/13 Helsens Clement Top LHC-France 39 No constraining power H 0 : Luminosity: ± 0.98σ Pulls, errors and fitted values looks good Pull Injected/fitted Fitted error

Fitting all systematics Shape 5/5 22/03/13 Helsens Clement Top LHC-France 40 There is a constraining power here H 0 : Shape: ± 0.38σ Pulls, errors and fitted values looks good Shape Systematic is obviously too large! Maybe comparing two models in a region of phase space where one one them is obviously wrong… Pull Fitted error

Constraining the nuisance parameters One can argue (during internal review for example) that fitting nuisance parameters in data is similar to a measurement So if for example one fits in data the b-tagging efficiency to be (in units of σ) 0.5 ± 0.2σ Does this means we can derive a measurement of the b-tagging efficiency with 0.2σ precision? Or maybe like in the Toy Monte-Carlo, the error is over-estimated and that in your signal region (that most of the case does not contain signal) you observe that your data/MC comparisons are within the systematics 22/03/13 Helsens Clement Top LHC-France 41

Fitting overall parameters An other solution than profiling could be to fit overall parameters or normalizations factors Those normalization factors should be seen as correction factors This can be used for example: When you have a dominant background When you have enough side-bands to constraint the parameter When you have evidence that data/MC in control region is not great and your systematics uncertainties are very large 22/03/13 Helsens Clement Top LHC-France 42

Fitting overall parameters, example 1/4 Example of Ht+X: ATL-CONF Using HT distribution as discriminant: scalar sum of all the objects p T in the event “Poor mans way” to discover new physics, and if something unexpected appears in HT tails, either mis-modeling or signal Can not use HT to identify the type of new particle… This analysis is suffering from large systematics and obviously what seems to be a mis-modeling of HT 22/03/13 Helsens Clement Top LHC-France 43

Fitting overall parameters, example 2/4 Obvious incorrectness of the the ttbar heavy/light flavor background, especially in the 6jets 4 tags in the low HT region= control region This analysis will fit two free parameters ttbar light and HF Ttbar HF: 1.35 ± 0.11 (stat)ttbar + light: 0.87 ± 0.02 (stat) 22/03/13 Helsens Clement Top LHC-France 44

Fitting overall parameters, example 3/4 22/03/13 Helsens Clement Top LHC-France 45 No evidence of signal, no strong mis-modeling outside of the systematic bands When un-blinding the analysis have not found any signal This analysis will fit two free parameters ttbar light and HF Ttbar HF: 1.21 ± 0.08 (stat)ttbar + light: 0.88 ± 0.02 (stat)

Fitting overall parameters, example 4/4 No evidence of signal, no strong mis-modeling outside of the systematic bands When un-blinding the analysis have not found any signal This analysis will fit two free parameters ttbar light and HF Ttbar HF: 1.21 ± 0.08 (stat)ttbar + light: 0.88 ± 0.02 (stat) 22/03/13 Helsens Clement Top LHC-France 46

Other tips that could help performing a profiled analysis Merging channels: If you are performing an analysis using leptons (for example single lepton analysis) you can merge electron and muon for example, if there is no reason the physics is different between the 2 lepton flavors  this will help to gain statistics in the tails Merging Backgrounds: If you are suffering from low Monte-Carlo statistic for small background and if the shape of those small backgrounds looks similar, why not merging them in a single sample! Merging systematics: It is also possible to merge small systematics that have the basically the same effect. For example, if you have several lepton systematics (like trigger SF, Reco SF, ID SF) then might be better to merge them into a single systematic Note that when merging channels or background, the systematic treatment should remain consistent 22/03/13 Helsens Clement Top LHC-France 47

Other tips that could help performing a profiled analysis You might also want to consider smoothing of histograms Be also very cautious here, because if there is no shape to start with, smoothing algorithm might invent a shape… Keep in mind that profiling nuisance parameter is at the end of the day a fit (using MIMUIT) So if you give to MINUIT crapy/shaky templates, it can not do miracles… Number of parameters, their variations are the most important thing when doing profiling 22/03/13 Helsens Clement Top LHC-France 48

Summary Hope you know everything about profiling now Profiling should be really seen as an optimization step that helps to recover the degradation due to systematics Now time for discussion References: Mclimit: cdf.fnal.gov/~trj/mclimit/production/mclimit.html Roostat: Wikipedia has a lot of interesting and detailed information about statistics!! 22/03/13 Helsens Clement Top LHC-France 49

Bonus slides 22/03/13 Helsens Clement Top LHC-France 50

Toy MC example: Case 4 CL s (STAT only) = 1.5e -5 CL s (STAT+SYST) = 3.2e -5 CL s (STAT+SYST PROF) = 1.6e -5 22/03/13 Helsens Clement Top LHC-France 51 Set the first bin to: Signal: 1 Background: 100 S/B = 0.01

Toy MC example: Case 5 CL s (STAT only) = 8.0e -6 CL s (STAT+SYST) = 3.2e -5 CL s (STAT+SYST PROF) = 1.7e -5 22/03/13 Helsens Clement Top LHC-France 52 Set the first bin to: Signal: 100 Background: 100 S/B = 1

Toy MC example: Case 6 CL s (STAT only) = 1.3e -5 CL s (STAT+SYST) = 2.4e -5 CL s (STAT+SYST PROF) = 2.3e -5 22/03/13 Helsens Clement Top LHC-France 53 Set the first bin to: Signal: 1 Background: 1 S/B = 1

An other Likelihood ratio 1/4 One being used in RooStat (hep- ex/ ) and at the LHC Here the fitting is not an optimization, it is useful for the correctness of the model µ is the best fit value of the signal rate Should distinguish between µ=0 (zero signal, SM, Null hypothesis, H 0 ) and µ >0 (test hypothesis, H 1 ) 22/03/13 Helsens Clement Top LHC-France 54 Maximize L for specified µ Maximize L, fit is done on Data ^

An other Likelihood ratio 2/4 Wald approximation for profile LLR (1943) Non central chi-square for -2lnλ(µ) (Wilks’s theorem): 22/03/13 Helsens Clement Top LHC-France 55 Sample size

An other Likelihood ratio 3/4 Asimov dataset: to estimate the median value of -2lnλ(µ), consider a special dataset where all the statistical fluctuations are suppressed 22/03/13 Helsens Clement Top LHC-France 56 Assimov value of -2lnλ(µ) gives the non-centrality paramter

An other Likelihood ratio 4/4 At the end of the day we have an asymptotic formulae Much faster than running toy-MC Very good approximation in most of the cases Poisson discreteness can make it break down 22/03/13 Helsens Clement Top LHC-France 57