Proposals for near-future BG determinations from control regions Renaud Bruneliere – Freiburg
What we are doing/planning for 2010 data taking Outline Motivations - Basics What we are doing/planning for 2010 data taking Open issues Warning: not many plots in this talk…
20% uncertainty in tails on VB+jets with 1 fb-1, is that feasible ? The Facts SUSY without prejudice at the LHC - John Conley (Universität Bonn) – at SUSY10: https://indico.desy.de/materialDisplay.py?contribId=225&sessionId=232&materialId=slides&confId=2955 Percentage of MSSM24 points found by an “ATLAS”-like search: ATL-PHYS-PUB-2010-010 Multijets + 0-lepton searches dominated by vector boson + jets backgrounds in tails (regions used to compute significance) 20% uncertainty in tails on VB+jets with 1 fb-1, is that feasible ? And how much for 10 – 40 pb-1 ? Do we have tools to reach this ?
Why a data-driven background estimation ? Till summer 2010, were mostly doing data / Monte Carlo comparison. Encouraging results but… VB+jets pure MC estimation means: Large uncertainties on expected rate (cross-section, luminosity) Strongly depends on detector simulation (like Jet Energy Scale uncertainties ~ 10%) Move progressively to data-driven methods. However data-driven method does not mean “no Monte Carlo”
What means data-driven background estimation ? By data-driven, I mean: Control sample: Process/phase space as close as possible from background in signal region Signal free whenever possible Acceptance correction: Can be estimated from data like with ABCD method (but relies on strong assumptions usually based on MC) More commonly obtained from MC: A=N(SR,MC)/N(CS,MC) => sensitive to Monte Carlo & theory uncertainties through A (ratio) Background contamination in control sample (from MC and/or data) e.g. ttbar->llvv+jets Estimated number of events in signal region e.g. Zvv+(2jets) Acceptance correction (from MC or data) Observed number of events in control sample (from data) e.g. Zll+(2jets) ETmiss C A D B MT
Uncertainty on data-driven estimate Error propagation formula: Tradeoff between these 3 terms Goal: for every control sample, try to chose phase space to optimize NSR But, if A and B are based on Monte Carlo, NSR should rely mostly on NCS: NCS is statistically well defined (Poisson) A relies on subjective comparison of models (Pythia vs Herwig frag.), variation of parameters (scales) => assume to be described by a prior pdf (gaussian,uniform,BW) with rms = difference.
Control samples: Zll+jets What are the possible control samples ? Z->ll+jets: Process very close to Z->vv => A small if same phase space cuts B, A << NCS NCS ~ NSR/10 => statistically limited ! Solution: relax some of the phase space cuts (like Meff cut) to increase NCS (and so A !) Conclusion: Need to extrapolate with MC from Meff core to tails (Meff > 1000 GeV) What are theory uncertainties in tails when doing so ?
Control samples: Wlv+jets What are the possible control samples ? W->lv+jets: W/Z ratio & W+jets cross-section well studied as SM process NCS ~ NSR => not statistically limited B large (ttbar) for Njets 2 Contamination from signal Possible solution(s): Estimate W+jets and ttbar simultaneously. But N(W+jets)/N(ttbar) difficult to control ttbar control sample (reconstruct masses and isolate them) W charge asymmetry: N(W+)-N(W-) ~ 0.2-0.3 (N(W+)+N(W-)) Solve both ttbar and signal contamination Conclusion: Probably best method for first data W charge assymetry looks interesting, but need good understanding of pdf (in Meff tails)
Control samples: prompt +jets What are the possible control samples ? +jets: NCS ~ NSR => not statistically limited B could be large (QCD) and difficult to control A largely unknown ? Conclusion: Results obtained by CMS seem encouraging Requires some careful theory studies End goal is to combined all 3 control samples in a single estimate of Wlv+jets and Zvv+jets backgrounds The different control samples will be weighted by NSR
Theory uncertainties on VB+jets Currently existing/used tools in ATLAS for VB+jets Alpgen(+Herwig+Jimmy) (v2.13) (See Keith Edmonds talk) Wlv+5jets (Wlvbb+3jets), Zll+5jets (Zllbb+3jets), Zvv+5jets, (possibly +5jets) Theory uncertainty: vary scales: renormalization scale (ktfac), factorization scale (qfac), dynamic scale choice (iqopt) Matching scales Eclus,Rclus Sherpa (v1.2.2) (See Frank Siegert talk) Only starting using it for searches. Vary scale by 0.5, 2.0 Matching scale 10 GeV MCFM (see James Buchanan talk) (Blackhat in future ?) Dynamic scale choice: HT/2 Theory uncertainty: vary scales, use different pdf sets/members Apply parton -> hadron corrections obtained with Pythia/Herwig Any obvious MC tool missing ?
Theory uncertainties: open issues ? Reminder: theory uncertainties on VB+jets are entering only through A, so we need a robust estimate: On Ratios (W/Z, W(SR)/W(CS), W+/W-) and NOT rates When extrapolating numbers into tails (Meff) Our channels are inclusive => 0-lepton+2jets (PT>40,||<2.5) means some fraction of 3jets and 4jets events Questions: Can we rely on LO scale changes as a robust theory uncertainty on ratios or does it factorize completely ? Do we need parton-level NLO predictions (even approximate) to compare with Alpgen/Sherpa (at parton-level / at hadron level with PowHeg) ? Any way to combine NLO VB+2jets with NLO VB+3jets,… or do we have to compare exclusive jet multiplicities ? When looking into high Meff (HT) tails with NLO programs, which scale choice should we use ?