Methodology and examples to determine fake rate separate signal from background Using fit on sideband. Using independent control sample. Using control region (ABCD method). Subtraction method.
Estimation using sideband In the Z →e + e - case, there is a well-defined sideband, one can perform the extrapolation by fitting S+B and to separate signal from background. –The signal and background normalization will be extracted directly from fit. – The electron ID efficiency is the ratio of signal extracted from loose ID cut and probes under the signal peak. Z→ee on top of di-jet (JF17) Separating Z→ee from di-jet Using InsituEgammaPerformance package developed by Maria Fiascaris and Oliver Arnaez
Estimation using some independent sample (control sample) Sometimes in order to estimate fake rate of sample A, we use independent sample B. – In principle, B must have higher statistics than A (especially in the early data case). For example: Use di-jet/photon-jet to estimate jet fake rate of wjet/zjet etc. The other way is to extract some shape of variables for sample A from sample B. – Obtain the normalization with fit.
Estimation of W+jets(j →e) using -jet Electron Isolation applied beforehand (wjet) |tight (e id) = ( jet) |tight (e id) ( (wjet) |loose (e id) / ( jet) |loose (e id) ) These plots show the results in the context of H→w( v)w(ev). The reason of using -jet for the estimation is due to high statistics of -jet (similar reason as using di-jet). -jet’s cross-section is a factor of 10 of wjets. Diagrams of -jet are close to wjets. Both of them are quark enriched jets. After tight electron isolation, the systematics uncertainty is less than ~20% wjet -jet
Estimation of w(ev)jet(j →e) using di-jet Different jet components (quark/gluon/HF). have different fake rates (upper left plot). It looks that the isolation can reduce the discrepancy (upper right plot). it shows similar trend in very high pt region (page 23). Isolated photons can convert into electrons: if the fraction of those electrons in wjet is different from that in di-jet, we need to take this effect into account. etcone20/et<0.3 How do different components of jets perform ? Fake rate
Result of applying jet fake rate to W(ev)jet(j->e) After applying isolation, the predicted and observed results are more consistent. It can be further improved if the effects from photon conversions are taken into account. N events normalized to 200 pb -1
The dominant background for the measurement of inclusive W(ev) is QCD jets. E T miss is a discriminating variable. The shape of E T miss for QCD jets can be obtained by using events passing photon selection. Photon selection can also be used to estimate the shapes of QCD shower shape variables. One more example: QCD subtraction in the measurement of W(ev) using photon selection W+jets With this MC statistics Uncertainty on accuracy ≤7.5%
Using control region (ABCD method) signal var1 background signal var2 background Choose two variables: – variable1 and variable2 are weakly correlated. – A→B or A→C from signal region to bkg enriched region. – In region B+D, extract background shape for var2. In region C+D, extract background shape for var1. – With the shape(s), we can fit on data. Extrapolate N bkgA from fit. F = f s PDF(s)+(1-f s )PDF(b) – Or estimate background normalization in signal like region N bkgA = N B /N D *N C. – For signal shape, sometimes we take advantage of Z →ee events(tag and probe method). AB CD var1 var2 signal region Bkg region signal region Bkg region
Example : Measure of fake electron in w(ev)jet from QCD Di-jet wjet Two variables chosen : E T Miss (var 1) and E thad /E t (var 2). All the ID cut except hadronic leakage is applied beforehand. In the background enriched region, one can get the shape of E T Miss for Di-jet. The estimated shape of E T Miss of Di-jet is consistent with true one. Background fractions: MC : 0.44 0.02 (stat.) Fit estimate: 0.46 0.02 (stat)
Subtraction method XSection/Eff.(A) Cut or Eff.(A) Data (mixed A and B) XSection Cut or Eff.(A) Sample A Yaixs: xsection /Eff.(A) Suppose we have two samples A and B, one (sample B) is more sensitive to some cut, the other one (sample A) is less sensitive to that cut (as left plot shows) as the left diagram displays. How to estimate each contribution of them given there are mixed in data? step 1: Need to know the cut efficiencies on A as we scan the cut - Eff.(A). Obtained from some other sample independently to make the analysis data driven. step 2: Plot contribution/Eff.(A) (e.g. (measured) Xsection/Eff.(A)) for data as a function of cut or Eff.(A). Because A will be more or less flat (right plot), what we are supposed to see from data is that the curve more or less goes asymptotically to a flat region as cut being tighter and tighter. step 3: The asymptotic flat value is the contribution of A (can be derived from the fit of asymptotic function). Step 4: Subtraction of the asymptotic flat value from data, we will obtain the contribution from B. Sample B Sample A Sample B
Example: W+jets estimation from data In the context: estimate W(→ )+jet contribution to H→W(→ )W(→e ) +0jet Tight electron id Loose electron id We are expected to see mixed contribution as red curve (bottom plot) shows from data. Reducible background W+jets (blue curve) drops very fast when approaching tight electron id region. For irreducible background, /electron eff. is more or less flat as a function of electron efficiency. We can estimate the flat value from low electron eff. region of red curve. The subtraction of the flat value from red curve is the estimated w+jets contribution ( green curve). It is well consistent with true w+jets (blue curve) at our working point. In order to make the method full data driven, electron id efficiency can be obtained independently from Z→ee (top plot)
Last example : Estimation of fake electrons in ttbar Pdf = f bkg *BKG+(1-f bkg )*SIGNAL where BKG=f HF *HF+(1-f HF )*FAKE Values from MC truth Values from fit Tot BKG content9.97% 9.5 0.8% BKG from HF6.21% 7.7 0.7% BKG from fakes2.92% 1.8 0.7% Goal : estimation of fake and heavy flavor(HF) bkgs to isolated electron in ttbar sample. Separate more than two bodies
We have the estimated bkg normalization, rejection etc. Is it possible or necessary to converge some common tools to do the fake electrons estimation ? input PDF from control region, control sample, Z->ee etc. data Transparent tools: Handle the fit and interpret the result o One advantage of common tools is to make sure we use the common definition, results from individual persons can be easier compared. o Experience and ideas can be easier shared. Everyone can repeat the analyses.
Conclusion Study on fake electrons is crucial for many processes and in addition to e-gamma, many other groups SM, Higgs, top, exotic, Susy have to struggle with it. Briefly introduce current methods to estimate the fake electron rates and try to categorize them a little bit
Back-up
Background estimation: ZQ(b/c)Q Measure the shape of Zqq from a light q enriched region. (analysis cut except for using egamma instead of 2-Loose electrons). Extract the chosen shower shape variable (i.e. R37) from a fit to DATA. Validate the MC. Extract the normalization from a restricted fit in R37 from DATA Extrapolate from egamma Loose electrons using the MC to predict the Zqq contribution. ZQQ predicted (loose electrons) = Data –Zqq Extrapolated (loose electrons). Motivation : ZQQ could be % of ZZ* at low higgs mass Procedure: At 200 pb -1, due to mostly low statistics and ZQq contamination, uncertainty is ~50%
Estimation W+jets using di-jet Good agreement with MC expectation In the context of H→WW