Download presentation
Presentation is loading. Please wait.
Published byBertram Walters Modified over 5 years ago
1
Hγγ Analysis Techniques: How can we get the most from a small signal?
Kyle Armour, Satyaki Bhattacharya, Jim Branson, James Letts, Tony Lee, Vladimir Litvin, Harvey Newman, Sergey Schevchenko
2
Jim Branson for UCSD/Caltech
What’s New in this Talk Use of large background samples produced by Caltech group. Including full jet-jet and g-jet samples. Use of reconstructed vertex in the tracker (instead of MC cheating to get vertex) This is now realistic and mass resolution deteriorates somewhat. Use of tracker for isolation variable Other technical improvements in analysis. Some conservatism has set in. April 14, 2004 Jim Branson for UCSD/Caltech
3
Performance is still “Too Good”
With somewhat conservative handling of low MC statistics, the luminosity to achieve a 5s discovery at M=120 GeV is (in round numbers): 2 fb-1 for jet-jet background 2 fb-1 for g-jet background 0.5 fb-1 for gg background We have not yet combined the backgrounds If there are no big problems, we should add the needed luminosities to get the total needed to discover the Higgs at MH=120 GeV. April 14, 2004 Jim Branson for UCSD/Caltech
4
Jim Branson for UCSD/Caltech
Outline Overview of analysis method If you don’t understand the method, you will focus on the wrong issues. Some details of analysis Separation into categories based on shower size Treatment of mass distribution Neural Net to optimize use of kinematic and isolation variables MC statistics issues Where to go from here Some new and disturbingly good results. April 14, 2004 Jim Branson for UCSD/Caltech
5
Sort signal and background events into bins in the same way.
s b Gray Box: Sort signal and background events into bins in the same way. Use mass, isolation, kinematics and shower size to sort events into bins Higgs Mass Hypothesis log(s/b) background signal Background normalization and characteristics from sidebands. single histogram combines histograms from categories using s/b in each bin Summary of Analysis Methods compute CL using random (s+b) and b-only trials from expectation in histogram April 14, 2004 Jim Branson for UCSD/Caltech
6
Jim Branson for UCSD/Caltech
Simple Example: Caltech Analysis; |m-mH|<2; kinematic cuts; isolation cuts. Gray Box: Sort signal and background events into bins in the same way. Higgs Mass Hypothesis Use mass, isolation, kinematics and shower size to sort events into bins background signal accepted trash trash bin is probably off the left of the graph log(s/b) April 14, 2004 Jim Branson for UCSD/Caltech
7
Significant improvement in analysis performance.
Using Categories and the Mass divide events into 4 barrel and 4 endcap categories using r9; Compute s/b from mass distributions. (maybe make some cuts) Higgs Mass Hypothesis Use mass and r9 to sort data into bins Significant improvement in analysis performance. s/b not normalized to cross sections log(s/b) April 14, 2004 Jim Branson for UCSD/Caltech
8
Combining Mass and Other Information
NeuralNet: kine and iso information X Fit functions ln(s/b) from gg mass ln(s/b) from Neural Net From fits to s and b. Nota bene: The resulting plot is just a histogram of events. Each event is plotted in a bin computed from the log likelihood ratio. Signal and background are treated identically. All improvements due to combination must be real. = Rapidly falling background distribution in region with significant signal. Log Likelihood ratio per event April 14, 2004 Jim Branson for UCSD/Caltech
9
Jet-jet Performance Curves
Background Fraction Signal Efficiency April 14, 2004 Jim Branson for UCSD/Caltech
10
Sort signal and background events into bins in the same way.
s b Very Important Point Sorting can easily be sub-optimal but it shouldn’t lead to a result which is “too good” Gray Box: Sort signal and background events into bins in the same way. Higgs Mass Hypothesis Background normalization and characteristics from data. log(s/b) background signal The various colors represent different categories. Summary of Analysis Methods Low MC statistics can lead to a result which is “too good” due to: downward fluctuation of bkgd in a bin Over-training of NN my trick to improve statistics. For jet-jet, MC weight is about 700 times signal MC weight and about 19 times a data event at inverse fb! compute CL using random (s+b) and b-only trials from expectation in histogram April 14, 2004 Jim Branson for UCSD/Caltech
11
The Current Performance
Luminosity in inverse fb needed for a Median discovery CL equivalent to 5 sigma Irreducible background 0.5 fb-1 g+jet background 2.1 fb-1 Jet-jet background 2.0 fb-1 We worry that this performance is “too good” Are background simulations correct enough No problem when we have data Are we being tricked by low statistics of MC Have studied this more carefully and taken a conservative approach. Clearly the performance is much better than previous simpler analyses. April 14, 2004 Jim Branson for UCSD/Caltech
12
Analysis Details
13
Jim Branson for UCSD/Caltech
r9 and Categories signal categories unconverted background (Sum of 9)/ESC (uncorrected) Selects unconverted or late converting photons. Better mass resolution Also discriminates against jets. April 14, 2004 Jim Branson for UCSD/Caltech
14
Categories Cut super 0: Both Photons in Barrel Category 1 2 3 4 events
r9min>0.95 0.95>r9min>0.90 0.90>r9min>0.75 0.75>r9min super 0: Both Photons in Barrel Category 1 2 3 4 events 1028 360 363 342 5: At least 1 Photon in Endcap 6 7 8 9 1258 374 275 April 14, 2004 Jim Branson for UCSD/Caltech
15
Reconstructed Mass r9>0.95, Barrel
linear correction of S25 for S1/S9 Includes most unconverted photons plus many others which also have good resolution. April 14, 2004 Jim Branson for UCSD/Caltech
16
Reconstructed Mass 0.90<r9<0.95
Peak resolution deteriorates a bit and tail develops. April 14, 2004 Jim Branson for UCSD/Caltech
17
Mass r9>0.95; at least 1 g in Endcap
April 14, 2004 Jim Branson for UCSD/Caltech
18
Mass r9<0.90; Endcap Worst category has about 3 times the resolution of best. April 14, 2004 Jim Branson for UCSD/Caltech
19
Choosing the Best Vertex
Marco improved the performance of the vertex finder somewhat by summing the PTs of tracks coming from a vertex. We now calculate mass using the found vertex. (Found vertex z) – (true z) cm April 14, 2004 Jim Branson for UCSD/Caltech
20
The Neural Net Input Variables
We really have not yet optimized the performance by trying many set of inputs. This is pretty much what we thought a priori would be a good set of variables. Jet-jet and g-jet SCiso1, SCiso2, ET1/(ET1+ET2), ET2, etn1[4], etn2[4], |h1-h2| Irreducible background ET1, ET2, ESC1, ESC2, |h1-h2| Training varies April 14, 2004 Jim Branson for UCSD/Caltech
21
Training the Neural Net for g+jets or jet-jet
Train NN in 2 super-categrories Both photons in barrel At least 1 in endcap (seems to need larger background sample) Keep best validation Stability for now cycles Fit NN in 2X4=8 r9 categories. April 14, 2004 Jim Branson for UCSD/Caltech
22
Training and Fitting NN for jj
Fits are used for binning; individual events are binned jet-jet barrel super-category jet-jet cat1 6 events in signal region April 14, 2004 Jim Branson for UCSD/Caltech
23
Training and Fitting the NN for g+j
γ-jet barrel super-category γ-jet cat1 April 14, 2004 Jim Branson for UCSD/Caltech
24
Jim Branson for UCSD/Caltech
NN Inputs: ET1 and ET2 jets γ+jet γγ signal April 14, 2004 Jim Branson for UCSD/Caltech
25
Jim Branson for UCSD/Caltech
NN Inputs: ET1/(ET1+ET2) γγ signal jets γ+jet April 14, 2004 Jim Branson for UCSD/Caltech
26
Jim Branson for UCSD/Caltech
NN Inputs: |η1- η2| signal jets γ+jet γγ April 14, 2004 Jim Branson for UCSD/Caltech
27
NN Inputs: SuperCluster Energies
jets jets γγ γ+jet γ+jet γγ signal signal April 14, 2004 Jim Branson for UCSD/Caltech
28
NN Inputs: Tracks in a cone
γγ γγ signal signal γ+jet γ+jet jets jets April 14, 2004 Jim Branson for UCSD/Caltech
29
NN Inputs: ECAL Isolation
signal signal γ+jet jets jets γγ γ+jet γγ April 14, 2004 Jim Branson for UCSD/Caltech
30
Mass Hypothesis Independence
Train NN with three signal files merged as input: 120, 130, 140 GeV signal background masses in the range GeV. Analyze for Higgs Mass hypothesis of 120 GeV binning just the 120 GeV signal file. All mass dependence is in the mass variable. NN performance does not seem to depend on mass. Varying Higgs mass hypothesis appears to be very simple. We could now just scan mass hypothesis in the range from using this training. April 14, 2004 Jim Branson for UCSD/Caltech
31
Improving Background MC “Statistics”
Use difference between measured mass and generated mass for signal. Generated mass represents Higgs mass hypothesis. Background mass is smooth and uncorrelated with NN variables. Smear background mass to improve MC estimate Each bkgd event is plotted 21 times between m-10 and m+10 GeV. Larger sample of events on mass peak. April 14, 2004 Jim Branson for UCSD/Caltech
32
The MC Samples generated simulated after cuts weight fb/ev Weight
Box Born 1.46X106 1.74X106 same 292K 0.025 0.1 g+jet 1.3X108 1.0X106 106K 0.7 2.8 Jets 1010 4.2X106 45K 9.5 38 Signal (120) 6000 4374 0.015 0.06 We should attempt to make these 1 or less. April 14, 2004 Jim Branson for UCSD/Caltech
33
Jim Branson for UCSD/Caltech
g+jets Background April 14, 2004 Jim Branson for UCSD/Caltech
34
Jim Branson for UCSD/Caltech
jet-jet Background old April 14, 2004 Jim Branson for UCSD/Caltech
35
Irreducible Background
April 14, 2004 Jim Branson for UCSD/Caltech
36
Rebinning or Smoothing jet-jet
rebin so that every bin has at least 10 MC entries smooth and extrapolate Rebinning or smoothing occurs in histograms for 8 categories (which tend to be smooth and continuous). These are then merged into one plot using s/b. April 14, 2004 Jim Branson for UCSD/Caltech
37
CL for jet-jet Background
Generate random “background only” and “signal + background” trails according to expectation. Use Log Likelihood estimator to order trials. Calculate median CLs. 5 sigma discovery is usually easier than 5 sigma exclusion. April 14, 2004 Jim Branson for UCSD/Caltech
38
CL for g+jet Background
g+jet and jet-jet backgrounds behave similarly. Luck will play a role in how quickly an experiment can find the higgs. April 14, 2004 Jim Branson for UCSD/Caltech
39
CL for Irreducible Background
Poisson region April 14, 2004 Jim Branson for UCSD/Caltech
40
Jim Branson for UCSD/Caltech
Do you feel lucky? While the median CL is a reasonable thing to quote, a discovery can be made much more quickly if we get a “lucky” dataset. It could also come much more slowly. One sigma fluctuation can change significance drastically. Different analyses can vary on the same data. April 14, 2004 Jim Branson for UCSD/Caltech
41
Jim Branson for UCSD/Caltech
Some Tests I have Tried Let background normalization float separately for NN<0.5 and NN>0.5. No big change in standard results. Use only two categories: barrel and encap. Required lumi increases by about 80% Use only mass distribution with a cut on NN output at 0.5 Required lumi increases 100% for irreducible Stays about the same for other two! Using 4 GeV mass window, no categories, and counting events requires 16 fb-1 for jet-jet. Consistent with earlier analysis. April 14, 2004 Jim Branson for UCSD/Caltech
42
Jim Branson for UCSD/Caltech
Not Optimized We have NOT optimized this analysis yet. First guess at NN inputs Some trying variables at NN output level No optimized cuts No variation in training to optimize result No optimization of categories Most choices have been based on experience and intuition so far. Primarily aiming at something that is technically working and conservative at this point. April 14, 2004 Jim Branson for UCSD/Caltech
43
Combine the Backgrounds?
We have so far resisted the urge to combine the backgrounds to get an overall performance. A little more stability is needed To first order, one would add the luminosities required for each background. This assumes no pathologies. The two big backgrounds will be very easy to add. Irreducible background is different. April 14, 2004 Jim Branson for UCSD/Caltech
44
Jim Branson for UCSD/Caltech
Concludsions It appears this type of analysis must be tried at startup of LHC. If it works, Hgg will dominate low mass Higgs search. We can train on data and signal MC, once we have enough data. Still need background MC to trust training Questions to answer: How good should the calibration be? What must we understand about background? What, other than mass resolution, must be optimized How big must the background simulations be? Can we use tracker to reliably recover some conversions? What is optimal HLT? April 14, 2004 Jim Branson for UCSD/Caltech
45
Jim Branson for UCSD/Caltech
What’s Next New pass through MC input files without cuts. Clean up analysis. Combine the backgrounds. We need new events and effectively better statistics. Accurate background MC is important for now. I’d suggest we aim at (weight ≤ 1) for all backgrounds. Can we do this more efficiently? Need we do it more reliably? Are there generator upgrades we can make quickly? April 14, 2004 Jim Branson for UCSD/Caltech
46
New and Disturbingly Good Results
April 14, 2004 Jim Branson for UCSD/Caltech
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.