Project on H→ττ and multivariate methods iSTEP 2016
Motivations Define a multivariate classifier to separate signal from background; Improve the procedure that produces the selection region based on the classifier and determine the expected discovery significance ; Extend the analysis to multiple bins to make significant improvements possible. iSTEP 2016
Strategies Fisher BDT MLP Binned Analysis iSTEP 2016
Fisher iSTEP 2016
Fisher iSTEP 2016
BDT iSTEP 2016
BDT iSTEP 2016
MLP
Analysis Binned analysis: Choose some number of bins (~40) for the histogram of the test statistic. In bin i, find the expected numbers of signal or background: Likelihood function for strength parameter μ with data n1,..., nN: iSTEP 2016
Analysis show that ln L(μ) can be written: where C represents terms that do not depend on μ. Therefore, to find the estimator μ, we solve: iSTEP 2016
Analysis Statistic for test of μ = 0: Estimate the discovery significance (significance of test of μ= 0) from the formula: iSTEP 2016
Analysis Overlook the unimportant information from each event’s value of the statistic t : DER prodeta jet jet : The product of the pseudorapidities of the two jets (undefined if PRI jet num ≤ 1). DER lep eta centrality : The centrality of the pseudorapidity of the lepton with regard to the two jets. PRI jet leading phi : The azimuth angle f of the leading jet (undefined if PRI jet num = 0). PRI jet subleading pt : The transverse momentum of the leading jet, that is, the jet with second largest transverse momentum (undefined if PRI jet ≤ num 1). PRI jet subleading eta : The pseudorapidity h of the subleading jet (undefined if PRI jet num ≤ 1). PRI jet subleading phi The azimuth angle f of the subleading jet (undefined if PRI jet num ≤ 1). iSTEP 2016
Conclusions Optimize the value of tcut to obtain the maximum expected discovery significance. Fisher . tcut= 0.2 ; Z0=2.09745 ; ZA = 2.09434 BDT . Tcut = 0.3 ; Z0=3.26939; ZA= 3.23772 iSTEP 2016
Conclusions Use the routine fitPar.cc to solve the equation .Then we find μ hat = 1 and use the Asimov data set to evaluate q0 and use this with the formula Z = √q0 to estimate the median discovery significance. Fisher : q0 = 5.7394 Z = 2.39571 BDT : q0 = 14.2301 Z = 3.77241 iSTEP 2016
Outlook Binned analysis : numbers , weights. GPU Parallel Computing to improve efficiency. Stimulated data from experiment. SVM, generate ni ~ Poisson(bi) to search for p-value to estimate the discovery significance. iSTEP 2016
THANKS! Zheng Yi Xu Hongge Li Ruibo Ding Wei Hu Boshen iSTEP 2016