Background Rejection Activities in Italy Gamma-ray Large Area Space Telescope Background Rejection Activities in Italy Francesco Longo University and INFN, Trieste, Italy francesco.longo@ts.infn.it On behalf of the “North-East” INFN group thanks in particular to R.Rando, O.Tibolla, Y.Lei, G.Busetto and P.Azzi (University and INFN Padova)
Bkg Rejection activity Starting from “simple cuts” Collection of Bkg Rejection documentation Understanding IM cuts (DC1 variables and cuts) Classification Trees in R New recent developments Ready for new data More Info: http://sirad.pd.infn.it/glast/ground_sw/dc2.html
“By hands” cuts First iteration using already suggested cuts Look into Merit Tuple to find efficiency of rejection and gamma acceptance Reference docs (Atwood): “Instrument response studies” “Post rome background rejection” Datasets: DC1 prep background ntuples DC1 prep gamma Merit nutple Divide events in particle type: gamma(signal), gamma(bkg), electron+positron, protons
Calorimeter categories Definitions: No cal: CalEnergySum<5. || CalTotRLn≤2 Low Cal: CalEnergySum>5. && CalTotRLn>2 Med Cal: CalEnergySum>350. && CalTotRLn>2 High Cal: CalEnergySum<3500. && CalTotRLn>2 “Good Energy” events: “good_energy” = (EvtEnergySumOpt-MCEnergy)/MCEnergy |”good_energy”| ≤ 35%
Signal events split g(all) g (good) ALL Total No Low Med High 722594 314226 (43.4%) 122081 (17%) 114224 (15.8%) 172063 (23.8%) g (good) 81139 (66.4%) 94869 (83%) 142082
Background events split ALL Total No Low Med High e 746232 649293 (87%) 93933 (13%) 1139 (0.2%) 1867 (0.3%) p 1165277 608568 (52%) 284425 (24%) 178084 (15%) 94200 (8%) g 288491 276358 (96%) 10066 (3.5%) 1227 (0.4%) 840 GOOD Low Med High e 60093 (64%) 854 (75%) 1365 (73%) p 35407 (12%) 10914 (5.7%) 21033 (22%) g 5185 (51.5%) 797 (65%) 459 (55%)
Gamma Low Cal GOOD BAD ALL Good ene High Cal ALL GOOD BAD MCEnergy
“Tree” cuts Using IM xml file in classification Develop a “Node” structure parsing the xml IM output Check of cuts
CT approach ID predicate ID 0/1 predicate
Signal events split All Total No Lo Med Hi g(all) 722,594 314,226 122,081 114,224 172,063 g(good) 64,110 (20.4%) 86,190 (70.6%) 90,201 (79.0%) 138,793 (80.7%)
Good Cal E
Starting with Classification Trees Use of R program – rpart (recursive partitioning) Searching to optimize “goodCal” For each step rpart reports the cost-complexity of the tree, the number of splits, the relative error and finally the error that it obtains from a process of cross validation, with the corresponding sigma.
Classification Trees with rpart
Classification Trees with rpart
Classification Trees with rpart
Classification Trees with rpart
One Tree per E bin
One Tree per E bin
One Tree per E bin
Classification Trees with R
rpart Classification
rpart Classification
Error costs
Error Costs
Error Costs
Random Forest
Random Forests
Random Forests
Random Forests
rForest package
Conclusions Work is progressing… Work on new variables started More results will come…