CMS-Bijing weekly meeting Unconverted Gamma/Pi0 discrimination in ECAL Barrel with CMSSW_3_1_2 J.Tao IHEP-Beijing CMS-Bijing weekly meeting Oct. 15, 2009
Introduction N = MC samples: Single , π0 → plsames in CMSSW_3_1_2 6 PT bins: 15-25GeV;25-35GeV; 35-45GeV; 45-55GeV;55-65GeV;65-75GeV η (geometry) range: -3.0~3.0 and range: - π~ π 10k events for each PT bin Unconverted Gamma/Pi0 discrimination in ECAL Barrel Unconverted photon selection 6 PT bins of Rec. photon: 20-25GeV 25-35GeV 35-45GeV 45-55GeV 55-65GeV 65-75GeV ROOT version : 5.24/00 TMVA analysis: BDT & MLP Method: NN 12 variables (previous presentation) Moments variables from CMS AN-2008/075: 3 variables Parametric EM shower fitting variables: 6 variables Combination: NN+Moments+Parametric = ConvID track N
Moments variable 1 – second-order moments Distribution of energy deposit for π0 with = 6 cm, overlaid with the major (solid line) and minor(dotted line) axes. Definition of monents of order n: where dMAJi (dMINi ) is the distance between the center of the i-th crystal and the major (minor) axis, expressed in terms of η, indices. Major & minor axes, dened as the eigenvectors of the covariance matrix where N is the number of crystals in the cluster, i and i are the η, indices that identify the i-th crystal of the cluster and . The weight I (The eigenvalues are 1 and 2 in NN variables) Moments with n > 2 are strongly correlated with second-order moments and therefore they are useless. So MMAJ2 & MMIN2. In fact only MMAJ2 is powerful for the discrimination of /π0. Distribution of the average value of MMAJ2 for and π0, as a function of EST Distribution of the average value of MMin2 for and π0, as a function of EST
Moments variable 2 – Lateral moment LAT Definition of LAT: where E1 and E2 are the energies of the two most energetic crystals of the cluster. Expression of r2: where ~ri is the radius vector of the i-th crystal, ~r is the radius vector of the cluster centroid and rM is the Moliere radius of the electromagnetic calorimeter (2.2 cm for PbWO4). Note that the two most energetic crystals are left out of the sum in this Eq. Distribution of LAT
Moments variable 3 – Pseudo-zernike moment The pseudo-Zernike moments Anm are dened as where n and m are two integers. The complex polynomials Vnm* (i,i), expressed in polar coordinates, are dened as The cluster shape viable PZM used in the analysis is computed as i. e. the norm of the A20 moment. Distribution of PZM
TMVA analysis with Moments variables PT20-25GeV PT20-25GeV PT25-35GeV --- TFHandler_Factory : Ranking input variables... --- IdTransformation : Ranking result (top variable is best ranked) --- IdTransformation : --------------------------------------- --- IdTransformation : Rank : Variable : Separation --- IdTransformation : 1 : NN_UncEB_Mmaj2 : 2.592e-01 --- IdTransformation : 2 : NN_UncEB_LAT : 8.184e-02 --- IdTransformation : 3 : NN_UncEB_PZM : 6.097e-02 --- IdTransformation : 4 : NN_UncEB_Mmin2 : 1.055e-02
Results with Moments variables Results from CMS AN-2008/075 samples: 150K single and 150K single π0 with energy uniformly distributed between 30 GeV and 70 GeV, within the barrel region For my analysis results with moments variables for diferent PT bins, please see the next several slides after the introduction of parametric EM shower fitting method. At that time, I can compare the dicrimination results with diffirent variables. Efciency of photon identication vs π0 rejection for MMAJ2, LAT and PZM. Efciency of photon identication vs π0 rejection for Fisher discriminants containing MMAJ2.
Parametric EM shower fitting method Same as the analysis in CMSSW_1_6_7 Formulea: longitudinal profile + lateral profile Determine longitudinal profile from CMS Geant4 Simulation: parameters a & b For the Gamma EM shower with B-on, the energy spreading is, to good approximation, only in the -direction. Non isotropy at the same r of the lateral formula in a layer now. Correction: The original COG obtained by the energy Log-weighted method is split into 2 new COG points; 2 interaction points with a layer are obtained; In a layer, the energy in a crystal is obtained from the average effect of the lateral formula originated at the 2 interaction points. For the EM shower fitting method , 6 variables were used in TMVA analysis: A、 ΔE/Edep5x5 where ΔE=E0-Edep5x5、、1、2、 2/Edep5x5 instead of 2 (in SW167).
TMVA analysis : BDT PT20-25 PT25-30 PT20-25
Analysis with different variables 6 PT bins for the events with EM Fit ok: Fit status=3 && Not at the limits. 3 seperated group variavles: NN (N12), Moments (M3), EM Fit (F6). 4 combined group variavles: M3+F6 (9), N12+M3 (15), N12+F6 (18), N12+M3+F6(21) Events for each PT bin - one half for training and the rest for test Fit Ok / Used (efficiency) PT bins (GeV) π0 20-25 13426 / 15617 (86.0%) 9295 / 12684 (73.3%) 25-35 26871 / 30990 (87.0%) 18707 / 23364 (80.1%) 35-45 26796 / 30399 (88.1%) 18833 / 22655 (83.1%) 45-55 23324 / 26307 (88.7%) 16521 / 19442 (85.0%) 55-65 23294 / 25854 (90.1%) 15831 / 18276 (86.6%) 65-75 18632 / 20349 (91.6%) 14871 / 16695 (89.1%)
TMVA analysis results: BDT & MLP Table list of the analysis results for the events with EM Fit ok PT bins (GeV) π0 rejection efficiency for keeping 90% photon efficincy M3 F6 M3+F6 N12 N12+M3 N12+F6 N12+M3+F6 20-25 BDT 58.7% 62.3% 70.2% 68.4% 72.0% 71.4% 73.4% MLP 59.6% 63.3% 69.5% 71.0% 72.7% 73.3% 74.7% 25-35 44.4% 62.2% 64.8% 66.1% 66.2% 71.2% 59.0% 65.0% 69.2% 69.6% 35-45 38.8% 50.2% 59.2% 52.1% 62.0% 59.8% 66.0% 39.3% 48.0% 57.7% 51.8% 60.0% 59.5% 45-55 26.7% 40.7% 41.8% 44.3% 47.9% 26.2% 38.2% 41.2% 41.0% 42.9% 48.3% 50.5% 55-65 32.5% 34.0% 33.7% 45.9% 40.0% 49.1% 32.0% 33.3% 43.3% 33.1% 44.8% 40.4% 49.8% 65-75 17.0% 30.2% 29.7% 30.1% 34.5% 17.6% 30.7% 30.0% 30.4% 35.6% 35.7%
Signal eff. vs Bkg rejection: BDT & MLP (I) PT20-25 PT20-25 PT25-35 PT25-35
Signal eff. vs Bkg rejection: BDT & MLP (II) PT35-45 PT35-45 PT45-55 PT45-55
Signal eff. vs Bkg rejection: BDT & MLP (III) PT55-65 PT55-65 PT65-75 PT65-75
Correlation of the inputs PT25-35 PT25-35 PT55-65 PT55-65
Response of the analysis: PT25-35 (I) PT25-35: M3 PT25-35: M3 PT25-35: F6 PT25-35: F6 K-S test tries to determine if two datasets differ significant. Reject the null hypothesis if P is “small”.
Response of the analysis: PT25-35 (II) PT25-35: N12 PT25-35: N12 PT25-35: M3+F6+N12 PT25-35: M3+F6+N12
Response of the analysis: PT55-65 PT55-65: F6 PT55-65: F6 PT55-65: M3+F6+N12 PT55-65: M3+F6+N12
Before the application of the “Spliting method”, firstly have a look at the bias of the calculation, using parameterized values instead the wide distributions of each parameters. PT20-75GeV Photon samples
(Fitted E0 – Edeposit) / Edeposit PT20-75GeV Photon samples
PT20-75GeV Photon samples
Calculation with parametric EM shower Seed crystal: Calculated E1 --------------------- Deposit E1 Calculated E5x5 --------------------- Deposit E5x5 Photon PT35to45 sample The energy difference of crystals was taken as the first step of “spliting method”. Bias of the energy from the parametric EM shower calculation, will affect the method a lot.
Summary and plan With the single , π0 samples, the unconverted/Converted /π0 discrimination was studied with the TMVA method, using different categories of variables. N12+M3+F6 is the best one First look at the bias of the Calculation with parametric EM shower. The results of the calculation using the parameterized valus instead of the distributions of each parameters show that bias exist for some events. Ongoing: trying with the “Spliting method”
Backup slides