Data analysis in HEP: a statistical toolkit S.Donadio, S.Guatelli, B.Mascialino, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo 1st Workshop on Italy-Japan Collaboration on Geant4 Medical Application
Data analysis in HEP Detector monitoring Simulation validation Provide tools for the statistical comparison of distributions equivalent reference distributions experimental measurements data from reference sources functions deriving from theoretical calculations or fits Detector monitoring Simulation validation Reconstruction vs. expectation Regression testing Physics analysis Detector monitoring in order to check if the behavior is constant in more than one run
GoF statistical toolkit Qualitative evaluation Quantitative evaluation A project to develop a statistical comparison system Detector monitoring in order to check if the behavior is constant in more than one run Comparison of distributions Goodness of fit testing
Software process guidelines United Software Development Process, specifically tailored to the project practical guidance and tools from the RUP both rigorous and lightweight mapping onto ISO 15504 Guidance from ISO 15504 Incremental and iterative life cycle model SPIRAL APPROACH
Architectural guidelines The project adopts a solid architectural approach to offer the functionality and the quality needed by the users to be maintainable over a large time scale to be extensible, to accommodate future evolutions of the requirements Component-based approach to facilitate re-use and integration in different frameworks AIDA adopt a (HEP) standard no dependence on any specific analysis tool
The algorithms are specialised on the kind of distribution (binned/unbinned) Every algorithm has been rigorously tested! Documentation available: http://www.ge.infn.it/geant4/analysis/HEPstatistics/
Chi-squared test Applies to binned distributions It can be useful also in case of unbinned distributions, but the data must be grouped into classes Cannot be applied if the counting of the theoretical frequencies in each class is < 5 When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached Otherwise one could use Yates formula
More sophisticated algorithms unbinned distributions Kolmogorov-Smirnov test Goodman approximation of KS test Kuiper test EMPIRICAL DISTRIBUTION FUNCTION ORIGINAL DISTRIBUTIONS Dmn SUPREMUM STATISTICS
More powerful algorithms unbinned distributions Cramer-von Mises test Anderson-Darling test TESTS CONTAINING A WEIGHTING FUNCTION These algorithms are so powerful that we decided to implement their equivalent in case of binned distributions: binned distributions Fisz-Cramer-von Mises test k-sample Anderson-Darling test
2 Is 2 the most powerful algorithm? In terms of power: The power of a test is the probability of rejecting the null hypothesis correctly In terms of power: 2 Supremum statistics tests Tests containing a weight function < 2 loses information in a test for unbinned distribution by grouping the data into cells Kac, Kiefer and Wolfowitz (1955) showed that Kolmogorov-Smirnov test requires n4/5 observations compared to n observations for 2 to attain the same power Cramer-von Mises and Anderson-Darling statistics are expected to be superior to Kolmogorov-Smirnov’s, since they make a comparison of the two distributions all along the range of x, rather than looking for a marked difference at one point
EXTRACTS THE ALGORITHM WRITING ONE LINE OF CODE User’s point of view Simple user layer Only deal with AIDA objects and choice of comparison algorithm The user is completely shielded from both statistical and computing complexity. STATISTICAL RESULT TOOLKIT USER EXTRACTS THE ALGORITHM WRITING ONE LINE OF CODE
Examples of practical applications
are statistically comparable with Microscopic validation of physics NIST Geant4 Standard Geant4 LowE 2N-S=0.267 =28 p=1 2N-L=1.315 =28 p=1 2N-S=0.532 =28 p=1 2N-L=1.928 =28 p=1 2N-S=0.373 =28 p=1 2N-L= 5.882 =28 p=1 Geant4 simulations are statistically comparable with reference data (NIST database http://www.nist.gov) Chi-squared test
X-ray fluorescence spectrum in Iceand basalt Test beam at Bessy Bepi-Colombo mission Energy (keV) Counts X-ray fluorescence spectrum in Iceand basalt (EIN=6.5 keV) Very complex distributions c2 not appropriate (< 5 entries in some bins, physical information would be lost if rebinned) Experimental measurements are comparable with Geant4 simulations Anderson-Darling Ac (95%) =0.752
Medical applications-hadron therapy DEXP-GEANT4=0.11 p=n.s. 2EXP-GEANT4=3.8 =2 p=n.s. KOLMOGOROV-SMIRNOV Goodman approximation KOLMOGOROV-SMIRNOV Experimental measurements are comparable with Geant4 simulations
Conclusions Applications in: HEP, astrophysics, medical physics, … This is a new up-to-date easy to handle and powerful tool for statistical comparison in particle physics. It the first tool supplying such a variety of sophisticated and powerful statistical tests in HEP. AIDA interfaces allow its integration in any other data analysis tool. Applications in: HEP, astrophysics, medical physics, …