Barbara Mascialino, INFN Genova An update on the Goodness of Fit Statistical Toolkit B. Mascialino, A. Pfeiffer, M.G. Pia, A. Ribon, P. Viarengo

Slides:

Advertisements

Similar presentations

Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit

Advertisements

Statistical Toolkit Power of Goodness-of-Fit tests

S. Guatelli, M.G. Pia – INFN Sezione di Genova Monte Carlo April 2005 Chattanooga, TN, USA Radioprotection.

Maria Grazia Pia, INFN Genova Test & Analysis Project Maria Grazia Pia, INFN Genova on behalf of the T&A team

Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Precision validation of Geant4 electromagnetic physics Katsuya Amako, Susanna Guatelli, Vladimir Ivanchenko, Michel Maire, Barbara Mascialino, Koichi Murakami,

F. Foppiano, M.G. Pia, M. Piergentili Medical Linac IEEE NSS, October 2004, Rome, Italy

Maria Grazia Pia, INFN Genova Geant4 Physics Validation (mostly electromagnetic, but also hadronic…) K. Amako, S. Guatelli, V. Ivanchenko, M. Maire, B.

Maria Grazia Pia, INFN Genova PhysicsLists in Geant4 Advanced Examples Geant4.

Simulation of X-ray Fluorescence and Application to Planetary Astrophysics A. Mantero, M. Bavdaz, A. Owens, A. Peacock, M. G. Pia IEEE NSS -- Portland,

Maria Grazia Pia, INFN Genova Atomic Relaxation Models A. Mantero, B. Mascialino, Maria Grazia Pia INFN Genova, Italy P. Nieminen ESA/ESTEC

S. Guatelli, M.G. Pia – INFN Sezione di Genova Geant4-SPENVIS Workshop 3-7 October 2005 Leuven, Belgium Radioprotection.

Low Energy Electromagnetic Physics

Geant4-Genova Group Validation of Susanna Guatelli, Alfonso Mantero, Barbara Mascialino, Maria Grazia Pia, Valentina Zampichelli INFN Genova, Italy IEEE.

Barbara MascialinoIEEE-NSSOctober 21 th, 2004 Application of statistical methods for the comparison of data distributions Susanna Guatelli, Barbara Mascialino,

S. Guatelli IEEE 2004 – NSS - Rome Dosimetry for Interplanetary Missions: the Geant4 REMSIM application S. Guatelli 1, P. Nieminen 2, M.G. Pia 1 IEEE NSS,

Maria Grazia Pia Experimental validation of models in the pre-equilibrium and nuclear de-excitation phase G.A.P. Cirrone 1, G. Cuttone 1, F. Di Rosa 1,

Maria Grazia Pia, INFN Genova PhysicsLists in Geant4 Advanced Examples M.G.

Maria Grazia Pia, INFN Genova Geant4 Physics Validation Geant4 Space User Workshop Pasadena, 6-10 November 2006 M.G. Pia On behalf of the LowE EM and Advanced.

Susanna Guatelli Radiation Shielding Simulation For Interplanetary Manned Missions S. Guatelli 1, B. Mascialino 1, P. Nieminen 2, M.G. Pia 1 1 INFN Genova,

Maria Grazia Pia, INFN Genova Geant4 Electromagnetic Validation (mostly electromagnetic, but also a bit of hadronic…) K. Amako, G.A.P. Cirrone, G. Cuttone,

Maria Grazia Pia, INFN Genova A Toolkit for Statistical Data Analysis M.G. Pia S. Donadio, F. Fabozzi, L. Lista, S. Guatelli, B. Mascialino, A. Pfeiffer,

Maria Grazia Pia, INFN Genova Test & Analysis Project Maria Grazia Pia, INFN Genova on behalf of the T&A team

Maria Grazia Pia, INFN Genova CERN, 26 July 2004 Background of the Project.

1 M.G. Pia et al. The application of GEANT4 simulation code for brachytherapy treatment Maria Grazia Pia INFN Genova, Italy and CERN/IT

Maria Grazia Pia, INFN Genova Low Energy Electromagnetic Physics Maria Grazia Pia INFN Genova

S. Guatelli, M.G. Pia – INFN Sezione di Genova Monte Carlo April Radiation protection for interplanetary.

Validation of the Bremsstrahlung models Susanna Guatelli, Barbara Mascialino, Luciano Pandola, Maria Grazia Pia, Pedro Rodrigues, Andreia Trindade IEEE.

Geant4-INFN (Genova-LNS) Team Validation of Geant4 electromagnetic and hadronic models against proton data Validation of Geant4 electromagnetic and hadronic.

Maria Grazia Pia Systematic validation of Geant4 electromagnetic and hadronic models against proton data Systematic validation of Geant4 electromagnetic.

Comparison of data distributions: the power of Goodness-of-Fit Tests

Maria Grazia Pia, INFN Genova Low Energy Electromagnetic Physics R. Capra, S. Chauvie, G.A.P. Cirrone, G. Cuttone, F. Di Rosa, Z. Francis, S. Guatelli,

SOI detector Geant4-based studies to characterise the tissue-equivalence of SOI and diamond microdosimeteric detectors, under development at CMRP S. Dowdell,

Alfonso Mantero, INFN Genova Models for the Simulation of X-Ray Fluorescence and PIXE A. Mantero, S. Saliceti, B. Mascialino, Maria Grazia Pia INFN Genova,

OOAD… LowE Electrons From HEP computing to medical research and vice versa Bidirectional From HEP computing to medical research and vice versa Bidirectional.

Maria Grazia Pia Simulation for LHC Radiation Background Optimisation of monitoring detectors and experimental validation Simulation for LHC Radiation.

Maria Grazia Pia, INFN Genova Test & Analysis Project aka “statistical testing” Maria Grazia Pia, INFN Genova on behalf of the T&A team

Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources.

Alberto Ribon, CERN Statistical Testing Project Alberto Ribon, CERN on behalf of the Statistical Testing Team CLHEP Workshop CERN, 28 January 2003.

REMSIM Radiation Exposure and Mission Strategies for Interplanetary Manned Missions Susanna Guatelli, 9 th March 2004, Genova, Italy

Riccardo Capra 1, Stéphane Chauvie 2, Ziad Francis 3, Sebastien Incerti 4, Barbara Mascialino 1, Gerard Montarou 3, Philippe Moretto 4, Petteri Nieminen.

Maria Grazia Pia, INFN Genova Statistical Toolkit Recent updates M.G. Pia B. Mascialino, A. Pfeiffer, M.G. Pia, A. Ribon, P. Viarengo

Susanna Guatelli & Barbara Mascialino G.A.P. Cirrone (INFN LNS), G. Cuttone (INFN LNS), S. Donadio (INFN,Genova), S. Guatelli (INFN Genova), M. Maire (LAPP),

Geant4 Space User Workshop 2004 Maria Grazia Pia, INFN Genova Proposal of a Space Radiation Environment Generator interfaced to Geant4 S. Guatelli 1, P.

An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.

Maria Grazia Pia, INFN Genova Update on the Goodness of Fit Toolkit M.G. Pia B. Mascialino, A. Pfeiffer, M.G. Pia, A. Ribon, P. Viarengo

Precision Validation of Geant4 Electromagnetic Physics Geant4 DNA Project Meeting 26 July 2004, CERN Michela.

Test Beam Simulation for ESA BepiColombo Mission Marcos Bavdaz, Alfonso Mantero, Barbara Mascialino, Petteri Nieminen, Alan Owens, Tone Peacock, Maria.

Maria Grazia Pia, INFN Genova Statistics Toolkit Project Maria Grazia Pia, INFN Genova AIDA Workshop.

The Statistical Testing Project Stefania Donadio and Barbara Mascialino January 15 TH, 2003.

Barbara MascialinoMonte Carlo 2005Chattanooga, April 19 th 2005 Monte Carlo Chattanooga, April 2005 B. Mascialino, A. Pfeiffer, M. G. Pia, A. Ribon,

Susanna Guatelli Geant4 in a Distributed Computing Environment S. Guatelli 1, P. Mendez Lorenzo 2, J. Moscicki 2, M.G. Pia 1 1. INFN Genova, Italy, 2.

Update on the Goodness of Fit Toolkit

Geant4 REMSIM application

A Statistical Toolkit for Data Analysis

F. Foppiano, S. Guatelli, B. Mascialino, M. G. Pia, M. Piergentili

Radioprotection for interplanetary manned missions

Data analysis in HEP: a statistical toolkit

B.Mascialino, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo

Hadronic physics validation of Geant4

Geant4 physics validation: Bragg Peak

The Hadrontherapy Geant4 advanced example

An update on the Goodness of Fit Statistical Toolkit

Short Course IEEE NSS/MIC 2003 Katsuya Amako (KEK) Makoto Asai (SLAC)

Precision validation of Geant4 electromagnetic physics

Statistical Testing Project

G. A. P. Cirrone1, G. Cuttone1, F. Di Rosa1, S. Guatelli1, A

The Geant4 Hadrontherapy Advanced Example

Comparison of data distributions: the power of Goodness-of-Fit Tests

Data analysis in HEP: a statistical toolkit

Presentation transcript:

Barbara Mascialino, INFN Genova An update on the Goodness of Fit Statistical Toolkit B. Mascialino, A. Pfeiffer, M.G. Pia, A. Ribon, P. Viarengo th Geant4 Space Users’ Workshop

Barbara Mascialino, INFN Genova Goodness of Fit testing Regression testing –Throughout the software life-cycle Online DAQ –Monitoring detector behaviour w.r.t. a reference Simulation validation –Comparison with experimental data Reconstruction –Comparison of reconstructed vs. expected distributions Physics analysis –Comparison with theoretical distributions –Comparisons of experimental distributions Goodness-of-fitmathematical foundation comparison of data distributions Goodness-of-fit testing is the mathematical foundation for the comparison of data distributions THEORETICAL DISTRIBUTION SAMPLE ONE-SAMPLE PROBLEM SAMPLE 2SAMPLE 1 TWO-SAMPLE PROBLEM Use cases in experimental physics

Barbara Mascialino, INFN Genova G.A.P Cirrone, S. Donadio, S. Guatelli, A. Mantero, B. Mascialino, S. Parlati, M.G. Pia, A. Pfeiffer, A. Ribon, P. Viarengo “A Goodness-of-Fit Statistical Toolkit” IEEE- Transactions on Nuclear Science (2004), 51 (5): B. Mascialino, M.G. Pia, A. Pfeiffer, A. Ribon, P. Viarengo “New developments of the Goodness-of-Fit Statistical Toolkit” IEEE- Transactions on Nuclear Science (2006), 53 (6), to be published

Barbara Mascialino, INFN Genova Software process guidelines Adopt a process – software quality Unified Processtailored Unified Process, specifically tailored to the project RUP –practical guidance and tools from the RUP –both rigorous and lightweight –mapping onto ISO (and CMM) Incremental and iterative life-cycle 1 st cycle: 2-sample GoF tests –1-sample GoF in preparation

Barbara Mascialino, INFN Genova Architectural guidelines architectural The project adopts a solid architectural approach functionalityquality –to offer the functionality and the quality needed by the users maintainable –to be maintainable over a large time scale extensible –to be extensible, to accommodate future evolutions of the requirements Component-based architecture –to facilitate re-use and integration in diverse frameworks –layer architecture pattern –core component for statistical computation –independent components for interface to user analysis environmentsDependencies –no dependence on any specific analysis tool –can be used by any analysis tools, or together with any analysis tools –offer a (HEP) standard (AIDA) for the user layer

Barbara Mascialino, INFN Genova

The algorithms are specialised on the kind of distribution (binned/unbinned)

Barbara Mascialino, INFN Genova GoF algorithms in the Statistical Toolkit Unbinned distributions – Anderson-Darling test – Anderson-Darling approximated test – Cramer-von Mises test – Generalised Girone test – Goodman test (Kolmogorov-Smirnov test in chi-squared approximation) – Kolmogorov-Smirnov test – Kuiper test – Tiku test (Cramer-von Mises test in chi-squared approximation) – Weighted Kolmogorov-Smirnov test (2 flavours) – Weighted Cramer-von Mises test TWO-SAMPLE PROBLEM Binned distributions – Anderson-Darling test – Anderson-Darling approximated test – Chi-squared test – Fisz-Cramer-von Mises test – Tiku test (Cramer-von Mises test in chi-squared approximation) It is the most complete software for the comparison of two distributions, even among commercial/professional statistics tools. It provides all 2-sample (edf) GoF algorithms existing in statistics literature

Barbara Mascialino, INFN Genova Simple user layer Shields the user from the complexity of the underlying algorithms and design analysis objects comparison algorithm Only deal with the user’s analysis objects and choice of comparison algorithm First release: user layer for AIDA analysis objects –LCG Architecture Blueprint, Geant4 requirement Second release: added user layer for ROOT analysis objects –in response to user requirements User Layer

Barbara Mascialino, INFN Genova Which test to use? Do we really need such a wide collection of GoF tests? Why? most appropriate Which is the most appropriate test to compare two distributions? good How “good” is a test at recognizing real equivalent distributions and rejecting fake ones? The choice of the most suitable GoF test can be performed on the basis of two different criteria: –Computational performance –Statistical performance (power) Which test to use?

Barbara Mascialino, INFN Genova AVERAGE CPU TIME Binned Distributions Unbinned Distributions Anderson-Darling (0.69±0.01) ms(16.9±0.2) ms Anderson-Darling (approximated) (0.60±0.01) ms(16.1±0.2) ms Chi-squared (0.55±0.01) ms Cramer-von Mises (0.44±0.01) ms(16.3±0.2) ms Generalised Girone (15.9±0.2) ms Goodman (11.9±0.1) ms Kolmogorov-Smirnov (8.9±0.1) ms Kuiper (12.1±0.1) ms Tiku (0.69±0.01) ms(16.7±0.2) ms Watson (14.2±0.1) ms Weighted Kolmogorov-Smirnov (AD) (14.0±0.1) ms Weighted Kolmogorov-Smirnov (Buning) (14.0±0.1) ms Weighted Cramer-von Mises (14.0±0.1) ms A) Performance of the GoF tests

Barbara Mascialino, INFN Genova B) Power of GoF tests Systematicall Systematic study of all existing GoF tests in progress –made possible by the extensive collection of tests in the Statistical Toolkit –GoF tests power evaluated in a variety of alternative situations considered clear winner: No clear winner: the statistical performance of a test depends on the features of the distributions to be compared (skewness and tailweight) and on the sample size Practical recommendations 1)first classify the type of the distributions in terms of skewness and tailweight 2)choose the most appropriate test given the type of distributions evaluating the best test by means of the quantitative model proposed Topic still subject to research activity in the domain of statistics p< General recipe The power of a test is the probability of rejecting the null hypothesis correctly

Barbara Mascialino, INFN Genova Examples of practical applications

Barbara Mascialino, INFN Genova Statistical Toolkit Usage Geant4 physics validation –rigorous approach: quantitative evaluation of Geant4 physics models with respect to established reference data –see for instance: K. Amako et al., Comparison of Geant4 electromagnetic physics models against the NIST reference data IEEE Trans. Nucl. Sci (2005) LCG Simulation Validation project –see for instance: A. Ribon, Testing Geant4 with a simplified calorimeter setup, CMS –validation of “new” histograms w.r.t. “reference” ones in OSCAR Validation Suite Usage also in space science, medicine, statistics, etc.

Barbara Mascialino, INFN Genova Electron Stopping Power H 0 REJECTION AREA p-value stability study centre Experimental set-up Physics models under test: Geant4 Standard Geant4 Low Energy – Livermore Geant4 Low Energy – Penelope Reference data: NIST ESTAR - ICRU 37 p-value Z Geant4 LowE Penelope Geant4 Standard Geant4 LowE EEDL NIST - XCOM The three Geant4 models are equivalent Geant4 LowE Penelope Geant4 Standard Geant4 LowE EEDL Validation of Geant4 e.m. physics models vs. NIST reference data  2 test (to include data uncertainties in the computation of the test statistics value)

Barbara Mascialino, INFN Genova Validation of Geant4 Atomic Relaxation vs NIST reference dataShell-end Kolmogorov- Smirnov D p-value Geant4 ○ NIST Fluorescence - Shell-start 3

Barbara Mascialino, INFN Genova Validation of Geant4 electromagnetic and hadronic models against proton data Low Energy EM – ICRU49: p, ions Low Energy EM – Livermore: , e- Standard EM:e+ HadronElastic with BertiniElastic Bertini Inelastic p-value CvMKSAD Left branch Right branch Whole curve LowE EM – ICRU49 BertiniElastic CvM Cramer-von Mises test KS Kolmogorov-Smirnov test AD Anderson-Darling test 0.5 M events mm Geant4 Experimental data Bertini Inelastic

Barbara Mascialino, INFN Genova  2 not appropriate (< 5 entries in some bins, physical information would be lost if rebinned) Anderson-Darling A c (95%) =0.752 Test beam at Bessy Bepi-Colombo mission Energy (keV) Counts X-ray fluorescence spectrum in Iceand basalt (E IN =6.5 keV) Very complex distributions Experimental measurements are comparable with Geant4 simulations

Barbara Mascialino, INFN Genova Average energy deposit (MeV) Depth in the phantom (cm) Average energy deposit of GCR p GCR p 4 cm Al - Binary set 4 cm Al - Bertini 10 cm water - Binary 10 cm water – EM 4 cm Al – EM 10 cm water - Bertini Comparison of alternative vehicle concepts in human missions to Mars Reference: rigid structures as in the ISS (2 - 4 cm Al) Kolmogorov-Smirnov test  Multi-layer + 10 cm water equivalent to 4 cm Al  Multi-layer + 5 cm water equivalent to 2.15 cm Al An inflatable habitat exhibits a shielding capability equivalent to a conventional rigid one Shielding material Energy deposited in phantom (MeV) EMBertiniBinary ML + 5 cm water 73.5 ± ± ± 0.4 ML + 10 cm water 71.9 ± ± ± cm Al 72.9 ± ± ± cm Al 73.9 ± ± ± 0.5 Inflatable habitat vs a conventional rigid habitat

Barbara Mascialino, INFN Genova Conclusions A novel, complete software software toolkit for statistical analysis is being developed –all the two-sample GoF tests available in statistical domain + chi-squared test –rigorous architectural design –rigorous software process It is the most complete software for the comparison of two distributions, even among commercial/professional statistics tools. A systematic study of the power of GoF tests is in progress –unexplored area of research Application in various domains –Geant4, HEP, space science, medicine… Feedback and suggestions are very much appreciated