Maria Grazia Pia, INFN Genova Update on the Goodness of Fit Toolkit M.G. Pia B. Mascialino, A. Pfeiffer, M.G. Pia, A. Ribon, P. Viarengo

Slides:



Advertisements
Similar presentations
Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit
Advertisements

Statistical Toolkit Power of Goodness-of-Fit tests
Maria Grazia Pia, INFN Genova Test & Analysis Project Maria Grazia Pia, INFN Genova on behalf of the T&A team
Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team
Maria Grazia Pia, INFN Genova Geant4 Physics Validation (mostly electromagnetic, but also hadronic…) K. Amako, S. Guatelli, V. Ivanchenko, M. Maire, B.
Maria Grazia Pia Geant4 LowE Workshop 30-31/5/2002 ow Energy e.m. Workshop CERN, May 2002.
Maria Grazia Pia, INFN Genova PhysicsLists in Geant4 Advanced Examples Geant4.
Simulation of X-ray Fluorescence and Application to Planetary Astrophysics A. Mantero, M. Bavdaz, A. Owens, A. Peacock, M. G. Pia IEEE NSS -- Portland,
Maria Grazia Pia, INFN Genova Atomic Relaxation Models A. Mantero, B. Mascialino, Maria Grazia Pia INFN Genova, Italy P. Nieminen ESA/ESTEC
Maria Grazia Pia, INFN Genova 1 Part V The lesson learned Summary and conclusions.
Geant4-Genova Group Validation of Susanna Guatelli, Alfonso Mantero, Barbara Mascialino, Maria Grazia Pia, Valentina Zampichelli INFN Genova, Italy IEEE.
Barbara MascialinoIEEE-NSSOctober 21 th, 2004 Application of statistical methods for the comparison of data distributions Susanna Guatelli, Barbara Mascialino,
Barbara Mascialino, INFN Genova An update on the Goodness of Fit Statistical Toolkit B. Mascialino, A. Pfeiffer, M.G. Pia, A. Ribon, P. Viarengo
Maria Grazia Pia, INFN Genova Geant4 Physics Validation Geant4 Space User Workshop Pasadena, 6-10 November 2006 M.G. Pia On behalf of the LowE EM and Advanced.
Maria Grazia Pia, INFN Genova Geant4 Electromagnetic Validation (mostly electromagnetic, but also a bit of hadronic…) K. Amako, G.A.P. Cirrone, G. Cuttone,
Maria Grazia Pia, INFN Genova A Toolkit for Statistical Data Analysis M.G. Pia S. Donadio, F. Fabozzi, L. Lista, S. Guatelli, B. Mascialino, A. Pfeiffer,
Maria Grazia Pia, INFN Genova Test & Analysis Project Maria Grazia Pia, INFN Genova on behalf of the T&A team
Maria Grazia Pia, INFN Genova CERN, 26 July 2004 Background of the Project.
1 M.G. Pia et al. The application of GEANT4 simulation code for brachytherapy treatment Maria Grazia Pia INFN Genova, Italy and CERN/IT
Maria Grazia Pia, INFN Genova Low Energy Electromagnetic Physics Maria Grazia Pia INFN Genova
Validation of the Bremsstrahlung models Susanna Guatelli, Barbara Mascialino, Luciano Pandola, Maria Grazia Pia, Pedro Rodrigues, Andreia Trindade IEEE.
Geant4-INFN (Genova-LNS) Team Validation of Geant4 electromagnetic and hadronic models against proton data Validation of Geant4 electromagnetic and hadronic.
Maria Grazia Pia Systematic validation of Geant4 electromagnetic and hadronic models against proton data Systematic validation of Geant4 electromagnetic.
Simulation Project Organization update & review of recommendations Gabriele Cosmo, CERN/PH-SFT Application Area Internal.
Comparison of data distributions: the power of Goodness-of-Fit Tests
Maria Grazia Pia, INFN Genova Software Process: Physics Maria Grazia Pia INFN Genova on behalf of the Geant4 Collaboration Budker Inst. of Physics IHEP.
Michela Piergentili, INFN Genova F. P. Brooks, “No Silver Bullet - Essence and Accidents of Software Engineering”, IEEE Computer 20(4):10-19, April, 1987.
M.G. Pia et al. Brachytherapy at IST Results from an atypical Comparison Project Stefano Agostinelli 1,2, Franca Foppiano 1, Stefania Garelli 1, Matteo.
P. Saracco, M.G. Pia, INFN Genova An exact framework for Uncertainty Quantification in Monte Carlo simulation CHEP 2013 Amsterdam, October 2013 Paolo.
IEEE Nuclear Science Symposium and Medical Imaging Conference Short Course The Geant4 Simulation Toolkit Sunanda Banerjee (Saha Inst. Nucl. Phys., Kolkata,
IEEE Nuclear Science Symposium and Medical Imaging Conference Short Course The Geant4 Simulation Toolkit Sunanda Banerjee (Saha Inst. Nucl. Phys., Kolkata,
Geant4 Workshop 2004 Maria Grazia Pia, INFN Genova Physics Book Maria Grazia Pia INFN Genova on behalf of the Physics Book Team
Maria Grazia Pia, INFN Genova Test & Analysis Project aka “statistical testing” Maria Grazia Pia, INFN Genova on behalf of the T&A team
Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources.
Alberto Ribon, CERN Statistical Testing Project Alberto Ribon, CERN on behalf of the Statistical Testing Team CLHEP Workshop CERN, 28 January 2003.
Riccardo Capra 1, Stéphane Chauvie 2, Ziad Francis 3, Sebastien Incerti 4, Barbara Mascialino 1, Gerard Montarou 3, Philippe Moretto 4, Petteri Nieminen.
Maria Grazia Pia, INFN Genova Statistical Toolkit Recent updates M.G. Pia B. Mascialino, A. Pfeiffer, M.G. Pia, A. Ribon, P. Viarengo
Susanna Guatelli & Barbara Mascialino G.A.P. Cirrone (INFN LNS), G. Cuttone (INFN LNS), S. Donadio (INFN,Genova), S. Guatelli (INFN Genova), M. Maire (LAPP),
Geant4 Space User Workshop 2004 Maria Grazia Pia, INFN Genova Proposal of a Space Radiation Environment Generator interfaced to Geant4 S. Guatelli 1, P.
Maria Grazia Pia, INFN Genova Introduction to medical physics applications Maria Grazia Pia, INFN Genova Geant4 Workshop,
IEEE Nuclear Science Symposium and Medical Imaging Conference Short Course The Geant4 Simulation Toolkit Sunanda Banerjee (Saha Inst. Nucl. Phys., Kolkata,
An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19 th, 2005.
Precision Validation of Geant4 Electromagnetic Physics Geant4 DNA Project Meeting 26 July 2004, CERN Michela.
Test Beam Simulation for ESA BepiColombo Mission Marcos Bavdaz, Alfonso Mantero, Barbara Mascialino, Petteri Nieminen, Alan Owens, Tone Peacock, Maria.
Geant4 Training 2006 Short Course Katsuya Amako (KEK) Gabriele Cosmo (CERN) Susanna Guatelli (INFN Genova) Aatos Heikkinen (Helsinki Institute of Physics)
Maria Grazia Pia, INFN Genova Statistics Toolkit Project Maria Grazia Pia, INFN Genova AIDA Workshop.
The Statistical Testing Project Stefania Donadio and Barbara Mascialino January 15 TH, 2003.
Barbara MascialinoMonte Carlo 2005Chattanooga, April 19 th 2005 Monte Carlo Chattanooga, April 2005 B. Mascialino, A. Pfeiffer, M. G. Pia, A. Ribon,
LCG – AA review 1 Simulation LCG/AA review Sept 2006.
Susanna Guatelli Geant4 in a Distributed Computing Environment S. Guatelli 1, P. Mendez Lorenzo 2, J. Moscicki 2, M.G. Pia 1 1. INFN Genova, Italy, 2.
Geant4 Training 2004 Short Course Katsuya Amako (KEK) Gabriele Cosmo (CERN) Giuseppe Daquino (CERN) Susanna Guatelli (INFN Genova) Aatos Heikkinen (Helsinki.
Maria Grazia Pia, INFN Genova and CERN1 Geant4 highlights of relevance for medical physics applications Maria Grazia Pia INFN Genova and CERN.
Maria Grazia Pia, INFN Genova - G4 WG Coord. Meeting, 13/11/2001 ow Energy Electromagnetic Physics ow Energy Electromagnetic Physics New physics features.
Uncertainty quantification in generic Monte Carlo Simulation: a mathematical framework How to do it? Abstract: Uncertainty Quantification (UQ) is the capability.
Update on the Goodness of Fit Toolkit
A Statistical Toolkit for Data Analysis
Introduction to medical physics applications
Data analysis in HEP: a statistical toolkit
B.Mascialino, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo
Introductory Course PTB, Braunschweig, June 2009
Short Course Siena, 5-6 October 2006
An update on the Goodness of Fit Statistical Toolkit
Introductory Course ORNL, May 2008
Short Course IEEE NSS/MIC 2003 Katsuya Amako (KEK) Makoto Asai (SLAC)
Precision validation of Geant4 electromagnetic physics
Statistical Testing Project
G. A. P. Cirrone1, G. Cuttone1, F. Di Rosa1, S. Guatelli1, A
The Geant4 Hadrontherapy Advanced Example
Comparison of data distributions: the power of Goodness-of-Fit Tests
Data analysis in HEP: a statistical toolkit
Presentation transcript:

Maria Grazia Pia, INFN Genova Update on the Goodness of Fit Toolkit M.G. Pia B. Mascialino, A. Pfeiffer, M.G. Pia, A. Ribon, P. Viarengo PHYSTAT 2005 Oxford, September 2005

Maria Grazia Pia, INFN Genova Historical background… Validation of Geant4 physics models through comparison of simulation vs experimental data or reference databases Fluorescence spectrum from Icelandic basalt (Mars-like rock): experimental data and simulation ESA Bepi Colombo mission to Mercury test beam at Bessy Photon attenuation coefficient, Al Geant4 Standard Geant4 LowE NIST data Electromagnetic models in Geant4 w.r.t. NIST reference

Maria Grazia Pia, INFN Genova Some use cases Regression testing –Throughout the software life-cycle Online DAQ –Monitoring detector behaviour w.r.t. a reference Simulation validation –Comparison with experimental data Reconstruction –Comparison of reconstructed vs. expected distributions Physics analysis –Comparisons of experimental distributions (ATLAS vs. CMS Higgs?) –Comparison with theoretical distributions (data vs. Standard Model) The test statistics computation concerns the agreement between the two samples’ empirical distribution functions

Maria Grazia Pia, INFN Genova G.A.P Cirrone, S. Donadio, S. Guatelli, A. Mantero, B. Mascialino, S. Parlati, M.G. Pia, A. Pfeiffer, A. Ribon, P. Viarengo “A Goodness-of-Fit Statistical Toolkit” IEEE- Transactions on Nuclear Science (2004), 51 (5): More information on the project web site

Maria Grazia Pia, INFN Genova Vision of the project software process Rigorous software process vision Basic vision –General purpose tool –Toolkit approach (choice open to users) –Open source product –Independent from specific analysis tools –Easily usable in analysis and other tools architecture Build on a solid architecture Clearly define scope objectives scope, objectives Flexible, extensible, maintainable Flexible, extensible, maintainable system quality Software quality

Maria Grazia Pia, INFN Genova Software process guidelines Adopt a process –the key to software quality... Unified Processtailored Unified Process, specifically tailored to the project RUP –practical guidance and tools from the RUP –both rigorous and lightweight –mapping onto ISO (and CMM) Incremental and iterative life-cycle 1 st cycle: 2-sample GoF tests –1-sample GoF in preparation

Maria Grazia Pia, INFN Genova Architectural guidelines architectural The project adopts a solid architectural approach functionalityquality –to offer the functionality and the quality needed by the users maintainable –to be maintainable over a large time scale extensible –to be extensible, to accommodate future evolutions of the requirements Component-based architecture –to facilitate re-use and integration in diverse frameworks –layer architecture pattern –core component for statistical computation –independent components for interface to user analysis environmentsDependencies –no dependence on any specific analysis tool –can be used by any analysis tools, or together with any analysis tools –offer various specific user layers for usage with AIDA, ROOT, etc.

Maria Grazia Pia, INFN Genova

Simple user layer Shields the user from the complexity of the underlying algorithms and design analysis objects comparison algorithm Only deal with the user’s analysis objects and choice of comparison algorithm User Layer

Maria Grazia Pia, INFN Genova GoF algorithms (currently implemented) Algorithms for binned distributions – Anderson-Darling test – Chi-squared test – Fisz-Cramer-von Mises test – Tiku test (Cramer-von Mises test in chi-squared approximation) Algorithms for unbinned distributions – Anderson-Darling test – Cramer-von Mises test – Goodman test (Kolmogorov-Smirnov test in chi-squared approximation) – Kolmogorov-Smirnov test – Kuiper test – Tiku test (Cramer-von Mises test in chi-squared approximation)

Maria Grazia Pia, INFN Genova Recent extensions: algorithms Fisz-Cramer-von Mises test and Anderson-Darling test –exact asymptotic distribution (earlier: critical values) Tiku test –Cramer-von Mises test in a chi-squared approximation New tests: weighted Kolmogorov-Smirnov, weighted Cramer-von Mises –various weighting functions available in literature In preparation –Watson test (can be applied in case of cyclic observations, like Kuiper test) –Girone test It is the most complete software for the comparison of two distributions, even among commercial/professional statistics tools –goal: provide all 2-sample GoF algorithms existing in statistics literature Publication in preparation to describe the new algorithms

Maria Grazia Pia, INFN Genova Recent extensions: user layer First release: user layer for AIDA analysis objects –LCG Architecture Blueprint, Geant4 requirement July 2005: added user layer for ROOT histograms –in response to user requirements Other user layer implementations foreseen –easy to add –sound architecture decouples the mathematical component and the user’s representation of analysis objects –different requirements from various user communities: satisfy them without introducing dependencies on any analysis tools

Maria Grazia Pia, INFN Genova Software release Releases are publicly downloadable from the web –code, documentation etc. For the convenience of LCG users, releases are also distributed with LCG AA software as “external contributions” Also ported to Java, distributed with JAS Release with new algorithms planned in autumn –+ publication on recent extensions Releases include extensive user documentation –statistics algorithms –how to use the software The project is systematically accompanied by publications on refereed journals to document the recognition of its scientific value

Maria Grazia Pia, INFN Genova Examples of Usage Geant4 physics validation –rigorous approach: quantitative evaluation of Geant4 physics models with respect to established reference data –see for instance: K. Amako et al., Comparison of Geant4 electromagnetic physics models against the NIST reference data IEEE Trans. Nucl. Sci (2005) LCG Simulation Validation project –see for instance: A. Ribon, Testing Geant4 with a simplified calorimeter setup, CMS –validation of “new” histograms w.r.t. “reference” ones in OSCAR Validation Suite Usage also in space science, medicine etc.

Maria Grazia Pia, INFN Genova Power of GoF tests Do we really need such a wide collection of GoF tests? Why? most appropriate Which is the most appropriate test to compare two distributions? good How “good” is a test at recognizing real equivalent distributions and rejecting fake ones? Which test to use?

Maria Grazia Pia, INFN Genova Systematic study of GoF tests No comprehensive study of the relative power of GoF tests exists in literature –novel research in statistics –novel research in statistics (not only in physics data analysis!) Systematicall Systematic study of all existing GoF tests in progress –made possible by the extensive collection of tests in the Statistical Toolkit guidance to the users Provide guidance to the users based on sound quantitative arguments Preliminary results available Publication in preparation

Maria Grazia Pia, INFN Genova Method for the evaluation of power N=1000 Monte Carlo replicas Pseudoexperiment: a random drawing of two samples from two parent distributions For each test, the p-value computed by the GoF Toolkit derives from the analytical calculation of the asymptotic distribution, often depending on the samples sizes Confidence Level = 0.05 Parent distribution 1 Sample 1 n Sample 2 m GoF test Parent distribution 2 Power = # pseudoexperiments with p-value < (1-CL) # pseudoexperiments

Maria Grazia Pia, INFN Genova Parent distributions Uniform Gaussian Double exponential Cauchy Exponential Contaminated Normal Distribution 2Contaminated Normal Distribution 1 Also Breit-Wigner, other distributions being considered

Maria Grazia Pia, INFN Genova Characterization of distributions Parent distribution ST f 1 (x) Uniform f 2 (x) Gaussian f 3 (x) Double exponential f 4 (x) Cauchy f 5 (x) Exponential f 6 (x) Contamined normal f 7 (x) Contamined normal SkewnessTailweight

Maria Grazia Pia, INFN Genova The power increases with the sample size (analytical calculation of the asymptotic distribution) N sample Power Kolmogorov-Smirnov test CL = 0.05 The “location-scale problem” Case: Parent1 = Parent 2 Uniform Normal Exponential Double Exponential Contaminated Normal 1 Contaminated Normal 2 Cauchy small size samples moderate size samples Preliminary

Maria Grazia Pia, INFN Genova The “general shape problem” Distribution1 – Distribution 2KSCVMAD CN2 - Normal55.6± ± ±1.1 CN2 - CN124.9± ± ±1.6 CN2 - Double Exponential37.6± ± ±1.6 T2T2 Case: Parent1 ≠ Parent 2 (S 1 = S 2 = 1) Power Tailweight Distribution 2 CL = 0.05 Kolmogorov-Smirnov Cramér-von Mises Anderson-Darling Distribution 1 and Double Exponential (T 1 = 2.161) A) Symmetric distributions B) Skewed versus symmetric distributionsKSKSCVMCVMADAD ~< For long tailed distributions:KSKSCVMCVMADAD ~~ For short/medium tailed distributions: Preliminary

Maria Grazia Pia, INFN Genova Comparative evaluation of tests Short (T<1.5) Medium (1.5 < T < 2) Long(T>2) S~1S~1S~1S~1KSKS – CVMCVM - AD S>1.5 KS - ADADCVM - AD Skewness Tailweight Preliminary

Maria Grazia Pia, INFN Genova Preliminary results clear winner No clear winner for all the considered distributions in general –the performance of a test depends on its intrinsic features as well as on the features of the distributions to be compared Practical recommendations 1)first classify the type of the distributions in terms of skewness and tailweight 2)choose the most appropriate test given the type of distributions Systematic study of the power in progress –for both binned and unbinned distributions Topic still subject to research activity in the domain of statistics Publication in preparation

Maria Grazia Pia, INFN Genova Outlook 1-sample GoF tests (comparison w.r.t. a function) Comparison of two/multi-dimensional distributions Systematic study of the power of GoF tests Goal to provide an extensive set of algorithms so far published in statistics literature, with a critical evaluation of their relative strengths and applicability Treatment of errors, filtering New release coming soon New papers in preparation Other components beyond GoF? Suggestions are welcome…

Maria Grazia Pia, INFN Genova Conclusions A novel, complete software toolkit for statistical analysis is being developed –rich set of algorithms –sound architectural design –rigorous software process A systematic study of the power of GoF tests is in progress –unexplored area of research Application in various domains –Geant4, HEP, space science, medicine… Feedback and suggestions are very much appreciated The project is open to developers interested in statistical methods