Release Validation J. Apostolakis, M. Asai, G. Cosmo, S. Incerti, V. Ivantchenko, D. Wright for Geant4 12 January 2009
Content Addresses recommendations #20, #21 and # January 2009Geant4 Delta Review Jan 20092
Recommendation 20: common validation procedure “We recommend defining and automating a common validation procedure to be run for every release, monitoring a comprehensive set of variables and exploiting the comparisons with the collected experimental results.” The current procedure for validating each release, started in June 2006 and incrementally improved, utilizes: – a suite of simplified calorimeter setups, run in large statistics using Grid resources, test in regression a large set of observables including shower profile and composition, track length and population of particle species, and exiting neutrons. Configurations span a set of energies from 1 to 300 GeV, set of materials typical to LHC applications and incident particle types. Tests are carried out by a small team, for several release candidates before each release, and are substantially automated for submittal and checking. In particular these calorimeter regression tests – report any deviations (significant at P-level of 1% at any one of several statistical tests) between the current tested version and the baseline distribution. These comparisons are generated as plots, so they can be checked if a pattern of deviations is found; – provide summary information for key observables; – summarize the number of deviations found. 12 January 20093Geant4 Delta Review Jan 2009 Continued..
Recommendation 20 (cont.): common validation procedure Additional tests are undertaken at more frequent intervals, with fewer configurations tested. We are planning to adapt a number of these for release validation. Currently the tests include: 1.a set of comparisons of five simplified calorimeters (similar to LHC ones) is used to validate all changes in EM processes for HEP calorimeter applications observables such as visible energy and resolution are measured for different range cut values, and with different EM precision options. currently these are run to validate EM developments when changes are made and EM tags prepared for a release. Also after every release (see one test case on next page). We plan to adapt all these setups for use as regression tests for release candidates in 2009 – Adding new scripts to compare key observables with values obtained by previous releases. 2.a selection of setups for each the 31 EM system integration tests is run in automated manner for simple comparisons at medium statistics to provide a basic test of majority of standard EM models. The following values are tested: stopping powers, ranges, energy deposition in thin layer, shower profile, cross sections, scattering angles. The tests use the processes via components (builders) used by the reference Physics Lists (eg QGSP, QGSP_EMV,..) Simplified regression testing (using diff) and human checking of changes is used for most tests. Currently two tests provide an automatic criteria for acceptance, based on a Chi^2 criteria. They measure shower profiles in uniform cylindrical and rectangular (crystal) calorimeter. We plan to adapt a set of thin target tests and the Fano cavity test for use in regression testing for release validation by June 2010 – The selection of the tests to be used in pending. 12 January 20094Geant4 Delta Review Jan 2009
One test: Simplified Atlas Barrel-type materials and sizes ResultsResults of one test case Patch geant patch-02 Development version geant ref-09 (1 Nov 2008) geant ref-09 (1 Nov 2008) Releases Release 9.1 (12 Dec 2007) Release 9.2 (19 Dec 2008) The same plot can be found in the file atlasbar.gif in the directories for all other releases and development tags tested. directories 12 January 2009Geant4 Delta Review Jan 20095
Recommendation 21: Validation metrics “We recommend defining quantitative metrics for validation results.” In most cases, we do not currently have such a metric. – A number of comparisons already utilize quantitative comparisons, including the comparison to energy deposition under Fano conditions, a.k.a. the "Fano cavity" test – the result is a ratio of MC energy deposition estimate with that predicted by Fano’s theorem; – see recent Geant4 results in Geant talk of S. Elles, and a short explanation and previous results from 2007;Geant talk of S. Elles2007 the comparison to the muon scattering data of MUSCAT.comparison We identify different cases: – models which describe the relevant data well. An example are most models of EM interactions, many well-established and/or derived from first principles. For these comparisons, metrics such as chi-squared are well suited. – models which do not fully describe the relevant data and/or have large deviations from data. Generally metrics, such as chi-squared, are not very useful in these cases. This is typically the case for hadronic physics models. For these cases we are adopting the ratio of Monte Carlo results to experimental data (“MC/data”) as our metric. 12 January 2009Geant4 Delta Review Jan (Continued on next page)
We plan to roll out the implementation of the MC/data metric in the validation suites wherever possible, in new validations and retro-fitting existing ones. – This will be used in plots (as in MUSCAT here) tables average MC/data values over the chosen, critical, parameter regions. Additional information regarding this recommendation is provided in EM presentation. 12 January 2009Geant4 Delta Review Jan Recommendation 21 (cont.): Validation metrics
Recommendation 22: Easy access to validation results We recommend that all validation results, both the quantitative metrics and the underlying distributions, be made easily accessible to the user. Our preferred approach is to select from each release a subset of validations which demonstrates the state of the most-used models and physics lists. This is currently done – for hadronic physics at URL and at URL – For electromagnetic physics At Improvements that have been made since early 2007 include: – the collection of hadronic validation results into two points of access (see above links), both linked to the Geant4 web pages – the collection of electromagnetic results into a single point of access (see above link), linked to the Geant4 web pages 12 January 2009Geant4 Delta Review Jan (Continued on next page)
Recommendation 22 (cont.): Easy access to validation results We see improving the accessibility to validation as an important way to improve communication – both internally in Geant4 and with our users. Yet there are significant challenges – primarily due to the effort required in keeping current and accessible the validation results: the effort to explain the comparisons for use by others, the need for improved automation in order to provide current comparisons. Plans for 2009: – Enable easier access to the majority of validation results from key validations, making them accessible from the main hadronic validation page low energy and cascade (test30) results, eg such as this plot,plot inclusive pion production (test35) results total target yield of neutrons (test45) results – iterate on providing first, simple explanation of results – continue to add new validation results as they become available 12 January 2009Geant4 Delta Review Jan 20099