Geant4 Towards major release 10 Gabriele Cosmo, CERN PH/SFT On behalf of the Geant4 Collaboration.

Slides:



Advertisements
Similar presentations
The Geant4 Kernel: Status and Recent Developments John Apostolakis, Gabriele Cosmo – CERN / PH Makoto Asai – SLAC On behalf the Geant4 collaboration April.
Advertisements

Simulation Project Major achievements (past 6 months 2007)
Highlights of latest developments ESA/ESTEC Makoto Asai (SLAC)
CMS Full Simulation for Run-2 M. Hildrith, V. Ivanchenko, D. Lange CHEP'15 1.
IEEE Nuclear Science Symposium and Medical Imaging Conference Short Course The Geant4 Simulation Toolkit Sunanda Banerjee (Saha Inst. Nucl. Phys., Kolkata,
U-Solids: new geometrical primitives library for Geant4 and ROOT Marek Gayer CERN Physics Department (PH) Group Software Development for Experiments (SFT)
GEant4 Parallelisation J. Apostolakis. Session Overview Part 1: Geant4 Multi-threading C++ 11 threads: opportunity for portability ? Open, revised and.
G EANT highlights kernel modules Gabriele Cosmo, CERN PH-SFT for the Geant4 Collaboration Gabriele Cosmo, CERN PH-SFT for the Geant4 Collaboration.
Geant4 Acceptance Suite for Key Observables CHEP06, T.I.F.R. Mumbai, February 2006 J. Apostolakis, I. MacLaren, J. Apostolakis, I. MacLaren, P. Mendez.
GEant4 Parallelisation J. Apostolakis. Session Overview Part 1: Geant4 Multi-threading C++ 11 threads: opportunity for portability ? Open, revised and.
Unified Solids Marek Gayer, John Apostolakis, Gabriele Cosmo, Andrei Gheata, Jean-Marie Guyader, Tatiana Nikitina CERN PH/SFT The 17 th Geant4 Collaboration.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
Requirements for a Next Generation Framework: ATLAS Experience S. Kama, J. Baines, T. Bold, P. Calafiura, W. Lampl, C. Leggett, D. Malon, G. Stewart, B.
Use of Coverity & Valgrind in Geant4 Gabriele Cosmo.
Detector Simulation on Modern Processors Vectorization of Physics Models Philippe Canal, Soon Yung Jun (FNAL) John Apostolakis, Mihaly Novak, Sandro Wenzel.
Geant4 MT: an update J. Apostolakis for Geant4-MT developers Xin Dong, Gene Cooperman (Northeastern Univ.) Makoto Asai, Daniel Brandt (SLAC) J. Apostolakis,
Computing Performance Recommendations #13, #14. Recommendation #13 (1/3) We recommend providing a simple mechanism for users to turn off “irrelevant”
WIRED 4 An extensible generic Event Display Mark Donszelmann SLAC, Stanford, U.S.A. CHEP2004, 27 september – 1 october Interlaken, Switzerland.
Chapter 4 – Threads (Pgs 153 – 174). Threads  A "Basic Unit of CPU Utilization"  A technique that assists in performing parallel computation by setting.
Geant4 in production: status and developments John Apostolakis (CERN) Makoto Asai (SLAC) for the Geant4 collaboration.
TDAQ Upgrade Software Plans John Baines, Tomasz Bold Contents: Future Framework Exploitation of future Technologies Work for Phase-II IDR.
New software library of geometrical primitives for modelling of solids used in Monte Carlo detector simulations Marek Gayer, John Apostolakis, Gabriele.
Jump to first page The new ROOT geometry package Andrei Gheata - ALICE Institute of Space Sciences, Bucharest.
Parallellising Geant4 John Allison Manchester University and Geant4 Associates International Ltd 16-Jan-2013Geant4-MT John Allison Hartree Meeting1.
Parallelization of likelihood functions for data analysis Alfio Lazzaro CERN openlab Forum on Concurrent Programming Models and Frameworks.
LHCb production experience with Geant4 LCG Applications Area Meeting October F.Ranjard/ CERN.
The CMS Simulation Software Julia Yarba, Fermilab on behalf of CMS Collaboration 22 m long, 15 m in diameter Over a million geometrical volumes Many complex.
Introduction What is detector simulation? A detector simulation program must provide the possibility of describing accurately an experimental setup (both.
Alex Howard, ETH, Zurich 13 th September 2012, 17 th Collaboration Meeting, Chartres 1 Geometrical Event Biasing Facility Alex Howard ETH, Zurich Geometrical.
ATLAS Meeting CERN, 17 October 2011 P. Mato, CERN.
STATUS OF THE UNIFIED SOLIDS LIBRARY Gabriele Cosmo/CERN Tatiana Nikitina/CERN.
JIRA tasks update Week 28 October – 4 November 2014.
G EANT highlights kernel modules Gabriele Cosmo, CERN PH-SFT for the Geant4 Collaboration Gabriele Cosmo, CERN PH-SFT for the Geant4 Collaboration.
Outline  Developments/fixes in the last year  Introduced in release 9.6 and patches  Planned for release 10.0 and ongoing …  Currently under development.
U-Solids: new geometrical primitives library for Geant4 and ROOT Marek Gayer CERN Physics Department (PH) Group Software Development for Experiments (SFT)
Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.
LHCbComputing LHCC status report. Operations June 2014 to September m Running jobs by activity o Montecarlo simulation continues as main activity.
Computing Performance Recommendations #10, #11, #12, #15, #16, #17.
Closing remarks Makoto Asai and Marc Verderi 2012 Geant4 Collaboration Meeting Chartres.
Geant4 CPU performance : an update Geant4 Technical Forum, CERN, 07 November 2007 J.Apostolakis, G.Cooperman, G.Cosmo, V.Ivanchenko, I.Mclaren, T.Nikitina,
CERN PH/SFT in Geant4 Gabriele Cosmo, PH/SFT. CERN PH/SFT in Geant4 O The focus of the team is on the LHC experiments: the customers O Our actions in.
Parallelization Geant4 simulation is an embarrassingly parallel computational problem – each event can possibly be treated independently 1.
Update on G5 prototype Andrei Gheata Computing Upgrade Weekly Meeting 26 June 2012.
Geant beta Steps towards release 10 Gabriele Cosmo, PH/SFT.
Geant4 release 5.1 summary Gabriele Cosmo EP/SFT.
Preliminary Ideas for a New Project Proposal.  Motivation  Vision  More details  Impact for Geant4  Project and Timeline P. Mato/CERN 2.
Some GPU activities at the CMS experiment Felice Pantaleo EP-CMG-CO EP-CMG-CO 1.
Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT.
General Introduction and prospect Makoto Asai (SLAC PPA/SCA)
LHCbComputing Computing for the LHCb Upgrade. 2 LHCb Upgrade: goal and timescale m LHCb upgrade will be operational after LS2 (~2020) m Increase significantly.
Follow-up to SFT Review (2009/2010) Priorities and Organization for 2011 and 2012.
2011 Development Plan Makoto Asai (SLAC PPA/SCA) on behalf of the Geant4 Collaboration March 3 rd, Geant4 Technical Forum.
Toward Geant4 version 10 Makoto Asai (SLAC PPA/SCA) For the Geant4 Collaboration Geant4 Technical Forum December 6 th, 2012.
Status of the development of the Unified Solids library Marek Gayer, CERN PH/SFT 2 nd AIDA Annual Meeting, Frascati 2013.
Geant4 - General Status Updates and Perspectives Makoto Asai (SLAC) August 27th, 2015 Geant4 Space Users Hiroshima.
Report on Vector Prototype J.Apostolakis, R.Brun, F.Carminati, A. Gheata 10 September 2012.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
Multi-threading and other parallelism options J. Apostolakis Summary of parallel session. Original title was “Technical aspects of proposed multi-threading.
16 th Geant4 Collaboration Meeting SLAC, September 2011 P. Mato, CERN.
Geant4 Computing Performance Task with Open|Speedshop Soon Yung Jun, Krzysztof Genser, Philippe Canal (Fermilab) 21 st Geant4 Collaboration Meeting, Ferrara,
Update on USolids/VecGeom integration in Geant4 Gabriele Cosmo, CERN EP/SFT.
Geant4 MT Performance Soon Yung Jun (Fermilab)
Geant4 Geometry Speed-ups
CPU Benchmarks Parallel Session Summary
Simulation Project Structure and tasks
LHCb.
Markus Frank CERN/LHCb CHEP2013, Amsterdam, October 14th–18th 2013
Geometry checking tools
Simulation Project Structure and tasks
Simulation Project Structure and tasks
Presentation transcript:

Geant4 Towards major release 10 Gabriele Cosmo, CERN PH/SFT On behalf of the Geant4 Collaboration

Outline Introduction of multi-threading for event-level parallelism Review of features Performance measurements Highlights of new developments & features planned for 10.0 For physics developments, see in the posters session: “ Geant4 Electromagnetic Physics for LHC Upgrade ”, V.Ivantchenko et al. “ Recent Developments in the Geant4 Hadronic Framework ”, W.Pokorski et al. Conclusions & final considerations CHEP 2013, Amsterdam - 17 October 2013Geant4 - Towards major release 10 - G.Cosmo2

Geant First major release since 2007 Important modifications introduced to most classes Adaptations to thread-safety for event-level parallelism Additional API for user-action classes Backwards compatible with old API in sequential mode Major revision of internal data initialisation in all areas Reviewed memory management New and extended features Removal of obsolete/deprecated code and interfaces CHEP 2013, Amsterdam - 17 October 2013Geant4 - Towards major release 10 - G.Cosmo3  May imply changes/adaptation to user’s code

Multi-threading from prototype to production … Capitalizing the work started back in 2009 By X.Dong and G.Cooperman, Northeastern University Big effort brought to success 10.0-beta announced on June 28 th on schedule Final release expected for December 6 th Geant4 - Towards major release 10 - G.Cosmo G4MT 9.4 (2011) G4MT 9.5 (2012) G beta ( Jun ) G (Dec. 2013) G4 10 series (2014+) Proof of principle Identify objects to be shared First testing MT code integrated into G4 API re-design Examples migration Further testing First optimisations Public release All functionalities ported to MT Further refinements Focus on further performance improvements CHEP 2013, Amsterdam - 17 October 20134

Multi-threading 10.0 features - 1/2 Event-level parallelism Each worker thread proceeds independently Initializes its state from a master thread Identifies its part of the work (events) Generates hits in its own hits- collection Uses thread-private objects and state Shares read-only data structures (e.g. geometry, cross-sections, …) Has its own read-write part in a few ‘shared/split’ objects Geant4 - Towards major release 10 - G.Cosmo Possibility to install/run Geant4 either in pure sequential or parallel (MT) mode Choice at configuration/installation time Sequential mode set as the default CHEP 2013, Amsterdam - 17 October 20135

Multi-threading 10.0 features - 2/2 Geant4 - Towards major release 10 - G.Cosmo Focus on “lock-free” code Metrics currently in use: linearity of speed-up (w.r.t. #threads) Enforce use of POSIX standards to allow for integration with user preferred parallelization frameworks (e.g. TBB, MPI, …) Absolute throughput optimisations are ongoing and will follow Design aimed to minimize changes in users code Keep API changes at minimum Allows for backwards compatibility CHEP 2013, Amsterdam - 17 October 20136

Multi-threading Porting applications … Few changes needed in user code: 1.Change main() to use G4MTRunManager – one line 2.Create Sensitive Detector & Field in a new method 3.Adapt to per-event RNG seeding (potential change) 4.Check User ‘Action’ classes (Step, Track, Event) Choice - h andling Output: per thread or accumulate ? Geant4 automatically performs reductions (accumulation) when using scorers or G4Run derived classes Testing Check output of runs – MT vs 1-thread vs Sequential See: CHEP 2013, Amsterdam - 17 October 2013Geant4 - Towards major release 10 - G.Cosmo7

Multi-threading Performance – 1/4 Showing good efficiency w.r.t. excellent linearity vs. number of threads (~95%) From 1.1 to 1.5 extra gain factor in HT-mode on HT-capable hardware Geant4 - Towards major release 10 - G.Cosmo (*) Based on performance analysis on full-CMS benchmark (last September development release, of Geant4) by S.Yung Jun, FNAL on AMD Opteron™ 6128, 32 cores No measured CPU degradation vs. sequential runs (*) CHEP 2013, Amsterdam - 17 October 20138

Multi-threading Performance – 2/4 Intel® Xeon Phi™ coprocessor (MIC) (*) 60 cores (4 HW threads each), 16Gb RAM Excellent results: additional factor ~2 in events produced w.r.t. host only Confirmed good scalability up to 240 threads Full physics: 50 GeV pions with B-field on Reduced use of memory (see next slide) Geant4 - Towards major release 10 - G.Cosmo (*) Analysis on full-CMS benchmark on latest September development release by A.Dotti, SLAC CHEP 2013, Amsterdam - 17 October HT mode

Multi-threading Performance – 3/4 Intel® Xeon Phi™ coprocessor Using out-of-the-box beta (i.e. no optimisations) ~40 MB/thread Baseline: Full-CMS benchmark; 200 MB (geometry and physics) Speedup almost linear with reasonably small increase of memory usage Geant4 - Towards major release 10 - G.Cosmo (*) Analysis on full-CMS benchmark for release 10.0-beta by A.Dotti, SLAC Number of threads Memory usage (MB) CHEP 2013, Amsterdam - 17 October

Multi-threading Performance – 4/4 Exynos 4412 Prime quad-core 1.7GHz (*) Based on latest September development release Full-CMS benchmark with full physics (single 50GeV) with B-Field turned on Each thread processing 100 events Still good linearity vs. number of working threads See also presentation by P.Elmer et al.: “ Explorations of the viability of ARM and Intel Xeon Phi for Physics Processing ” Geant4 - Towards major release 10 - G.Cosmo (*) Preliminary analysis on full-CMS benchmark (last September development release of Geant4) by A.Dotti, SLAC CHEP 2013, Amsterdam - 17 October ARM Cortex A9

Multi-threading Physics validation results… 20 Gev proton on W-Lar Full showers simulated FTFP_BERT physics-list Sequential: 5000 events Multi-threaded: events 4 threads; results for 1 thread shown Geant4 - Towards major release 10 - G.CosmoCHEP 2013, Amsterdam - 17 October  Aiming for perfect reproducibility vs. sequential

Multi-threading Next to come … - 1 Review and further refinements to API Based on feedback from users and Beta testers Rationalisation and better modularisation of code for the initialisation of threads Further simplification for user-code migration Geant4 - Towards major release 10 - G.CosmoCHEP 2013, Amsterdam - 17 October Further improve performance Identify and solve hotspots Investigate use of thread-private malloc (to remove hidden locks in new/delete) Improve event throughput (inter-algorithm parallelism)

Multi-threading Next to come … - 2 Address and solve few limitations & problems affecting version 10.0-beta Improve testing coverage Geant4 - Towards major release 10 - G.CosmoCHEP 2013, Amsterdam - 17 October Further investigations on task-based parallelism (TBB) TBB works already with Geant4-MT Provide one or two examples based on the new API Study heterogeneous parallelism (MPI together with multi- threading) Use in hybrid systems (host + one [or more] MIC card) Adoption of check-pointing technique (DMTCP) to improve start-up time

Developments in release 10.0… Highlights on kernel modules Geant4 - Towards major release 10 - G.CosmoCHEP 2013, Amsterdam - 17 October

Geometry 10.0-beta features Replaced UI commands for geometry overlaps check Now based on built-in overlaps checking for random points generated on solids’ surfaces Now consistently working also for parameterised volumes Possibility to tune resolution for the test and set tolerances Possibility to define depth interval in geometrical tree Geant4 - Towards major release 10 - G.CosmoCHEP 2013, Amsterdam - 17 October Introduction of gravity field and magnetic field gradient Use of precise safety computation by default in navigation Archived obsolete BREPs classes and module

Geometry Geometrical primitives AIDA Unified Solids library AIDA Unified Solids library integration AIDA Unified Solids library As optional component, for replacing the original solids Provides optimised implementation for a large number of geometrical primitives and constructs box, orb, sphere (+sphere section), tube (+cylindrical section), cone (+conical section), simple, generic & arbitrary trapezoid, tetrahedron, polycone, polyhedra, extruded solid, tessellated solid and new Multi- Union structure Geant4 - Towards major release 10 - G.CosmoCHEP 2013, Amsterdam - 17 October

Geometry Unified Solids Library performance – a couple of examples… Significant speedup achieved for some shapes Tessellated shape: now making possible fine-grained tessellation CHEP 2013, Amsterdam - 17 October 2013Geant4 - Towards major release 10 - G.Cosmo18 Multi-Union construct MethodSpeedup Inside 2423x DistanceToIn 1334x DistanceToOut 1976x InformationValue Number of facets Number of voxels Memory saved compared with original Geant4 22% (51MB) LHCb VELO RF-foil

More features … Highlights Adoption of fast mathematical functions for exp() and log() Extracted from VDT library (D.Piparo et al.) & adapted Expected CPU performance improvements Geant4 - Towards major release 10 - G.CosmoCHEP 2013, Amsterdam - 17 October Automatically generating isotope vector with natural abundances (NIST materials) Variables shadowing … Units & constants inclusion Enhanced CMake build system Deprecated GNUMake based tools Redesigned examples (basic & extended) Several examples migrated to support multi-threading Updated data sets Ability to treat compressed data for G4NDL library New framework for “generic” biasing for physics-based biasing Based on wrapper and helper classes

More features … Visualization & Analysis Improved Qt support & GUI Ability to display in MT and sequential mode GL with no graphics card To use for automated tests or launch GL graphics from batch See also: “ Geant4 application in a Web browser ”, L.Garnier et al. Geant4 - Towards major release 10 - G.CosmoCHEP 2013, Amsterdam - 17 October Redesigned interfaces for analysis/histogramming; multi-thread capable See poster: “ Integration of g4tools in Geant4 ”, I.Hrivnacova et al.

Summary Release 10.0 is going to introduce ‘optional’ event-level parallelism through use of independent working threads Excellent scalability vs. #threads up to O(100) threads with no performance penalty vs. sequential mode Physics validation tests done so far are positive Aiming to achieve exact event reproducibility vs. sequential mode Allowing for easy & smooth migration of users code Geant4 - Towards major release 10 - G.CosmoCHEP 2013, Amsterdam - 17 October Lots of new features in all areas in view of the final release in December 10.0-beta notes: Work plan: