Vector Prototype Status Philippe Canal (For VP team)
Components Scheduler UGeom Physics Simulation 7/21/14Vector Prototype Status2
About Scheduler Data structures Baskets and basket management Basket managers (per LV) Track and basket lifecycle Transport (physics and geometry) and track phases Scheduler workflow 7/21/14Vector Prototype Status3
GeantTrack Track identifiers – event, slot (memory management), track ID, PDG, G5 code Particle identifiers – PDG, GeantV code, charge, mass, species Kinematics – position, direction, momentum, energy Status flags – status, N steps, N null steps, boundary flag, pending flag Geometry/physics context – process, proposed step, current step, edep, distance to boundary, safety, current path, next path 7/21/14Vector Prototype Status4
GeantTrack fEvent fEvslot fParticle fPDG … fXpos fYpos fZpos fXdir fYdir fZdir … Edep Pstep Snext Safety fEvent fEvslot fParticle fPDG … fXpos fYpos fZpos fXdir fYdir fZdir … Edep Pstep Snext Safety *fPath *fNextpath *fEventV *fEvslotV *fParticleV *fPDGV … *fXposV *fYposV *fZposV *fXdirV *fYdirV *fZdirV … *fEdepV *fPstepV *fSnextV *fSafetyV *fEventV *fEvslotV *fParticleV *fPDGV … *fXposV *fYposV *fZposV *fXdirV *fYdirV *fZdirV … *fEdepV *fPstepV *fSnextV *fSafetyV C0 00 *fPathV *fNextpath V fEventV fEvslotV fParticle V V fPDGV … … fXPosV fYPosV fZPosV fSnextV fSafetyV … … fPathV fNextpathV fNtracks=10 padding=32 vector 1 vector 2 GeantTrackPool GeantTrack GeantTrack_v SOA of fNtracks fBuffer 192 bytes TO FIX 7/21/14Vector Prototype Status5
Track_v operations (overhead) Pre-requirement to use vectorized: contiguity at the beginning of the arrays fEventVfParticleV … During transport, tracks stop leaving holes in the container Method(fXposV,…, fNtracks) or Method(GeantTrack_v &) fEventVfParticleV … fEventVfParticleV fEventVfParticleV fEventVfParticleV Use Compact Move A A A B A 7/21/14Vector Prototype Status6
Track_v operations (overhead) Track selection according some criteria fEventVfParticleV … Tracks have to be copied to a receiver during rescheduling fEventVfParticleV … fEventVfParticleV fEventVfParticleV Reshuffle Copy fEventVfParticleV … AA A B C Concurrency support 7/21/14Vector Prototype Status7
GeantBasket Elementary work unit for GeantV – They currently only hold tracks that are physically inside a given logical volume – Input GeantTrack_v array, filled by the scheduler – Output GeantTrack_v array, filled during transport Baskets have thread local access during transport, but concurrent access during scheduling Input Scheduler Transport Physics Output 7/21/14Vector Prototype Status8
Automatic basket scheduling Concurrent track addition, garbage collection, collection of tracks from prioritized events Adjustable threshold – T vol = N tracks_in_flight /2N threads rounded to %4 (min 4, max 256) Volume BM fThreshold * current empty Basket pool Transport queue GeantScheduler bottleneck 7/21/14Vector Prototype Status9
Basket lifecycle empty full Basket pool TGeoVolume Basket manager current Generator Scheduler 1…N volumes Transport queue Propagator transported recycle AddTrack priority AddTrack Push on threshold Push on garbage collection 1…N workers 7/21/14Vector Prototype Status10
Track lifecycle PhysicsSelect fProcessV[i] fPstepV[i] PropagateTracks Input tracks Output tracks kCrossing kExiting kPhysics kKilled (geom) PostStep (continuous) PostStep (continuous) fXposV[i], … fXdir[i], …, fPV[i], fEV[i] PostStep (discrete) PostStep (discrete) kNew kKilled(phys) kKilled(phys) 7/21/14Vector Prototype Status11
PropagateTracks kVector – continue in vector mode kSingle – call PropagateTracksSingle at the given stage kPostpone – copy remaining tracks to output MarkRemoved + Compact – compact holes and copy these tracks to the output PostponedAction kVector kSingle kPostpone ComputeTransport Length ComputeTransport Length FindNextBoundary AndStep FindNextBoundary AndStep vectorloop Propagate Neutrals Propagate Neutrals kCrossing kExiting kPhysics MarkRemoved Compact(output) MarkRemoved Compact(output) Propagate Safe<Pstep Propagate Safe<Pstep kPhysics Propagate Close to bound. Propagate Close to bound. kCrossing kExiting Propagate with safety Propagate with safety fSnextV[i], fSafetyV[i] stage0 stage1 stage2 7/21/14Vector Prototype Status12
Propagation to boundaries Safety-based approach algorithm very slow What is the step in magnetic field which shifts the final particle position with no more than epsilon with respect to linear propagation? – If proposed step within isotropic safety: use safety – Otherwise take into account only safe_step value in competition with distance to boundary and proposed step C =1/R ε = 1 micron safe_step = 2√ε/C 7/21/14Vector Prototype Status13
Track stages Imported Pending (threshol d) Queued for pickup Being transpor ted Queued to be dispatch ed Scheduled Basket manage r Transport queue Generator Basket transport Scheduler queue Scheduler dispatch Priority dispatch 7/21/14Vector Prototype Status14
Scheduler Pulls transported baskets, dispatches tracks to basket managers per volume – Not anymore! Applying policies to: – Provide work balancing (concurrency) – Keep memory under control – Keep the vectors up (most of the time) 7/21/14Vector Prototype Status15
Scheduler workflow Recycle transported baskets Event done? Digitize event ImportTracks Digitize event ImportTracks Last event done? EXIT Priority is ON? Y Y Last PE done? PE = prioritized event PE range = event number range for priority events Stop priority mode Y Y Queue flushed? Flush priority baskets Y Q size<min Adjust basket size Y Priority = ON PE range = (last,last+4) Priority = ON PE range = (last,last+4) Collect prioritized tracks (once) Empty Q? Garbage collect Y Check track counters Digitize transported events and Inject new events into released slot Priority mode: the scheduler puts all tracks from priority events to special baskets, injected them at every loop regardless the content Garbage collect mode when the queue is empty: inject every basket regardless the content 7/21/14Vector Prototype Status16
Monitoring Main bottleneck: GeantObjectPool::Borrow/Return 7/21/14Vector Prototype Status17
Performance 1000 events with 100 tracks each, measured on a 24- core dual socket E GHz (IVB). 7/21/14Vector Prototype Status18
Physics Simulation Strategy – Implement tabulated physics Backport to Geant4 as a single process (incorporating all implemented physics) Compare back ported Physics to regular G4 – Both physics performance and run-time performance Then compare VP with tabulated physics against G4 with tabulated physics – Implement vectorized physics Same scheme for verification 7/21/14Vector Prototype Status19
Vector Prototype Status20
Vector Prototype Status21
Vector Prototype Status22
Vector Prototype Status23
Vector Prototype Status24
Vector Prototype Status25
Vector Prototype Status26
Vector Prototype Status27
Vector Prototype Status28
Vector Prototype Status29
Vector Prototype Status30
Tabulated Physics Everything (except decay) is implemented both behind Geant4 (as a TotalPhysicsProcess) and behind VP Simple final state correction is implemented – scaling of the 3-momentum; of course not correct but we cannot do anything else for now. exampleN03 – exampleN03 can now be executed both by using the tabulated physics (default physics list TABPHYS) and FTFP BERT, FTFP BERT HP, QBBC physics lists. Physics list can be selected by -p flag at execution. – both production cuts(fixed to 1.0 [keV]) and tracking cuts are set (in energy) when exampleN03 is executed by using one of the original Geant4 physics lists – tracking cuts can be set by the -l flag at execution (both in case of G4 and TABPHYS physics lists) 7/21/14Vector Prototype Status31
Geant4+FTFP BERT vs Geant4+TABPHYS First results by using e − as primaries with energies of 30, 300, 3000, [MeV]: – production and tracking cuts are the same and set in energy – we don’t have range tables in the tabulated physics – linLossLimit is set to 100% in Geant4 – fluctuations and decay are switched off in Geant4 Energy grid of tabulation: – E p = 30, 300, [MeV]: 1000 bins between 1.0[keV ] − 3.0[GeV ] (logscale) and 10 final states – E p = 30 [GeV]: 100 bins between 1.0[keV ] − 1.0[TeV ] (logscale) and 5 final states 7/21/14Vector Prototype Status32
Next steps Further data will be generated by using Geant4 – switching on fluctuations – setting back linLossLimit to 1-2 % and so on to see these effects... – Decay Can start debugging the prototype! First by comparing these simple statistics generated by Geant4+TABPHYS and GeantV+TABPHYS 7/21/14Vector Prototype Status33
7/21/14Vector Prototype Status34
7/21/14Vector Prototype Status35
7/21/14Vector Prototype Status36
7/21/14Vector Prototype Status37
Geometry Vectorized Propagator implemented Merg(ing) with Usolids – Repository merged – Backward compatible interface Shapes – Box – Paraboloid – Trapezoid – Parallelepiped – Tube Coming soon – Hyperboloid – Polyhedra – Orb – TRD 7/21/14Vector Prototype Status38
7/21/14Vector Prototype Status39
Trapezoid 7/21/14Vector Prototype Status40
7/21/14Vector Prototype Status41
7/21/14Vector Prototype Status42
7/21/14Vector Prototype Status43
7/21/14Vector Prototype Status44
Connecting Geant-V and VecGeom Geant-V could already use VecGeom in serial mode Geant-V can now use VecGeom in vector mode – added missing pieces in Geant-V: some thread local data to provide workspace – completed and tested vector navigation functionality in VecGeom – connected the two 7/21/14Vector Prototype Status45
First glance at performance Did some initial “valgrind --tool=callgrind” benchmarks of Geant-V Used scheduler version with hard working scheduler thread ( on 4+1 threads ) – 200 events – Ex03 geometry – SSE instructions 7/21/14Vector Prototype Status46
Major cpu users 13 % log 13 % Geometry Navigation 5 % memcpy 7/21/14Vector Prototype Status47
Current influence of VecGeom on overall performance 7/21/14Vector Prototype Status48
Geometry Integration Next Steps Performance tuning of vector navigation – global to local transformation not yet optimal – a couple of other ideas comparison to tabulated Geant4 simulation gradually more complicated geometries ( we should now put tubes, traps,... ) 7/21/14Vector Prototype Status49
Summary Progress on all 3 parts in both: – Performance – Breadth Moving along toward silver bullet measurement 7/21/14Vector Prototype Status50