Preliminary Ideas for a New Project Proposal.  Motivation  Vision  More details  Impact for Geant4  Project and Timeline P. Mato/CERN 2.

1 Preliminary Ideas for a New Project Proposal

2  Motivation  Vision  More details  Impact for Geant4  Project and Timeline P. Mato/CERN 2

3  For the last 40 years HEP event processing frameworks have had the same structure ◦ initialize; loop events {loop modules {…} }; finalize ◦ O-O has not added anything substantial ◦ It is simple, intuitive, easy to manage, scalable  Current frameworks designed late 1990’s ◦ We know now better what is really needed ◦ Unnecessary complexity impacts on performance  Inadequate for the many-core era ◦ Multi-process, multi-threads, GPU’s, vectorization, etc. ◦ The one job-per-core scales well but requires too much memory and sequential file merging P. Mato/CERN 3

4  Same framework for simulation, reconstruction, analysis, high level trigger applications  Common framework for use by any experiment  Decomposition of the processing of each event into ‘chunks’ that can executed concurrently  Ability to process several [many] events concurrently  Optimized scheduling and associated data structures  Minimize any processing requiring exclusive access to resources because it breaks concurrency  Supporting various hardware/software technologies  Facilitate the integration of existing LHC applications code (algorithmic part)  Quick delivery of running prototypes. The opportunity of the 18 months LHC shutdown P. Mato/CERN 4

5  Current frameworks used by LHC experiments supports all data processing applications ◦ High-level trigger, reconstruction, analysis, etc. ◦ Nothing really new here  But, simulation applications are designed with a big ‘chunk’ in which all Geant4 processing is happening ◦ We to improve the full and fast simulation using the set common services and infrastructure ◦ See later the implications for Geant4  Running on the major platforms ◦ Linux, MacOSX, Windows P. Mato/CERN 5

6  Frameworks can be shared between experiments ◦ E.g. Gaudi used by LHCb, ATLAS, HARP, MINERVA, GLAST, BES3, etc.  We can do better this time :-) ◦ Expect to work closely with LHC experiments ◦ Aim to support ATLAS and CMS at least  Special emphasis to requirements from: ◦ New experiments  E.g. Linear Collider, SuperB, etc. ◦ Different processing paradigms  E.g. fix target experiments, astroparticles P. Mato/CERN 6

7  Framework with the ability to schedule concurrent tasks ◦ Full data dependency analysis would be required (no global data or hidden dependencies)  Not much gain expected with today’s designed ‘chunks’ ◦ See CMS estimates at CHEP’10 (*)* ◦ Algorithm decomposition can be influenced by the framework capabilities  ‘Chunks’ could be processed by different hardware/software ◦ CPU, GPU, threads, process, etc. P. Mato/CERN 7 Time Input Processing Output

8  Need to deal with tails of sequential processing ◦ See Rene’s presentation (*)*  Introducing Pipeline processing ◦ Never tried before! ◦ Exclusive access to resources can be pipelined e.g. file writing  Need to design a very powerful scheduler P. Mato/CERN 8

9  Start with a sea of ‘chunks’  Then combine them according to required inputs/outputs ◦ Input/Outputs define dependencies => solve them  Organize the ‘chunks’ in queues according to their ‘state’ ◦ Running, Ready, Waiting, etc. Input Module In Out Processor 2 In Out Processor 1 Histogramm 1 In Out Processor 3 ….. 9 Markus Frank’s Design ideas

10 P. Mato/C ERN … Dataflow Manager Processor … … InputProcessor Executor (Wrapper) Input port Output port (multiple instances) 10

11 P. Mato/CERN  Dataflow manager  Knows of each Executor ◦ Input data (mandatory) ◦ Output data  Whenever new data arrives ◦ Evaluate executor fitting the new data content ◦ Gives executor to worker thread Dataflow Manager … InputProcessor 11

12 P. Mato/C ERN Dataflow Manager … InputProcessor Executor I/O Executor I/O Executor I/O 12

13  Task: Formal workload to be given to a worker thread  To schedule an Task ◦ acquire worker from idle queue ◦ attach task to worker ◦ start worker  Once finished ◦ put worker back to idle queue ◦ Task back to “sea” ◦ Check work queue for rescheduling P. Mato/CERN 13 Dataflow Manager (Scheduler) Worker Idle queue Busy queue Worker Task Worker Task Waiting work Markus Frank’s Design ideas

14  Resource ‘locking’ can limit strongly parallelism  Need to restrict/limit access to some resources ◦ E.g. Database connections, file for writing, share memory for writing, etc.  The blocking time should be reduced to the bare- minimum  Best would be to have only one processing instance accessing these resources ◦ ‘Only-one-writer’ rule P. Mato/CERN 14

15  Concrete algorithms can be parallelized with some effort ◦ Making use of Threads, OpenMP, MPI, GPUs, etc. ◦ But difficult to integrate them in a complete application  E.g. MT-G4 with Parallel Gaudi ◦ Performance-wise only makes sense to parallelize the complete application and not only parts  Developing and validating parallel code is difficult ◦ ‘Physicists’ should be saved from this ◦ In any case, concurrency will limit what can be done and not in algorithmic code  At the Framework level you have the overall view and control of the application P. Mato/CERN 15

16  Re-engineer G4 to use the new framework ◦ Make use of the common set of foundation packages (math, vectors, utility classes, etc.) ◦ With this we can get an effortless integration with non G4- core functionality (visualization, I/O, configurability, interactivity, analysis, etc.)  Concurrent processing of sets of ‘tracks’ and ‘events’ ◦ Development of Rene’s ideas of ‘baskets’ of particles organized by particle type, volume shape, etc. ◦ Need to develop an efficient summing (‘reduce’) of the results ◦ Study reproducibility of results (random number sequence) P. Mato/CERN 16

17  Major cleanup of obsolete physics and functionality ◦ Needed in any case for a 15 years old software  Ability to run full and fast MC together using common infrastructure (e.g. geometry, conditions, etc.) ◦ Today’s frameworks allow to run e.g. different ‘tacking algorithms’ in the same program ◦ Defining clearly the input and output types P. Mato/CERN 17

18  Collaboration of CERN with FNAL, DESY and possible other Labs ◦ Start with small number of people (at the beginning) ◦ Open to people willing to collaborate ◦ Strong interactions with ATLAS and CMS (and others)  E.g. Instrumentation of existing applications to provide requirements ◦ Strong collaboration with Geant4 team  Quick delivery of running prototypes (I and II) ◦ First prototype in 12 months :-)  Agile project management with ‘short’ cycles ◦ Weekly meetings to review progress and update plans P. Mato/CERN 18

19  We need to evaluate some of the existing and technologies and design partial prototypes of critical parts ◦ Examples: OpenCL, impact of vectorization, transactional memory, fast scheduling algorithms, etc.  The idea would be to organize these R&D activities in short cycles ◦ Coordinating the interested people to cover all aspects ◦ Coming with conclusions (yes/no) within few months P. Mato/CERN 19

20 P. Mato/CERN 20 2011 2012 2013 2014 LHC shutdown Today R&D, technology evaluation and design of critical parts Complete Prototype I Initial adaptation of LHC and Geant4 applications Complete Prototype II with experience of porting LHC applications First production quality release Project definition

