Preliminary Ideas for a New Project Proposal.  Motivation  Vision  More details  Impact for Geant4  Project and Timeline P. Mato/CERN 2.

Slides:



Advertisements
Similar presentations
Metrics and Databases for Agile Software Development Projects David I. Heimann IEEE Boston Reliability Society April 14, 2010.
Advertisements

Prescriptive Process models
MIS 2000 Class 20 System Development Process Updated 2014.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Chapter 13 Embedded Systems
CMS Full Simulation for Run-2 M. Hildrith, V. Ivanchenko, D. Lange CHEP'15 1.
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
REVIEW OF NA61 SOFTWRE UPGRADE PROPOSAL. Mandate The NA61 experiment is contemplating to rewrite its fortran software in modern technology and are requesting.
Roger Jones, Lancaster University1 Experiment Requirements from Evolving Architectures RWL Jones, Lancaster University Ambleside 26 August 2010.
CS 153 Design of Operating Systems Spring 2015 Lecture 11: Scheduling & Deadlock.
LC Software Workshop, May 2009, CERN P. Mato /CERN.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
Requirements for a Next Generation Framework: ATLAS Experience S. Kama, J. Baines, T. Bold, P. Calafiura, W. Lampl, C. Leggett, D. Malon, G. Stewart, B.
Status of the vector transport prototype Andrei Gheata 12/12/12.
Games Development 2 Concurrent Programming CO3301 Week 9.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
5 May 98 1 Jürgen Knobloch Computing Planning for ATLAS ATLAS Software Week 5 May 1998 Jürgen Knobloch Slides also on:
Introducing collaboration members – Korea University (KU) ALICE TPC online tracking algorithm on a GPU Computing Platforms – GPU Computing Platforms Joohyung.
Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International.
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.
SEAL Core Libraries and Services CLHEP Workshop 28 January 2003 P. Mato / CERN Shared Environment for Applications at LHC.
Welcome and Introduction P. Mato, CERN.  Outcome of the FNAL workshop ◦ Interest for common effort to make rapid progress on exploratory R&D activities.
Parallelization of likelihood functions for data analysis Alfio Lazzaro CERN openlab Forum on Concurrent Programming Models and Frameworks.
Manno, , © by Supercomputing Systems 1 1 COSMO - Dynamical Core Rewrite Approach, Rewrite and Status Tobias Gysi POMPA Workshop, Manno,
Mantid Stakeholder Review Nick Draper 01/11/2007.
The CMS Simulation Software Julia Yarba, Fermilab on behalf of CMS Collaboration 22 m long, 15 m in diameter Over a million geometrical volumes Many complex.
ATLAS Meeting CERN, 17 October 2011 P. Mato, CERN.
MSE Presentation 1 Lakshmikanth Ganti
The High Performance Simulation Project Status and short term plans 17 th April 2013 Federico Carminati.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
23/2/2000Status of GAUDI 1 P. Mato / CERN Computing meeting, LHCb Week 23 February 2000.
M.Frank CERN/LHCb Event Data Processing  Philosophical changes of future frameworks  Detector description  Miscellaneous application support.
Engineering Economics Lecture 18 Project Management 6 January 2010.
General requirements for BES III offline & EF selection software Weidong Li.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
Update on G5 prototype Andrei Gheata Computing Upgrade Weekly Meeting 26 June 2012.
General Introduction and prospect Makoto Asai (SLAC PPA/SCA)
Concurrency and Performance Based on slides by Henri Casanova.
Follow-up to SFT Review (2009/2010) Priorities and Organization for 2011 and 2012.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Tuning Threaded Code with Intel® Parallel Amplifier.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
BES III Software: Beta Release Plan Weidong Li 19 th October 2005.
Marco Cattaneo, 3-June Event Reconstruction for LHCb  What is the scope of the project?  What are the goals (short+medium term)?  How do we organise.
Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.
Report on Vector Prototype J.Apostolakis, R.Brun, F.Carminati, A. Gheata 10 September 2012.
LHCb Core Software Programme of Work January, 2012 Pere Mato (CERN)
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Modularization of Geant4 Dynamic loading of modules Configurable build using CMake Pere Mato Witek Pokorski
Mini-Workshop on multi-core joint project Peter van Gemmeren (ANL) I/O challenges for HEP applications on multi-core processors An ATLAS Perspective.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
Marco Cattaneo, 20-May Event Reconstruction for LHCb  What is the scope of the project?  What are the goals (short+medium term)?  How do we organise.
16 th Geant4 Collaboration Meeting SLAC, September 2011 P. Mato, CERN.
GUIDO VOLPI – UNIVERSITY DI PISA FTK-IAPP Mid-Term Review 07/10/ Brussels.
Kilohertz Decision Making on Petabytes
Parallel Programming By J. H. Wang May 2, 2017.
Report on Vector Prototype
Software Project Planning &
Chapter 6: CPU Scheduling
Operating System Concepts
Linear Collider Simulation Tools
Chapter 6: CPU Scheduling
Chapter 6: CPU Scheduling
Linear Collider Simulation Tools
Process Management -Compiled for CSIT
Planning next release of GAUDI
Concurrency: Threads, Address Spaces, and Processes
Presentation transcript:

Preliminary Ideas for a New Project Proposal

 Motivation  Vision  More details  Impact for Geant4  Project and Timeline P. Mato/CERN 2

 For the last 40 years HEP event processing frameworks have had the same structure ◦ initialize; loop events {loop modules {…} }; finalize ◦ O-O has not added anything substantial ◦ It is simple, intuitive, easy to manage, scalable  Current frameworks designed late 1990’s ◦ We know now better what is really needed ◦ Unnecessary complexity impacts on performance  Inadequate for the many-core era ◦ Multi-process, multi-threads, GPU’s, vectorization, etc. ◦ The one job-per-core scales well but requires too much memory and sequential file merging P. Mato/CERN 3

 Same framework for simulation, reconstruction, analysis, high level trigger applications  Common framework for use by any experiment  Decomposition of the processing of each event into ‘chunks’ that can executed concurrently  Ability to process several [many] events concurrently  Optimized scheduling and associated data structures  Minimize any processing requiring exclusive access to resources because it breaks concurrency  Supporting various hardware/software technologies  Facilitate the integration of existing LHC applications code (algorithmic part)  Quick delivery of running prototypes. The opportunity of the 18 months LHC shutdown P. Mato/CERN 4

 Current frameworks used by LHC experiments supports all data processing applications ◦ High-level trigger, reconstruction, analysis, etc. ◦ Nothing really new here  But, simulation applications are designed with a big ‘chunk’ in which all Geant4 processing is happening ◦ We to improve the full and fast simulation using the set common services and infrastructure ◦ See later the implications for Geant4  Running on the major platforms ◦ Linux, MacOSX, Windows P. Mato/CERN 5

 Frameworks can be shared between experiments ◦ E.g. Gaudi used by LHCb, ATLAS, HARP, MINERVA, GLAST, BES3, etc.  We can do better this time :-) ◦ Expect to work closely with LHC experiments ◦ Aim to support ATLAS and CMS at least  Special emphasis to requirements from: ◦ New experiments  E.g. Linear Collider, SuperB, etc. ◦ Different processing paradigms  E.g. fix target experiments, astroparticles P. Mato/CERN 6

 Framework with the ability to schedule concurrent tasks ◦ Full data dependency analysis would be required (no global data or hidden dependencies)  Not much gain expected with today’s designed ‘chunks’ ◦ See CMS estimates at CHEP’10 (*)* ◦ Algorithm decomposition can be influenced by the framework capabilities  ‘Chunks’ could be processed by different hardware/software ◦ CPU, GPU, threads, process, etc. P. Mato/CERN 7 Time Input Processing Output

 Need to deal with tails of sequential processing ◦ See Rene’s presentation (*)*  Introducing Pipeline processing ◦ Never tried before! ◦ Exclusive access to resources can be pipelined e.g. file writing  Need to design a very powerful scheduler P. Mato/CERN 8

 Start with a sea of ‘chunks’  Then combine them according to required inputs/outputs ◦ Input/Outputs define dependencies => solve them  Organize the ‘chunks’ in queues according to their ‘state’ ◦ Running, Ready, Waiting, etc. Input Module In Out Processor 2 In Out Processor 1 Histogramm 1 In Out Processor 3 ….. 9 Markus Frank’s Design ideas

P. Mato/C ERN … Dataflow Manager Processor … … InputProcessor Executor (Wrapper) Input port Output port (multiple instances) 10

P. Mato/CERN  Dataflow manager  Knows of each Executor ◦ Input data (mandatory) ◦ Output data  Whenever new data arrives ◦ Evaluate executor fitting the new data content ◦ Gives executor to worker thread Dataflow Manager … InputProcessor 11

P. Mato/C ERN Dataflow Manager … InputProcessor Executor I/O Executor I/O Executor I/O 12

 Task: Formal workload to be given to a worker thread  To schedule an Task ◦ acquire worker from idle queue ◦ attach task to worker ◦ start worker  Once finished ◦ put worker back to idle queue ◦ Task back to “sea” ◦ Check work queue for rescheduling P. Mato/CERN 13 Dataflow Manager (Scheduler) Worker Idle queue Busy queue Worker Task Worker Task Waiting work Markus Frank’s Design ideas

 Resource ‘locking’ can limit strongly parallelism  Need to restrict/limit access to some resources ◦ E.g. Database connections, file for writing, share memory for writing, etc.  The blocking time should be reduced to the bare- minimum  Best would be to have only one processing instance accessing these resources ◦ ‘Only-one-writer’ rule P. Mato/CERN 14

 Concrete algorithms can be parallelized with some effort ◦ Making use of Threads, OpenMP, MPI, GPUs, etc. ◦ But difficult to integrate them in a complete application  E.g. MT-G4 with Parallel Gaudi ◦ Performance-wise only makes sense to parallelize the complete application and not only parts  Developing and validating parallel code is difficult ◦ ‘Physicists’ should be saved from this ◦ In any case, concurrency will limit what can be done and not in algorithmic code  At the Framework level you have the overall view and control of the application P. Mato/CERN 15

 Re-engineer G4 to use the new framework ◦ Make use of the common set of foundation packages (math, vectors, utility classes, etc.) ◦ With this we can get an effortless integration with non G4- core functionality (visualization, I/O, configurability, interactivity, analysis, etc.)  Concurrent processing of sets of ‘tracks’ and ‘events’ ◦ Development of Rene’s ideas of ‘baskets’ of particles organized by particle type, volume shape, etc. ◦ Need to develop an efficient summing (‘reduce’) of the results ◦ Study reproducibility of results (random number sequence) P. Mato/CERN 16

 Major cleanup of obsolete physics and functionality ◦ Needed in any case for a 15 years old software  Ability to run full and fast MC together using common infrastructure (e.g. geometry, conditions, etc.) ◦ Today’s frameworks allow to run e.g. different ‘tacking algorithms’ in the same program ◦ Defining clearly the input and output types P. Mato/CERN 17

 Collaboration of CERN with FNAL, DESY and possible other Labs ◦ Start with small number of people (at the beginning) ◦ Open to people willing to collaborate ◦ Strong interactions with ATLAS and CMS (and others)  E.g. Instrumentation of existing applications to provide requirements ◦ Strong collaboration with Geant4 team  Quick delivery of running prototypes (I and II) ◦ First prototype in 12 months :-)  Agile project management with ‘short’ cycles ◦ Weekly meetings to review progress and update plans P. Mato/CERN 18

 We need to evaluate some of the existing and technologies and design partial prototypes of critical parts ◦ Examples: OpenCL, impact of vectorization, transactional memory, fast scheduling algorithms, etc.  The idea would be to organize these R&D activities in short cycles ◦ Coordinating the interested people to cover all aspects ◦ Coming with conclusions (yes/no) within few months P. Mato/CERN 19

P. Mato/CERN LHC shutdown Today R&D, technology evaluation and design of critical parts Complete Prototype I Initial adaptation of LHC and Geant4 applications Complete Prototype II with experience of porting LHC applications First production quality release Project definition