Trigger Software Upgrades John Baines, Tomasz Bold, Joerg Stelzer, Werner Wiedenmann 1.

Slides:



Advertisements
Similar presentations
Jet Slice Status Report Ricardo Gonçalo (LIP) and David Miller (Chicago) For the Jet Trigger Group Trigger General Meeting – 9 July 2014.
Advertisements

1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.
MIS 2000 Class 20 System Development Process Updated 2014.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Acquiring Information Systems and Applications
MotoHawk Training Model-Based Design of Embedded Systems.
The ATLAS High Level Trigger Steering Journée de réflexion – Sept. 14 th 2007 Till Eifert DPNC – ATLAS group.
Chapter 13 Embedded Systems
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
1 The ATLAS Online High Level Trigger Framework: Experience reusing Offline Software Components in the ATLAS Trigger Werner Wiedenmann University of Wisconsin,
Acquiring Information Systems and Applications
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
REVIEW OF NA61 SOFTWRE UPGRADE PROPOSAL. Mandate The NA61 experiment is contemplating to rewrite its fortran software in modern technology and are requesting.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Topic (1)Software Engineering (601321)1 Introduction Complex and large SW. SW crises Expensive HW. Custom SW. Batch execution.
Acquiring Information Systems and Applications
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
Use of GPUs in ALICE (and elsewhere) Thorsten Kollegger TDOC-PG | CERN |
Virtual commissioning at vcc
Requirements for a Next Generation Framework: ATLAS Experience S. Kama, J. Baines, T. Bold, P. Calafiura, W. Lampl, C. Leggett, D. Malon, G. Stewart, B.
1 Trigger “box” and related TDAQ organization Nick Ellis and Xin Wu Chris Bee and Livio Mapelli.
Acquiring Information Systems and Applications
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
© 2012 xtUML.org Bill Chown – Mentor Graphics Model Driven Engineering.
CHAPTER 13 Acquiring Information Systems and Applications.
Assessing the influence on processes when evolving the software architecture By Larsson S, Wall A, Wallin P Parul Patel.
Future farm technologies & architectures John Baines 1.
1 Planning for Reuse (based on some ideas currently being discussed in LHCb ) m Obstacles to reuse m Process for reuse m Project organisation for reuse.
The System and Software Development Process Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
Future Framework John Baines for the Future Framework Requirements Group 1.
Trigger input to FFReq 1. Specific Issues for Trigger The HLT trigger reconstruction is a bit different from the offline reconstruction: – The trigger.
1 “Steering the ATLAS High Level Trigger” COMUNE, G. (Michigan State University ) GEORGE, S. (Royal Holloway, University of London) HALLER, J. (CERN) MORETTINI,
Status of Reconstruction in CBM
Preparations for the 2012 Athens Trigger Workshop Trigger Workshop Panel John Baines (chair), Thorsten Wengler, Bill Murray, Tatsuo Kawamoto, Olga Igonkina,
Navigation Timing Studies of the ATLAS High-Level Trigger Andrew Lowe Royal Holloway, University of London.
TDAQ Upgrade Software Plans John Baines, Tomasz Bold Contents: Future Framework Exploitation of future Technologies Work for Phase-II IDR.
GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.
Parallelization of likelihood functions for data analysis Alfio Lazzaro CERN openlab Forum on Concurrent Programming Models and Frameworks.
Chapter 6 CASE Tools Software Engineering Chapter 6-- CASE TOOLS
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
Trigger Software Upgrades John Baines & Tomasz Bold 1.
ATLAS Meeting CERN, 17 October 2011 P. Mato, CERN.
Artemis School On Calibration and Performance of ATLAS Detectors Jörg Stelzer / David Berge.
Future computing strategy Some considerations Ian Bird WLCG Overview Board CERN, 28 th September 2012.
Process Asad Ur Rehman Chief Technology Officer Feditec Enterprise.
S t a t u s a n d u pd a t e s Gabriella Cataldi (INFN Lecce) & the group Moore … in the H8 test-beam … in the HLT(Pesa environment) … work in progress.
Upgrade Software University and INFN Catania Upgrade Software Alessia Tricomi University and INFN Catania CMS Trigger Workshop CERN, 23 July 2009.
General requirements for BES III offline & EF selection software Weidong Li.
Ian Bird CERN, 17 th July 2013 July 17, 2013
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Some GPU activities at the CMS experiment Felice Pantaleo EP-CMG-CO EP-CMG-CO 1.
LHCbComputing Computing for the LHCb Upgrade. 2 LHCb Upgrade: goal and timescale m LHCb upgrade will be operational after LS2 (~2020) m Increase significantly.
Framework support for Accelerators Sami Kama. Introduction Current Status Future Accelerator use modes Symmetric resource Asymmetric resource 09/11/2015.
AliRoot survey: Reconstruction P.Hristov 11/06/2013.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
FTK high level simulation & the physics case The FTK simulation problem G. Volpi Laboratori Nazionali Frascati, CERN Associate FP07 MC Fellow.
1 TrigMoore: Status, Plans, Possible Milestones. 2 Moore in HLT- status and ongoing work Package under the CVS directory: Trigger/TrigAlgorithms/TrigMoore.
Summary of IAPP scientific activities into 4 years P. Giannetti INFN of Pisa.
LHCb Computing 2015 Q3 Report Stefan Roiser LHCC Referees Meeting 1 December 2015.
WP8 : High Level Trigger John Baines. Tasks & Deliverables WP8 Tasks: 1.Optimize HLT Tracking software for Phase-I 2.Optimize Trigger Selections for Phase-I.
Software Project Configuration Management
TDAQ Phase-II kick-off CERN background information
CMS High Level Trigger Configuration Management
Tracking for Triggering Purposes in ATLAS
for the Offline and Computing groups
Controlling a large CPU farm using industrial tools
LHCb.
CHEP La Jolla San Diego 24-28/3/2003
Presentation transcript:

Trigger Software Upgrades John Baines, Tomasz Bold, Joerg Stelzer, Werner Wiedenmann 1

Trigger Software Upgrades Meetings Purpose of these meetings: ForuBring together people from working on Phase-I Trigger software upgrades targeted at Run 3 Coordinate HLT work on Frameworks and exploitation of new technologies. Run 3 Phase I Upgrades 2

Organisation & Meetings Trigger Core Software – Covers both operations & upgrade – Meetings: Fridays 15:00 (chairs: Joerg Stelzer, Attila Krasznahorkay, Werner Wiedenmann) Meetings: Fridays 15:00 – Periodic meetings will be dedicated to Software Upgrades (chaired by Tomasz & John) currently planned: 19 Sep, 5 Dec (sw weeks), other dates as needed DAQ/HLT Software and Operations – Covers both operations & upgrade – Meetings: Thursdays 14:00 (chairs: Rainer Hauser, Wainer Vandelli) Meetings: Thursdays 14:00 Meeting in Copenhagen TDAQ week – Parallel session Tuesday 15 th July – Discussion session focusing on: Online/HLT interface: present and past experience & discussion of implications for new framework requirements Accelerators : How to quantify benefits & cost including cost of additional online complexity. 3

Motivation for Trigger Software Upgrades Meet physics requirements within online & offline resource constraints  Cleverer selections to maintain HLT rejection;  faster code that fully exploits the capability of the farm hardware HLT upgrades to match Detector & L1 Upgrades: – FTK, Muon New Small Wheel, L1Topo Exploit Technology Evolution: – Increased no. of cores => may no longer be possible to run an application per core – Possible trend to higher no. of small, low-power cores with lower memory May be instead-of or in addition to larger CPU cores. – Availability of more specialised hardware e.g. GPGPU – Evolution of compilers, libraries etc. 4

Discuss today Upgrade Work Packages TDAQ Phase-I TDR defines “Trigger” and “Online” work packages. In practice closely coupled: Online HLT Processing Unit Evaluate & Exploit new technologies Online Core Software, Infrastructure Configuration, Control Monitoring Dataflow, Event Format Detector Software & Tools Trigger Trigger Core software Evaluate & Exploit new technologies Menus & Algorithms Simulation TDAQ Phase-I upgrade TDR: Trigger Core Software DAQ/HLT Software Signatures & Menus 5

Tasks: Trigger Core Software Design & Implementation of new offline/HLT framework – Requirements, design, prototyping and implementation of New Framework  in collaboration with offline and other experiments – Design & Implementation of Steering/Scheduler common HLT/offline mechanism for concurrent algorithm scheduling – Interface to online software – Design & Implement HLT-specific features/extensions of the new framework Exploitation of the new Framework – Central work to migrate signatures and algorithms – Monitoring (especially cost monitoring) able to handle parallel, asynchronous component execution – Tools for parallel software validation and debugging Infrastructure for offloading work – to GPU/other co-processor/idle cores Trigger configuration upgrades – Support changes to the Level-1 hardware and HLT software Support for FTK – Steering & RegionSelector 6

Tasks: New Technologies Evaluate CPU and co-processor/accelerator developments Software optimisation: – using profiling tools and techniques, expert code inspection and code redesign – make better use of parallelism provided by CPU architectures Look at new compilers, languages and libraries – to facilitate optimal use of new hardware and parallel programming techniques. Define best practices – for implementation of framework & algorithms on chosen hardware. 7

Tasks: Trigger Menus and Algorithms Speed up of code – especially Detector Specific code for data preparation & reconstruction Improve selections: – maintain efficiency w.r.t. offline & rejection; – track offline changes; – improved robustness w.r.t. pile-up – Benefit from use of FTK information Tasks: Simulation Ability to simulation the trigger as run online – use of old software version FTK Simulation (fast and full) Fast Trigger simulation (L1+HLT) based on parameterisation Explore flexible approach in common with Integrated Simulation Framework 8

Timescales: Framework, Steering & New Technologies 2014 Q3Q4 LS 1 Design & Prototype Implement core functionality Extend to full functionality CommissioningRun EvaluateImplement Infrastructure Exploit New. Tech. in Algorithms Speed up code, thread-safety, investigate possibilities for internal parallelisation Implement Algorithms in new framework. HLT software Commissioning Complete Final Software Complete Framework & Algos. Fix PC architecture Framework Core Functionality Complete Incl. HLT components & new tech. support Design of Framework & HLT Components Complete Narrow h/w choices e.g. Use or not GPU Run 3 Full menu complete Simple menu Requirements Capture Complete Framework New Tech. Algs & Menus Draft Version for discussion 9 TDR +0 mon. TDR +6 mon. TDR +12 mon. TDR +12 mon. TDR +12 mon. Prototype with 1 or 2 chains

10 Todays Meeting Aims for today’s meeting: Discuss and start to form a plan on: 1)How to speed up algorithms: code optimisation, vectorisation, internal parallelisation what are the priorities? what tools are there to help? what code re-design is needed (e.g. EDM). 2)How do we evaluate, choose and exploit future technologies & architectures in the HLT farm: what technologies to follow? what demonstrators/prototypes are needed? what infrastructure is needed? what do we need to measure? Agenda

Additional Material 11

Timescales: draft version for discussion 2014 Q3Q4 LS 1 HLT software Commissioning Complete Final Software Complete Framework & Algos. Fix PC architecture Framework Core Functionality Complete Incl. HLT components & new tech. support Design of Framework & HLT Components Complete Narrow h/w choices e.g. Use or not GPU Run 3 Full menu complete Requirements Capture Complete Initial FTK Chains FTK Fast Sim. All FTK Chains Trigger Fast Sim. Complete Trigger Fast Sim. validated Trigger Fast Sim. Design Complete Simple menu implemented in NF TDR +6 mon. TDR +12 mon. TDR +12 mon. TDR +12 mon. TDR +0 mon. 12 NF prototype with 1 or 2 chains

GPUS Benefits: Potential for v. large speed-ups for specific algorithms/parts of code (up ~ x30) – Partly from EDM and code restructuring (factor 2-3?) and part from use of GPU A lot of interest. Good way to bring in new people Issues: Lower speed-ups for some other algorithms/code Overheads to ship conditions & event data to/from GPU Need to rewrite code in specialist language (CUDA, OpenCL) Need to restructure EDM and code to be parallelisable (but useful for CPU as well as GPU) Rapidly evolving h/w => code restructured for a specific h/w may be much less efficient on a different h/w GPGPU becoming less General Purpose? Trend to more cores, less memory? Questions: Important to evaluate & track this technology, but how much effort should we invest? What can we learn from demonstrators? How complete do they need to be? Language: proprietary e.g. CUDA or cross-platform e.g. OPENCL? How to integrate with Athena? What framework infrastructure is needed? APE, dopenCL etc. 13

Frameworks Desirable to have common framework for trigger & offline: – unique window of opportunity now to influence framework design Requirements capture ongoing: – FFReq: joint Trigger + Offline. Bi-weekly meetings. Tomasz+Ben (John ex. Offic.) – Parallel session at TDAQ week to discuss online constraints Prototyping: – GaudiHive: based on real algorithms - so far offline code only: CaloHive, IDHive Stalled due to issues with Tools, Services, Incidents – TBB sheduler (Tomasz) based on dummy algos. Questions: – What can we learn from demonstrators? Do we need real algorithms? – What HLT-specific components are needed? Can the offline & HLT schedulers be the same? 14

Some Issues for Discussion Optimisation & New Technologies Code optimisation: – Code profiling & optimisation & thread-safety are a vital first step – how do we motivate & attract more effort for this? – Can all/most code used in the trigger (incl. increasing amounts of offline code) be made thread-safe? What do we do if it can’t? – Restructuring EDM and code is vital for internal parallelisation – is this achievable? – What is the correct balance between re-writing and re-use? New Technologies – GPUs - speculative activity: how much effort should we put into it? – How do we make architecture decisions? (e.g. GPU or not?) What input is needed? – What do we need to measure with GPU demonstrators? How complete do they need to be? What can we learn from standalone demonstrators and when must they be integrated in athena? 15

Assessment criteria for Cost/Benefit for GPU Increase in throughput Compare throughput of fully occupied CPU node running C++ (e.g. 2x16 cores with hyperthreading) with same system with the addition of a GPU integrated into athena via. APE. Reference 1: Original C++ code Reference 2: C++ code restructured and optimised to same level as GPU code CostCost of hardware & support Effort needed to port code to openCL/CUDA hw IntegrationPhysical size, heat output, how mounted - PCI… sw integrationInteraction with run-control, farm monitoring, error reporting. MaintenanceHow easy to maintain software & to pass on maintenance to others DebuggingHow easy/difficult is it to pinpoint errors occurring online/on Grid so that they can be reported & assigned (by non-expert) & debugged (by expert). 16

Some Issues for Discussion Frameworks Frameworks – What questions do we need framework demonstrators to answer? How complete does it have to be? What can be learnt with dummy algos & what needs real code? – How do we make the choice of framework tech. ? (e.g. GaudiHive or another). – Is it a framework requirement to minimize the modifications of algo. code? Or can we assume significant algo. code renewal? 17

Possible Next Steps Code Optimisation Framework Requirements: – Complete framework requirements capture Framework Demonstrator: – Step 1: Simple demonstrator: Implement with modified GaudiHive scheduler and/or TBB scheduler with a small menu (few chains, few steps per chain) with step-wise execution of dummy algorithms and menu decision after each step – Step 2: Extended prototype: Once problem with tools, services, incidents is solved, implement small menu running a few real algorithms to identify any issues using a more realistic prototype. GPU demonstrator – Calo data prep & TopoCluster – ID data prep & ID tracking – Muon data prep & MuonTracking – Integrated into athena using APE 18

Work Areas needing people Framework & HLT Steering Framework: demonstrator evaluation, requirements capture, design, implementation of HLT-specific components Steering: 19

GPUS Time (ms) Tau RoI 0.6x0.6 tt events 2x1034 C++ on 2.4 GHz CPU CUDA on Tesla C2050 Speedup CPU/GPU Data. Prep Seeding Seed ext Triplet merging Clone removal CPU GPU xfer n/a0.1n/a Total Example of complete L2 ID chain implemented on GPU (Dmitry Emeliyanov) Data Prep. L2 Tracking 20

21

Data Preparation Code 22

23

24

25

26

27

28

29

30

31