Trigger Software Upgrades John Baines, Tomasz Bold, Joerg Stelzer, Werner Wiedenmann 1.

Trigger Software Upgrades John Baines, Tomasz Bold, Joerg Stelzer, Werner Wiedenmann 1

Trigger Software Upgrades Meetings Purpose of these meetings: ForuBring together people from working on Phase-I Trigger software upgrades targeted at Run 3 Coordinate HLT work on Frameworks and exploitation of new technologies. Run 3 Phase I Upgrades 2

Organisation & Meetings Trigger Core Software – Covers both operations & upgrade – Meetings: Fridays 15:00 (chairs: Joerg Stelzer, Attila Krasznahorkay, Werner Wiedenmann) Meetings: Fridays 15:00 – Periodic meetings will be dedicated to Software Upgrades (chaired by Tomasz & John) currently planned: 19 Sep, 5 Dec (sw weeks), other dates as needed DAQ/HLT Software and Operations – Covers both operations & upgrade – Meetings: Thursdays 14:00 (chairs: Rainer Hauser, Wainer Vandelli) Meetings: Thursdays 14:00 Meeting in Copenhagen TDAQ week – Parallel session Tuesday 15 th July – Discussion session focusing on: Online/HLT interface: present and past experience & discussion of implications for new framework requirements Accelerators : How to quantify benefits & cost including cost of additional online complexity. 3

Motivation for Trigger Software Upgrades Meet physics requirements within online & offline resource constraints  Cleverer selections to maintain HLT rejection;  faster code that fully exploits the capability of the farm hardware HLT upgrades to match Detector & L1 Upgrades: – FTK, Muon New Small Wheel, L1Topo Exploit Technology Evolution: – Increased no. of cores => may no longer be possible to run an application per core – Possible trend to higher no. of small, low-power cores with lower memory May be instead-of or in addition to larger CPU cores. – Availability of more specialised hardware e.g. GPGPU – Evolution of compilers, libraries etc. 4

Discuss today Upgrade Work Packages TDAQ Phase-I TDR defines “Trigger” and “Online” work packages. In practice closely coupled: Online HLT Processing Unit Evaluate & Exploit new technologies Online Core Software, Infrastructure Configuration, Control Monitoring Dataflow, Event Format Detector Software & Tools Trigger Trigger Core software Evaluate & Exploit new technologies Menus & Algorithms Simulation TDAQ Phase-I upgrade TDR: https://cds.cern.ch/record/1602235 Trigger Core Software DAQ/HLT Software Signatures & Menus 5

Tasks: Trigger Core Software Design & Implementation of new offline/HLT framework – Requirements, design, prototyping and implementation of New Framework  in collaboration with offline and other experiments – Design & Implementation of Steering/Scheduler common HLT/offline mechanism for concurrent algorithm scheduling – Interface to online software – Design & Implement HLT-specific features/extensions of the new framework Exploitation of the new Framework – Central work to migrate signatures and algorithms – Monitoring (especially cost monitoring) able to handle parallel, asynchronous component execution – Tools for parallel software validation and debugging Infrastructure for offloading work – to GPU/other co-processor/idle cores Trigger configuration upgrades – Support changes to the Level-1 hardware and HLT software Support for FTK – Steering & RegionSelector 6

Tasks: New Technologies Evaluate CPU and co-processor/accelerator developments Software optimisation: – using profiling tools and techniques, expert code inspection and code redesign – make better use of parallelism provided by CPU architectures Look at new compilers, languages and libraries – to facilitate optimal use of new hardware and parallel programming techniques. Define best practices – for implementation of framework & algorithms on chosen hardware. 7

Tasks: Trigger Menus and Algorithms Speed up of code – especially Detector Specific code for data preparation & reconstruction Improve selections: – maintain efficiency w.r.t. offline & rejection; – track offline changes; – improved robustness w.r.t. pile-up – Benefit from use of FTK information Tasks: Simulation Ability to simulation the trigger as run online – use of old software version FTK Simulation (fast and full) Fast Trigger simulation (L1+HLT) based on parameterisation Explore flexible approach in common with Integrated Simulation Framework 8

Timescales: Framework, Steering & New Technologies 2014 Q3Q4 LS 1 Design & Prototype Implement core functionality Extend to full functionality CommissioningRun EvaluateImplement Infrastructure Exploit New. Tech. in Algorithms Speed up code, thread-safety, investigate possibilities for internal parallelisation Implement Algorithms in new framework. HLT software Commissioning Complete Final Software Complete Framework & Algos. Fix PC architecture Framework Core Functionality Complete Incl. HLT components & new tech. support Design of Framework & HLT Components Complete Narrow h/w choices e.g. Use or not GPU Run 3 Full menu complete Simple menu Requirements Capture Complete Framework New Tech. Algs & Menus Draft Version for discussion 9 TDR +0 mon. TDR +6 mon. TDR +12 mon. TDR +12 mon. TDR +12 mon. Prototype with 1 or 2 chains

10 Todays Meeting Aims for today’s meeting: Discuss and start to form a plan on: 1)How to speed up algorithms: code optimisation, vectorisation, internal parallelisation what are the priorities? what tools are there to help? what code re-design is needed (e.g. EDM). 2)How do we evaluate, choose and exploit future technologies & architectures in the HLT farm: what technologies to follow? what demonstrators/prototypes are needed? what infrastructure is needed? what do we need to measure? Agenda

Additional Material 11

Timescales: draft version for discussion 2014 Q3Q4 LS 1 HLT software Commissioning Complete Final Software Complete Framework & Algos. Fix PC architecture Framework Core Functionality Complete Incl. HLT components & new tech. support Design of Framework & HLT Components Complete Narrow h/w choices e.g. Use or not GPU Run 3 Full menu complete Requirements Capture Complete Initial FTK Chains FTK Fast Sim. All FTK Chains Trigger Fast Sim. Complete Trigger Fast Sim. validated Trigger Fast Sim. Design Complete Simple menu implemented in NF TDR +6 mon. TDR +12 mon. TDR +12 mon. TDR +12 mon. TDR +0 mon. 12 NF prototype with 1 or 2 chains

GPUS Benefits: Potential for v. large speed-ups for specific algorithms/parts of code (up ~ x30) – Partly from EDM and code restructuring (factor 2-3?) and part from use of GPU A lot of interest. Good way to bring in new people Issues: Lower speed-ups for some other algorithms/code Overheads to ship conditions & event data to/from GPU Need to rewrite code in specialist language (CUDA, OpenCL) Need to restructure EDM and code to be parallelisable (but useful for CPU as well as GPU) Rapidly evolving h/w => code restructured for a specific h/w may be much less efficient on a different h/w GPGPU becoming less General Purpose? Trend to more cores, less memory? Questions: Important to evaluate & track this technology, but how much effort should we invest? What can we learn from demonstrators? How complete do they need to be? Language: proprietary e.g. CUDA or cross-platform e.g. OPENCL? How to integrate with Athena? What framework infrastructure is needed? APE, dopenCL etc. 13

Frameworks Desirable to have common framework for trigger & offline: – unique window of opportunity now to influence framework design Requirements capture ongoing: – FFReq: joint Trigger + Offline. Bi-weekly meetings. Tomasz+Ben (John ex. Offic.) – Parallel session at TDAQ week to discuss online constraints Prototyping: – GaudiHive: based on real algorithms - so far offline code only: CaloHive, IDHive Stalled due to issues with Tools, Services, Incidents – TBB sheduler (Tomasz) based on dummy algos. Questions: – What can we learn from demonstrators? Do we need real algorithms? – What HLT-specific components are needed? Can the offline & HLT schedulers be the same? 14

Some Issues for Discussion Optimisation & New Technologies Code optimisation: – Code profiling & optimisation & thread-safety are a vital first step – how do we motivate & attract more effort for this? – Can all/most code used in the trigger (incl. increasing amounts of offline code) be made thread-safe? What do we do if it can’t? – Restructuring EDM and code is vital for internal parallelisation – is this achievable? – What is the correct balance between re-writing and re-use? New Technologies – GPUs - speculative activity: how much effort should we put into it? – How do we make architecture decisions? (e.g. GPU or not?) What input is needed? – What do we need to measure with GPU demonstrators? How complete do they need to be? What can we learn from standalone demonstrators and when must they be integrated in athena? 15

Assessment criteria for Cost/Benefit for GPU Increase in throughput Compare throughput of fully occupied CPU node running C++ (e.g. 2x16 cores with hyperthreading) with same system with the addition of a GPU integrated into athena via. APE. Reference 1: Original C++ code Reference 2: C++ code restructured and optimised to same level as GPU code CostCost of hardware & support Effort needed to port code to openCL/CUDA hw IntegrationPhysical size, heat output, how mounted - PCI… sw integrationInteraction with run-control, farm monitoring, error reporting. MaintenanceHow easy to maintain software & to pass on maintenance to others DebuggingHow easy/difficult is it to pinpoint errors occurring online/on Grid so that they can be reported & assigned (by non-expert) & debugged (by expert). 16

Some Issues for Discussion Frameworks Frameworks – What questions do we need framework demonstrators to answer? How complete does it have to be? What can be learnt with dummy algos & what needs real code? – How do we make the choice of framework tech. ? (e.g. GaudiHive or another). – Is it a framework requirement to minimize the modifications of algo. code? Or can we assume significant algo. code renewal? 17

Possible Next Steps Code Optimisation Framework Requirements: – Complete framework requirements capture Framework Demonstrator: – Step 1: Simple demonstrator: Implement with modified GaudiHive scheduler and/or TBB scheduler with a small menu (few chains, few steps per chain) with step-wise execution of dummy algorithms and menu decision after each step – Step 2: Extended prototype: Once problem with tools, services, incidents is solved, implement small menu running a few real algorithms to identify any issues using a more realistic prototype. GPU demonstrator – Calo data prep & TopoCluster – ID data prep & ID tracking – Muon data prep & MuonTracking – Integrated into athena using APE 18

Work Areas needing people Framework & HLT Steering Framework: demonstrator evaluation, requirements capture, design, implementation of HLT-specific components Steering: 19

GPUS Time (ms) Tau RoI 0.6x0.6 tt events 2x1034 C++ on 2.4 GHz CPU CUDA on Tesla C2050 Speedup CPU/GPU Data. Prep. 2739 Seeding8.31.65 Seed ext.1567.820 Triplet merging 7.43.42 Clone removal 706.211 CPU GPU xfer n/a0.1n/a Total2682212 Example of complete L2 ID chain implemented on GPU (Dmitry Emeliyanov) Data Prep. L2 Tracking 20

Data Preparation Code 22

Trigger Software Upgrades John Baines, Tomasz Bold, Joerg Stelzer, Werner Wiedenmann 1.

Similar presentations

Presentation on theme: "Trigger Software Upgrades John Baines, Tomasz Bold, Joerg Stelzer, Werner Wiedenmann 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Trigger Software Upgrades John Baines, Tomasz Bold, Joerg Stelzer, Werner Wiedenmann 1.

Similar presentations

Presentation on theme: "Trigger Software Upgrades John Baines, Tomasz Bold, Joerg Stelzer, Werner Wiedenmann 1."— Presentation transcript:

Similar presentations

About project

Feedback