1 The ATLAS Online High Level Trigger Framework: Experience reusing Offline Software Components in the ATLAS Trigger Werner Wiedenmann University of Wisconsin,

Slides:



Advertisements
Similar presentations
G ö khan Ü nel / CHEP Interlaken ATLAS 1 Performance of the ATLAS DAQ DataFlow system Introduction/Generalities –Presentation of the ATLAS DAQ components.
Advertisements

Sander Klous on behalf of the ATLAS Collaboration Real-Time May /5/20101.
CWG10 Control, Configuration and Monitoring Status and plans for Control, Configuration and Monitoring 16 December 2014 ALICE O 2 Asian Workshop
Seeking prime numbers quickly through parallel-computing Daniel J. Wright.
GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC M. Della Pietra, P. Adragna,
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
July 7, 2008SLAC Annual Program ReviewPage 1 High-level Trigger Algorithm Development Ignacio Aracena for the SLAC ATLAS group.
CHEP03 - UCSD - March 24th-28th 2003 T. M. Steinbeck, V. Lindenstruth, H. Tilsner, for the Alice Collaboration Timm Morten Steinbeck, Computer Science.
Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 1 Timm M. Steinbeck HLT Data Transport Framework.
Operating Systems.
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
Control and monitoring of on-line trigger algorithms using a SCADA system Eric van Herwijnen Wednesday 15 th February 2006.
First year experience with the ATLAS online monitoring framework Alina Corso-Radu University of California Irvine on behalf of ATLAS TDAQ Collaboration.
Large Scale and Performance Tests of the ATLAS Online Software CERN ATLAS TDAQ Online Software System D.Burckhart-Chromek, I.Alexandrov, A.Amorim, E.Badescu,
L3 Filtering: status and plans D  Computing Review Meeting: 9 th May 2002 Terry Wyatt, on behalf of the L3 Algorithms group. For more details of current.
LHCb Quarterly Report October Core Software (Gaudi) m Stable version was ready for 2008 data taking o Gaudi based on latest LCG 55a o Applications.
The D0 Monte Carlo Challenge Gregory E. Graham University of Maryland (for the D0 Collaboration) February 8, 2000 CHEP 2000.
Framework for Automated Builds Natalia Ratnikova CHEP’03.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Copyright © 2000 OPNET Technologies, Inc. Title – 1 Distributed Trigger System for the LHC experiments Krzysztof Korcyl ATLAS experiment laboratory H.
Requirements for a Next Generation Framework: ATLAS Experience S. Kama, J. Baines, T. Bold, P. Calafiura, W. Lampl, C. Leggett, D. Malon, G. Stewart, B.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett The Athena Control Framework in Production, New Developments and Lessons Learned.
The Region of Interest Strategy for the ATLAS Second Level Trigger
Control in ATLAS TDAQ Dietrich Liko on behalf of the ATLAS TDAQ Group.
1 John Baines Commissioning of the ATLAS High Level Trigger.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Clara Gaspar, March 2005 LHCb Online & the Conditions DB.
Future Framework John Baines for the Future Framework Requirements Group 1.
Trigger input to FFReq 1. Specific Issues for Trigger The HLT trigger reconstruction is a bit different from the offline reconstruction: – The trigger.
1 “Steering the ATLAS High Level Trigger” COMUNE, G. (Michigan State University ) GEORGE, S. (Royal Holloway, University of London) HALLER, J. (CERN) MORETTINI,
Introduction CMS database workshop 23 rd to 25 th of February 2004 Frank Glege.
TDAQ Upgrade Software Plans John Baines, Tomasz Bold Contents: Future Framework Exploitation of future Technologies Work for Phase-II IDR.
U.S. ATLAS Executive Committee August 3, 2005 U.S. ATLAS TDAQ FY06 M&O Planning A.J. Lankford UC Irvine.
September 2007CHEP 07 Conference 1 A software framework for Data Quality Monitoring in ATLAS S.Kolos, A.Corso-Radu University of California, Irvine, M.Hauschild.
CHEP March 2003 Sarah Wheeler 1 Supervision of the ATLAS High Level Triggers Sarah Wheeler on behalf of the ATLAS Trigger/DAQ High Level Trigger.
Artemis School On Calibration and Performance of ATLAS Detectors Jörg Stelzer / David Berge.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
Online Monitoring for the CDF Run II Experiment T.Arisawa, D.Hirschbuehl, K.Ikado, K.Maeshima, H.Stadie, G.Veramendi, W.Wagner, H.Wenzel, M.Worcester MAR.
1 ATLAS experience running ATHENA and TDAQ software Werner Wiedenmann University of Wisconsin Workshop on Virtualization and Multi-Core Technologies for.
Online Monitoring System at KLOE Alessandra Doria INFN - Napoli for the KLOE collaboration CHEP 2000 Padova, 7-11 February 2000 NAPOLI.
Experience with multi-threaded C++ applications in the ATLAS DataFlow Szymon Gadomski University of Bern, Switzerland and INP Cracow, Poland on behalf.
General requirements for BES III offline & EF selection software Weidong Li.
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
Analysis experience at GSIAF Marian Ivanov. HEP data analysis ● Typical HEP data analysis (physic analysis, calibration, alignment) and any statistical.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
Mini-Workshop on multi-core joint project Peter van Gemmeren (ANL) I/O challenges for HEP applications on multi-core processors An ATLAS Perspective.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
U.S. ATLAS TDAQ FY06 M&O Planning
SuperB and its computing requirements
CMS High Level Trigger Configuration Management
Commissioning of the ATLAS High Level Trigger
Hands-On Microsoft Windows Server 2008
LHC experiments Requirements and Concepts ALICE
Controlling a large CPU farm using industrial tools
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
Sharing Memory: A Kernel Approach AA meeting, March ‘09 High Performance Computing for High Energy Physics Vincenzo Innocente July 20, 2018 V.I. --
TDAQ commissioning and status Stephen Hillier, on behalf of TDAQ
The LHCb High Level Trigger Software Framework
Presentation transcript:

1 The ATLAS Online High Level Trigger Framework: Experience reusing Offline Software Components in the ATLAS Trigger Werner Wiedenmann University of Wisconsin, Madison, Wisconsin on behalf of the ATLAS Collaboration CHEP 2009 Prague, March 2009

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 2 The ATLAS Trigger Target processing time <75 (100) kHz ~ 3 kHz ~ 200 Hz Rate ~ 4 s 2 GHz / 4 core CPU ~ 40 ms 2 GHz / 4 core CPU 2.5 μs Level-1 Hardware trigger High Level Triggers (HLT) Level-2 + Event Filter Software trigger runs on commodity hardware Level 2 Trigger Event Filter Data Recording Event Size ~1.6 MB

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 3 HLT Hardware  Final system:  17 L2 racks  62 EF racks  1 rack = 31 PCs (1U)  28 racks configurable as L2 or EF (XPU)  1 Local File Server per rack  served by Central File Servers (CFS)  27 XPU racks installed  ~35% of final system  XPU  CPU  2x Harpertown 2.5 GHz  2 GB Memory/core  Network booted

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 4 ATLAS High Level Triggers  Level-2 (75 kHz→3 kHz)  Partial event reconstruction in Regions of Interest (RoI) provided by Level 1  event data in RoI are requested from selection algorithms over the network from the readout system  HLT selection software runs in the Level-2 Processing Unit (L2PU). Selection algorithms are run in a worker thread.  Event Filter (3 kHz→200 Hz)  Full event reconstruction seeded by Level-2 result with offline-type algorithms  Independent Processing Tasks (PT) run selection software on Event Filter (EF) farm nodes  HLT Event Selection Software is based on the ATLAS Athena offline Framework  HLT framework interfaces the HLT event selection algorithms to online  Driven by run control and data flow software  Event loop managed by data flow software  Allows HLT algorithms to run unchanged in the trigger and offline environment

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 5 HLT software in the Level 2 and Event Filter Data Flow Data Collection L2PU/EFPT Steering Controller HLT Event Selection Software HLT Framework based on ATHENA/GAUDI Online/Data Collection Framework Level-2 Event processing in Worker threads Event Filter Event processing in Processing Tasks

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 6 HLT Event Selection Software Level2 HLT Data Flow Software HLT Selection Software  Framework ATHENA/GAUDI  Reuse offline components  Common to Level-2 and EF Offline algorithms used in EF

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 7 HLT Framework Data Flow L2PU/EFPT Algorithms ATHENA Environment athena Algorithms OfflineOnline AthenaSvc Online Information Service Conditions Data Base Provides special Athena services when necessary to allow algorithms a transparent running in online / offline. Whenever possible the offline infrastructure services are reused. Athena services with a special “online” backend are used e.g. for  Interfacing to online run control, state machine and the online configuration database  e.g. for the readout buffer configuration  Interfacing the algorithm configuration to the Trigger Database  Holds menu configuration  Interfacing to online conditions data  Forwarding of data requests to data flow in Level 2  …

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 8 HLT Framework  The HLT framework is responsible  For packaging and forwarding of the HLT decisions to online, e.g.  Detailed Level 2 and Event Filter decision record for subsequent reconstruction steps  Event output destination (Stream tag information)  Use of event for calibration purposes and building event only partially  ….  For error handling and forwarding of error conditions from algorithms to online applications  For forwarding of monitoring histograms and counters to online monitoring infrastructure  …  Level-2 and Event Filter share many common components from the framework  differences mainly due to different data access in RoI or of full event

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 9 Development Model Data Flow L2PU/EFPT Steering Controller Algorithms ATHENA Environment athenaMT/PT Steering Controller Algorithms OfflineOnline Offline support for developers Emulators athenaMT/PT Emulate complete L2PU/PT environment with specific boundary conditions for data access Command line tools  no need to setup complex Data Flow systems Allow testing of complete state machine Written in Python/C++  HLT algorithm development, testing and optimization is done mainly in offline environment.  Developers can use online emulators athenaMT for Level-2 and athenaPT for Event filter to test their applications  Decoupling HLT algorithm development and TDAQ data flow development Integration HLT Framework software

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 10 The Atlas HLT Software Project  The HLT framework packages form the AtlasHLT software project  setup like other offline software projects  uses many facilities from offline build and testing system  is hosted in the offline version control repository  Follows offline build and patching strategy  Automatic tests  …  AtlasHLT project has dependencies on most of the Atlas software projects  AtlasHLT has to be last in build sequence AtlasHLT LCG AtlasTrigger AtlasCore Gaudi Atlas…. offline tdaq tdaq- common online Extends directly Gaudi base services Automatic memory leak test for Level 2 Contains HLT algorithms

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 11 Operational Experience  The HLT framework has been successfully tested and used now over many years in e.g.  The ATLAS test beam in 2004 and TDAQ commissioning runs  In cosmic data taking and detector commissioning periods  1 st beam on September 10th: configured in pass through mode  HLT framework used regularly for cosmic data taking and commissioning runs: e.g. continuous cosmic data taking after Sept.12:  216 millions events  453 TB data  400k files  Proved to be very versatile and allowed to react fast on changing detector conditions  Emulators athenaMT/PT used to analyze events offline in DEBUG streams and to eventually reassign them to the analysis streams.  Valuable feedback on functionality, stability and reliability

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 12 Operational Experience System tests with full menu and simulated data showed that timing for event processing is at specification (2.5 GHz quad core CPUs)  Level-2: ~ 45 ms  Event Filter: ~ 260 ms (Event Filter not exploited by this setup)  7 hour run  L2 input ~ 60 kHz (80% of design rate)  2880 L2PU (70% of final system) Cron jobs Level-2 rejected events Event Filter rejected events

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 13 Event Parallelism and Multi-Core Machines  Multi-core machines are used for the HLT processors  Processor development  rapidly increasing number of CPU cores  Calls for massive parallelism  typically inherent to high energy physics selection and reconstruction as “event parallelism”  The original ATLAS HLT design exploits event parallelism in  Level-2: parallel event processing in multiple worker threads (with the possibility to fall back to multiple selection processes)  Event filter: multiple event processing tasks  The HLT framework supports both ways of exploiting event parallelism  Special version of Gaudi/Athena framework to create selection algorithm instances for worker threads  Experience with multi threading in Level-2 has shown that it is very difficult to maintain a large code base in an open developer community thread safe and thread efficient.  Use Level-2 processing unit with one worker thread with one Level-2 processing unit per CPU core  Allows also in Level-2 to profit from large ATLAS offline code basis (mostly written and designed in “pre-multi-core era”)  Increases software communality between Level-2 and Event Filter and allows transparent move of Level-2 algorithms to Event Filter  Increases however the resource usage (memory, network connections, …)

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 14 Experience with Multi Threading in HLT Framework InitializationEvent processing Worker threads blocked due to common memory pool for STL containers L2PU with 3 worker threads  Event processing in multiple worker threads  Code sharing  Small context switch times  “lightweight processes”  Automatic sharing of many hardware resources  Requires careful tuning and proves difficult to keep code thread safe and efficient over release cycles  impacts capability to react fast on changing conditions and to deploy new code

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 15 Experience with Multiple Processes in HLT Framework Run on every CPU core one instance of a Level-2 processing unit or Event Filter processing task  Easy to do with existing code basis  a priori no code changes required  Facilitates reuse of offline components Observe good scaling with number of cores  provides required throughput Resource requirements are multiplied with number of process instances  Memory  ~ 1– 1.5 GByte/Application  file descriptors,  network sockets,  number of controlled applications  ~ 7k presently  ~ 20k final system Machine with 8 cores in total Event Filter memory requirements

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 16 Resource Optimization  Typically in HEP applications all processes use a large amount of constant configuration and detector description data.  Common problem for offline reconstruction and trigger  Different prototypes tried  see contr. # 244, S. Binet, “Harnessing multicores: strategies and implementations in ATLAS”  Very interesting also “OS based” memory optimization “KSM” P/T Core 1 P/T Core 2 P/T Core 3 P/T Core 4 Memory Controller P/T Core 1 P/T Core 2 P/T Core 3 P/T Core 4 Memory Controller Data per P/T Sharable data Memory

Werner Wiedenmann, The ATLAS Online High Level Trigger Framework…, CHEP 2009, Prague 17 Conclusions  The ATLAS HLT framework based on the ATHENA offline framework has successfully been used in many different data taking periods and provides the required performance  The reuse of offline software components allows to benefit from a big developer community and a large code basis.  The ability to develop, optimize and test trigger algorithms in offline is successfully used to establish the ATLAS HLT selection menu  The optimal use of multi core processors requires further frame work enhancements to reduce resource utilization. Many issues there are shared with offline and can profit from common solutions.  The HLT Framework is ready for the new data taking periods.