Experience with multi-threaded C++ applications in the ATLAS DataFlow Szymon Gadomski University of Bern, Switzerland and INP Cracow, Poland on behalf.

Slides:



Advertisements
Similar presentations
Lecture 12: MapReduce: Simplified Data Processing on Large Clusters Xiaowei Yang (Duke University)
Advertisements

CSC 360- Instructor: K. Wu Overview of Operating Systems.
G ö khan Ü nel / CHEP Interlaken ATLAS 1 Performance of the ATLAS DAQ DataFlow system Introduction/Generalities –Presentation of the ATLAS DAQ components.
Sander Klous on behalf of the ATLAS Collaboration Real-Time May /5/20101.
Kostas KORDAS INFN – Frascati XI Bruno Touschek spring school, Frascati,19 May 2006 Higgs → 2e+2  O (1/hr) Higgs → 2e+2  O (1/hr) ~25 min bias events.
Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
OS Fall ’ 02 Introduction Operating Systems Fall 2002.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
Threads CSCI 444/544 Operating Systems Fall 2008.
1 When to Switch Processes 3 triggers –System call, Interrupt and Trap System call –when a user program invokes a system call. e.g., a system call that.
March 2003 CHEP Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.
1 The ATLAS Online High Level Trigger Framework: Experience reusing Offline Software Components in the ATLAS Trigger Werner Wiedenmann University of Wisconsin,
Performance Evaluation of Real-Time Operating Systems
Virtual Organization Approach for Running HEP Applications in Grid Environment Łukasz Skitał 1, Łukasz Dutka 1, Renata Słota 2, Krzysztof Korcyl 3, Maciej.
Worldwide event filter processing for calibration Calorimeter Calibration Workshop Sander Klous September 2006.
DAQ System at the 2002 ATLAS Muon Test Beam G. Avolio – Univ. della Calabria E. Pasqualucci - INFN Roma.
Input and Output Computer Organization and Assembly Language: Module 9.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
LECC2003 AmsterdamMatthias Müller A RobIn Prototype for a PCI-Bus based Atlas Readout-System B. Gorini, M. Joos, J. Petersen (CERN, Geneva) A. Kugel, R.
Copyright © 2000 OPNET Technologies, Inc. Title – 1 Distributed Trigger System for the LHC experiments Krzysztof Korcyl ATLAS experiment laboratory H.
Design and Performance of a PCI Interface with four 2 Gbit/s Serial Optical Links Stefan Haas, Markus Joos CERN Wieslaw Iwanski Henryk Niewodnicznski Institute.
Remote Online Farms Sander Klous
Control in ATLAS TDAQ Dietrich Liko on behalf of the ATLAS TDAQ Group.
The Data Flow System of the ATLAS DAQ/EF "-1" Prototype Project G. Ambrosini 3,9, E. Arik 2, H.P. Beck 1, S. Cetin 2, T. Conka 2, A. Fernandes 3, D. Francis.
The ATLAS Trigger: High-Level Trigger Commissioning and Operation During Early Data Taking Ricardo Gonçalo, Royal Holloway University of London On behalf.
Prospects for the use of remote real time computing over long distances in the ATLAS Trigger/DAQ system R. W. Dobinson (CERN), J. Hansen (NBI), K. Korcyl.
Online-Offsite Connectivity Experiments Catalin Meirosu *, Richard Hughes-Jones ** * CERN and Politehnica University of Bucuresti ** University of Manchester.
2003 Conference for Computing in High Energy and Nuclear Physics La Jolla, California Giovanna Lehmann - CERN EP/ATD The DataFlow of the ATLAS Trigger.
LHCb DAQ system LHCb SFC review Nov. 26 th 2004 Niko Neufeld, CERN.
CHEP March 2003 Sarah Wheeler 1 Supervision of the ATLAS High Level Triggers Sarah Wheeler on behalf of the ATLAS Trigger/DAQ High Level Trigger.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
1 ATLAS experience running ATHENA and TDAQ software Werner Wiedenmann University of Wisconsin Workshop on Virtualization and Multi-Core Technologies for.
ATLAS TDAQ RoI Builder and the Level 2 Supervisor system R. E. Blair, J. Dawson, G. Drake, W. Haberichter, J. Schlereth, M. Abolins, Y. Ermoline, B. G.
Kostas KORDAS INFN – Frascati 10th Topical Seminar on Innovative Particle & Radiation Detectors (IPRD06) Siena, 1-5 Oct The ATLAS Data Acquisition.
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.
CODA Graham Heyes Computer Center Director Data Acquisition Support group leader.
Event Management. EMU Graham Heyes April Overview Background Requirements Solution Status.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
LECC2004 BostonMatthias Müller The final design of the ATLAS Trigger/DAQ Readout-Buffer Input (ROBIN) Device B. Gorini, M. Joos, J. Petersen, S. Stancu,
ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
The Evaluation Tool for the LHCb Event Builder Network Upgrade Guoming Liu, Niko Neufeld CERN, Switzerland 18 th Real-Time Conference June 13, 2012.
Jos VermeulenTopical lectures, Computer Instrumentation, Introduction, June Computer Instrumentation Introduction Jos Vermeulen, UvA / NIKHEF Topical.
REAL-TIME OPERATING SYSTEMS
5/14/2018 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community.
CS 6560: Operating Systems Design
Controlling a large CPU farm using industrial tools
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
Operating the ATLAS Data-Flow System with the First LHC Collisions
Chapter 6: CPU Scheduling
ATLAS Canada Alberta Carleton McGill Montréal Simon Fraser Toronto
12/3/2018 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community.
Event Building With Smart NICs
CPU SCHEDULING.
1/2/2019 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community.
Chapter 6: CPU Scheduling
The LHCb High Level Trigger Software Framework
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Mr. M. D. Jamadar Assistant Professor
CSC Multiprocessor Programming, Spring, 2011
Chapter 3: Process Management
Presentation transcript:

Experience with multi-threaded C++ applications in the ATLAS DataFlow Szymon Gadomski University of Bern, Switzerland and INP Cracow, Poland on behalf of the ATLAS Trigger/DAQ DataFlow, CHEP 2003 conference Performance problems found and solved: STL containers thread scheduling other

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"2 ATLAS DataFlow software Flow of data in the ATLAS DAQ system –Data to LVL2 (part of event), to EF (whole event), to mass storage. –See talks by Giovanna Lehman (overview of DataFlow) and by Stefan Stancu (networking). PCs, standard Linux, applications written in C++ (so far using only gcc to compile), standard network technology (Gb ethernet). “Soft” real time system, no guaranteed response time. The average response time is what matters. Common tasks (exchanging messages, state machine, access configuration db, reporting errors, …) using a framework (well, actually two…).

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"3 ATLAS Data Flow software (2) State of the project: –development done mostly in , –measurements for Technical Design Report – performance, –preparation for beam test support – stability, robustness and deployment. 7 kinds of applications (+3 kinds of controllers) Always several threads (independent processes within one application without their own resources). Roles, challenges and use of threads very different. In this short talk only a few examples –use of threads, problems, solutions.

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"4 Testbed at CERN 1U PCs >= 2 GHz 4U PCs >= 2 GHz FPGA Traffic generators

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"5 LVL2 processing unit (L2PU) - role Multiplicties are indicative only L2PU L2SV DataFlow application Interface with control software 1600x 10x Up to 500x MassStorage pROS 1x ROB 140x Detector data! ROS L1 + RoI data data request (RoI only) data Open choice. detailed LVL2 result LVL2 decision gets LVL1 decision asks for data gets it makes LVL2 decision sends it sends detailed result

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"6 L2PU design Worker Thread Worker Thread Worker Thread Input Thread RoI Data Requests RoI Data L2SV ROS‘s LVL2 Decision L2PU LVL1 Result Worker Thread pROS LVL2 Result Assemble RoI Data Add to Event Queue Get next Event from Queue If Accept send Result Run LVL2 Selection code Send Decision Request data + wait RoI Data Continue Selection code If complete restart Worker

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"7 Sub-farm Interface (SFI) - role Multiplicties are indicative only SFI DataFlow application EF Interface with control 50x 140x 1x MassStorage LVL2 accepts and rejects complete event DFM ROS data request clear assign EoE gets event id (L2 accept) asks for all event data gets it builds complete event buffers it sends it to Event Filter request

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"8 Assembly Thread SFI Design Input Thread Request Thread Event Handler Data Requests Event Data Event Assigns Ø Different threads for requesting and receiving data Ø Threads for assembly and for sending to Event Handler DFM EB Rate/SFI ~50 Hz End of Event SFI Reask Fragment IDs Assigns ROSFragments Events EF Full Event ROS

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"9 Lesson with L2PU and SFI – STL containers # threads time blocked! With no apparent dependence between threads in code, it was observed that threads were not running independently. No effect from more threads. VisualThreads, using instrumented pthread library: –STL containers use a memory pool, by default one per executable. There is a lock, threads may block each other.

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"10 Lesson with L2PU and SFI – STL containers (2) The solution is to use pthread allocator. Independent memory pools for each thread, no lock, no blocking. Use for all containers used at event rate. Careful with creating objects in one thread and deleting in another. blocked less often # threads

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"11 SFI History DateChangeEBEB + Output to EF 30 Oct `02First integration on testbed0.5 MB/s- 13 NovSending data requests at a regular pace8.0 MB/s- 14 NovReduce the number of threads15 MB/s- 20 NovSwitch off hyper-threading17 MB/s- 21 NovIntroduce credit based traffic shaping28 MB/s- 13 DecFirst try on throughput-14 MB/s 17 JanChose pthread allocator for STL object53 MB/s18 MB/s 29 JanDC Buffer recycling when sending56 MB/s19 MB/s 05 FebIOVec storage type in the EFormat library58 MB/s46 MB/s 21 FebBuffer pool per thread64 MB/s48 MB/s 21 FebGrouping interthread communication73 MB/s51 MB/s 26 FebAvoiding one system call per message80 MB/s55 MB/s threads Most improvements (and most problems) are related to threads.

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"12 Lessons from SFI Traffic shaping (limiting the number of outstanding requests for data) eliminates packet loss. Grouping interthread communication – decrease frequency of thread activation. Some improvements in more predictable areas: avoiding copies and system calls, avoiding creations by recycling buffers, avoiding contention, each thread has its own buffers.  Optimizations driven by measurements with full functionality. Effective development: developer works on a good testbed, tests and optimizes, short cycle.

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"13 Performance of the SFI Reaching I/O limit at 95 MB/s otherwise CPU limited 35% performance gain with at least 8 ROLs/ROS Will approach I/O limit for 1 ROL/ROS with faster CPU 95 MB/s – IO limited #ROLs/ROS EB only Throughput CPU limited (2.4 GHz CPU)

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"14 Readout System (ROS) - role ROBin I/O Manager ~12 bufers for data ROS controller LVL2 or EB Data request request data ROI collection and partial event building. Not exactly like SFI: ROSSFI Request Rate 24 kHz L2 3 kHz EB 50 Hz Data per req. 2 kB LVL2 8 kB EB 1.5 MB Data rate 72 MB/s75 MB/s All numbers approximate.

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"15 IOManager in ROS = Thread = Process = Linux Scheduler Requests (L2, EB, Delete) Request Queue RobIns Request Handlers Control, error Trigger The number of request handlers is configurable

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"16 System without interrupt. Poll and yield. Standard linux scheduler puts the thread away until next time slice. Up to 10 ms. Solution is to change scheduling in kernel For kernels there exists an unofficial patch (tested on CERN RH7.2) For CERN RH7.3 there is a CERN-certified patch linux_2.4.18_18_sched.yield.patch 20  s latency for getting data This is and evolving field, need to continue evaluating thread-related changes of Linux kernels. Thread scheduling problem

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"17 Conclusions The DataFlow of ATLAS DAQ has a set of applications managing the flow of data. All prototypes exist, have been optimized, are used for performance measurements and are prepared for Beam Test. Standard technology (Gb ethernet, PCs, standard Linux, C++ with gcc, multi- threaded) meets ATLAS requirements. A few lessons were learned.

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"18 Backup slides

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"19 Data Flow Manager (DFM) - role Multiplicties are indicative only L2SV SFI DataFlow application EF I/F with OnlineSW 100x 200x 16x 1x MassStorage SFO 30x Disk files LVL2 accepts and rejects data DFM ROS data request clear assign EoE

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"20 DFM Design ØBulk of work done in I/O thread ØCleanup thread identifies timed out events ØFully embedded in the DC framework Threads allow for independent and parallel processing within an application DFM I/O Thread Cleanup Thread Load Balancing Bookkeeping L2 Desicions EndOfEvent SFI Assigns Timeouts L2SV L2 Decisions EventAssigns EndOf Event Clears ROS I/O Rate ~4 kHz SFI

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"21 STL containers (3)

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"22 SFI performance Input up to 95 Mb/s (~3/4 of the 1 Gb line) Input and output at 55 Mb/s (~1/2 line speed) With all the logic of EventBuilding and all the objects involved, the performance is already close to the network limit (on a 2.4 GHz PC).

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"23 Performance of Event Building max EB rate with 8 SFIs ~ 350Hz (17% of ATLAS EB rate) N SFIs 1 DFM hardware emulators of ROS

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"24 After the patch Xeon/2GHz - Linux CERN scheduling patch # request handlers L2 request rate (kHz) latency = 2 usecs latency = 5 usecs latency = 10 usecs latency = 20 usecs latency = 50 usecs latency = 100 usecs latency = 1000 usecs 100% L2Requests 1 ROL per L2 request release grouping = 100 Simulated I/O latency

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"25 Flow of messages 5a: DFM_Decision SFI ROS/ROB DFM 5b: SFI_EoE SFI_FlowControl Note 6a: SFI_DataRequest associated with 5a: DFM_Decision used for error recovery. 1..n L2PU L2SV 2a: L2PU_Data Request p ROS 3a: L2PU_LVL2Result 4a: L2SV_LVL2 Decision 2b: ROS/ROB_Fragment 1..i DFM_FlowControl Build event 3b: pROS_Ack wait EoE reassign 1..n 1a: L2SV_LVL1Result RoIB EF wait LVL2 decision or time out 1..i 4b: DFM_Ack 1b: L2PU_LVL2Decision 1..n 6b: ROS/ROB_EventFragment 5a': DFM_SFIAssign 6a: SFI_DataRequest 7: DFM_Clear receive or timeout sequential processing or time out time-out event or time out

CHEP, March 03S.Gadomski, "Experience with multi-threaded C++ in ATLAS DataFlow"26 LVL2 Processors DFMs Local EF Farms SFIs To Remote EF Farm LVL2 Supervisors RoIB SV Switch DFM Switch SubFarm Switch SubFarm Switch EF Switch RO{B,S} EB Switch LVL2 Switch RO{B,S} RO{B/S} RODs Deployment view