1/2/2019 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community.

Slides:



Advertisements
Similar presentations
G ö khan Ü nel / CHEP Interlaken ATLAS 1 Performance of the ATLAS DAQ DataFlow system Introduction/Generalities –Presentation of the ATLAS DAQ components.
Advertisements

Sander Klous on behalf of the ATLAS Collaboration Real-Time May /5/20101.
1 Introduction to Geneva ATLAS High Level Trigger Activities Xin Wu Journée de réflexion du DPNC, 11 septembre, 2007 Participants Assitant(e)s: Gauthier.
The First-Level Trigger of ATLAS Johannes Haller (CERN) on behalf of the ATLAS First-Level Trigger Groups International Europhysics Conference on High.
October 20 th, 2000Lyon - DAQ2000HP Beck ATLAS Trigger & Data Acquisition Requirements and Concepts Hanspeter Beck LHEP - Bern for the ATLAS T/DAQ Group.
The ATLAS High Level Trigger Steering Journée de réflexion – Sept. 14 th 2007 Till Eifert DPNC – ATLAS group.
Kostas KORDAS INFN – Frascati XI Bruno Touschek spring school, Frascati,19 May 2006 Higgs → 2e+2  O (1/hr) Higgs → 2e+2  O (1/hr) ~25 min bias events.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
Chris Bee ATLAS High Level Trigger Introduction System Scalability Trigger Core Software Development Trigger Selection Algorithms Commissioning & Preparation.
1 The ATLAS Online High Level Trigger Framework: Experience reusing Offline Software Components in the ATLAS Trigger Werner Wiedenmann University of Wisconsin,
First year experience with the ATLAS online monitoring framework Alina Corso-Radu University of California Irvine on behalf of ATLAS TDAQ Collaboration.
Large Scale and Performance Tests of the ATLAS Online Software CERN ATLAS TDAQ Online Software System D.Burckhart-Chromek, I.Alexandrov, A.Amorim, E.Badescu,
Worldwide event filter processing for calibration Calorimeter Calibration Workshop Sander Klous September 2006.
Algorithm / Data-flow Interface
LECC2003 AmsterdamMatthias Müller A RobIn Prototype for a PCI-Bus based Atlas Readout-System B. Gorini, M. Joos, J. Petersen (CERN, Geneva) A. Kugel, R.
Copyright © 2000 OPNET Technologies, Inc. Title – 1 Distributed Trigger System for the LHC experiments Krzysztof Korcyl ATLAS experiment laboratory H.
ILC Trigger & DAQ Issues - 1 ILC DAQ issues ILC DAQ issues By P. Le Dû
The ATLAS Trigger and Data Acquisition: a brief overview of concept, design and realization John Erik Sloper ATLAS TDAQ group CERN - Physics Dept. April.
The Region of Interest Strategy for the ATLAS Second Level Trigger
Remote Online Farms Sander Klous
Control in ATLAS TDAQ Dietrich Liko on behalf of the ATLAS TDAQ Group.
HEP 2005 WorkShop, Thessaloniki April, 21 st – 24 th 2005 Efstathios (Stathis) Stefanidis Studies on the High.
The ATLAS Trigger: High-Level Trigger Commissioning and Operation During Early Data Taking Ricardo Gonçalo, Royal Holloway University of London On behalf.
U.S. ATLAS Executive Committee August 3, 2005 U.S. ATLAS TDAQ FY06 M&O Planning A.J. Lankford UC Irvine.
2003 Conference for Computing in High Energy and Nuclear Physics La Jolla, California Giovanna Lehmann - CERN EP/ATD The DataFlow of the ATLAS Trigger.
CHEP March 2003 Sarah Wheeler 1 Supervision of the ATLAS High Level Triggers Sarah Wheeler on behalf of the ATLAS Trigger/DAQ High Level Trigger.
Artemis School On Calibration and Performance of ATLAS Detectors Jörg Stelzer / David Berge.
Experience with multi-threaded C++ applications in the ATLAS DataFlow Szymon Gadomski University of Bern, Switzerland and INP Cracow, Poland on behalf.
ATLAS TDAQ RoI Builder and the Level 2 Supervisor system R. E. Blair, J. Dawson, G. Drake, W. Haberichter, J. Schlereth, M. Abolins, Y. Ermoline, B. G.
Kostas KORDAS INFN – Frascati 10th Topical Seminar on Innovative Particle & Radiation Detectors (IPRD06) Siena, 1-5 Oct The ATLAS Data Acquisition.
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
Trigger/DAQ/DCS. LVL1 Trigger Calorimeter trigger Muon trigger Central Trigger Processor (CTP) Timing, Trigger, Control (TTC) Germany, Sweden, UK Italy.
1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.
ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October
The ATLAS Trigger System 1.Global Requirements for the ATLAS trigger system 2.Trigger/DAQ Architecture 3.LVL1 system 4.HLT System 5.Some results from LHC.
1 Nicoletta GarelliCPPM, 03/25/2011 Overview of the ATLAS Data-Acquisition System o perating with proton-proton collisions Nicoletta Garelli (CERN) CPPM,
EPS HEP 2007 Manchester -- Thilo Pauly July The ATLAS Level-1 Trigger Overview and Status Report including Cosmic-Ray Commissioning Thilo.
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June Computer Instrumentation Triggering and DAQ Jos Vermeulen, UvA / NIKHEF Topical.
M. Bellato INFN Padova and U. Marconi INFN Bologna
The ATLAS Collaboration
5/14/2018 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community.
U.S. ATLAS TDAQ FY06 M&O Planning
Ricardo Gonçalo, RHUL BNL Analysis Jamboree – Aug. 6, 2007
Electronics Trigger and DAQ CERN meeting summary.
LHC experiments Requirements and Concepts ALICE
Enrico Gamberini, Giovanna Lehmann Miotto, Roland Sipos
TELL1 A common data acquisition board for LHCb
Controlling a large CPU farm using industrial tools
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
Commissioning of the ALICE HLT, TPC and PHOS systems
High Level Triggering Fred Wickens.
CMS DAQ Event Builder Based on Gigabit Ethernet
Operating the ATLAS Data-Flow System with the First LHC Collisions
ProtoDUNE SP DAQ assumptions, interfaces & constraints
Toward a costing model What next? Technology decision n Schedule
ATLAS Canada Alberta Carleton McGill Montréal Simon Fraser Toronto
ATLAS Canada Alberta Carleton McGill Montréal Simon Fraser Toronto
The First-Level Trigger of ATLAS
ATLAS: Level-1 Calorimeter Trigger
12/3/2018 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community.
TDAQ commissioning and status Stephen Hillier, on behalf of TDAQ
John Harvey CERN EP/LBC July 24, 2001
Remote Online Farms TDAQ Sander Klous ACAT April
Low Level HLT Reconstruction Software for the CMS SST
LHCb Trigger, Online and related Electronics
The Performance and Scalability of the back-end DAQ sub-system
TELL1 A common data acquisition board for LHCb
Presentation transcript:

1/2/2019 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community CHEP 06 Mumbai 13-19 February 2006

1/2/2019 22 m Architecture Weight: 7000 t 44 m January 2, 2019

1/2/2019 22 m Global concept The ATLAS TDAQ architecture is based on a three-level trigger hierarchy It uses a LVL2 selection mechanism based on a subset of event data (Region-of-Interest) to reduce without full event building the rate of selected events Therefore, there is a much reduced demand on dataflow power Weight: 7000 t 44 m January 2, 2019

ARCHITECTURE (Functional elements) 1/2/2019 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms FE Pipelines January 2, 2019

For 2.5 msec, all signals must be stored in electronics pipelines Interactions every 25 ns … In 25 ns particles travel 7.5 m 1/2/2019 Cable length ~100 meters … In 25 ns signals travel 5 m Weight: 7000 t 44 m 22 m Level-1 latency Total Level-1 latency = (TOF+cables+processing+distribution) = 2.5 msec For 2.5 msec, all signals must be stored in electronics pipelines January 2, 2019

ARCHITECTURE (Functional elements) 1/2/2019 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 100 kHz ROD Read-Out Drivers RoI 160 GB/s Read-Out Links (S-LINK) ROS ROB Read-Out Buffers Region of Interest Builder ROIB Read-Out Systems January 2, 2019

Region of Interest - Why? 1/2/2019 Region of Interest - Why? The Level-1 selection is dominated by local signatures Based on coarse granularity (calo, mu trig chamb), w/out access to inner tracking Important further rejection can be gained with local analysis of full detector data The geographical addresses of interesting signatures identified by the LVL1 (Regions of Interest) Allow access to local data of each relevant detector Sequentially Typically, there is 1-2 RoI per event accepted by LVL1 <RoIs/ev> = ~1.6 The resulting total amount of RoI data is minimal a few % of the Level-1 throughput January 2, 2019

RoI mechanism - How? --> the ATLAS RoI-based Level-2 trigger 4 RoI 1/2/2019 RoI mechanism - How? There is a simple correspondence  region <-> ROB number(s) (for each detector) -> for each RoI the list of ROBs with the corresponding data from each detector is quickly identified (LVL2 processors) This mechanism provides a powerful and economic way to add an important rejection factor before full Event Building --> the ATLAS RoI-based Level-2 trigger … ~ one order of magnitude smaller ReadOut network … … at the cost of a higher control traffic … 4 RoI -f addresses Note that this example is atypical; the average number of RoIs/ev is ~1.6 January 2, 2019

ARCHITECTURE (Functional elements) 1/2/2019 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 100 kHz ROD RoI 160 GB/s Level 2 L2 ~10 ms ROS RoI requests ROB L2 Supervisor ROIB L2SV Read-Out Systems L2 Network L2 Processing Units RoI data (~2%) L2P L2N ~3 GB/s January 2, 2019

L2PU performance & number of nodes 1/2/2019 L2PU performance & number of nodes Assume: 100 KHz LVL1 accept rate 500 dual-CPU PCs for LVL2 So: each L2PU does 100Hz 10 ms average latency per event in each L2PU We need enough computing power to avoid exceeding time budget In TDR we estimated that 8 GHz dual machines would do the job We’ll not reach 8 GHz per CPU any time soon But multi-core multi-CPU PCs show scaling! we’ll reach 2x8 GHz equivalent performance per PC See Kostas Kordas’s talk January 2, 2019

Unpacking & Cluster Formation SCT & Pixels 1/2/2019 Mean time to: Accept electron: 3ms Reject Jet: 1.8ms Times on 2.4 GHz dual Opteron Calorimeter Tracks ByteStream Track Fit SpacePoint formation Space-Points Unpacking & Cluster Formation Get data in RoI Pat. Rec. 1.6 1.2 0.8 0.4 Improvement 0.7 ms Time (ms) 1.7 ms 1.0 ms Total: 3.5ms c.f. 12 ms for std offline -2 -1 0 1 2 PseudoRapidity (h) Muon 0.4 ms Data Prep : 5.2ms Track Finding : 1.3 ms Total : 6.5 ms Will be reduced to 2 ms 0.2 ms Total: 4.1ms Single m + background 0.2 x 0.2 RoI Single e + p.u. 2x1033 cm-2 s-1 January 2, 2019 0 1 2 3 4 5 6 7 8 9 10 ms

RoI collection & RoIB-L2SV scalability 1/2/2019 RoI collection & RoIB-L2SV scalability L2SV gets RoI info from RoIB Assigns a L2PU to work on event Load-balances its’ L2PU sub-farm Questions: Does the scheme work? If LVL1 accept rate increases? Test with preloaded RoI info into the L2SVs and preload muon data into the ROSs shows that: LVL2 system is able to sustain the LVL1 input rate: with a 1 L2SV system for LVL1rate ~ 35 KHz with a 2 L2SV system for LVL1rate ~70 KHz (workload distributed really 50-50 betwen the two "subfarms"!) Max LVL1 rate is 35 kHz multiplied by number of L2SVs January 2, 2019

ARCHITECTURE (Functional elements) 1/2/2019 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 100 kHz ROD RoI 160 GB/s Level 2 L2 ~10 ms ROS RoI requests ROB ROIB L2SV Read-Out Systems RoI data (~2%) L2P L2N ~3+6 GB/s EB Dataflow Manager Event Building Network ~3.5 kHz EBN DFM L2 accept (~3.5 kHz) Event Builder Sub-Farm Input SFI January 2, 2019

Event Builder: needs for ATLAS 1/2/2019 Event Builder: needs for ATLAS Assume: 100 KHz LVL1 accept rate 3.5% LVL2 accept rate  3.5 KHz input 1.6 MB event size  3.5 x 1.6 = 5600 MB/s total input Wish: Event building using 60-70% of Gbit network (i.e., ~70 MB/s into each Event Building node SFI) So: 5600 MB/s into EB system / (70MB/s in each EB node)  need ~80 SFIs for full ATLAS (assume Network limited: true with “fast” PCs nowadays) When SFI serves EF, throughput decreases by ~20%  actually need 80/0.80 = 100 SFIs January 2, 2019

See Gokhan Unel’s talk ROS performance 1/2/2019 ROS performance ROS units contain 12 R/O buffers  150 units needed for ATLAS A ROS unit is implemented with a 3.4 GHz PC housing 4 custom PCI-x cards (ROBIN) Each ROBIN implements 3 readout buffers 1. Paper model estimate of requirements for ROS units 2. Measurements on real ROS H/W “Hottest” ROS from paper model High Lumi. operating region Performance of final ROS (PC+ROBIN) is already above requirements. See Gokhan Unel’s talk Low Lumi. operating region

ARCHITECTURE (Functional elements) 1/2/2019 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 100 kHz ~3.5 kHz ~ 200 Hz 2.5 ms Det. R/O L1 accept (100 kHz) ROD RoI 160 GB/s Level 2 L2 ~10 ms ROS RoI requests ROB ROIB L2SV Read-Out Systems RoI data (~2%) L2P L2N ~3+6 GB/s EB EBN DFM L2 accept (~3.5 kHz) Event Builder SFI ~ sec EFN Event Filter Network Event Filter EF Event Filter Processors EFP EF accept (~0.2 kHz) SFO Sub-Farm Output ~ 300 MB/s January 2, 2019

ARCHITECTURE (Functional elements) 1/2/2019 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 100 kHz ~3.5 kHz ~ 200 Hz 2.5 ms Det. R/O L1 accept (100 kHz) ROD RoI 160 GB/s Level 2 High Level Trigger Dataflow L2 ~10 ms ROS RoI requests ROB ROIB L2SV Read-Out Systems RoI data (~2%) L2P L2N ~3+6 GB/s EB EBN DFM L2 accept (~3.5 kHz) Event Builder SFI ~ sec EFN Event Filter EF EFP EF accept (~0.2 kHz) SFO ~ 300 MB/s January 2, 2019

ARCHITECTURE (Functional elements) 1/2/2019 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROD RoI High Level Trigger Dataflow L2 ~10 ms 150 nodes 500 nodes ROS RoI requests ROB ROIB L2SV RoI data (~2%) L2P L2N EB 100 nodes EBN DFM L2 accept (~3.5 kHz) SFI 1600 nodes ~ sec EFN EF EFP EF accept (~0.2 kHz) SFO Infrastructure Control Communication Databases January 2, 2019

DAQ/HLT Large Scale Tests 2005 1/2/2019 DAQ/HLT Large Scale Tests 2005 LXBATCH test bed at CERN 5 weeks, June/July 2005 100 – 700 dual nodes farm size increasing in steps Objectives Verify the functionality of the integrated DAQ/HLT software system at large scale Study optimal control structure Investigate best HLT sub-farm granularity Take selected performance measurement for trend analysis Include 1st time trigger and event selection algorithms into DAQ/HLT tests Access the conditions database Test DAQ/HLT/offline software distribution method See Doris Burckhart-Chromek’s talk EF LVL2 EB integrated: LVL2 with algorithm EF with algorithm HLT image DC Infrastructure InfraStructure LVL2 EF Year: 2001 2002 2003 2004 4/2005 7/2005 Farm size(nodes): 100 220 220 230 300/800 700 At the LXBATCH testbed at CERN 5 weeks, 12th June – 22nd of July 2005 Farm size increasing in steps, 100 – 700 dual nodes, Each node: <= 1GHz, >= 500 Mbytes RAM, Linux SLC3, Aimed for Verifying the software part of the DAQ/HLT system functionality at a large scale close to final Atlas size and selected performance tests Integration tests, sub-system and selected component test No data flow performance (only one network available) January 2, 2019

Level 1 Det. R/O High Level Trigger Dataflow TDAQ Deployment L2 ROS EB 1/2/2019 TDAQ Deployment Detectors Level 1 Det. R/O ROD High Level Trigger Dataflow L2 ROS ROB ROIB L2SV L2P L2N EB EBN DFM SFI EFN EF EFP SFO January 2, 2019

1/2/2019 22 m Status Weight: 7000 t 44 m January 2, 2019

1/2/2019 RoI Builder Small RoI Builder in place in USA15 (adequate for 3 subsystems – mostly for testing and prototyping) – stand alone tests done Full system being built and tested – expected for delivery March Tests with muon, TTC and CTP inputs being done now January 2, 2019

ROS production : total 150 units - 700 ROBINs 1/2/2019 Read Out System ROS production : total 150 units - 700 ROBINs ROS PCs Tendering for full amount completed, test of selected PC was successful (Elonex) First 50 PCs have been delivered Next 60 to be ordered soon ROBINs First 380 are are at CERN All tested and validated with ROS PC Full ROS systems 50 units being installed now Installation completed for Tile calorimeter (8 ROSs). Ongoing for LARG calorimeter (27 installed so far). ~ 60 later in 2006 Rest deferred 11 ROS prototypes installed in USA 15 (pre-series) Successfully used with TileCal for cosmic data taking “Mobile” ROS Waiting for final ROSes -> simpler systems provided to detectors for test and commissioning E.g. a 7-FILARs (28 links) Mobile ROS to Lar in USA15 January 2, 2019

Pre-series system in ATLAS point-1 1/2/2019 Pre-series system in ATLAS point-1 5.5 One Full L2 rack - TDAQ rack - 30 HLT PCs Partial EF rack - TDAQ rack - 12 HLT PCs RoIB rack - TC rack + horiz. cooling - 50% of RoIB Partial Superv’r rack - TDAQ rack - 3 HE PCs Partial ONLINE rack - TDAQ rack - 4 HLT PC (monitoring) 2 LE PC (control) 2 Central FileServers One ROS rack - TC rack + horiz. Cooling - 12 ROS 48 ROBINs One Switch rack - TDAQ rack - 128-port GEth for L2+EB Partial EFIO rack - TDAQ rack - 10 HE PC (6 SFI - 2 SFO - 2 DFM) underground : USA15 surface: SDX1 8 racks (10% of final dataflow) See Gokhan Unel’s talk

DAQ Infrastructure already installed at the ATLAS site 1/2/2019 DAQ Infrastructure already installed at the ATLAS site Definition Combination of H/W and S/W necessary as the basis on which other DAQ services are installed and commissioned DAQ infrastructure components DAQ infrastructure servers File, boot, log servers Configuration database servers Online services and online servers Online software infrastructure (control, configuration, monitoring) The machines on which the services will run Control computers Computers from which users can operate Software installation and upgrades See M. Dobson’s talk January 2, 2019

First cosmics seen by ATLAS in the pit 1/2/2019 First cosmics seen by ATLAS in the pit January 2, 2019

1/2/2019 Conclusions The ATLAS TDAQ system has a three-level trigger hierarchy, making use of the Region-of-Interest mechanism --> important reduction of data movement The architecture has been validated via deployment of full systems: On testbed prototypes and at the ATLAS H8 test beam On pre-series system being exploited Detailed modeling has been used to extrapolate to full size Dataflow applications and protocols have been tested and perform according to specs L2 software performance studies based on complete algorithms and realistic raw-data input indicate that our target processing times are realistic The system design is complete, installation has started Some components are already being used for ATLAS detector commissioning Online infrastructure at point-1 Mobile ROS for single detector read out January 2, 2019