Presentation is loading. Please wait.

Presentation is loading. Please wait.

12/3/2018 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community.

Similar presentations


Presentation on theme: "12/3/2018 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community."— Presentation transcript:

1 12/3/2018 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community CHEP 06 Mumbai February 2006

2 12/3/2018 22 m Architecture Weight: 7000 t 44 m December 3, 2018

3 12/3/2018 22 m Global concept The ATLAS TDAQ architecture is based on a three-level trigger hierarchy It uses a LVL2 selection mechanism based on a subset of event data (Region-of-Interest) to reduce without full event building the rate of selected events Therefore, there is a much reduced demand on dataflow power Weight: 7000 t 44 m December 3, 2018

4 ARCHITECTURE (Functional elements)
12/3/2018 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms FE Pipelines December 3, 2018

5 For 2.5 msec, all signals must be stored in electronics pipelines
Interactions every 25 ns … In 25 ns particles travel 7.5 m 12/3/2018 Cable length ~100 meters … In 25 ns signals travel 5 m Weight: 7000 t 44 m 22 m Level-1 latency Total Level-1 latency = (TOF+cables+processing+distribution) = 2.5 msec For 2.5 msec, all signals must be stored in electronics pipelines December 3, 2018

6 ARCHITECTURE (Functional elements)
12/3/2018 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 100 kHz ROD Read-Out Drivers RoI 160 GB/s Read-Out Links (S-LINK) ROS ROB Read-Out Buffers Region of Interest Builder ROIB Read-Out Systems December 3, 2018

7 Region of Interest - Why?
12/3/2018 Region of Interest - Why? The Level-1 selection is dominated by local signatures Based on coarse granularity (calo, mu trig chamb), w/out access to inner tracking Important further rejection can be gained with local analysis of full detector data The geographical addresses of interesting signatures identified by the LVL1 (Regions of Interest) Allow access to local data of each relevant detector Sequentially Typically, there is 1-2 RoI per event accepted by LVL1 <RoIs/ev> = ~1.6 The resulting total amount of RoI data is minimal a few % of the Level-1 throughput December 3, 2018

8 RoI mechanism - How? --> the ATLAS RoI-based Level-2 trigger 4 RoI
12/3/2018 RoI mechanism - How? There is a simple correspondence  region <-> ROB number(s) (for each detector) -> for each RoI the list of ROBs with the corresponding data from each detector is quickly identified (LVL2 processors) This mechanism provides a powerful and economic way to add an important rejection factor before full Event Building --> the ATLAS RoI-based Level-2 trigger … ~ one order of magnitude smaller ReadOut network … … at the cost of a higher control traffic … 4 RoI -f addresses Note that this example is atypical; the average number of RoIs/ev is ~1.6 December 3, 2018

9 ARCHITECTURE (Functional elements)
12/3/2018 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 100 kHz ROD RoI 160 GB/s Level 2 L2 ~10 ms ROS RoI requests ROB L2 Supervisor ROIB L2SV Read-Out Systems L2 Network L2 Processing Units RoI data (~2%) L2P L2N ~3 GB/s December 3, 2018

10 L2PU performance & number of nodes
12/3/2018 L2PU performance & number of nodes Assume: 100 KHz LVL1 accept rate 500 dual-CPU PCs for LVL2 So: each L2PU does 100Hz 10 ms average latency per event in each L2PU We need enough computing power to avoid exceeding time budget In TDR we estimated that 8 GHz dual machines would do the job We’ll not reach 8 GHz per CPU any time soon But multi-core multi-CPU PCs show scaling! we’ll reach 2x8 GHz equivalent performance per PC See Kostas Kordas’s talk December 3, 2018

11 Unpacking & Cluster Formation
SCT & Pixels 12/3/2018 Mean time to: Accept electron: 3ms Reject Jet: 1.8ms Times on 2.4 GHz dual Opteron Calorimeter Tracks ByteStream Track Fit SpacePoint formation Space-Points Unpacking & Cluster Formation Get data in RoI Pat. Rec. 1.6 1.2 0.8 0.4 Improvement 0.7 ms Time (ms) 1.7 ms 1.0 ms Total: 3.5ms c.f. 12 ms for std offline PseudoRapidity (h) Muon 0.4 ms Data Prep : 5.2ms Track Finding : 1.3 ms Total : 6.5 ms Will be reduced to 2 ms 0.2 ms Total: 4.1ms Single m + background 0.2 x 0.2 RoI Single e + p.u. 2x1033 cm-2 s-1 December 3, 2018 ms

12 RoI collection & RoIB-L2SV scalability
12/3/2018 RoI collection & RoIB-L2SV scalability L2SV gets RoI info from RoIB Assigns a L2PU to work on event Load-balances its’ L2PU sub-farm Questions: Does the scheme work? If LVL1 accept rate increases? Test with preloaded RoI info into the L2SVs and preload muon data into the ROSs shows that: LVL2 system is able to sustain the LVL1 input rate: with a 1 L2SV system for LVL1rate ~ 35 KHz with a 2 L2SV system for LVL1rate ~70 KHz (workload distributed really betwen the two "subfarms"!) Max LVL1 rate is 35 kHz multiplied by number of L2SVs December 3, 2018

13 ARCHITECTURE (Functional elements)
12/3/2018 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 100 kHz ROD RoI 160 GB/s Level 2 L2 ~10 ms ROS RoI requests ROB ROIB L2SV Read-Out Systems RoI data (~2%) L2P L2N ~3+6 GB/s EB Dataflow Manager Event Building Network ~3.5 kHz EBN DFM L2 accept (~3.5 kHz) Event Builder Sub-Farm Input SFI December 3, 2018

14 Event Builder: needs for ATLAS
12/3/2018 Event Builder: needs for ATLAS Assume: 100 KHz LVL1 accept rate 3.5% LVL2 accept rate  3.5 KHz input 1.6 MB event size  3.5 x 1.6 = 5600 MB/s total input Wish: Event building using 60-70% of Gbit network (i.e., ~70 MB/s into each Event Building node SFI) So: 5600 MB/s into EB system / (70MB/s in each EB node)  need ~80 SFIs for full ATLAS (assume Network limited: true with “fast” PCs nowadays) When SFI serves EF, throughput decreases by ~20%  actually need 80/0.80 = 100 SFIs December 3, 2018

15 See Gokhan Unel’s talk ROS performance
12/3/2018 ROS performance ROS units contain 12 R/O buffers  150 units needed for ATLAS A ROS unit is implemented with a 3.4 GHz PC housing 4 custom PCI-x cards (ROBIN) Each ROBIN implements 3 readout buffers 1. Paper model estimate of requirements for ROS units 2. Measurements on real ROS H/W “Hottest” ROS from paper model High Lumi. operating region Performance of final ROS (PC+ROBIN) is already above requirements. See Gokhan Unel’s talk Low Lumi. operating region

16 ARCHITECTURE (Functional elements)
12/3/2018 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 100 kHz ~3.5 kHz ~ 200 Hz 2.5 ms Det. R/O L1 accept (100 kHz) ROD RoI 160 GB/s Level 2 L2 ~10 ms ROS RoI requests ROB ROIB L2SV Read-Out Systems RoI data (~2%) L2P L2N ~3+6 GB/s EB EBN DFM L2 accept (~3.5 kHz) Event Builder SFI ~ sec EFN Event Filter Network Event Filter EF Event Filter Processors EFP EF accept (~0.2 kHz) SFO Sub-Farm Output ~ 300 MB/s December 3, 2018

17 ARCHITECTURE (Functional elements)
12/3/2018 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 100 kHz ~3.5 kHz ~ 200 Hz 2.5 ms Det. R/O L1 accept (100 kHz) ROD RoI 160 GB/s Level 2 High Level Trigger Dataflow L2 ~10 ms ROS RoI requests ROB ROIB L2SV Read-Out Systems RoI data (~2%) L2P L2N ~3+6 GB/s EB EBN DFM L2 accept (~3.5 kHz) Event Builder SFI ~ sec EFN Event Filter EF EFP EF accept (~0.2 kHz) SFO ~ 300 MB/s December 3, 2018

18 ARCHITECTURE (Functional elements)
12/3/2018 Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROD RoI High Level Trigger Dataflow L2 ~10 ms 150 nodes 500 nodes ROS RoI requests ROB ROIB L2SV RoI data (~2%) L2P L2N EB 100 nodes EBN DFM L2 accept (~3.5 kHz) SFI 1600 nodes ~ sec EFN EF EFP EF accept (~0.2 kHz) SFO Infrastructure Control Communication Databases December 3, 2018

19 DAQ/HLT Large Scale Tests 2005
12/3/2018 DAQ/HLT Large Scale Tests 2005 LXBATCH test bed at CERN 5 weeks, June/July 2005 100 – 700 dual nodes farm size increasing in steps Objectives Verify the functionality of the integrated DAQ/HLT software system at large scale Study optimal control structure Investigate best HLT sub-farm granularity Take selected performance measurement for trend analysis Include 1st time trigger and event selection algorithms into DAQ/HLT tests Access the conditions database Test DAQ/HLT/offline software distribution method See Doris Burckhart-Chromek’s talk EF LVL2 EB integrated: LVL2 with algorithm EF with algorithm HLT image DC Infrastructure InfraStructure LVL2 EF Year: / /2005 Farm size(nodes): / At the LXBATCH testbed at CERN 5 weeks, 12th June – 22nd of July 2005 Farm size increasing in steps, 100 – 700 dual nodes, Each node: <= 1GHz, >= 500 Mbytes RAM, Linux SLC3, Aimed for Verifying the software part of the DAQ/HLT system functionality at a large scale close to final Atlas size and selected performance tests Integration tests, sub-system and selected component test No data flow performance (only one network available) December 3, 2018

20 Level 1 Det. R/O High Level Trigger Dataflow TDAQ Deployment L2 ROS EB
12/3/2018 TDAQ Deployment Detectors Level 1 Det. R/O ROD High Level Trigger Dataflow L2 ROS ROB ROIB L2SV L2P L2N EB EBN DFM SFI EFN EF EFP SFO December 3, 2018

21 12/3/2018 22 m Status Weight: 7000 t 44 m December 3, 2018

22 12/3/2018 RoI Builder Small RoI Builder in place in USA15 (adequate for 3 subsystems – mostly for testing and prototyping) – stand alone tests done Full system being built and tested – expected for delivery March Tests with muon, TTC and CTP inputs being done now December 3, 2018

23 ROS production : total 150 units - 700 ROBINs
12/3/2018 Read Out System ROS production : total 150 units ROBINs ROS PCs Tendering for full amount completed, test of selected PC was successful (Elonex) First 50 PCs have been delivered Next 60 to be ordered soon ROBINs First 380 are are at CERN All tested and validated with ROS PC Full ROS systems 50 units being installed now Installation completed for Tile calorimeter (8 ROSs). Ongoing for LARG calorimeter (27 installed so far). ~ 60 later in 2006 Rest deferred 11 ROS prototypes installed in USA 15 (pre-series) Successfully used with TileCal for cosmic data taking “Mobile” ROS Waiting for final ROSes -> simpler systems provided to detectors for test and commissioning E.g. a 7-FILARs (28 links) Mobile ROS to Lar in USA15 December 3, 2018

24 Pre-series system in ATLAS point-1
12/3/2018 Pre-series system in ATLAS point-1 5.5 One Full L2 rack - TDAQ rack - 30 HLT PCs Partial EF rack - TDAQ rack - 12 HLT PCs RoIB rack - TC rack + horiz. cooling - 50% of RoIB Partial Superv’r rack - TDAQ rack - 3 HE PCs Partial ONLINE rack - TDAQ rack - 4 HLT PC (monitoring) 2 LE PC (control) 2 Central FileServers One ROS rack - TC rack + horiz. Cooling - 12 ROS 48 ROBINs One Switch rack - TDAQ rack - 128-port GEth for L2+EB Partial EFIO rack - TDAQ rack - 10 HE PC (6 SFI SFO DFM) underground : USA15 surface: SDX1 8 racks (10% of final dataflow) See Gokhan Unel’s talk

25 DAQ Infrastructure already installed at the ATLAS site
12/3/2018 DAQ Infrastructure already installed at the ATLAS site Definition Combination of H/W and S/W necessary as the basis on which other DAQ services are installed and commissioned DAQ infrastructure components DAQ infrastructure servers File, boot, log servers Configuration database servers Online services and online servers Online software infrastructure (control, configuration, monitoring) The machines on which the services will run Control computers Computers from which users can operate Software installation and upgrades See M. Dobson’s talk December 3, 2018

26 First cosmics seen by ATLAS in the pit
12/3/2018 First cosmics seen by ATLAS in the pit December 3, 2018

27 12/3/2018 Conclusions The ATLAS TDAQ system has a three-level trigger hierarchy, making use of the Region-of-Interest mechanism --> important reduction of data movement The architecture has been validated via deployment of full systems: On testbed prototypes and at the ATLAS H8 test beam On pre-series system being exploited Detailed modeling has been used to extrapolate to full size Dataflow applications and protocols have been tested and perform according to specs L2 software performance studies based on complete algorithms and realistic raw-data input indicate that our target processing times are realistic The system design is complete, installation has started Some components are already being used for ATLAS detector commissioning Online infrastructure at point-1 Mobile ROS for single detector read out December 3, 2018


Download ppt "12/3/2018 The ATLAS Trigger and Data Acquisition Architecture & Status Benedetto Gorini CERN - Physics Department on behalf of the ATLAS TDAQ community."

Similar presentations


Ads by Google