Operating the ATLAS Data-Flow System with the First LHC Collisions Nicoletta Garelli CERN – Physics Department/ADT on behalf of the ATLAS TDAQ group
Outline Introduction TDAQ Working Conditions in 2010 Run Efficiency of ATLAS (A Toroidal LHC ApparatuS) ATLAS Trigger and Data AcQuisition (TDAQ) System TDAQ Working Conditions in 2010 Rates & Bandwidths Event Builder & Local Storage High Level Trigger (HLT) Farm TDAQ monitoring system and working point prediction TDAQ working beyond design specifications ATLAS Full/Partial Event Building Explore TDAQ Potential Fast!
ATLAS Recorded Luminosity LHC – to find the Higgs boson & new physics beyond the Standard Model Nominal working condition p-p beams: √s=14 TeV; L=1034 cm-2s-1; Bunch Cross every 25 ns Pb-Pb beams: √s=5.5 TeV; L=1027 cm-2s-1 SPS PS LHC LHCb Alice ATLAS CMS ATLAS Recorded Luminosity 2010 √s=7 TeV (first collisions on March, 30th) Commissioning bunch train 150 ns, up to 233 colliding bunches in ATLAS (October) Peak L~1032 cm-2s-1 (October) First ion collisions scheduled in November October, 14th 2010: 20.6 pb-1 Delivered Stable 18.94 pb-1 Ready Recorded
ATLAS Run Efficiency Key functionality for maximizing efficiency ATLAS Efficiency @Stable Beams at √s = 7 TeV (not luminosity weighted) Run Efficiency 96.5% (green): fraction of time in which ATLAS is recording data, while LHC is delivering stable beams Run Efficiency Ready 93% (grey): fraction of time in which ATLAS is recording physics data with innermost detectors at nominal voltages (safety aspect) 752.7 h of stable beams (March, 30th - Oct, 11th) Key functionality for maximizing efficiency Data taking starts at the beginning of the LHC fill Stop-less removal/recovery: automated removal/ recovery of channels which stopped the trigger Dynamic resynchronization: automated procedure to resynchronize channels which lost synchronization with LHC clock, w/o stopping the trigger Definition of livetime, list which are the reasons why you cannot have always 100% short explanation of the warm start/stop concept
Data Collection Network TDAQ Design ATLAS Data Calo/Muon Detectors Data- Flow ATLAS Event 1.5 MB/25 ns Trigger DAQ High Level Trigger ROI data (~2%) ROI Requests ~4 sec EF Accept ~200 Hz ~ 200 Hz ~ 3 kHz Event Filter Level 2 L2 Accept ~3 kHz SubFarmOutput SubFarmInput ~4.5 GB/s ~ 300 MB/s Detector Read-Out Level 1 FE <2.5 s Other Detectors Regions Of Interest L1 Accept 75 (100) kHz 40 MHz ~40 ms 112 (150) GB/s Trigger Info CERN Data Storage Event Builder ROD Event Filter Network ReadOut System Data Collection Network
~98% fully operational in 2010 ATLAS TDAQ System CERN computer center [~5 ] [~1600] [~100] [~ 500] [26] Local Storage SubFarm Outputs (SFOs) Event Filter (EF) farm Event Builder SubFarm Inputs (SFIs) Level 2 farm Control+ Configuration Data Storage [48] [1] Monitoring SDX1 DataFlow Manager Network Switches [70] [4] File Servers Level 2 Super- visors surface underground Event data requests & Delete commands USA15 Requested event data ATLAS Data Trigger Info [# nodes] Control, Configuration and Monitoring Network not shown For location and # of nodes VME bus [~ 150] ~1600 Read-Out Links UX15 Read- Out Drivers (RODs) Read-Out Subsystems (ROSes) ROI Builder Level 1 trigger Region Of Interest (ROI) Timing Trigger Control (TTC) ~90M channels ~98% fully operational in 2010
TDAQ Farm Status 27 xpu racks ~800 xpu nodes Component Installed Comments Online&Monitoring 100% ~60 nodes ROSes ~150 nodes ROIB & L2SVs HLT (L2+EF) ~50% ~800 xpu nodes; ~300 EF nodes Event Builder ~60 nodes (exploiting multi-core) SFO Headroom for high instantaneous throughput Networking Redundancy deployed in critical areas 27 xpu racks ~800 xpu nodes XPU = L2 or EF Processing Unit on a “run by run” basis can be configured to run either as L2 or EF Possibility to move processing power between the L2 and the EF allows high flexibility to meet the trigger needs Functional node assignment (L2 or EF) not automated 2 Gbps/rack maximum possible EF bandwidth with xpu racks ~6 GB/s Other farm anticipated the LHC performance
Event Builder and SubFarm Output EB collects L2 accepted events into single data structure @ a single place 3 kHz (L2 accept rate) x 1.5 MB (event size) 4.5GB/s (EB input) EB able to handle a wide range of event size from O(100 kB) to O(10 MB) EB sends built events to EF farm Events accepted by EF sent to the SFO SFO effective throughput (240 MB/s per node): 1.2 GB/s vs. Aim Event distribution into data files follows data stream assignment (express, physics, calibration, debug) Data files asynchronously transferred to the mass storage EB Input Bandwidth L2 commissioning = high acceptance 1 Dashed lines = nominal working points design spec. for EB input ~4.5 GB/s design spec. for SFO input ~300 MB/s EF Output Bandwidth 2 Choose the plots of the PEB EF commissioning = high acceptance
SubFarm Output Beyond the Design Summer 2010: SFO Farm 5+1 nodes, each 3 HW raid arrays, total capacity of ~50 TB (=2 days of disk buffer in case of mass storage failure) Round-Robin policy to avoid concurrent I/O 4 Gbps connection to CERN data storage LHC Van-der-Meer scan SFO input rate up to 1.3 GB/s. Data transfer to mass storage ~920 MB/s for about 1 hour LHC Tertiary Collimator Setup SFO input rate ~1GB/s Data transfer to mass storage ~950 MB/s for about 2 hours ~1 GB/s input throughput sustained during special runs to allow low event rejection TRAFFIC TO SFO TRAFFIC TO CERN DATA STORAGE
Partial Event Building (PEB) & Event Stripping Calibration events require only part of the full event info event size ≤ 500 kB Dedicated triggers for calibration + events selected for physics & calibration PEB: calibration events are built based on a list of detector identifiers Event Stripping: events selected by L2 for physics and calibration are completely built. The subset of the event useful for the calibration is copied @ EF or SFO, depending on the EF result Output bandwidth @ EB: calibration bandwidth ~2.5% of physics bandwidth High rate, using few TDAQ resources for assembling, conveyance, and logging of calibration events Beyond design feature, extensively used! Continuous calibration complementary to dedicated calibration runs Emphasis on the fact that we take data for calibration and physics at the same time
Network Monitoring Tool (poster PO-WED-035) Scalable, flexible, integrated system for collection and display with same look&feel: Network Statistics (200 network devices and 8500 ports) Computer Statistics Environmental Conditions Data Taking Parameters Tuned for Network monitoring, but assuring a transparent access to data collected by any number of monitoring tools: correlate network utilization with TDAQ performance Constant monitoring of network performance, anticipation of possible limits
HLT Resources Monitoring Tool Dedicated calibration stream for monitoring HLT performance (CPU consumption, decision time, rejection factor) L2: ~300 selection chains, decision time ~40 ms, rejection factor ~5 EF: ~290 selection chains, decision time ~300 ms, rejection factor ~10 Note: rejection factors still low during commissioning phase Example: September, 24th L ≥ 5 1031 cm-2s-1 need for CPU power @ HLT (HLT Resources Monitoring Tool) need for bandwidth @ EB output (Network Monitoring Tool) 9 dedicated EF racks with 10 Gbps/rack enabled (~11 GB/s additional bandwidth), bandwidth sufficient even with saturated EB HLT Resources Before After 24.09. Max Bandwith 4.5 GB/s 15 GB/s # L2 Racks 9 xpu 12 xpu # EF Racks 18 xpu 15 xpu, 9 EF
High Rate Test with Random Triggers TDAQ 2010 1.5 MB/150 ns ~350 Hz ~3.5 kHz 20 kHz ~1 MHz ~20 kHz ~ 350 Hz ~30 GB/s ~550 MB/s ~5.5 GB/s ~40 ms ~300 ms High Rate Test with Random Triggers ~65 kHz
Trigger Commissioning: Balancing L2 and EF BEYOND Design Trigger Commissioning: Balancing L2 and EF 4.5 kHz 7 GB/s
BEYOND Design LHC Van-der-Meer Scans Up to 2 GB/s Up to 1.3 kHz
Outlook Exploring Data-Flow Phase Space Assumed L1 rate of 100 kHz TODAY Design Assumed L1 rate of 100 kHz # L2 XPU racks # EF specific racks EF processing time L2 processing time
Conclusions In 2010 ATLAS TDAQ system has operated beyond design requirements to meet changing working conditions, trigger commissioning and understanding of detector, accelerator and physics performance Monitoring tools in place to predict the working conditions and thus establish the HLT resource balancing Partial Event Building to continuously tune the detectors, commission the trigger, maximize the rate at which calibration events can be continuously collected, and prompt analysis of the Beam Spot. Data Logging farm regularly used beyond design specifications Evolved number of EF nodes to meet required EB rate and processing power of 2010, i.e. L= O(1032) cm-2s-1 More CPU power will be installed to meet evolving needs High Run Efficiency for Physics of 93% Ready for steady running in 2011 with further increase of L
Backup Slides
EB Performance Red & green lines: building without EF, with 90 SFI. 2 different network protocol. Blue line: throughput with the final configuration. Violet line: Last year configuration using as much as possible SFI. Today
High Rate Test with Random Triggers BEYOND Expectations High Rate Test with Random Triggers 65 kHz 100 GB/s
Data Collection Network BEYOND Design Full Scan for B-Physics ReadOut System Full Scan @21 kHz (max ~23 kHz) Data Collection Network
Explore possibilities of TDAQ Curves = maximum EF processing time sustainable with racks used as EF Evaluate TDAQ limits considering: # L2/EF XPU racks L2/EF processing time L2/EF rejection power We aim to have: max EB throughput ~5 GB/s Green dashed line, which implies an EF rejection of ~10 and an L2 rejection power of ~18 L2 average time ~ 40 ms (dashed black line) # L2 racks ~8 L2/EF processing time, L2 rejection power and # available racks still far from limit. 12; 6.2 GB/s 16; 8 GB/s 14; 7.2 GB/s 10; 5.4 GB/s 8; 4.4 GB/s 7; 3.6 GB/s 5; 2.8 GB/s TDAQ ready for operation with L=1032cm-2s-1 Adjust HLT Requirements Vertical Dashed Lines = EB bandwidth & corresponding EF rejection power needed to have SFO bandwidth ~ 500 MB/s