ATLAS HLT in PPD John Baines, Dmitry Emeliyanov, Julie Kirk, Monika Wielers, Will Dearnaley, Fred Wickens, Stephen Burke 1
Overview Introduction ID trigger e/gamma trigger B/physics trigger ROS & Farms Upgrade 2
Who’s doing what John: HLT UK project leader (M&O and Upgrade) Dmitry: Inner Detector Trigger & Upgrade – Lead developer for L2Star Monika: Electron & photon Trigger Julie: B-Physics Trigger – B-trigger coordinator Will: B-Physics Trigger Fred: ROS & HLT farms Stephen: Upgrade 3
Level 1 (LVL1) Fast Custom-built electronics Level 2 & Level 3 (Event Filter): Software based running on large PC farm Level-2: Fast custom algorithms reconstruction mainly in Regions of Interest (RoI)=> limited data access Level 3 = Event Filter (EF) Offline tools inside custom wrappers, Access to full event information The ATLAS Trigger 4 ~300 Hz~300 MB/s ~3kHz <75 kHz Front End Pipelines Readout Buffers Event Builder Full Event Buffers Storage & offline processing Tracking Level-1 Fast Custom Electronics t<2.5 s Level PCs ~40ms Event Filter 1800 PCs ~4s Calo, Muon, Min. Bias Requested Data in RoI Access to full event
Inner Detector Trigger UK & Genoa UK : UCL, Sussex, Manchester, RAL RAL: L2 Tracking – L2Star development – Parallelisation/use of coprocessors – Use of FTK tracks 5
Inner Detector Trigger : Dmitry TrigSteering TrigIDSCANTrigSiTrack Data Provider ZFinder HitFilter Track Fitting Pattern recognition TRT Track extension Monitoring Pre-2012 running – common code, “service elements” – algorithm-specific code Two HLT algorithms: SiTrack : Genoa Combinatorial method Used for beamspot, tau & b-jet triggers IDSCAN : UK Fast histograming method Used for muon, e/gamma & B- physics triggers Common: Track Fitting TRT extension Written by Dmitry 6
ID Tracking Performance in 2011 L2 ID execution time v. Mean no. interactions Electron RoI: ~10ms/RoI Linear rise with luminosity (LHC design =23) Oct, 2011 ID tracking Effic. for electrons 7
L2STAR : new L2 tracking framework TrigSteering The unified algorithm Data Provider Track Fitting Monitoring TRT Track ext. > ITrigL2PattRecoStrategy m_findTracks(RoI, data) : HLT::ErrorCode m_findTracks(data) : HLT::ErrorCode Strategy AStrategy BStrategy F Common interface Configurable list of track finding strategies – plug-ins L2STAR design Design goals: Provides a single configurable algorithm for L2 tracking Simplify menus Removes duplication of code Provides different track-finding strategies Status: L2STAR has replaced IDSCAN and SiTrack in the 2012 trigger menu L2STAR is being used for data-taking in 2012 Next phase of development targets 2014 running Dmitry: Lead developer Fast HistogrammingCombinatorialStart from FTK trk 8
e/gamma (Monika) 9 Finalizing the 2011 performance – with combined e/g efficiency group – will appear in trigger e/g trigger conference note (nearly through the approval process) – Effic. w.r.t. offline loose/med/tight sel. – For lowest un-prescaled single triggers: e20_medium e22_medium and e22vh_medium1 – Extract efficiencies in 1d as a function of η and E T Use tag & probe with Z->ee – Calculate systematics by variation of the Z(ee) inv. mass window and the tightness of the tag
e/gamma (Monika) 2011 performance – Extraction of scale factors (SF=ε(data)/ε(MC)) using MC11a, MC11c and atlfast – These numbers are available within the e/g combined performance group tools for use in any D3PD/AOD analysis 10
e/gamma (Monika) 1 st look at 2012 performance using data from period A Measure effic. using Z(ee) tag & probe For lowest threshold unprescaled triggers: – e22vh_medium, e22vhi_medium, – e24vh_medium, e22vhi_medium Prepare for higher lumi : – Selections tighter than 2011 – Lower Effic. – Check for pile-up dependence Not conclusive yes, but hint of small of pileup dependance 11 v: variable threshold in eta bins of 0.4 at L1, h: hadronic core isolation cut, i: track isolation use at EF
e/gamma (Monika) Looking in detail at where loses occur Losses due to b-layer cut were traced back incorrect handling of noisy modules Other sources of losses under study 12
13 B-physics programme: Low p T di-muon signatures: Onia studies (J/ ψ→μ + μ −, ϒ →μ + μ − ) mixing and CP violation studies B → J/ ψ(μμ )X, Rare and semi-rare B decays B →μμ (X) Trigger : low p T (4,6 GeV) di-mu 2 muons at Level1 Confirmed in High Level Trigger Require vertex fit and mass cuts Unprescaled ‘2mu4’ trigger for Jpsi, Upsilon and B →μ + μ − throughout 2011 data-taking Large samples of events recorded for the B-physics programme “DiMu” prescaled in latter part of 2011 data-taking About 50% of total 2011 data sample TriggerMass windowNo. of evts (M) 2mu4_DiMu1.4 – 14 GeV27 2mu4_Jpsimumu2.5 – 4.3 GeV14 2mu4_Bmumu4 – 8.5 GeV3.7 2mu4_Upsimumu8 – 12 GeV9.1 B-physics - Julie
B-physics Changes for 2012 “2mu4” rates are too high above 3.e33 –> baseline triggers move to “2mu6_xxxxx”: 2mu6 yield wrt 2mu4 Jpsi25% Upsilon12% Bmumu36% BUT lower yield for B-Physics: To keep some of lower thresholds introduce “Barrel” triggers: Efficiencies wrt L1_2MU4: EF_2mu4T_Bmumu 68% EF_mu4Tmu6_Bmumu 55% EF_2mu4T_Bmumu_Barrel 43% EF_mu4Tmu6_Bmumu_Barrel 34% EF_2mu4T_Bmumu_BarrelOnly 33% EF_2mu6_Bmumu 24% Barrel Triggers: OK for L1 rate. EF rate still too high → Introduce “delayed” stream Will only be processed when space at Tier0 – possibly not until
First look at 2012 data compared to 2011 Some runs from periodA (up to use mu18_medium) Jpsi tag-and-probe using “mu18_medium” as tag periodL 2011 periodA 2012 pT probe muon B-trigger (Julie) 15
B-Physics Trigger : Will Trigger efficiency measurements Validation of trigger in Sample A & Sample T productions Muon & B-physics on- call B-physics systematics from trigger choice Currently working on providing B-Trigger info in D3PD Maker 16
ROS & Farm Peak L2 data request rates per ROS PC ROS: UCL & RAL (Fred) UK delivered 1/3 of Read Out System: 700 ROBin boards manuf. & tested in UK ROS PCs sustained required rates: Up to ~20kHz request rate per RoS PC 2011 programme of rolling replacement (motherboard, CPU, memory) prioritised ROS with highest load Additional 4-6 kHz to Event Builder 17
Upgrade Next phase of L2Star: – Greater modularization – Tracking for Single-node (L2 and EF combined on same PC) – Use of FTK information in the HLT Parallelization of code, use of co-processors 18
Parallelisation for HLT Upgrade : Dmitry Dmitry leading work to investigate parallelisation in the ATLAS Trigger Use of Graphical Processor Units (GPU) Use of Many-core CPU Have successfully parallelized a number of HLT algorithms and re- implemented them for GPUs, in particular: full data preparation chain (Bytestream decoding, pixel/strip clusterization and spacepoint making) for Pixel and SCT – (Dmitry + Jacob Howard, Oxford) parallel GPU-accelerated track finder (based on SiTrack) (Dmitry) 19
GPU-based Tracking Spectacular speed-up factor up to x26 Full data preparation takes only 12 ms for the whole Pixel and SCT 20 Factor 12 speed-up for Pat. Rec.:
Parallelisation for HLT Upgrade : Dmitry Dmitry developed solution for integration of GPU-based code into Athena reconstruction: “client-server” architecture with multi-process server allows for transparent GPU “sharing” between few Athena apps on multiple CPU cores 21 GPU sharing test Up to 4 parallel jobs can share GPU and get accelerated w/o significant GPU saturation
Trigger Operations Trigger Monitoring Expert : John, Monika Inner Detector & B-jet on-call : Dmitry Trigger Release Coordination: Dmitry, Stephen ID code maintenance/bug-fixes: Dmitry B-trigger menu, B-trigger code development & bug fixes: Julie Muon & B-trigger on-call: Will Trigger Validation shift: Julie, Will e/gamma & Tau on-call shifts: Monika e/gamma code development: Monika Control room shifts: Monika, Stephen, Will 22
Extras 23
24 Gigabit Ethernet Event data requests Delete commands Requested event data Regions Of Interest LVL2 Super- visor Network switches Second- level trigger pROS ~ 500 stores LVL2 output LVL2 farm Read- Out Drivers ( RODs ) First- level trigger Dedicated links VME Data of events accepted by first-level trigger Read-Out Subsystems ( ROSs ) 1600 Read- Out Links RoI Builder ~150 PCs Event data ≤ 100 kHz, 1600 fragments of ~ 1 kByte each Timing Trigger Control (TTC) DataFlow Manager Event Filter (EF) ~1600 Network switches Event data pulled: partial ≤ 100 kHz, full ~ 3 kHz Event size ~1.5MB 4-code dual-socket nodes CERN computer centre Event rate ~ 200 Hz Data storage 6 Local Storage SubFarm Outputs (SFOs) ~100 Event Builder SubFarm Inputs (SFIs) Trigger / DAQ architecture
25 ARCHITECTURE HLTHLT 40 MHz 75 kHz ~3 kHz ~ 300 Hz 40 MHz RoI data = 1-2% ~2 GB/s FE Pipelines 2.5 s LVL1 accept Read-Out Drivers ROD Event Builder EB ~3 GB/s ROS Read-Out Sub-systems Read-Out Buffers ROB 120 GB/sRead-Out Links Calo MuTrCh Other detectors ~ 1 PB/s Event Filter EFP ~ 4 sec EFN ~3 GB/s ~ 300 MB/s TriggerDAQ LVL2 ~ 40 ms L2P L2SV L2N L2P ROIB LVL2 accept RoI requests RoI’s LVL1 2.5 s Calorimeter Trigger Muon Trigger Min. Bias Triggers ~2.5 s