Download presentation
Presentation is loading. Please wait.
Published byGrace Dalton Modified over 9 years ago
1
Towards a Phase-II ‘Strawman’ …work in progress… R.Middleton, on behalf of B.Barnett I.Brawn N.Gee W.Qian D.Sankey 1
2
Overview Requirements –(“motherhood and apple pie”) Options –4 (representative) picked…obviously many other variations… Algorithms –what you want to do may have big architectural implications Open questions… 2
3
Requirements / Options Luminosity target ~5x10 34 cm -2 s -1 Preserve sensitivity to physics signatures –low momentum leptons : most challenging HLT input rate not to exceed 75 (100) kHz –to be checked with HLT Options considered:- –0: re-use Phase-I L1Calo but with digital LAr/Tile frontend –1: keep current latency (~”2.5” s) rate not to exceed 75 (100) kHz –2: allow longer latency (>6 s) but rate still not to exceed 75 (100) kHz –3: 2-stage o L0 keeps latency (~”2.5” s); rate up to 500kHz o L1 latency up to a few 10s s; rate ~100kHz (whatever suitable for HLT) Ack: Will Buttinger – Nov’10 Upgrade Week 3
4
Trigger / L1Calo - Synchronous - 40MHz Option 0: Architecture Re-use Phase-I architecture –calorimeter frontends digital (for readout only) –input still LVDS 0.1x0.1 trigger towers (2 options for this) FADCFADC FADCFADC ROD BCID Sums Zero Supp. ROD BCID Sums Zero Supp. e/ Jet/E T CMM++ FIBREFIBRE Digital Frontends to CTP Local / Global 4 Calorimeter - Synchronous - 40MHz L0Muon (TGC/RPC) L0Muon (TGC/RPC) Muons tracks L1Calo-PPM FADC, BCID etc. L1Calo-PPM FADC, BCID etc. Analogue towers 0.1x0.1 HLT Readout ~3.2 s Latency Topo (LVDS)
5
Option 0: Current boundary conditions Phase-I Hardware Keep within existing latency envelope –in terms of triggering offers no more than Phase-I o same calorimeter granularities o same feature extraction algorithms Hence, does nothing to tackle the rate ! The status quo is not an option ! 5
6
Current L1 Latency 27Jul2010USG/UPO - Level-1 Timing - N.Gee6
7
Trigger / L1Calo - Synchronous - 40MHz Option 1: Architecture Key issues –can not run “sufficient algorithm” to get the rate down and stay within latency budget –L1Track precluded within this latency envelope FADCFADC FADCFADC ROD BCID Sums Zero Supp. ROD BCID Sums Zero Supp. e/ Jet/E T Local Topo Global Topo e.g Jet/e disambiguation FIBREFIBRE to CTP 7 Fine granularity MiniTowers Calorimeter - Synchronous - 40MHz L0Muon (TGC/RPC) L0Muon (TGC/RPC) Muons tracks ~3.2 s HLT Readout Latency Digital Frontends
8
Option 1: Current Boundary Conditions Keep within design latency (2.5 to 3 s) and rate (75-100kHz) –clearly ATLAS would prefer this, BUT viability is questionable… –incompatible with a seeded (or even unseeded) L1Track o so L1Track probably not then useful (inferior to FTK at HLT) ! –incompatible with L1MDT (which is too slow) –rate will be highest of options 1,2 or 3 (at low thresholds) o no L1Track to help o no advanced, time-consuming algorithms to help –latency may be too long (with LAr digital readout scheme), and if not, then there’s little headroom for future expansion –however, interface to muon system, topo’ processing and E T miss correction with muons all should be possible Actions –MC study of rates (first, finest resolution minitowers, then if rates viable, optimise minitower size) –very careful latency calculations needed 8
9
Latency with LAr on-detector digitisation 27Jul2010USG/UPO - Level-1 Timing - N.Gee9
10
Option 2: Architecture Similar to Option 1, but now enough time allowed for L1Track & L1MDT to contribute FADCFADC FADCFADC ROD BCID Sums Zero Supp. ROD BCID Sums Zero Supp. L0Topo L1Topo Global e.g Jet/e disambiguation FIBREFIBRE to CTP L1Track Seed ID tracks MiniTowers Clusters Jets L1T MDT Track L1M Processing at 500kHz (40MHz) Local 10 e/ Jet/E T Calorimeter - Synchronous - 40MHz L0Muon (TGC/RPC) L0Muon (TGC/RPC) Muons tracks Trigger / L1Calo - Synchronous - 40MHz Digital Frontends
11
Option 2: as 1, but longer latency As Option 1, but latency at least 6 s to allow for L1Track –in fact subset of Option 3, but with private/internal L0A for L1Track –L1A processing runs at internal L0A rate (500kHz), but since overall system is fully synchronous L1A logic clocks at 40MHz –hard to include algorithms in firmware with variable loop length o perhaps can approximate with unrolled, fixed number of iterations –synchronous processor is FPGA o or can a DSP guarantee latency ? –has much of the complexity of option 3, but few of the advantages… o detectors buffer all data until the L1A (since L0A is hidden) thus, no possibility of a “fast” clear (L0A) o no opportunity for refined feature extraction o limited scope for complex algorithms can define latency to get required precision, but fixed L1A latency implies hard cut-off for iterative algorithms 11
12
Option 3: Architecture FADCFADC FADCFADC ROD L0Calo FEX e/ -Jet L0Calo FEX e/ -Jet L0Topo (global) FIBREFIBRE Calorimeter L0A Fine granularity MiniTowers L0Muon FEX (TGC/RPC) L0Muon FEX (TGC/RPC) L1Track FEX L1Track FEX L1Calo FEX e/ -Jet L1Calo FEX e/ -Jet L0CTPL0CTP L0CTPL0CTP L1Topo (global) L1Topo (global) L1CTPL1CTP L1CTPL1CTP L1Muon FEX (MDT) L1Muon FEX (MDT) MDT hits 0.05x0.025 MiniTowers ID hits RODL0FexL0TopoL1FexL1Topo Zero suppression, MiniTower sums, BCID Cluster / Jet findingSum E T, E T missIterative cluster / jet finding Invariant Mass Transverse Mass Calibrated E T Jet/e disambiguation -isolation (calorimetric) Track findingSphericity, Aplanarity π 0 find ID- track match ID-e track match Synchronous - 40MHzAsynchronous - 500kHz TGC/RPC hits L0A (fast clear) To ID & Muon Subsystems L1A ~3 s 10s of s Latency L1A HLT Readout Level-0 Level-1 12 Seeds
13
Option 3: 2-stage h/w trigger Assumes L0A latency ~3 s & 500kHz and L1A at ~100kHz with long latency (only limited by CHF for L0-FIFO buffer) –L1A processing runs at L0A rate (500kHz) –if in software, processing can’t be synchronous o (e.g. variable loop lengths) –need de-randomising buffer (at L1 input) to handle closely spaced L0A, and mean processing time of ~0.5 mean interval between L0A –probably not possible with L1 implemented as a single CPU/DSP o “farming” out-of-sequence results o BUT, must ensure results in-sequence to avoid sub-detector nightmares 13
14
Trigger Processors and Readout Buffers L1 processing must… –maintain results order –sustain average L0 rate without FIFO overflow L0A –select or clear is sent for every event –signal timing selects correct event L1A –select or clear is sent for every event –L0A event number used as cross-check L0-FIFO –decouples 40MHz regime – saves detectors from large amounts of 40MHz buffer 7 6 5 4 3 2 1 7 5 2 5 L0 Proc L1 Proc Scrolling L0-FIFO L1-Readout 40 MHz sync. 100 kHz async. 40 MHz sync. time L0A number Synchronous - 40MHzAsynchronous - 500kHz 14 Option 3 Detector Buffering Scheme
15
L1A Processing Options - Option 2.vs.3 Synchronous (left) - Option 2 –processors dedicated to different parts of data (e/ / ) –fixed delays to re-synchronise processor outputs –(results) clock at (~)500kHz, with idle cycles for L0A gaps Asynchronous (right) – Option 3 –variable latency processors work on different aspects of one event (feature farming), or even different events (event farming), or both ! –results are put in queues and combined when all ready 15
16
Algorithms Feature extraction –e/ / clusters; jet finding o sliding window; iterative; … –π 0 identification Isolation –e/ (HadCal); (calorimetric) Overlaps –e/jet disambiguation; Feature combination –transverse / invariant mass Event sums –total E T, missing E T (significance) Event topology –sphericity; aplanarity; –rapidity gap Ref: “Potential Topological Trigger Algorithms for the Phase I Upgrade” – Koll, Kraus & Linnemann 16
17
Algorithm Characteristics By no means a definitive, complete analysis ! –sometimes > 1 technology option…and probably full of errors –complex algorithms may be simplified without greatly impacting triggering –but, can start to see architectural implications GPP = General Purpose Processor; GPGPU = General Purpose Graphics Processing Unit CalculationExampleCharacteristicsTechnologyLevelDomain FEX e/ clusters, jets parallel, sequentialFPGA0local FEX (fine grain)jets - e.g (anti)-k T iterative, sortingGPP/GPGPU1local overlapcombinationsFPGA with GPP core0 and/or 1local matchcombinationsFPGA with GPP core0 and/or 1local masscombinationsFPGA with GPP core1global sumsFPGA0 (or 1 ?)global topologysphericitytensor, diagonaliseGPP/GPGPU1global topology(2)histogramFPGA with GPP core1global 17
18
Summary 18 0 1 2 3 Re-use Phase-I Existing Latency ~3 s No possibility to expand algorithms Finer granularity not an option L1Track & L1MDT precluded Phase-I with digital frontend Improved feature extraction Fine granularity Existing Latency ~3 s Little possibility to expand algorithms L1Track & L1MDT precluded Little opportunity to reduce rate 2-stage : L0, L1 Expanded Latency Seed L1Track, L1MDT (L0A can be kept internal) Fine granularity Doable: Limited time to expand algorithms L1Track & L1MDT included BUT Messy timing structure No option to improve on L0Calo 2-stage : L0, L1 Full architecture at both stages Possible asynchronous L1 Doable: More flexibility (e.g. algorithms) Timing domains clearer Architecture OptionWhat Why Both 2 & 3 are feasible, but 3 has simpler characteristics, and offers more (algorithmic) flexibility L L L L
19
Some Questions Which algorithms are we going to pursue ? –Monte Carlo studies… –pileup effects Is Level-1 synchronous or asynchronous ? –determines some technology choices –if asynchronous & GPP/GPGPU used, does Level-1 merge with HLT ? o is it just 500kHz that stops this ? o is data-flow/connectivity too complex ? What is the “real-estate” budget ? –can e/ and jet be done on the same module ? Need much more complete understanding of latencies (at Phase II) Need (much more) detail in… –dataflows, data concentration and cabling –algorithm characteristics/complexity/timings/… 19
20
The End (We’ll keep digging) 20
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.