Presentation is loading. Please wait.

Presentation is loading. Please wait.

RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne

Similar presentations


Presentation on theme: "RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne"— Presentation transcript:

1 The New LHCb Trigger and DAQ Strategy: A System Architecture based on Gigabit-Ethernet
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne for the LHCb Collaboration

2 LHCb Trigger Niko NEUFELD CERN, EP

3 Two Software Trigger Levels
Both run on commercial PCs Level-1 uses reduced data set: only part of the sub-detectors (mostly Vertex-detector and some tracking) with limited-precision data has a limited latency, because data need to be buffered in the front-end electronics reduces event rate from 1.1 MHz to 40 kHz, by selecting events with displaced secondary vertices High Level Trigger (HLT) uses all detector information reduces event rate from 40 kHz to 200 Hz for permanent storage Niko NEUFELD CERN, EP

4 Features Two data streams to handle:
Level-1 trigger: MHz High Level Trigger: kHz Fully built from commercial components (Gigabit) Ethernet throughout Push-through protocol, no re-transmissions Centralized flow control Latency control for Level-1 at several stages Scalable by adding CPUs and/or switch ports Niko NEUFELD CERN, EP

5 Architecture Readout Network Switch Multiplexing Layer FE
Level-1 Traffic HLT Traffic Links 44 kHz GB/s 349 Links 40 kHz 2.3 GB/s 31 Switches 33 Links 1.7 GB/s 90-153 SFCs Front-end Electronics Gb Ethernet Mixed Traffic Links GB/s TRM Sorter TFC System L1-Decision Storage System Readout Network SFC CPU CPU Farm 62-83 Switches Links 88 kHz ~1400 CPUs Niko NEUFELD CERN, EP

6 Front-end electronics
Separation of Level-1 and HLT paths two Ethernet links into network Data must be packaged into IPv4 packets Must be able to pack several events into “super-events”  reduce packet rate into network Must provide sufficient buffer space to allow for Level-1 trigger algorithm to decide (53 ms total) Must assign destination, which is centrally distributed (with the trigger system) Niko NEUFELD CERN, EP

7 Event Building Network
Built from Gigabit Ethernet switches (1000 BaseT a.k.a UTP copper) Try to optimise link-load ( ~ 80% or 100 MB/s) using (cheap) office switches to multiplex links from front-end Need a large core switch with ~ 100 x 100 ports  can be built from smaller elements Need switch with sufficient amount of buffering and good internal congestion control Niko NEUFELD CERN, EP

8 CPU farm More than 1000 PCs partitioned into sub-farms consisting of
a Sub-farm Controller (SFC), acting as a gateway into the readout-network a number of worker CPUs, only known to the sub-farms The SFC builds the events from the “super-event” fragments it receives distributes them among its workers in a load-balancing manner receives the trigger decisions from workers and passes them on to permanent storage (HLT events) passes them to the decision sorter (Level-1 events) Niko NEUFELD CERN, EP

9 Latencies Queuing latencies in the network (switch buffers)
Multiplexing Layer FE Switch Front-end Electronics TRM Sorter L1-Decision Readout Network Queuing latencies in the network (switch buffers) Queuing in the SFC (“all all nodes are busy with a L1 event”) SFC Switch CPU CPU Farm Reception of event and invocation of trigger algorithm TFC System Niko NEUFELD CERN, EP

10 Latencies due to queuing in the network or the farm
Latencies in the network can only be estimated from simulation, because it comes from competition between large packets for the same output port (forwarding latency of a packet in a switch is negligible) Latencies in the sub-farm are due to statistical fluctuations in the Level-1 processing time Simulation using simulated raw data shows that the amount of events which run into Level-1 time-out because of this is very small ( < 10-4) Goes down as sub-farms grow in number of workers This can and will be measured Niko NEUFELD CERN, EP

11 Context Switching Latency
What is it? On a multi-tasking OS, whenever the OS switches from one process to another it needs a certain time to do this Why do we worry? Because we run the L1 and the HLT algorithms concurrently on each CPU node Why do we want this concurrency? We want to use every available CPU cycle Niko NEUFELD CERN, EP

12 Scheduling and Latency
Using Linux we have established two facts about the scheduler: Soft Realtime priorities work: the Level-1 task will never be interrupted until it finishes The context switch latency is low: < 10.1 ± 0.2 µs Measurements of this have been done on a high-end server 2.4 GHz PIV Xeon – 400 MHz FSB – we should have machines at least 2x faster in 2007 Conclusion: the scheme of running both tasks concurrently is sound Niko NEUFELD CERN, EP

13 System Design God-given parameters: trigger rates, transport overheads, raw data size distributions per front-end links Chosen parameters: number of CPUs (1400), average link load (80%), maximum acceptable rate at event-building SFC (80 kHz), packing factor of events into “super-events” for transport (25) Munch through huge spread-sheet, apply some reasonable rounding and take care of partitioning and voila! Niko NEUFELD CERN, EP

14 Some Numbers Number of CPUs 1400
Aggregated rate through network (including all overheads) 7.2 GB/s Links from detector (Level-1) 126 Links from detector (HLT) 349 Input ports into network 97 Output ports from network 91 Frame-rate at SFC 80 kHz Maximum time for Level-1 algorithm 53 ms Number of events in 1 super-event 10 to 25 Average event 1.1 MHz (Level-1) 4.8 kB Average event 40 kHz (full read-out) 38 kB Mean CPU time for Level-1 algorithm (extrapolated) 700 us Niko NEUFELD CERN, EP

15 Summary LHCb’s new software trigger system operates on the same infrastructure two read-out streams at 40 and 1100 kHz One event stream requires hard latency restrictions to be obeyed System is based on Gigabit Ethernet and uses commercial and mostly commodity hardware throughout The system can be built today and afforded in three years from now Niko NEUFELD CERN, EP


Download ppt "RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne"

Similar presentations


Ads by Google