Download presentation
Presentation is loading. Please wait.
Published byRonald Neal Watts Modified over 8 years ago
1
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 20071 Computer Instrumentation Triggering and DAQ Jos Vermeulen, UvA / NIKHEF Topical lectures, 29 June 2007
2
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 20072 Event data pushed @ ≤ 100 kHz, 1600 fragments of ~ 1 kByte each ATLAS detector Read- Out Drivers ( RODs ) First- level trigger Read-Out Subsystems ( ROSs ) LVL2 Super- visor UX15 USA15 SDX1 CERN computer centre SDX1 USA15 UX15 Dedicated links Timing Trigger Control (TTC) 1600 Read- Out Links 10 Gigabit Ethernet ATLAS Trigger / DAQ DataFlow Overview RoI Builder DataFlow Manager Event Filter (EF) pROS ~ 870~1500 Regions Of Interest VME Data of events accepted by first-level trigger Event data requests Delete commands Requested event data stores LVL2 output dual 1, 2 or 4-core CPU nodes ~100~30 Network switches Event data pulled: partial events @ ≤ 100 kHz, full events @ ~ 3 kHz Event rate ~ 200 Hz Data storage Local Storage SubFarm Outputs (SFOs) LVL2 Farm + switches Network switches Event Builder SubFarm Inputs (SFIs) Second- level trigger ~40x10 ~320x1 Gbit/s ~20 switches ~150 PCs Gigabit Ethernet
3
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 20073 Event data pushed @ ≤ 100 kHz, 1600 fragments of ~ 1 kByte each ATLAS detector Read- Out Drivers ( RODs ) First- level trigger Read-Out Subsystems ( ROSs ) UX15 USA15 Dedicated links Timing Trigger Control (TTC) 1600 Read- Out Links RoI Builder VME Data of events accepted by first-level trigger RODs, ROS PCs and ROBINs 10 Gigabit Ethernet Read-Out Drivers (ROD): subdetector-specific, collect and process data (no event selection) output via Read-Out Links (ROL, 200 MByte/s optical fibers) to buffers on ROBIN cards in Read-Out Subsystem (ROS) PCs Same type of ROLs, ROBINs and ROS PCs used for all sub-detectors ROBINs: 64-bit 66 MHz PCI cards 3 ROL inputs ROS PCs: 4U rack-mounted PCs with 4 ROBINs => 12 ROLs per ROS PC ~40x10 ~320x1 Gbit/s ~20 switches ~150 PCs
4
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 20074 ROBIN : about 700 produced 1 1 1 2 3 4 5 3 Read-Out Link channels (1) (200 MByte/s per channel), 64 MByte buffer memory per ROL, electrical Gigabit Ethernet (2), PowerPC processor (466 MHz) (3), 128 MByte program and data memory, Xilinx XC2V2000 FPGA (4), 66 MHz PCI-64 interface (5) 12 layer PCB, 220*106mm, surface Mounted Devices on both sides Power consumption ~ 15W (operational)
5
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 20075 PCI bus: Peripheral Interconnect Bus: parallel multi-master bus, used on PC motherboards. The ROS PCs use the 66 MHz, 64 bit variety, desktop machines used in most cases the 32 MHz, 32 bit variety, but this is phased out and replaced by PCI Express, which makes use of fast serial links, up to 16 in parallel for a single slot. More information: http://www.pcisig.com/http://www.pcisig.com/
6
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 20076 ROBIN designers (Gerard Kieft (NIKHEF), Barry Green (RHUL) and Andreas Kugel (Mannheim) ) at work at NIKHEF ROBIN
7
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 20077 ROS system integrators in USA-15 (Gokhan Ünel, Markus Joos, Benedetto Gorini, Louis Tremblet)
8
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 20078 ROS: PC properties 4U, 19" rack mountable PC Motherboard: Supermicro X6DHE-XB CPU: One 3.4 GHz Xeon Hyper threading not used uni-processor kernel RAM: 512 MB Network: 2 GB onboard 1 used for control network 4 GB on PCI-Express card 2 used + 2 free (upgrade) for LVL2 RoI collection & for Event Building Redundant power supply Network booted (no local hard disk) Remote management via IPMI HP Beck - LHEP BernTDAQ weekCERN May21 – May 25, 2007 8
9
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 20079 DataFlow in a ROS PC ROS Application PPC Event store Data from RODs ROBIN FPGA PCI bus... PPC Event store ROBIN FPGA ROBIN request queues Multi-threaded C++ program running under Linux (SLC3 / SLC4) PowerPC processor in ROBIN runs C program booted from FLASH memory The application retrieves data fragments from the ROBINs, combines them in a single fragment and sends this to the requester Fragments have to be requested for each ROL individually Data requests from LVL2 or EB ROS request queue Request handlers 1 event per thread = DAQ thread = Linux process = Scheduler = Control thread Data to LVL2 or EB Data fragment Control message Request receiver ("trigger I/F") ROS - 3.4 GHz Xeon PC (SuperMicro X6DHE-XB Motherboard)
10
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200710 PCI Measurements … PCI ROBIN NIC DOLAR ROBIN DOLAR Test PCData SourceROS PC NIC Setup Test PC generates following messages: “data request”, 2 types: "LVL2" -> data requested from only few ROBIN channels "EB" -> data requested from all ROBIN channel of the ROS "event delete" (or "clear") containing a list of 100 events XOFF of link throttles data rate when ROBIN data buffer full Test data generator (FPGA based), emulates RODs, free running at max. speed Gigabit Ethernet Philosophy Let the system run as fast as possible in different ATLAS-like conditions, defined by different LVL2 accept and EB fractions Measure the sustained ”event delete" rate, this is the maximum first-level accept rate compatible with the LVL2 and EB rates following from the fractions chosen
11
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200711 Running the system at NIKHEF 21.5 kHz event rate, EB fraction 100%, no LVL2 requests, 7 ROLs active
12
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200712 IPMI: Intelligent Platform Management Interface: remote control and monitoring of server via Ethernet.
13
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200713
14
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200714 The Event Builder Node: SFI HP Beck - LHEP BernTDAQ week 32 SFI PCs installed Final system ~100 SFIs 1U, 19" rack mountable PC Motherboard: Supermicro H8DSR-i CPU: AMD Opteron 252 2.6 GHz SMP kernel RAM: 2 GB Network: 2 GB onboard 1 used for control network 1 used for data-in 1 GB on PCI-Express card used for data-out 1 dedicated IPMI port Cold-swappable power supply Network booted Local hard disk to store event data; only used for commissioning Remote management via IPMI CERN May21 – May 25, 2007 14
15
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200715 Perfect scaling observed up to 29 SFIs @ Event size of 1.5 MB 3.3 GB/s aggregated bandwidth 2.2 kHz EB rate Single SFIs close to GE speed 114 MB/s per SFI 78 Hz per SFI Event Building scaling properties HP Beck - LHEP BernTDAQ weekCERN May21 – May 25, 2007 15
16
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200716 Compute server: Dell Poweredge 1950 2 * 4 cores, 1.86 GHz, 8 GByte memory, Dual Gigabit Ethernet (copper) 130 machines installed in SDX1 as “interchangeable processing unit” or “XPU” (can be used as SFI as well as EF processor)
17
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200717 Interchangeable processing power Standard processor rack with up- links to both FrontEnd and BackEnd networks. The processing power migration between L2 and EF is achieved by software enabling/disabling of the appropriate up-links. From presentation Stefan Stancu, CHEP06, Mumbai, India switch for control network
18
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200718 XPUs HP Beck - LHEP BernTDAQ week 130 XPU PCs installed Can act as L2P Can act as EFP 1U, 19" rack mountable PC Motherboard: Intel CPU: 2 x Intel E5320 quad core 1.86 GHz SMP kernel RAM: 8 GB i.e., 1 GB per core Network: 2 GB onboard 1 used for control and IPMI 1 used for data VLAN to allow connecting to both DataCollection & BackEnd n/w’s Cold-swappable power supply Network booted Local hard disk – recovery from crashes; pre-load events CERN May21 – May 25, 2007 18
19
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200719 SFOs HP Beck - LHEP BernTDAQ weekCERN May21 – May 25, 2007 19 6 SFO PCs installed 5 + 1 live spare 5U, 19" rack mountable PC Motherboard: Intel CPU: 2 x Intel E5130 dual core 2.0 GHz SMP kernel RAM: 4 GB Network: 2 GB onboard 1 used for control and IPMI 1 used for data 3 GB on PCIe cards used for data 3 SATA RAID controllers 24 SATA disks à 500GB each Quadruple cold-swappable power supplies Network booted
20
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200720 Network architecture From presentation Stefan Stancu, CHEP06, Mumbai, India
21
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200721 Two central switches One VLAN per switch (V1 and V2) Fault tolerant (tolerate one switch failure) Network architecture From presentation Stefan Stancu, CHEP06, Mumbai, India
22
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200722 Ethernet is the dominant technology for LANs –TDAQ’s choice for networks multi-vendor, long term support, commodity (on-board GE adapters), etc. –Gigabit and TenGigabit Ethernet Use GE for end-nodes 10GE whenever the bandwidth requirements exceed 1Gbit/s Multi-vendor Ethernet switches/routers available on the market: –Chassis-based devices ( ~320 Gbit/s switching) GE line-cards: typically ~40 ports 10GE line-cards: typically 4 ports –Pizza-box devices (~60 Gbit/s switching) 24/48 GE ports Optional 10GE module with 2 up-links Networking From presentation Stefan Stancu, CHEP06, Mumbai, India
23
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200723 From presentation Stefan Stancu, CHEP06, Mumbai, India What happens if a switch or link fails? –Phone call, but nothing critical should happen after a single failure. Networks are made resilient by introducing redundancy: –Component-level redundancy: deployment of devices with built-in redundancy (PSU, supervision modules, switching fabric) –Network-level redundancy: deployment of additional devices/links in order to provide alternate paths between communicating nodes. Protocols are needed to correctly (and efficiently) deal with multiple paths in the network NB: Webster: characterized or marked by resilience : as a : capable of withstanding shock without permanent deformation or rupture b : tending to recover from or adjust easily to misfortune or change
24
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200724 see Cisco whitepaper, “Understanding Multiple Spanning Tree Protocol (802.1s)” http://www.cisco.com/warp/public/473/147.pdf Cluster Core switch Conc. sw VLAN X VLAN Y 2 x loop From presentation Stefan Stancu, TDAQ workshop, Mainz, October 2005 Problem! Multiple Spanning Tree protocol and virtual LANs (VLANs)
25
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200725 Cluster Core switch Conc. sw VLAN X VLAN Y MST OK Multiple Spanning Tree protocol and virtual LANs (VLANs) From presentation Stefan Stancu, TDAQ workshop, Mainz, October 2005
26
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200726 Cluster Core switch Conc. sw VLAN X VLAN Y failure OK Multiple Spanning Tree protocol and virtual LANs (VLANs) From presentation Stefan Stancu, TDAQ workshop, Mainz, October 2005
27
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200727 Queue starts to form here if all senders send at the same time Switch Senders Requester Traffic shaping By limiting the number of outstanding requests the building up of queues can be avoided and the associated risk of packet loss be minimized
28
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200728 ~2000 end-nodes One core device with built-in redundancy Rack level concentration with use of link aggregation for redundant up-links to the core. BackEnd network From presentation Stefan Stancu, CHEP06, Mumbai, India
29
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200729 Control network ~3000 end nodes Design assumption: the instantaneous traffic does not exceed 1 Gbit/s on any segment, including up-link. One device suffices for the core layer, but better redundancy is achieved by deploying 2 devices. A rack level concentration switch can be deployed for all units except for critical services.
30
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200730 Characteristics of the software running in the system: Multi-threading: necessary for optimizing system performance, ROSs, LVL2 processors, SFIs: requests for data are generated and transmitted, while waiting for the data other processing can be done, like e.g. inputting data requested earlier. By assigning the various tasks to different threads and also allowing several threads for each task processor utilization can be optimized. Also machines with multi multi-core CPUs provide a natural environment for multi-threaded software. Thread-safety is of course a must! Communication between processes on different machines is essential for the system. TCP is preferred, but scalability is an issue, UDP in conjunction with explicitly programmed time-outs in any case will provide the necessary communication. Remote process invocation is also essential for bringing up the system Basic functionality is provided by ssh, which provides support for remote command execution, in conjunction with specification of all environment variables required
31
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200731
32
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200732 Configuration data base
33
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200733 Modelling of the dataflow system: Paper model: started as "back on the envelope" calculation, now a small C++ program, which, starting from a trigger menu and rejection factors, allows to calculate average rates in the system. Computer model: C++ program based on discrete event simulation, allowing to simulate the behavior of the system with respect to data flow aspects, again starting from a trigger menu and rejection factors. Continuous effort to keep up with new hardware and software introduced in the system and to help in understanding results of test measurements.
34
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200734 Example output “computer model”, design luminosity, 100 kHz LVL1 rate LVL2 decision time (µs) (histogram selected in list) Scrolling list with histogram titles # of LVL1 triggers Monitoring of simulation Scrolling list with statistics Menu index vs. LVL2 decision time (µs) Scrolling list with histogram titles (~ 1 hr running on 1.8 GHz G5 Mac)
35
Jos VermeulenTopical Lectures, Computer Instrumentation, TDAQ, June 200735 Example output “computer model”, design luminosity, 100 kHz LVL1 rate Time interval between LVL1 accept and completion of event building (µs) (histogram selected in list) Scrolling list with histogram titles # of LVL1 triggers Monitoring of simulation Scrolling list with statistics Number of events buffered in em calo ROBIn Scrolling list with histogram titles
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.