Department of Particle Physics & Astrophysics ATLAS TDAQ upgrade proposal TDAQ week at CERN Michael Huffer, November 19, 2008
Department of Particle Physics & Astrophysics 2 Outline DAQ support for next generation HEP experiments… –“survey the requirements and capture their commonality” One size does not fit all… –generic building blocks the (Reconfigurable) Cluster Element (RCE) the Cluster Interconnect (CI) –industry standard packaging ATCA Packaged solutions –the RCE board –the CI board Applicability to ATLAS (the proposal): –motivation –scope –details ROM (Read-Out-Module) CIM (Cluster-Interconnect-Module) ROC (Read-Out-Crate) –physical footprint, scaling & performance Summary
Department of Particle Physics & Astrophysics 3 Three building block concepts Computational elements –must be low-cost $$$ footprint power –must support a variety of computational models –must have both flexible and performanent I/O Mechanism to connect together these elements –must be low-cost –must provide low-latency/high-bandwidth I/O –must be based on a commodity (industry) protocol –must support a variety of interconnect topologies hierarchical peer-to-peer fan-In & fan-Out Packaging solution for both element & interconnect –must provide High Availability –must allow scaling –must support different physical I/O interfaces –preferably based on a commercial standard The Reconfigurable Cluster Element (RCE) –employs System-On-Chip technology (SOC) The Cluster Interconnect (CI) –based on 10-GE Ethernet switching ATCA –Advanced Telecommunication Computing Architecture –crate based, serial backplane
Department of Particle Physics & Astrophysics 4 (Reconfigurable) Cluster Element (RCE) Bundled software: –bootstrap loader –Open Source kernel (RTEMS) POSIX compliant interfaces standard I/P network stack –exception handling support MGTs Core DSP tiles Combinatoric logic Resources Processor 450 MHZ PPC MByte RLD-II Boot Options Memory Subsystem Configuration data 128 MByte Flash Data Exchange Interface (DEI) instruction reset & bootstrap options Class libraries (C++) provide: –DEI support –configuration interface Bundled software: –GNU cross- development environment (C & C++) –remote (network) GDB debugger –network console
Department of Particle Physics & Astrophysics 5 Resources Multi-Gigabit Transceivers (MGTs) –up to 24 channels of: SER/DES input/output buffering clock recovery 8b/10b encoder/decoder 64b/66b encoder/decoder –each channel can operate up to 6.5 gb/s –channels may be bound together for greater aggregate speed Combinatoric logic gates flip-flops (block RAM) I/O pins DSP support –contains up 192 Multiple-Accumulate-Add (MAC) units
Department of Particle Physics & Astrophysics 6 Derived configuration - Cluster Element (CE) Combinatoric logic MGTs Core 1.0/2/5/10.0 gb/s PGP Ethernet MAC MGTs Combinatoric logic E gb/s E1
Department of Particle Physics & Astrophysics 7 The Cluster Interconnect (CI) Based on two Fulcrum FM224s –24 port 10-GE switch –is an ASIC (packaging in 1433-ball BGA) –XAUI interface (supports multiple speeds including 100-BaseT, 1-GE & 2.5 gb/s) –less then 24 watts at full capacity –cut-through architecture (packet ingress/egress < 200 NS) –full Layer-2 functionality (VLAN, multiple spanning tree etc..) –configuration can be managed or unmanaged Management bus RCE 10-GE L2 switch Q0Q1 Q2Q3
Department of Particle Physics & Astrophysics 8 A cluster of 12 elements To back-end systems Cluster Interconnect Elements From Front-End systems switching fabric
Department of Particle Physics & Astrophysics 9 Why ATCA as a packaging standard? An emerging telecom standard… Its attractive features: –backplane & packaging available as a commercial solution –generous form factor 8U x 1.2” pitch –hot swap capability –well-defined environmental monitoring & control –emphasis on High Availability –external power input is low voltage DC allows for rack aggregation of power Its very attractive features: –the concept of a Rear Transition Module (RTM) allows all cabling to be on rear (module removal without interruption of cable plant) allows separation of data interface from the mechanism used to process that data –high speed serial backplane protocol agnostic provision for different interconnect topologies
Department of Particle Physics & Astrophysics 10 RCE board + RTM (Block diagram) MFD RCE flash memory RCE slice 0 slice 1 slice 2 slice 3 slice 7 slice 6 slice 5 slice 4 MFD Fiber-optic transceivers P2 PayloadRTM P3 E1 E0 base fabric
Department of Particle Physics & Astrophysics 11 RCE board + RTM Media Carrier with flash Media Slice controller RCE Zone 1 (power) Zone 2 Zone 3 transceiver s RTM
Department of Particle Physics & Astrophysics 12 Cluster Interconnect board + RTM (Block diagram) MFD CI Q2 P2 1-GE 10-GE XFP Q0 Q1Q3 XFP PayloadRTM P3 10-GE 1-GE 10-GE XFP 10-GE (fabric) (base) base fabric (fabric) (base)
Department of Particle Physics & Astrophysics 13 Cluster Interconnect board + RTM CI Zone 3 Zone 1 10 GE switch XFP 1G Ethernet RCE XFP RTM
Department of Particle Physics & Astrophysics 14 Typical (5 slot) ATCA crate fans CI RTM RCE RTM CI board Power supplies RCE board Shelf manager Front Back
Department of Particle Physics & Astrophysics 15 Motivation Start with the premise that ROD replacement is inevitable… –detector volume will scale upwards with luminosity –modularity of Front-End-Electronics will change Replacement is an opportunity to address additional concerns… –longevity of existing RODS long-term maintenance & support –many different ROD flavors difficult to capture commonality & reduce duplicated effort –ROS Ingress/Egress imbalance capable of almost 2 Gbytes/sec input capable of less than 400/800 Mbytes/sec output –scalability each added ROD requires (on average) adding one ROS/PC –one ROD (on average) drives two ROLs –one ROS/PC can process (roughly) 2 ROLs worth of data physical separation adds mechanical & operational constraints
Department of Particle Physics & Astrophysics 16 Scope & the Integrated Read-Out System (IROS) ROD crates –RODs –crate controller –L1 distribution & control –“back-plane” boards ROS/PC racks –ROS/PCs –ROBins ROLs (between ROS & ROD) “wires” connecting these components Proposal calls out for the replacement of the IROS… –Intrinsic modularity of the scheme allows replacing a subset Level-2 (L2) trigger Event Builder (EB) Integrated Read Out System (IROS) Detector Front-End Electronics Upstream & downstream systems would remain the same… Proposal is constructed out of three elements: –ROM (Read-Out-Module) combines functionality of ROD + ROS/PC –CIM (Cluster-Interconnect-Module) –ROC (Read-Out-Crate)
Department of Particle Physics & Astrophysics 17 Read-Out-Module (ROM) From detector FEE ROC backplane Cluster Elements P3 ROM Rear Transition Module P2 10-GE switch L1 fanout switch management (X4) 3.2 gb/s 10 gb/s (X2) 10 gb/s (X2) 2.5 gb/s CIM
Department of Particle Physics & Astrophysics 18 Read-Out-Crate (ROC) from L1 CIM Rear Transition Module 10-GE switch P3 Backplane Rear Transition Module switch management L1 fanout 10-GE switch Shelf Management 10-GE switch P3 ROMs CIM To monitoring & controlfrom L1 To L2 & Event Building switch management L1 fanout (X12) 10 gb/s (X2) 10 gb/s (X2) 2.5 gb/s
Department of Particle Physics & Astrophysics 19 IROS USA15 UX15 SDX1 IROS Switching fabric L1 trigger Event Builder farm L2 farm (X384) 3.2 gb/s (X24) 10 gb/s TileCalLArPixelSCTTRTMDTCSC Inner TrackerMuonCalorimeter CalRPCTGCTPICTP
Department of Particle Physics & Astrophysics 20 IROS plant footprint Detector System Detector Subsystem Total RODs Total ROLs ROD crates The ROS Total ROMs Total crates Total CIMs ROBInsPCs Calorimeter LAr TileCal Inner Detector Pixel SCT TRT Muon MDT CSC L1 Calorimeter Muon RPC Muon TGC MUCTPI CTP Totals current proposed
Department of Particle Physics & Astrophysics 21 Scaling & performance Phase Input event rates (KHz)Data size (Mbytes) Output rates (Gbytes/s) IROSL2EBEventROIL2EB Proposal scales linearly with number of ROMs… –2 Gbytes/sec/ROM for L2 network –.5 Gbytes/sec/ROM for Event Building network For plug replacement example this implies an output capacity of… –270 Gbytes/sec for L2 –118 Gbytes/sec for Event Building As a comparison current system has a total output capacity of… –116 Gbytes/sec (8 NIC channels) Performance requirements as a function of luminosity upgrade phase –numbers derived from Andy’s upgrade talk (TDAQ week, May 2008) –both ROI size & number change as a function of luminosity
Department of Particle Physics & Astrophysics 22 Summary SLAC is positioning itself for a new generation of DAQ… –strategy is based on the idea of modular building blocks inexpensive computational elements (the RCE) interconnect mechanism (the CI) industry standard packaging (ATCA) –architecture is now relatively mature both demo boards (& corresponding RTMs) are functional RTEMS ported & operating network stack fully tested and functional –performance and scaling meet expectations –costs have been established (engineering scales): ~$1K/RCE (goal is less then $750) ~$1K/CI (goal is less then $750) This is an outside view looking in (presumptuous + sometimes useful) –Initiate discussion on the bigger picture –Separate proposal abstraction from its implementation common substrate ROD integration of ROD + ROL + ROBin functionality Inherent modularity of this scheme allows piece-meal (adiabatic) replacement –can co-exist with current system Leverage recent industry innovation –System-On-chip (SOC) –High speed serial transmission –Low cost, small footprint, high-speed switching (10-GE) –Packaging standardization (serial backplanes and RTM)