Future experiment specific needs for LHCb OpenFabrics/Infiniband Workshop at CERN Monday June 26 Sai Suman Cherukuwada Sai Suman Cherukuwada and Niko Neufeld.

Slides:



Advertisements
Similar presentations
LHCb Upgrade Overview ALICE, ATLAS, CMS & LHCb joint workshop on DAQ Château de Bossey 13 March 2013 Beat Jost / Cern.
Advertisements

Copyright© 2000 OPNET Technologies, Inc. R.W. Dobinson, S. Haas, K. Korcyl, M.J. LeVine, J. Lokier, B. Martin, C. Meirosu, F. Saka, K. Vella Testing and.
The LHCb Event-Builder Markus Frank, Jean-Christophe Garnier, Clara Gaspar, Richard Jacobson, Beat Jost, Guoming Liu, Niko Neufeld, CERN/PH 17 th Real-Time.
Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
The LHCb DAQ and Trigger Systems: recent updates Ricardo Graciani XXXIV International Meeting on Fundamental Physics.
Laboratoire de l’Accélérateur Linéaire, Orsay, France and CERN Olivier Callot on behalf of the LHCb collaboration Implementation and Performance of the.
5 th LHCb Computing Workshop, May 19 th 2015 Niko Neufeld, CERN/PH-Department
LHCb readout infrastructure NA62 TDAQ WG Meeting April 1 st, 2009 Niko Neufeld, PH/LBC.
PCIe based readout U. Marconi, INFN Bologna CERN, May 2013.
The LHCb Online System Design, Implementation, Performance, Plans Presentation at the 2 nd TIPP Conference Chicago, 9 June 2011 Beat Jost Cern.
K. Honscheid RT-2003 The BTeV Data Acquisition System RT-2003 May 22, 2002 Klaus Honscheid, OSU  The BTeV Challenge  The Project  Readout and Controls.
Niko Neufeld, CERN/PH-Department
Architecture and Dataflow Overview LHCb Data-Flow Review September 2001 Beat Jost Cern / EP.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
Status and plans for online installation LHCb Installation Review April, 12 th 2005 Niko Neufeld for the LHCb Online team.
TELL1 The DAQ interface board for LHCb experiment Gong guanghua, Gong hui, Hou lei DEP, Tsinghua Univ. Guido Haefeli EPFL, Lausanne Real Time ,
Network Architecture for the LHCb DAQ Upgrade Guoming Liu CERN, Switzerland Upgrade DAQ Miniworkshop May 27, 2013.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
1 Network Performance Optimisation and Load Balancing Wulf Thannhaeuser.
Data Acquisition Backbone Core J. Adamczewski-Musch, N. Kurz, S. Linev GSI, Experiment Electronics, Data processing group.
Niko Neufeld PH/LBC. Detector front-end electronics Eventbuilder network Eventbuilder PCs (software LLT) Eventfilter Farm up to 4000 servers Eventfilter.
LHCb Upgrade Architecture Review BE DAQ Interface Rainer Schwemmer.
Latest ideas in DAQ development for LHC B. Gorini - CERN 1.
1 DAQ System Realization DAQ Data Flow Review Sep th, 2001 Niko Neufeld CERN, EP.
LHCb front-end electronics and its interface to the DAQ.
LHCb DAQ system LHCb SFC review Nov. 26 th 2004 Niko Neufeld, CERN.
Guido Haefeli CHIPP Workshop on Detector R&D Geneva, June 2008 R&D at LPHE/EPFL: SiPM and DAQ electronics.
Why it might be interesting to look at ARM Ben Couturier, Vijay Kartik Niko Neufeld, PH-LBC SFT Technical Group Meeting 08/10/2012.
LNL 1 SADIRC2000 Resoconto 2000 e Richieste LNL per il 2001 L. Berti 30% M. Biasotto 100% M. Gulmini 50% G. Maron 50% N. Toniolo 30% Le percentuali sono.
Niko Neufeld, CERN/PH. Online data filtering and processing (quasi-) realtime data reduction for high-rate detectors High bandwidth networking for data.
Management of the LHCb Online Network Based on SCADA System Guoming Liu * †, Niko Neufeld † * University of Ferrara, Italy † CERN, Geneva, Switzerland.
A Super-TFC for a Super-LHCb (II) 1. S-TFC on xTCA – Mapping TFC on Marseille hardware 2. ECS+TFC relay in FE Interface 3. Protocol and commands for FE/BE.
DAQ interface + implications for the electronics Niko Neufeld LHCb Electronics Upgrade June 10 th, 2010.
Common meeting of CERN DAQ teams CERN May 3 rd 2006 Niko Neufeld PH/LBC for the LHCb Online team.
1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.
Niko Neufeld HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Pierre VANDE VYVRE ALICE Online upgrade October 03, 2012 Offline Meeting, CERN.
Niko Neufeld LHCC Detector Upgrade Review, June 3 rd 2014.
PCIe based readout for the LHCb upgrade U. Marconi, INFN Bologna On behalf of the LHCb Online Group (Bologna-CERN-Padova) CHEP2013, Amsterdam, 15 th October.
PCIe40 — a Tell40 implementation on PCIexpress Beat Jost DAQ Mini Workshop 27 May 2013.
DAQ Overview + selected Topics Beat Jost Cern EP.
DAQ & ConfDB Configuration DB workshop CERN September 21 st, 2005 Artur Barczyk & Niko Neufeld.
Introduction to DAQ Architecture Niko Neufeld CERN / IPHE Lausanne.
ROM. ROM functionalities. ROM boards has to provide data format conversion. – Event fragments, from the FE electronics, enter the ROM as serial data stream;
The Evaluation Tool for the LHCb Event Builder Network Upgrade Guoming Liu, Niko Neufeld CERN, Switzerland 18 th Real-Time Conference June 13, 2012.
29/05/09A. Salamon – TDAQ WG - CERN1 LKr calorimeter L0 trigger V. Bonaiuto, L. Cesaroni, A. Fucci, A. Salamon, G. Salina, F. Sargeni.
Grzegorz Kasprowicz1 Level 1 trigger sorter implemented in hardware.
HTCC coffee march /03/2017 Sébastien VALAT – CERN.
Giovanna Lehmann Miotto CERN EP/DT-DI On behalf of the DAQ team
M. Bellato INFN Padova and U. Marconi INFN Bologna
LHCb and InfiniBand on FPGA
Challenges in ALICE and LHCb in LHC Run3
Niko Neufeld LHCb Upgrade Online Computing Challenges CERN openlab Workshop on Data Center Technologies and Infrastructures, Mar 2017.
Electronics Trigger and DAQ CERN meeting summary.
Enrico Gamberini, Giovanna Lehmann Miotto, Roland Sipos
TELL1 A common data acquisition board for LHCb
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
The LHCb Event Building Strategy
VELO readout On detector electronics Off detector electronics to DAQ
LHCb Trigger and Data Acquisition System Requirements and Concepts
UNIZH and EPFL at LHCb.
John Harvey CERN EP/LBC July 24, 2001
Event Building With Smart NICs
The LHCb Trigger Niko Neufeld CERN, PH.
LHCb Trigger, Online and related Electronics
Network Processors for a 1 MHz Trigger-DAQ System
LHCb Online Meeting November 15th, 2000
The LHCb Front-end Electronics System Status and Future Development
TELL1 A common data acquisition board for LHCb
Presentation transcript:

Future experiment specific needs for LHCb OpenFabrics/Infiniband Workshop at CERN Monday June 26 Sai Suman Cherukuwada Sai Suman Cherukuwada and Niko Neufeld CERN/PHNiko Neufeld

Niko Neufeld CERN, PH 2 LHCb Trigger-DAQ system: Today LHC crossing-rate: 40MHz Visible events: 10MHz Two stage trigger system –Level-0: synchronous in hardware; 40 MHz  1 MHz –High Level Trigger (HLT): software on CPU-farm; 1 MHz  2 kHz Front-end Electronics (FE): interface to Readout Network Readout network –Gigabit Ethernet LAN –Full readout at 1MHz Event filter farm –~ 1800 to U servers FE Readout network CPU FE L0 trigger Timing and Fast Control CPU permanent storage

Niko Neufeld CERN, PH 3 LHCb DAQ system: features On average every 1 us new data become available at each of ~ 300 sources (= custom electronics boards, “TELL1”) Data from several 1 us cycles (=“triggers”) are concatenated into 1 IP packet  reduces message / packet-rate IP packets are pushed over 1000 BaseT links  short distances allow using 1000 BaseT throughout Destination IP-address is synchronously assigned via a custom optical network (TTC) to all TELL1s For each trigger a PC-server must receive IP packets from all TELL1 boards (“event-building”).

Niko Neufeld CERN, PH 4 Terminology channel: elementary sensitive element = 1 ADC = 8 to 10 bits. The entire detector comprises millions of channels event: all data fragments (comprising several channels) created at the same discrete time together form an event. It is an electronic snap-shot of the detector response to the original physics reaction zero-suppression: send only channel-numbers of non-zero value channels (applying a suitable threshold) packing-factor: number of event-fragments (“triggers”) packed into a single packet/message –reduces the message rate –optimises bandwidth usage –is limited by the number of CPU cores in the receiving CPU (to guarantee prompt processing and thus limit latency)

Niko Neufeld CERN, PH 5 PC #876 Following the data-flow TELL1 UKL1 TELL1 Front-end Electronics 400 Links 35 GByte/s TFC System Storage System Readout Network Switch PC Switch PC Switch PC Switch PC Switch PC Event Filter Farm 50 Subfarms L0 Yes MEP Destination PC #876 VELO #2 RICH #1 B  ΦΚ s L0 Yes VELO TT IT OT CALO RICH MUON L0 VELO #1 RICH #2 #2 #1 VELO MEP TO PC #876 #2 #1 RICH MEP PC #876 RICH#2 RICH#1 VELO#2 VELO#1 MEP HLT Process MEP Request PC #876

Niko Neufeld CERN, PH 6 Data pre-processing: The LHCb common Readout Board TELL1 PP-FPGA A-RxCard L1B SyncLink-FPGA PP-FPGA L1B PP-FPGA L1B PP-FPGA L1B RO-Tx TTCrxECS 4 x 1000 BaseT TTC ECS FE A-RxCardO-RxCard Throttle Receiver cards get data from detector via optical fibres FPGAs do pre-processing, zero-suppression and data formatting (into IP packets) FPGA attached to Ethernet Quad-MAC on SPI3 bus (simple FIFO protocol) IP packets are pushed out to the Data Acquisition on a private LAN over 4 x 1000 BaseT links

Niko Neufeld CERN, PH 7 Improving the LHCb trigger Triggering is filtering. The quality of the trigger is determined (using simulated data) by measuring how many good events of the possible good events are selected: efficiency ε = N good-selected / N good-all Each stage has its own efficiency. LHCb looses mostly in the “L0” step: 40 MHz  1 MHz Reason: only coarse information (“high p T ”) used Solution: reconstruct secondary vertices at collision rate 40 MHz!

Niko Neufeld CERN, PH 8 Upgrade We want to have a DAQ and Event filter which: allows for vertex triggering at collision rate (40 MHz) fits within the existing infrastructure: –1 MW power and cooling –50 racks with a total space of 2200 Us preserves the main good features of the current LHCb DAQ –simple, scalable, industry-standard technologies, as much as possible commodity items costs <10 7 of a reasonable currency

Niko Neufeld CERN, PH 9 Two Options Two stage readout: –Readout ~ MHz. Data are buffered in the FL1 for a suitable amount of time: 40 ms (?) –Algorithm on event-filter farm selects 1 MHz of “good” events and informs (how?) FL1 boards of its decision (yes/continue – no/discard): –In case of “yes” the entire detector is read out: 35 1 MHz Always read out entire detector MHz (“brute force”)

Niko Neufeld CERN, PH 10 Full read-out at 40 MHz At a collision rate of 40 MHz, the data rate for a full readout is ~ 1400 GB/s, or ~ 12 Tb/s –  network with ~ 2 x 1200 x 10 Gigabit ports Need several switches as building blocks  optimised topology highly desirable (non-Banyan) Advantages: –No latency constraints –Less memory requirements on the FL1 Disadvantages: –Huge, expensive –Almost all of the data shipped will never be looked at (physics  algorithms do not change much) –Requires zero-suppression and FPGA pre-processing for all detector data 40 MHz (not obvious)

Niko Neufeld CERN, PH 11 Parameters / Assumptions Vertex reconstruction requires only a subset of the total event of roughly MHz ( essentially the VertexLocator of the future + some successor of TT) FE with full 40 MHz readout capability We dispose of the successor of the TELL1, FL1 *, which has several 10 Gigabit output links and can do pre-processing / zero-suppression at the required rate Several triggers are packed into a MTP. This reduces the message rate from each board. In this presentation we assume 8 triggers per message == R TX-message = 5 MHz (per FL1) (*) FL1 for Future L1 or Fast L1 or FormuLa 1

Niko Neufeld CERN, PH 12 Data pre-processing: A new readout-board: FL1 PP-FPGA L1B SyncLink-FPGA PP-FPGA L1B PP-FPGA L1B PP-FPGA L1B RO-Tx Sync Info Host Processor ECS 4 x CX4 TTC ECS FE O-RxCard Throttle Receiver cards get data from detector via optical fibres FPGAs do pre-processing, zero-suppression and data formatting) FPGA attached to HCA on ??? bus (are there alternatives to PCIe?) Output to the Data Acquisition private LAN on (up-to) 4 x CX4 cables O-RxCard Host processor needed (??) to handle complex protocol stack

Niko Neufeld CERN, PH 13 Event filter farm for upgraded LHCb We need an event-filter which can absorb 4 * 10^7 * 10 kB/s + 10^6 * 35 kB/s ~ 435 GB/s! Assume 2000 servers: –A server is something which takes one U in space and has p two processor sockets –Each socket holds a chip, which comprises several CPU cores Each server must accept ~ 210 MB/s as 500 kHz of messages of ~ 400 Bytes Options for attaching servers to network: –3 Gigabit links as a trunk: not very practical because would have to bring > 130 links into one rack! –Use an (underused) 10 Gigabit link

Niko Neufeld CERN, PH 14 Server Horoscopes Quad-core processors from Intel and AMD will most likely be available in 2007 Could we have “Octo-cores” by end of 2008? Can thus assume to have 8 cores running at 2 to 2.4 GHz (prob. not more!) in one U. Commitment by Intel and AMD: power consumption per processor < 100 W Reasonable rumors: –2007 will see first mainboards with 10 Gigabit interface on board: most likely CX4 for either 10 Gigabit Ethernet or Infiniband (?)

Niko Neufeld CERN, PH 15 CPU power for triggering / latency / buffering Assuming 2000 servers / cores and 40 MHz of events each core has on average 2.5 ms to reach a decision when processing the ~ 10 kB of vertex-detector data –  should have at least 40 ms buffering in the FL1s to cope with fluctuations in processing time (the processing time distribution is known to have long tails) Assuming 400 FL1 means that they have to have 12.5 GB buffer memory

Niko Neufeld CERN, PH m+ LHCb Detector 1. Readout 10KB 40MHz, Buffer on FL1 123…400 Front-End L1 Boards x10Gbps Links High Density Switches Rack1…Rack50 Fabric 400 ports “in” per Switch 65Gbps per Rack 60 m 20m+? Farm Racks with 1 x 32-port Switch or 2 x 16-port Switch 2. Send to Farm for Trigger decision 3. Send trigger Decision to FL1 4. Receive Trigger Decision. 5. If trigger decision Positive, readout 1MHz

Niko Neufeld CERN, PH 17 Power Consumption Probably need 512 MB per core (trigger process) x 8 ==> 4 GB 4 GB of high-speed memory + onboard 10 Gigabit interface will need also power (assume conservatively 50 W) The 1 U box should stay below 300 W Total power for CPUs < 600 kW 10 Gigabit distribution switches need also power (should count at least 250 W)

Niko Neufeld CERN, PH 18 Open questions Can an FPGA drive the HCA or do we need an embedded host-processor with an OS? It would be nice to centrally assign the next destination (server) to all FL1 boards. This means determining the Queue Pair number and DLID/DGID to send a message to. Can we use the Infiniband network for this as well? Almost the entire traffic is unidirectional (from the FL1s to the servers). Can we take advantage of this fact?