Niko Neufeld niko.neufeld@cern.ch https://www.afinite.co.uk/media/85/Connectivity.jpg  (quasi) real-time connectivity requirements ”CERN openlab workshop.

Slides:



Advertisements
Similar presentations
CCNA3: Switching Basics and Intermediate Routing v3.0 CISCO NETWORKING ACADEMY PROGRAM Switching Concepts Introduction to Ethernet/802.3 LANs Introduction.
Advertisements

E-link IP for FE ASICs VFAT3/GdSP ASIC design meeting 19/07/2011.
Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
I/O Channels I/O devices getting more sophisticated e.g. 3D graphics cards CPU instructs I/O controller to do transfer I/O controller does entire transfer.
PCIe based readout U. Marconi, INFN Bologna CERN, May 2013.
Niko Neufeld, CERN/PH-Department
ACM 511 Chapter 2. Communication Communicating the Messages The best approach is to divide the data into smaller, more manageable pieces to send over.
GBT Interface Card for a Linux Computer Carson Teale 1.
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
Status and plans for online installation LHCb Installation Review April, 12 th 2005 Niko Neufeld for the LHCb Online team.
Network Architecture for the LHCb DAQ Upgrade Guoming Liu CERN, Switzerland Upgrade DAQ Miniworkshop May 27, 2013.
The new CMS DAQ system for LHC operation after 2014 (DAQ2) CHEP2013: Computing in High Energy Physics Oct 2013 Amsterdam Andre Holzner, University.
1 Network Performance Optimisation and Load Balancing Wulf Thannhaeuser.
Niko Neufeld PH/LBC. Detector front-end electronics Eventbuilder network Eventbuilder PCs (software LLT) Eventfilter Farm up to 4000 servers Eventfilter.
Latest ideas in DAQ development for LHC B. Gorini - CERN 1.
Ethernet. Ethernet standards milestones 1973: Ethernet Invented 1983: 10Mbps Ethernet 1985: 10Mbps Repeater 1990: 10BASE-T 1995: 100Mbps Ethernet 1998:
Guido Haefeli CHIPP Workshop on Detector R&D Geneva, June 2008 R&D at LPHE/EPFL: SiPM and DAQ electronics.
Why it might be interesting to look at ARM Ben Couturier, Vijay Kartik Niko Neufeld, PH-LBC SFT Technical Group Meeting 08/10/2012.
Niko Neufeld, CERN/PH. ALICE – “A Large Ion Collider Experiment” Size: 26 m long, 16 m wide, 16m high; weight: t 35 countries, 118 Institutes Material.
Niko Neufeld, CERN/PH. Online data filtering and processing (quasi-) realtime data reduction for high-rate detectors High bandwidth networking for data.
Sonoma Workshop 2008 OpenFabrics at 40 and 100 Gigabits? Bill Boas, Vice-Chair
1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.
Niko Neufeld HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
Niko Neufeld, CERN. Trigger-free read-out – every bunch-crossing! 40 MHz of events to be acquired, built and processed in software 40 Tbit/s aggregated.
Pierre VANDE VYVRE ALICE Online upgrade October 03, 2012 Offline Meeting, CERN.
S.Anvar, V.Gautard, H.Le Provost, F.Louis, K.Menager, Y.Moudden, B.Vallage, E.Zonca, on behalf of the KM3NeT consortium 1 IRFU/SEDI-CEA Saclay F
ROM. ROM functionalities. ROM boards has to provide data format conversion. – Event fragments, from the FE electronics, enter the ROM as serial data stream;
The Evaluation Tool for the LHCb Event Builder Network Upgrade Guoming Liu, Niko Neufeld CERN, Switzerland 18 th Real-Time Conference June 13, 2012.
Firmware and Software for the PPM DU S. Anvar, H. Le Provost, Y.Moudden, F. Louis, E.Zonca – CEA Saclay IRFU – Amsterdam/NIKHEF, 2011 March 30.
IRFU The ANTARES Data Acquisition System S. Anvar, F. Druillole, H. Le Provost, F. Louis, B. Vallage (CEA) ACTAR Workshop, 2008 June 10.
HTCC coffee march /03/2017 Sébastien VALAT – CERN.
Digital Acquisition: State of the Art and future prospects
Giovanna Lehmann Miotto CERN EP/DT-DI On behalf of the DAQ team
Transient Waveform Recording Utilizing TARGET7 ASIC
Balazs Voneki CERN/EP/LHCb Online group
MPD Data Acquisition System: Architecture and Solutions
Use of FPGA for dataflow Filippo Costa ALICE O2 CERN
DAQ read out system Status Report
LHCb and InfiniBand on FPGA
10 GIGABIT ETHERNET TECHNOLOGY
FPGAs for next gen DAQ and Computing systems at CERN
Challenges in ALICE and LHCb in LHC Run3
Niko Neufeld LHCb Upgrade Online Computing Challenges CERN openlab Workshop on Data Center Technologies and Infrastructures, Mar 2017.
Intro to MIS – MGS351 Network Basics
PANDA collaboration meeting FEE session
Electronics Trigger and DAQ CERN meeting summary.
LHC experiments Requirements and Concepts ALICE
Enrico Gamberini, Giovanna Lehmann Miotto, Roland Sipos
Direct Attached Storage and Introduction to SCSI
10 Gigabit Ethernet 1 1.
What is Fibre Channel? What is Fibre Channel? Introduction
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
Trigger, DAQ, & Online: Perspectives on Electronics
Introduction to Networks
The LHCb Event Building Strategy
Network Basics Extended Learning Module E
Direct Attached Storage and Introduction to SCSI
Low Latency Analytics HPC Clusters
VELO readout On detector electronics Off detector electronics to DAQ
Ethernet LAN 1 1.
Example of DAQ Trigger issues for the SoLID experiment
Energy Efficient Scheduling in IoT Networks
Characteristics of Reconfigurable Hardware
Business Data Communications, 4e
LHCb Trigger, Online and related Electronics
Network Architecture for Cyberspace
Network Processors for a 1 MHz Trigger-DAQ System
SVT detector electronics
The LHCb Front-end Electronics System Status and Future Development
COE 342: Data & Computer Communications (T042) Dr. Marwan Abu-Amara
Presentation transcript:

Niko Neufeld niko.neufeld@cern.ch https://www.afinite.co.uk/media/85/Connectivity.jpg  (quasi) real-time connectivity requirements ”CERN openlab workshop 23/4/17” Niko Neufeld niko.neufeld@cern.ch

Acknowledgements & disclaimer Most of this is not original, though some of it I arrived at independently  Too vast a topic for 15 minutes, you will be subject to heavy presenter’s bias Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Getting from the sensors Distributing Processing Storing The 4 tasks with data Getting from the sensors Distributing Processing Storing Specific to Online computing Both Online & Offline Repeated cycles Not covered in this talk Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

The ideal experiment: get all data which your instrument provides! In theory this should allow you to extract anything of interest In practice this can be a lot of data. Typical LHC experiment has 10^7 sensors with new values every 25 ns – assuming (pessimistically) 8 bit ADC for each sensor – this gives 400 Tbyte / second (!) What is limiting us? Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

The 4 limitations with data Getting from the sensors Distributing Processing Storing Specific to Online computing Not covered in this talk Not covered in this talk Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

The readout chain Detector / Sensor Ideally all these functions are integrated in the front-end chip There will be challenges with power (dissipation), in particular at high rate detectors, and, in some cases, radiation Amplifier Filter Shaper Range compression Sampling clock Digital filter Zero suppression Buffer Feature extraction Buffer Format & Readout to Data Acquisition System Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

The last (first) mile The last mile is (still) in copper Optical modules too large / too power hungry / not radiation hard enough But copper is not what we want in the way of our particles! This is the ultimate limit to high-rate read-out Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Very high density readout 1): wireless Wadapt: “Wireless Allowing Data And Power Transmission” Original idea by R. Brenner (Uppsala) Radial Datatransfer Communication between layers Signal cannot traverse layers Reuse of frequency channels Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

The next link for LHC: LpGBTX (draft characteristics) 500 mW (target) 5.12 Gbit/s low power 10.24 Gbit/s (”high” speed) Includes possibility for strong Forward Error Correction (FEC) Bidirectional link including possibility to use for synchronous (trigger) commands, clock and slow- control: one link for everything https://espace.cern.ch/GBT-Project/default.aspx Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Silicon photonics Challenging, but very hot topic  industry interest (hint, hint ) High bandwidth (10, 25 Gbit/s)… Allows very tight integration, very compact, dense optical transmitters directly from front-end ASICs Lots of challenges: Integration with other electronics and packaging Radiation hardness needs to be established (some first encouraging results have been achieved) Source: S. Seif El Nasr-Storey et al. http://http://cds.cern.ch/record/2025925 Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Transiting to COTS Commercial of the Shelf Custom Electronics Transition module Switching network On-detector Off-detector Processing Elements Data formatting MUX Data formatting Zero- suppression DAQ Data buffering during trigger Optional, if event-building required Derandomizer buffer Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Transitioning to COTS using PCIe and an (intel) FPGA Today transitioning to COTS means PCIe or Ethernet Example: PCIe40, developed by CPP Marseille Will be used by ALICE and LHCb for LHC run-3 48 bi-directional links up to 10 gbit/s Free choice of protocol (ex. Gbt, ethernet etc… Multimode fibre 850 nm Can bring up to 110 Gbit/ into a PC over PCIe High-port density, cost-effective solution – more integration possible? CPU/FPGA? Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Event-building Detector Every Readout Unit has a piece of one data-set (“event”) If all pieces must be brought together into a single compute unit for filtering then event-building is required This can be most easily be done by a switching network Readout Units Switching network 100 m rock Compute Units Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

LAN technologies Ethernet (Ieee 802.3 xx) Infiniband (IB) Omni-path The LAN Normally used in conjunction with IP (and tcp) Infiniband (IB) Open standard, single vendor Complete stack(no need for IP or TCP) rmance computing and as a back- end for storage Low cpu usage on clients by using remote direct memory access (rdma) Low latency Omni-path Intel alternative to infiniband Lowest cost per unit bandwidth Anything else is exotic http://www.digibarn.com/collections/diagrams/ethernet-original/composit-ethernet-sketch.jpg Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

DAQ challenges Transport multiple Terabit/s reliably and cost-effectively 500 port full duplex, full bi-sectional bandwidth network, aiming at 80% sustained link-load @ >= 100 Gbit/s / link Integrate the network closely and efficiently with compute resources (be they classical CPU, “many-core” or accelerator-assisted) Multiple network technologies should seamlessly co-exist in the same integrated fabric (“the right link for the right task”) Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Future network needs for LHC DAQ 2020 2026 2026 2020 Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

The evolution of Network Interconnects HDR (200 Gb/s) HCA expected Only general purpose technologies shown Driven by SERDES speed: 10, 25, 50 GHz PCIe Gen3 available PCIe Gen4 100 GbE NIC Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

How much does a Gigabit/s cost? Source: http://www.kernelsoftware.com/products/catalog/mellanox.html Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Cost example DAQ (Fat Tree) Based on Ethernet (40G) Assume 100 data sources and 150 computers, 40G / server! The radix is here 32 We need 35 switches, 474 cables Cost (assuming typical discounts) ~ 333000 USD Assumes perfect scaling! Source: https://computing.llnl.gov/tutorials/lc_resources/ DIY: http://www.broadcom.com/application/cloud_scale_network/cloud_configurator.php (design the network + network costs) http://www.colfaxdirect.com (adapter cost) Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Scaling in practice A lot of challenges remain! The good: more than 31 Tbit/s in flight! The bad: between 4 and 512 nodes loose more than 50% A lot of challenges remain! Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Long-distance data-transport a.k.a. the Prevessin challenge The servers can be hosted elsewhere. In that case the 32 Tbit/s raw data need to be transported off-site Distance: about 4 km (for ALICE and LHCb) Lay many new fibres or use of DWDM Using 25 GHz wavelength about 1200 lambdas are needed in the maximum configuration The solution should be compact, does not need 99.999% availability, it should be scalable (starting smaller to be ramped up to max later) Traffic is essentially uni-directional  Could this be exploited? In any case data transport cost / Tbit/s is significant, compression of data is attractive, if cost-efficient Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17

Conclusions Significant challenges remain in all areas of data movement Not only online but also offline Getting the data out and distributed is crucial for more and better physics in all future experiments LAN growth is still strong, but Online requirements are very high and budgets are limited Niko Neufeld, real-time connectivity requirements, openlab workshop 23/03/17