Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.

Slides:



Advertisements
Similar presentations
Middleware Support for RDMA-based Data Transfer in Cloud Computing Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi Department of Electrical.
Advertisements

1 GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing.
Performance Analysis of Virtualization for High Performance Computing A Practical Evaluation of Hypervisor Overheads Matthew Cawood University of Cape.
Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.
Development of a track trigger based on parallel architectures Felice Pantaleo PH-CMG-CO (University of Hamburg) Felice Pantaleo PH-CMG-CO (University.
CCU EE&CTR1 Software Architecture Overview Nick Wang & Ting-Chao Hou National Chung Cheng University Control Plane-Platform Development Kit.
Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
CS 550 Amoeba-A Distributed Operation System by Saie M Mulay.
Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.
Figure 1.1 Interaction between applications and the operating system.
Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 1 Timm M. Steinbeck HLT Data Transport Framework.
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
LNL CMS G. MaronCPT Week CERN, 23 April Legnaro Event Builder Prototypes Luciano Berti, Gaetano Maron Luciano Berti, Gaetano Maron INFN – Laboratori.
A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.
Data Acquisition Backbone Core DABC J. Adamczewski, H.G. Essel, N. Kurz, S. Linev GSI, Darmstadt The new Facility for Antiproton and Ion Research at GSI.
1 March 2010 A Study of Hardware Assisted IP over InfiniBand and its Impact on Enterprise Data Center Performance Ryan E. Grant 1, Pavan Balaji 2, Ahmad.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
Artdaq Introduction artdaq is a toolkit for creating the event building and filtering portions of a DAQ. A set of ready-to-use components along with hooks.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
C.Combaret, L.Mirabito Lab & beamtest DAQ with XDAQ tools.
LECC2003 AmsterdamMatthias Müller A RobIn Prototype for a PCI-Bus based Atlas Readout-System B. Gorini, M. Joos, J. Petersen (CERN, Geneva) A. Kugel, R.
Design and Performance of a PCI Interface with four 2 Gbit/s Serial Optical Links Stefan Haas, Markus Joos CERN Wieslaw Iwanski Henryk Niewodnicznski Institute.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
Towards a Homogeneous Software Environment for DAQ Applications Luciano Orsini Johannes Gutleber CERN EP/CMD.
Computers Operating System Essentials. Operating Systems PROGRAM HARDWARE OPERATING SYSTEM.
Ihr Logo Operating Systems Internals & Design Principles Fifth Edition William Stallings Chapter 2 (Part II) Operating System Overview.
Network Architecture for the LHCb DAQ Upgrade Guoming Liu CERN, Switzerland Upgrade DAQ Miniworkshop May 27, 2013.
D ata A cquisition B ackbone C ore DABCDABC , Huelva J.Adamczewski, H.G.Essel, N.Kurz, S.Linev 1 Work.
Upgrade of the CMS Event Builder Andrea Petrucci - CERN (PH/CMD) on behalf of the CMS DAQ group 19 th International Conference on Computing in High Energy.
DYNES Storage Infrastructure Artur Barczyk California Institute of Technology LHCOPN Meeting Geneva, October 07, 2010.
7. CBM collaboration meetingXDAQ evaluation - J.Adamczewski1.
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
The new CMS DAQ system for LHC operation after 2014 (DAQ2) CHEP2013: Computing in High Energy Physics Oct 2013 Amsterdam Andre Holzner, University.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Data Acquisition Backbone Core J. Adamczewski-Musch, N. Kurz, S. Linev GSI, Experiment Electronics, Data processing group.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
IT/EE Palaver FAIR DAQ - J.Adamczewski, S.Linev1.
Latency Reduction Techniques for Remote Memory Access in ANEMONE Mark Lewandowski Department of Computer Science Florida State University.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
Latest ideas in DAQ development for LHC B. Gorini - CERN 1.
DABCDABC D ata A cquisition B ackbone C ore J.Adamczewski, H.G.Essel, N.Kurz, S.Linev 1 Work supported by EU RP6 project.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
LNL 1 SADIRC2000 Resoconto 2000 e Richieste LNL per il 2001 L. Berti 30% M. Biasotto 100% M. Gulmini 50% G. Maron 50% N. Toniolo 30% Le percentuali sono.
Management of the LHCb Online Network Based on SCADA System Guoming Liu * †, Niko Neufeld † * University of Ferrara, Italy † CERN, Geneva, Switzerland.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
CODA Graham Heyes Computer Center Director Data Acquisition Support group leader.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
LECC2004 BostonMatthias Müller The final design of the ATLAS Trigger/DAQ Readout-Buffer Input (ROBIN) Device B. Gorini, M. Joos, J. Petersen, S. Stancu,
COMPASS DAQ Upgrade I.Konorov, A.Mann, S.Paul TU Munich M.Finger, V.Jary, T.Liska Technical University Prague April PANDA DAQ/FEE WS Игорь.
Remigius K Mommsen Fermilab CMS Run 2 Event Building.
Online Software November 10, 2009 Infrastructure Overview Luciano Orsini, Roland Moser Invited Talk at SuperB ETD-Online Status Review.
The Evaluation Tool for the LHCb Event Builder Network Upgrade Guoming Liu, Niko Neufeld CERN, Switzerland 18 th Real-Time Conference June 13, 2012.
CHEP 2010, October 2010, Taipei, Taiwan 1 18 th International Conference on Computing in High Energy and Nuclear Physics This research project has.
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
SDN controllers App Network elements has two components: OpenFlow client, forwarding hardware with flow tables. The SDN controller must implement the network.
M. Bellato INFN Padova and U. Marconi INFN Bologna
LHCb and InfiniBand on FPGA
J. Gutleber, L. Orsini, 2005 March 15
Controlling a large CPU farm using industrial tools
CMS DAQ Event Builder Based on Gigabit Ethernet
Northbound API Dan Shmidt | January 2017
MPJ: A Java-based Parallel Computing System
The LHCb High Level Trigger Software Framework
GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing.
Presentation transcript:

Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics Conference 2 June 2014, Amsterdam

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 2 Outline  Motivations  A Quick Overview of Infiniband  CMS Data Acquisition  Event Building  CMS Online Software Framework (XDAQ)  Integrating Infiniband  Preliminary Results  Conclusions

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 3 Motivations  DAQ hardware (PC’s) from run 1 at end-of-life (>5 years old)  Cost-effective solution that meets the requirements for run 2  Opportunity to take advantage of technological advances top500.org Gigabit Ethernet Infiniband

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 4 A Quick Overview of Infiniband  Infiniband is…  A switched fabric computer network communications link for high performance  Reliable communication  Supports message based transfer using send/receive semantic  Multiple programming methods  Infiniband verbs  uDAPL (user-level Direct Access Programming Library)  IPoIB  Software available as part of the OFED (OpenFabrics Enterprise Distribution)

CMS Data Acquisition

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 6 Parameters Data Sources (FEDs)~ 620 Trigger levels2 First Level rate100 kHz Event size1 to 2 MB Readout Throughput200 GB/s High Level Trigger1 kHz Storage Bandwidth2 GB/s CMS DAQ Requirements for LHC Run 2

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 7 CMS DAQ2 Layout

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 8 CMS Event Builder Network

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 9 CMS Event Builder Network Resources Type Run 1Run 2 RU~64084 BU ~ FU~1260 Throughput Requirement100 Gb/s200 Gb/s Resulting Required Physical Resources

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 10 CMS Event Builder Layout Event Manager (EVM) Input Sources (e.g. Detector Frontend) Builder Networks Readout Units Builder Units Event fragments stored in physically separated memory systems Fragments collected into full event by builder units … …

CMS Online Software Framework (XDAQ)

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 12 XDAQ Framework  The XDAQ is software platform created specifically for the development of distributed data acquisition systems  Implemented in C++, developed by the CMS DAQ group  Provides platform independent services, tools for inter-process communication, configuration and control  Builds upon industrial standards, open protocols and libraries, and is designed according to the object-oriented model For further information a bout XDAQ see: J. Gutleber, S. Murray and L. Orsini, Towards a homogeneous architecture for high-energy physics data acquisition systems published in Computer Physics Communications, vol. 153, issue 2, pp ,

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 13 XDAQ Architecture Foundation Core Executive Plugin Interface

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 14 Application Plugin Core Executive Plugin Interface XDAQ Architecture Foundation XML Configuration XML Configuration

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 15 XDAQ Architecture Foundation XML Configuration XML Configuration Application Plugin Core Executive Plugin Interface SOAP HTTP RESTful Control and Interface

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 16 XDAQ Architecture Foundation XML Configuration XML Configuration Application Plugin Core Executive Plugin Interface Hardware Access VMEPCIMemory SOAP HTTP RESTful Control and Interface

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 17  Uniform building blocks - One or more executives per computer contain application and service components XDAQ Architecture Foundation XML Configuration XML Configuration SOAP HTTP RESTful Control and Interface Application Plugin Core Executive Plugin Interface Hardware Access VMEPCIMemory Communication Protocols

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 18 Hardware Architecture - NUMA  Non Uniform Memory Access (NUMA)  XDAQ provides utilities to take advantage The distance between processing cores and memory or I/O interrupts varies for each core CPU Memory Controller CPU Memory Controller CPU Memory Controller CPU Memory Controller I/O Devices

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 19 I/O Controller CPU Multithreading Workloop Thread Application Thread  Workloops can be bound to run on specific CPU cores by configuration  Work assigned by application  Workloops provide easy use of threads Assignment of work

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 20 CPU Memory – Memory Pools Memory Pool Cached buffers All memory allocation can be bound to specific NUMA node  Memory pools allocate and cache memory  log 2 best-fit allocation  No fragmentation of memory over long runs  Buffers are recycled during runs – constant time for retrieval

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 21 Memory – Buffer Loaning Step 1 Step 2Step 3 Task ATask B Loans reference Allocates reference Releases reference  Buffer loaning allows zero-copy of data between software layers and processes

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 22 Protocol level XDAQ User Application  User application is network and protocol independent  Routing defined by XDAQ configuration  Connections setup through Peer to Peer model Logical communication NIC Data Transmission – Peer Transports XDAQ User Application NIC Peer Transport Application uses network transparently through XDAQ framework Peer Transport

Integrating Infiniband

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 24 Integration Aims  Evaluation of two programming libraries  uDAPL  verbs  Both from OpenFabric distribution  uDAPL has connection support  verbs is lower level in the software stack

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 25 Infiniband NIC Infiniband Peer Transports XDAQ ptIBVerbs ptuDAPL  Two new XDAQ Peer Transports  uDAPL -> ptuDAPL  verbs -> ptIBVerbs  Full integration into XDAQ framework  Event based API  Send/receive with reliable connections User Application  Buffer loaning – zero-copy  Memory pools automatically register memory with the NIC  Translation of virtual to physical addresses  Pinning memory to avoid swapping

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 26 Connection Handling  uDAPL provides IP address based connections  verbs leaves the question of connecting peers open…  For ptIBVerbs, a custom connection mechanism was implemented based upon IPoIB 1. Connection request issued 2. Connection information in reply

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 27  Work is distributed across several XDAQ workloops  Workloops are bound to run on one or more cores  Sending/receiving operations separated and use different memory pools Application sending Event Handling Receiving allocate control IB free receive completion dispatch allocate free send completion post Parallelism

Preliminary Results

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 29 Test Setup  Small scale with 4 nodes on 1 switch  1x1 and 2x2 (RU x BU) tests for…  N-to-N and event building tests  Each node has…  Dual socket Intel Xeon E core 2.6 GHz  16 GB RAM per socket (NUMA)  Mellanox Connect X-3 VPI Infiniband FDR network card  OFED v 2.x  Scientific Linux (CERN) 6 SDRDDRQDRFDR-10FDREDR 1X X X Effective unidirectional theoretical throughput in Gb/s

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 30 N-to-N Setup Clients Servers  N clients each send to N servers for each ‘message’  The measurement is the rate of receiving in the receivers  No additional processing  Fixed sized messages, round robin dispatching in the senders  Test to show the performance for unidirectional throughput

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 31 N-to-N

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 32 Event Building Setup  Event fragments are generated in the RU’s  Fully built event are dropped in the BU’s  The measurement is the rate of receiving in the BU’s  Additional control messages with Event Manager RU BU Event Manager

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 33 Event Building

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 34 Infiniband FDR Event Builder CLOS Network Larger Scale Tests  ptIBVerbs used for larger scale tests  up to 48x48 using Infiniband CLOS network  Preliminary N-to-N tests

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 35 ptIBVerbs Performance – N-to-N Message Size v Throughput

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 36 DAQ1 v DAQ2 RU Performance

Conclusions

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 38 Conclusions  Infiniband works well with event building applications  and the CMS Online Software framework (XDAQ)  CMS DAQ will be using ptIBV for data flow in LHC run 2  Performance compared to DAQ 1 allows for an order of magnitude of reduction in physical resources for event building  In the future…  Full DAQ2 tests

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 39 Thank You For Listening  Questions ?

Additional Materials

Technology and Instrumentation in Particle Physics Conference, 2 June 2014, Amsterdam– Andrew Forrest CERN (PH/CMD) 41 OFED