The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb 2000 1 The CMS Event Builder Demonstrator based on Myrinet.

Slides:



Advertisements
Similar presentations
1 IK1500 Communication Systems IK1330 Lecture 3: Networking Anders Västberg
Advertisements

1 ELEN 602 Lecture 18 Packet switches Traffic Management.
Copyright© 2000 OPNET Technologies, Inc. R.W. Dobinson, S. Haas, K. Korcyl, M.J. LeVine, J. Lokier, B. Martin, C. Meirosu, F. Saka, K. Vella Testing and.
Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
Spring 2002CS 4611 Router Construction Outline Switched Fabrics IP Routers Tag Switching.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
An Overview of Myrinet By: Ralph Zajac. What is Myrinet? n LAN designed for clusters n Based on USCD’s ATOMIC LAN n Has many characteristics of MPP message-passing.
t Popularity of the Internet t Provides universal interconnection between individual groups that use different hardware suited for their needs t Based.
8.1 Chapter 8 Switching Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Router Architectures An overview of router architectures.
Can Google Route? Building a High-Speed Switch from Commodity Hardware Guido Appenzeller, Matthew Holliman Q2/2002.
Router Architectures An overview of router architectures.
LNL CMS G. MaronCPT Week CERN, 23 April Legnaro Event Builder Prototypes Luciano Berti, Gaetano Maron Luciano Berti, Gaetano Maron INFN – Laboratori.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
ATM SWITCHING. SWITCHING A Switch is a network element that transfer packet from Input port to output port. A Switch is a network element that transfer.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
The MPC Parallel Computer Hardware, Low-level Protocols and Performances University P. & M. Curie (PARIS) LIP6 laboratory Olivier Glück.
TO p. 1 Spring 2006 EE 5304/EETS 7304 Internet Protocols Tom Oh Dept of Electrical Engineering Lecture 9 Routers, switches.
1 Copyright © Monash University ATM Switch Design Philip Branch Centre for Telecommunications and Information Engineering (CTIE) Monash University
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
LECC2003 AmsterdamMatthias Müller A RobIn Prototype for a PCI-Bus based Atlas Readout-System B. Gorini, M. Joos, J. Petersen (CERN, Geneva) A. Kugel, R.
Data and Computer Communications Chapter 10 – Circuit Switching and Packet Switching (Wide Area Networks)
Univ. of TehranAdv. topics in Computer Network1 Advanced topics in Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
Network Architecture for the LHCb DAQ Upgrade Guoming Liu CERN, Switzerland Upgrade DAQ Miniworkshop May 27, 2013.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
The Data Flow System of the ATLAS DAQ/EF "-1" Prototype Project G. Ambrosini 3,9, E. Arik 2, H.P. Beck 1, S. Cetin 2, T. Conka 2, A. Fernandes 3, D. Francis.
1 Network Performance Optimisation and Load Balancing Wulf Thannhaeuser.
Interconnect simulation. Different levels for Evaluating an architecture Numerical models – Mathematic formulations to obtain performance characteristics.
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
Final Chapter Packet-Switching and Circuit Switching 7.3. Statistical Multiplexing and Packet Switching: Datagrams and Virtual Circuits 4. 4 Time Division.
2003 Conference for Computing in High Energy and Nuclear Physics La Jolla, California Giovanna Lehmann - CERN EP/ATD The DataFlow of the ATLAS Trigger.
Supporting Multimedia Communication over a Gigabit Ethernet Network VARUN PIUS RODRIGUES.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
21-Dec-154/598N: Computer Networks Cell Switching (ATM) Connection-oriented packet-switched network Used in both WAN and LAN settings Signaling (connection.
LNL 1 SADIRC2000 Resoconto 2000 e Richieste LNL per il 2001 L. Berti 30% M. Biasotto 100% M. Gulmini 50% G. Maron 50% N. Toniolo 30% Le percentuali sono.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
DAQ interface + implications for the electronics Niko Neufeld LHCb Electronics Upgrade June 10 th, 2010.
LKr readout and trigger R. Fantechi 3/2/2010. The CARE structure.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Spring 2000CS 4611 Router Construction Outline Switched Fabrics IP Routers Extensible (Active) Routers.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
DAQ Overview + selected Topics Beat Jost Cern EP.
KIP Ivan Kisel, Uni-Heidelberg, RT May 2003 A Scalable 1 MHz Trigger Farm Prototype with Event-Coherent DMA Input V. Lindenstruth, D. Atanasov,
Artur BarczykRT2003, High Rate Event Building with Gigabit Ethernet Introduction Transport protocols Methods to enhance link utilisation Test.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Remigius K Mommsen Fermilab CMS Run 2 Event Building.
The Evaluation Tool for the LHCb Event Builder Network Upgrade Guoming Liu, Niko Neufeld CERN, Switzerland 18 th Real-Time Conference June 13, 2012.
CHEP 2010, October 2010, Taipei, Taiwan 1 18 th International Conference on Computing in High Energy and Nuclear Physics This research project has.
Ethernet Packet Filtering – Part 2 Øyvind Holmeide 10/28/2014 by.
Graciela Perera Department of Computer Science and Information Systems Slide 1 of 18 INTRODUCTION NETWORKING CONCEPTS AND ADMINISTRATION CSIS 3723 Graciela.
Giovanna Lehmann Miotto CERN EP/DT-DI On behalf of the DAQ team
High Rate Event Building with Gigabit Ethernet
Chapter 8 Switching Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
URL: Chapter 8 Switching Tel: (03) Ext: URL:
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
CMS DAQ Event Builder Based on Gigabit Ethernet
J.M. Landgraf, M.J. LeVine, A. Ljubicic, Jr., M.W. Schulz
PCI BASED READ-OUT RECEIVER CARD IN THE ALICE DAQ SYSTEM
Chapter 3 Part 3 Switching and Bridging
The LHCb Event Building Strategy
Event Building With Smart NICs
Router Construction Outline Switched Fabrics IP Routers
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
Network Processors for a 1 MHz Trigger-DAQ System
Chapter 3 Part 3 Switching and Bridging
LHCb Online Meeting November 15th, 2000
Cluster Computers.
Presentation transcript:

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb The CMS Event Builder Demonstrator based on Myrinet Introduction Myrinet Overview Tests of the Switching Fabric Event Building Studies Future Work and Conclusions Frans Meijers CERN/EP on behalf of the CMS DAQ group CHEP2000, Padova Italy, Feb 2000

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Introduction  DAQ architecture and EVB parameters  Event building by switches. Crossbar  EVB traffic shaping: barrel shifter  Banyan network  A multistage 1024 port switch  The CMS DAQ system

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb DAQ architecture and EVB parameters 100 kHz 1 Mbyte 1 Tbps Detector Front-end Computing Services Readout Systems Builder and Filter Systems Event Manager Builder Networks Level 1 Trigger Run Control kbyte Level-1 Maximum trigger rate Average event size Builder network (512x512 port) aggregate throughput Number of Readout Units Average event fragment size High Level Trigger acceptance %

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Event building by switches. Crossbar The maximum switch load for random traffic is about 63% (large N limit) due to head-of-line blocking Higher efficiency: queues at input and/or outputs ports traffic shaping (example: barrel shifter 100%) NxN matrix N 2 number of crosspoints

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb EVB traffic shaping: barrel shifter sources emit to mutually exclusive destinations in a cycle works only for fixed size chunks needs synchronisation

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Banyan network Example : 8x8 made of 3 stages 2x2 (8=2 3 ) single path per connection suffers from internal blocking number of cross points : N log 2 N For random traffic (no intermediate IQ and no OQ): efficiency drops with s, N; for “infinite” N, eff. 20% There exists a non-blocking barrel-shifting pattern

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb A multistage 1024 port switch Banyan topology: NxN out of nxn N=n s basic unit: 8x8 crossbars 3 stages: 512x512 need 192 crossbars in total Important to study multistage switches

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb The CMS DAQ system F U Computing and Communication Services EVM LV1 R U Detector front-end readout Ctrl

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Myrinet overview  Myrinet features  Myrinet switches  Network Interface Card

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb  Myrinet is a System Area Network (SAN)  point to point links, byte wide, full-duplex, 1.3 Gbps per direction, very low error rate  packet structure: routing header, payload and tail each crossbar switch strips leading byte from routing header  wormhole routing (versus store-and-forward) no buffering, low latency, arbitrary length packets  byte based flow control (STOP/GO)  no packet loss inside switching fabric  3Q 2000: link speed from 1.3 Gbps to 2.6 Gbps Myrinet features PAYLOAD ROUTING HEADER CRC STOP GO

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Myrinet switches M2M-OCT-SW8 32 ports 8 times 4x4 crossbars Large switch fabric built out of 4x4 crossbar elements now 8x8 crossbar available as basic element

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Network interface card Myrinet SAN link 32 or 64 (33 or 66 MHz) host DMA RISC Pkt Interface Memory AddressData LANai7 Send DMA 64 (66 MHz) PCI Bridge 66 MHz 2 MByte Recv DMA 8 (80 MHz, NRZ) 8 M2M-PCI64 Developed a custom Myrinet Control Program controls DMA engines implements low-level communication protocol

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Switch tests  Set-up for switch test  Traffic conditions tested  Point-to-point 1x1  Parameters point-to-point 1x1  Point-to-Point NxN - Mutually exclusive paths  Block on output port  Block on internal switch  Random Traffic

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Demonstrator set-up for switch tests 32 nodes Linux PCs PC: 450 MHz PII BX PCI 33 MHz/32bit Myrinet switch: M2M-OCT-SW8, NIC: M2M-PCI64[A] two-stage Banyan network out of 4x4 crossbars sources destinations

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Traffic conditions tested Random traffic Point-to-point traffic (fixed destinations)

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Point-to-point 1x1 full host - NIC DMA: limited by PCI (33 MHz/32bit) partial host - NIC DMA: NIC memory - link: full packet host - NIC: only headers limited by SAN link Allows to load switch to maximum PCI link

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Parameters point-to-point 1x1 Partial host - NIC DMA above 1 kbyte: linear behaviour below 1 kbyte: plateau 5  s (NIC-host communication) speed: 128 Mbyte/s -> PCI speed speed: 141 Mbyte/s -> 92% link eff. Full host - NIC DMA time per packet = overhead + size / speed

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Point-to-point NxN - Mutually exclusive paths [d = 4*(s%4)+s/4, s=0-15] As expected; Aggregate throughput through the switch is linear in N

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Block on output port measured at source #0 Force m (=1,2,3,4) sources on the same destination: Each source gets 1/m of V max

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Block on internal switch Force 2 sources on different destinations, but through same intermediate path: As expected; plateau at V max /2 measured at source #0

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Random traffic measured at destinations Efficiency: 4x4: 69 % expect 68% 16x16: 51 % limited by head-of-line blocking sources send, independently, to a random destination according to a uniform distribution 1x1 4x4 16x

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Event building studies  EVB demonstrator set-up  Event building protocol  Variable size event fragments  Event building performance  Event building: scaling behaviour  Traffic shaping  EVB performance with traffic shaping  performance for variable size event fragments  EVB with traffic shaping: scaling behaviour  Traffic shaping: time evolution

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb EVB demonstrator set-up 32+1 Linux PCs [ 450 MHz PII BX PCI 33 MHz/32bit] Myrinet switch: M2M-OCT-SW8, NIC: M2M-PCI64[A] 16x16 two-stage Banyan network out of 4x4 crossbars Myrinet between RUs and BUs (full duplex). N-to-N traffic Fast Ethernet between BUs and EVM. N-to-1 traffic No emulation of Level-1 trigger

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Request EvtId BU EVMRU EvtId Request Data Send Data Clear EvtId Event building protocol level1 Several EvtId messages are grouped in a single Ethernet packet Myrinet

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Variable size event fragments Log-normal distribution example: Average = 2 kbyte, RMS = 2 kbyte mimics CMS data readout EVB Builder Units Readout Units

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Event building performance Fragment rate per node † 16x16: For 2 kbyte fragments: 30 kHz No traffic shaping Fixed size event fragments 2k unstable 4x4 8x8 16x16 1x1 results: 1x1 is close to point-to-point Performance decrease from 4x4 to 8x8 to 16x16, as expected from small sizes: overhead 7  s † Fragment rate per node = level-1 rate

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Event building - scaling behaviour take average fragment size of 2 kbyte also variable size fragments results: For variable size reduced performance, as expected No scaling in N Need simulation for large N ?

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Traffic shaping Sources divide fragments into fixed size packets (blocks) and cycle through all destinations Inspired by ATM rate division (block size is 53 bytes) Should work for large N multistage switch as well Implementation: Performed by NIC control program Block size set to 4 kbyte (30  s cycle) Barrel shifter without external synchronisation (Myrinet back pressure by HW flow control) Packets can be (partially) empty

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb EVB performance with traffic shaping fixed size event fragments 4k results: close to point-to-point fragment rate per node 16x16: for 2 kbyte fragments: 65 kHz 2k

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Performance for variable size event fragments 2k decrease of efficiency with larger RMS of fragment size distribution (in agreement with Monte Carlo) [†with full host-NIC DMA about 80 Mbyte/s or 40 kHz] Fragment rate per node for nominal average of 2k and RMS 2k †: 60 kHz

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb EVB traffic shaping - scaling behaviour EVB with traffic shaping: approximate scaling

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Traffic shaping - time evolution (I) BS cycling rate * block size 23:00 ? throughput dropped traffic shaping barrel shifter stayed in sync ? 2 hours (= cycles, 10 Tbyte moved)

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Traffic shaping - time evolution (II) 1 hour (= 10 8 cycles) BS cycling rate * block size perturb system : 1: slow down RU1: all BU’s reduced rate 2: slow down BU1: only BU1 reduced rate 1 2 traffic shaping barrel shifter stays in sync EVM RU BU

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Future work and conclusions  Future work  Conclusions

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Future work  Evaluate Myrinet 2000  available 3Q 2000  link speed from 1.3 Gbps to 2.6 Gbps  switches based on 8x8 crossbars as elementary units  Further study of traffic shaping  Simulation  Extrapolate to large systems

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Conclusions  Event builder demonstrator 16x16 based on Myrinet multistage switch and Linux PCs established.  Performed systematic switch studies. As expected.  Measured event building performance  without traffic shaping: no scaling, as expected  with traffic shaping: approximate scaling  For nominal event fragment sizes with average and RMS of 2 kbyte achieved about 60 kHz trigger rate or 120 Mbyte/s per node (almost 2 Gbyte/s aggregate)  That is, today, a factor two off from CMS needs, assuming scaling.  Measurements provide parameters for simulation of large scale (512x512) systems

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Extra Material

The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb Multi-step Event Building Step 1: at 100 kHz Rejection factor 10 with 0.25 of the data from High Level Trigger Step 2: at 10 kHz Remaining 0.75 of the data Throughput reduced by x0.75=0.33, ie factor 3 At the cost of control complexity and increased latency With link speed of 1 Gbps need factor 2 from multi-step event building for 100 kHz level-1 rate (assuming 100% efficient switch ) If higher speed links in , then single-step event builder 100 kHz 10 kHz