J.M. Landgraf, M.J. LeVine, A. Ljubicic, Jr., M.W. Schulz

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Review of Topology and Access Techniques / Switching Concepts BSAD 141 Dave Novak Sources: Network+ Guide to Networks, Dean 2013.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Protocols and software for exploiting Myrinet clusters Congduc Pham and the main contributors P. Geoffray, L. Prylli, B. Tourancheau, R. Westrelin.
Copyright© 2000 OPNET Technologies, Inc. R.W. Dobinson, S. Haas, K. Korcyl, M.J. LeVine, J. Lokier, B. Martin, C. Meirosu, F. Saka, K. Vella Testing and.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Low Overhead Fault Tolerant Networking (in Myrinet)
An Overview of Myrinet By: Ralph Zajac. What is Myrinet? n LAN designed for clusters n Based on USCD’s ATOMIC LAN n Has many characteristics of MPP message-passing.
1 Interfacing Processors and Peripherals I/O Design affected by many factors (expandability, resilience) Performance: — access latency — throughput — connection.
High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.
Evaluation of High-Performance Networks as Compilation Targets for Global Address Space Languages Mike Welcome In conjunction with the joint UCB and NERSC/LBL.
Router Architectures An overview of router architectures.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Protocol-Dependent Message-Passing Performance on Linux Clusters Dave Turner – Xuehua Chen – Adam Oline This work is funded by the DOE MICS office.
Semester 1 Module 8 Ethernet Switching Andres, Wen-Yuan Liao Department of Computer Science and Engineering De Lin Institute of Technology
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
The MPC Parallel Computer Hardware, Low-level Protocols and Performances University P. & M. Curie (PARIS) LIP6 laboratory Olivier Glück.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 1 - by Adrian Riedo - Summer 2000 High Performance Computing using.
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
David Abbott - Jefferson Lab DAQ group Data Acquisition Development at JLAB.
A record and replay mechanism using programmable network interface cards Laurent Lefèvre INRIA / LIP (UMR CNRS, INRIA, ENS, UCB)
Latest ideas in DAQ development for LHC B. Gorini - CERN 1.
STAR Level-3 C. Struck CHEP 98 1 Level-3 Trigger for the Experiment at RHIC J. Berger 1, M. Demello 5, M.J. LeVine 2, V. Lindenstruth 3, A. Ljubicic, Jr.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
June 17th, 2002Gustaaf Brooijmans - All Experimenter's Meeting 1 DØ DAQ Status June 17th, 2002 S. Snyder (BNL), D. Chapin, M. Clements, D. Cutts, S. Mattingly.
The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb The CMS Event Builder Demonstrator based on Myrinet.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
CODA Graham Heyes Computer Center Director Data Acquisition Support group leader.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
1 DAQ.IHEP Beijing, CAS.CHINA mail to: The Readout In BESIII DAQ Framework The BESIII DAQ system consists of the readout subsystem, the.
Rutherford Appleton Laboratory September 1999Fifth Workshop on Electronics for LHC Presented by S. Quinton.
DAQ 1000 Tonko Ljubicic, Mike LeVine, Bob Scheetz, John Hammond, Danny Padrazo, Fred Bieser, Jeff Landgraf.
The Evaluation Tool for the LHCb Event Builder Network Upgrade Guoming Liu, Niko Neufeld CERN, Switzerland 18 th Real-Time Conference June 13, 2012.
CHEP 2010, October 2010, Taipei, Taiwan 1 18 th International Conference on Computing in High Energy and Nuclear Physics This research project has.
The ALICE Data-Acquisition Read-out Receiver Card C. Soós et al. (for the ALICE collaboration) LECC September 2004, Boston.
IRFU The ANTARES Data Acquisition System S. Anvar, F. Druillole, H. Le Provost, F. Louis, B. Vallage (CEA) ACTAR Workshop, 2008 June 10.
Introduction to Operating Systems Concepts
HTCC coffee march /03/2017 Sébastien VALAT – CERN.
MPD Data Acquisition System: Architecture and Solutions
High Performance and Reliable Multicast over Myrinet/GM-2
CIS 700-5: The Design and Implementation of Cloud Networks
Module 12: I/O Systems I/O hardware Application I/O Interface
Chapter 13: I/O Systems Modified by Dr. Neerja Mhaskar for CS 3SH3.
PC Farms & Central Data Recording
LHC experiments Requirements and Concepts ALICE
The DZero DAQ System Sean Mattingly Gennady Briskin Michael Clements
CMS DAQ Event Builder Based on Gigabit Ethernet
DAQ upgrades at SLHC S. Cittolin CERN/CMS, 22/03/07
CS703 - Advanced Operating Systems
Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: illusion of having more physical memory program relocation protection.
The LHCb Event Building Strategy
Active Message Implementation
High Performance Messaging on Workstations
I/O Systems I/O Hardware Application I/O Interface
Operating System Concepts
13: I/O Systems I/O hardwared Application I/O Interface
CS703 - Advanced Operating Systems
Router Construction Outline Switched Fabrics IP Routers
MPJ: A Java-based Parallel Computing System
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
SUPPORTING MULTIMEDIA COMMUNICATION OVER A GIGABIT ETHERNET NETWORK
CDA-5155 Computer Architecture Principles Fall 2000
Myrinet 2Gbps Networks (
Module 12: I/O Systems I/O hardwared Application I/O Interface
Cluster Computers.
Presentation transcript:

Implementation of the STAR Data Acquisition System using a Myrinet Network J.M. Landgraf, M.J. LeVine, A. Ljubicic, Jr., M.W. Schulz (Brookhaven National Laboratory) J.M. Nelson (University of Birmingham) C. Adler, J.S. Lange (University of Frankfurt)

First Collisions at RHIC! Star Control Room June 12th, 2000 9:00pm

Outline The STAR DAQ System Introduction to Myrinet Components Event Building Network Introduction to Myrinet Myrinet Implementation Myrinet Software (GM) STAR DAQ Software myriLib Year 2 Event Builder Performance & Reliability

STAR DAQ DAQ Readout Units L3 Event Building Network VME Crate-Based Custom RBs with ASICs & i960 CPUs Motorola MVME Detector Broker L3 Linux Farm (Compaq Alpha workstations) Physics based build decision Event Building Network Token Management Event Building Event Storage and Buffering

DAQ / L3 Event Building Network Squares: MVME / VxWorks Circles: Alpha Workstations / Linux Diamonds: Ultrasparc Workstations / Solaris

What is Myrinet? Commercial Network From Myricom (www.myri.com) Low cost (~$1K / Card, $4-6K / Switch) PCI / PMC Network Interface Cards High bandwidth (1.28 + 1.28 Gb/sec) Low Latency (13 usec) Scalable switched topology Network control performed in software Open-source MCP / Driver software

Myrinet Architecture Network Card Interface (PCI64B) Lanai processor controls network Local memory buffer Both network & PCI DMA engines Switches Cut-through wormhole routing CRC is recalculated at each stage Including header Stop/Go flow control mediated with Small slack buffer

Myrinet Throughput We Tested: 32 / 64 bit Myrinet cards VxWorks MVME 2604, MVME 2306 Linux Compaq Alpha Linux Intell Solaris Ultrasparc

Myrinet Software Myrinet driver (GM) Network mapping Each myrinet node maintains list of port offsets to each other node Dynamic and Static mapping supported Alternate routes can be forced by user Myrinet driver (GM) Variable length Messages Sender / Receiver provide buffers in advance for each size Sender / Receiver notified and must return buffer to gm Directed Sends DMA directly to host memory Receiver not notified GM imposes structure on user program Poll / Block on gm_receive() GM is not thread-safe

DAQ Software Software is Message Based ICCP Message Protocol for(;;) { msgQReceive(&msg); switch(msg.cmd) } ICCP Message Protocol 120 byte messages Standard header Sending is routed to the proper network daqMsgSend(node, &msg) node/task/domain Local Queue Myrinet Ethernet VME Each network has an associated daemon myriLib ethComLib vmComLib que[task]

myriLib DAQ library which wraps gm What does it do? Several Flavors myriMsgSend() myriMemCpy() What does it do? Manages the DMA message buffers Handles callback functions Thread synchronization Misc… (Byte order, 32 vs. 64 bit etc.) Bypasses DMA limitations on Solaris Several Flavors Threaded vs Process Buffered vs Unbuffered DMA copies

myriLib Operations Threaded (VxWorks tasks) myriLib Process myriLib with Buffering These lead to extra latency/reduced throughput for directed sends

myriMemCopy() Throughput

Multi-Sender myriLib Throughput 32-bit card MVME 2306 senders 64-bit Ultrasparc receiver

Year 2 Event Building Solaris Myrinet Cards allow us to implement the EVB on the BB Node Removes a node from the network Simplifies Software Replaces point-to-point transfer with many-to-point transfer More Memory (1.5GB vs. 256 MB) Throughput increase via multiple pftp streams (30-35 MB/Sec vs. 25 MB/Sec) Multi-CPU Ultrasparc Machine Compression on Built Events? Preliminary Results Show Improved Small Event Performance (25 evts/Sec 140 evts/Sec) Improved Throughput to BB (28 MB/Sec 100 MB/Sec)

Year 1 Performance & Reliability RHIC Data Run 3 Months Data Taking ~15 Days Integrated Stable Beam Little down time due to DAQ STAR Performance ~10 TB data ~2.03 Million Events Myrinet Performance 4 known message failures (>108) Cause not known Reported by software Resulted in aborted run No known data corruption

Au-Au Central Collision Made up title - make up your own 130 GeV Au-Au Collision viewed through the L3 Event Display Several thousand tracks Tracking in real time (~100 msec)