Evolution of S-LINK to PCI interfaces

Slides:



Advertisements
Similar presentations
CHEP 2000 Padova Stefano Veneziano 1 The Read-Out Crate in the ATLAS DAQ/EF prototype -1 The Read-Out Crate model The Read-Out Buffer The ROBin ROB performance.
Advertisements

6-April 06 by Nathan Chien. PCI System Block Diagram.
CMS Week Sept 2002 HCAL Data Concentrator Status Report for RUWG and Calibration WG Eric Hazen, Jim Rohlf, Shouxiang Wu Boston University.
Super Fast Camera System Performed by: Tokman Niv Levenbroun Guy Supervised by: Leonid Boudniak.
DUAL-OUTPUT HOLA MAY 2011 STATUS Anton Kapliy Mel Shochet Fukun Tang.
Reliable Data Storage using Reed Solomon Code Supervised by: Isaschar (Zigi) Walter Performed by: Ilan Rosenfeld, Moshe Karl Spring 2004 Part A Final Presentation.
t Popularity of the Internet t Provides universal interconnection between individual groups that use different hardware suited for their needs t Based.
Summary Ted Liu, FNAL Feb. 9 th, 2005 L2 Pulsar 2rd IRR Review, ICB-2E, video: 82Pulsar
Reliable Data Storage using Reed Solomon Code Supervised by: Isaschar (Zigi) Walter Performed by: Ilan Rosenfeld, Moshe Karl Spring 2004 Midterm Presentation.
Detector Array Controller Based on First Light First Light PICNIC Array Mux PICNIC Array Mux Image of ESO Messenger Front Page M.Meyer June 05 NGC High.
Sept TPC readoutupgade meeting, Budapest1 DAQ for new TPC readout Ervin Dénes, Zoltán Fodor KFKI, Research Institute for Particle and Nuclear Physics.
HCAL FIT 2002 HCAL Data Concentrator Status Report Gueorgui Antchev, Eric Hazen, Jim Rohlf, Shouxiang Wu Boston University.
Lecture 7 Lecture 7: Hardware/Software Systems on the XUP Board ECE 412: Microcomputer Laboratory.
5 Feb 2002Alternative Ideas for the CALICE Backend System 1 Alternative Ideas for the CALICE Back-End System Matthew Warren and Gordon Crone University.
DDL hardware, DATE training1 Detector Data Link (DDL) DDL hardware Csaba SOOS.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
GBT Interface Card for a Linux Computer Carson Teale 1.
LECC2003 AmsterdamMatthias Müller A RobIn Prototype for a PCI-Bus based Atlas Readout-System B. Gorini, M. Joos, J. Petersen (CERN, Geneva) A. Kugel, R.
Design and Performance of a PCI Interface with four 2 Gbit/s Serial Optical Links Stefan Haas, Markus Joos CERN Wieslaw Iwanski Henryk Niewodnicznski Institute.
Micro-Research Finland Oy Components for Integrating Device Controllers for Fast Orbit Feedback Jukka Pietarinen EPICS Collaboration Meeting Knoxville.
PROCStar III Performance Charactarization Instructor : Ina Rivkin Performed by: Idan Steinberg Evgeni Riaboy Semestrial Project Winter 2010.
RCU Status 1.RCU design 2.RCU prototypes 3.RCU-SIU-RORC integration 4.RCU system for TPC test 2002 HiB, UiB, UiO.
A PCI Card for Readout in High Energy Physics Experiments Michele Floris 1,2, Gianluca Usai 1,2, Davide Marras 2, André David IEEE Nuclear Science.
EUDRB: the data reduction board of the EUDET pixel telescope Lorenzo Chiarelli, Angelo Cotta Ramusino, Livio Piemontese, Davide Spazian Università & INFN.
CMS ECAL Week, July 20021Eric CANO, CERN/EP-CMD FEDkit FED Slink64 readout kit Dominique Gigi, Eric Cano (CERN EP/CMD)
ECE 526 – Network Processing Systems Design Computer Architecture: traditional network processing systems implementation Chapter 4: D. E. Comer.
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
Fast Fault Finder A Machine Protection Component.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
PCI B ASED R EAD-OUT R ECEIVER C ARD IN THE ALICE DAQ S YSTEM W.Carena 1, P.Csato 2, E.Denes 2, R.Divia 1, K.Schossmaier 1, C. Soos 1, J.Sulyan 2, A.Vascotto.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
B. Hirosky 12/14/00 FPGA + FIFO replaces: DMA P/IO buffers TSI device Keep ECL drivers BUY THIS! Same Basic Concept as L2Alpha, but with simplified implementation.
LECC2004 BostonMatthias Müller The final design of the ATLAS Trigger/DAQ Readout-Buffer Input (ROBIN) Device B. Gorini, M. Joos, J. Petersen, S. Stancu,
Rutherford Appleton Laboratory September 1999Fifth Workshop on Electronics for LHC Presented by S. Quinton.
ROD Activities at Dresden Andreas Glatte, Andreas Meyer, Andy Kielburg-Jeka, Arno Straessner LAr Electronics Upgrade Meeting – LAr Week September 2009.
The ALICE Data-Acquisition Read-out Receiver Card C. Soós et al. (for the ALICE collaboration) LECC September 2004, Boston.
LISA Linux Switching Appliance Radu Rendec Ioan Nicu Octavian Purdila Universitatea Politehnica Bucuresti 5 th RoEduNet International Conference.
PXD DAQ (PC option) Status Report
Use of FPGA for dataflow Filippo Costa ALICE O2 CERN
The Data Handling Hybrid
CRU PCIe usage 1 1.
Chapter 6 Input/Output Organization
Chapter 13: I/O Systems Modified by Dr. Neerja Mhaskar for CS 3SH3.
USB The topics covered, in order, are USB background
ABC130: DAQ Hardware Status Matt Warren et al. Valencia 3 Feb 2014
Advanced Technology Attachment
Operating Systems (CS 340 D)
FrontEnd LInk eXchange
HCAL Data Concentrator Production Status
D.Cobas, G. Daniluk, M. Suminski
CS 286 Computer Organization and Architecture
Read-out of High Speed S-LINK Data Via a Buffered PCI Card
The PCI bus (Peripheral Component Interconnect ) is the most commonly used peripheral bus on desktops and bigger computers. higher-level bus architectures.
Chapter III Desktop Imaging Systems & Issues
Software Implementation of USB 3.0 Stack
PCI BASED READ-OUT RECEIVER CARD IN THE ALICE DAQ SYSTEM
The Read Out Driver for the ATLAS Muon Precision Chambers
AT91 Memory Interface This training module describes the External Bus Interface (EBI), which generatesthe signals that control the access to the external.
Bus-Based Computer Systems
Tests Front-end card Status
PC Buses & Standards Bus = Pathway across which data can travel. Can be established between two or more computer elements. PC has a hierarchy of different.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Universal Serial Bus (USB)
LHCb Online Meeting November 15th, 2000
TELL1 A common data acquisition board for LHCb
Chapter 13: I/O Systems.
Cluster Computers.
Presentation transcript:

Evolution of S-LINK to PCI interfaces W. Iwanski (Henryk Niewodniczanski Institute of Nuclear Physics) M. Joos, R. McLaren, J. Petersen, E. van der Bij (CERN)

W.Iwanski LECC, Colmar 9-13 September 2002 System evolution Yesterday Tomorrow Today 4 S-LINKs 4 S-LINKs S-LINK S-LINK S-LINK S-LINK SSP SSP FILAR FILAR S32PCI64 S32PCI64 slow PCI fast PCI fast PCI Eth. MEM MEM slow PCI MEM slow PCI Eth. Eth. CPU CPU CPU PC PC PC W.Iwanski LECC, Colmar 9-13 September 2002

Current use of S-Link to PCI interfaces in test-beams Muon test beam TileCal test beam VMEbus VMEbus VMEbus VMEbus VMEbus SPS SPS SPS SPS SPS S-LINK S-LINK S-LINK S-LINK S-LINK SSPCI SSPCI SSPCI SSPCI SSPCI PC PC Features of SSP / SSPCI Simple hardware design (AMCC S5933 PCI controller) Max. speed = 80 (RIO2) / 132 (PC) MB/s Event overhead = ~8 us One ROL per PMC / PCI slot W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 S32PCI64 32/64 bit PCI 33/66 MHz 3.3 V PCI slots only 3.3 V S-LINK LDC plug-in W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 Overview of S32PCI64 S L N K PCI PCI 32 to Map 32 INPUT INPUT BURST L DMA I to BUFFER BUFFER FIFO FIFO I 64 64 FIFO FIFO ENGINE 1024 x 64-bit (1024 x 64 - bit) 128x 64 128x 64 - bit REQUEST REQUEST FIFO FIFO PCI address, length 64 64 - BIT - BIT 33/66 MHz 32/64 - 32/64-bit PCI PCI BACKEND PCI ACKNOWLEDGE CONTROL BACK END FIFO CORE CORE LOGIC ctrl words, length FIFO LOGIC CONTROL, STATUS & CONTROL, INTERRUPT STATUS & REGISTERS FPGA Local logic Commercial IP Core W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 Features of S32PCI64 32/64 bit, 33/66 MHz PCI (3.3 V slots only) 32-bit S-LINK (3.3 V LDC plug-in only) 32-bit PCI-bus addressing No Initiator wait states during a burst Max. 64 Mbytes long DMA transfer Protocol overhead per event: 5 PCI single cycles Based on a commercial PCI IP core (PLDA) Highly autonomous data reception Interrupt generation selectable on several conditions Data packets longer than 1kB are segmented W.Iwanski LECC, Colmar 9-13 September 2002

Features and applications of S32PCI64 based systems Less load on the CPU to handle link protocol event overhead =~2.5 us 3.3 V PCI slots required For applications requiring full ATLAS input rate Applications: CosmoDAQ mainly for prototypes Commercially available from NE (Nowoczesna Elektronika) W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 S32PCI64 test set-up Elonex PC SuperMicro 370DLE motherboard memory bandwidth > 528 MB/s SLIDAS data generator VMEtro PBT-515BX PCI Bus Analyser PCI bandwidth exerciser (PCI-Blaster) SLIDAS SLIDAS 160 MB/s 160 MB/s S32PCI64 S32PCI64 PCI 66 MHz/ 64 bit ServerSet III LE CPU MEM PC W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 PCI-Blaster PCI bandwidth exerciser firmware designed and implemented on S32PCI64 board to benchmark PCI bridge and memory in PC Features full speed PCI capability (528 MB/s) read and write capability read and write modes can be set-up simultaneously simple programming model data transfers in specified number of times or in infinite loop 32-bit and 64-bit PCI bus (3.3V PCI bus only) 33 and 66 MHz MHz PCI clock speed 32-bit PCI-bus addressing no interrupt capability W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 Overall throughput W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 Other measurements Overhead of S32PCI64 hardware in single access from host CPU to S32PCI64 write -> 30 ns (2 wait states) read -> 45 ns (3 wait states) Minimal time interval between 2 consecutive single commands seen on PCI bus (limited by CPU/PCI bridge) write-write -> 75 ns write-read -> 105 ns read-read -> 330 ns read-write -> 345 ns Gap between DMA sub-bursts (packets > 1kB) -> 180 ns W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 FILAR Four Integrated Links for ATLAS Readout LDC protocol FPGA S32PCI64 PCI 64 bit / 66 MHz 3.3 V Ser- des ROL Pluggable f/o transceiver W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 Main features of FILAR Four 2.5 Gbit/s HOLA S-LINK Destinations integrated Data channels individually enabled/disabled 32/64-bit 33/66 MHz PCI bus 32-bit S-LINK 32-bit PCI-bus addressing Protocol overhead per event and channel: 2 PCI single cycles Highly autonomous data reception Card temperature readout Module scheduled for 1Q/2003 W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 Steps towards FILAR New library written in ROS context FILAR emulator: simplified specification of FILAR implemented in existing hardware (S32PCI64) Data from one S-link connector copied to all four data channels One common flow control signal logical OR of all four XOFF signals W.Iwanski LECC, Colmar 9-13 September 2002

Features of FILAR software Package consists of Linux driver and user library Detection of data packets based on interrupts Support for: multiple PCI cards multiple channels Code optimised for low overhead ~1.5 us / event Current API will also be valid for final FILAR W.Iwanski LECC, Colmar 9-13 September 2002

Performance of FILAR emulator W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 Performance (cont’d) Improved performance of FILAR emulator for small packets with respect to S32PCI64 Better performance of FILAR emulator running one data channel than that in S32PCI64 Performance of FILAR emulator running two, three or four data channels is compromised by board limitation flow control signal, working for all channels here, stops new data to whole interface whenever any of data channels is getting full. It prevents other, already empty buffers from being re-filled gaps up to 4.5 us seen between some DMAs If necessary, (still) costly single PCI cycles will be replaced by DMA W.Iwanski LECC, Colmar 9-13 September 2002

W.Iwanski LECC, Colmar 9-13 September 2002 Summary Transition from 32-bit/33 MHz to 64-bit/66 MHz PCI full bandwidth of S-LINK (160 MB/s) Stable design: evolution from S32PCI64 to FILAR great part of software re-used Use of FPGA and IP core in designs easy debugging, testing and upgrading re-usability of existing hardware PCI-Blaster FILAR emulator Powerful and cost optimised FILAR design integration of 4 links in one card reduces costs and overcomes a limit of fast 66 MHz PCI slots in PC pluggable optical transceivers further reduce costs when less than 4 data channels are needed W.Iwanski LECC, Colmar 9-13 September 2002